Three kinds of inference for means of normally-distributed data:
Two forms of inference for a population parameter:
Sir Ronald A. Fisher, 1890–1962.
Decision | ||
---|---|---|
Truth | Do not reject | reject null |
Null true | Correct | Type I error |
Null false | Type II error | Correct |
…as a .csv
file:
This is a big data set: only 25 observations, but a lot of variables.
To see the first few values in all the variables, can also use glimpse
:
Rows: 25
Columns: 21
$ row <dbl> 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96…
$ game <dbl> 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 27, 28, 29, 30, 31, 3…
$ date <chr> "Monday, Apr 13", "Tuesday, Apr 14", "Wednesday, Apr 15", …
$ box <chr> "boxscore", "boxscore", "boxscore", "boxscore", "boxscore"…
$ team <chr> "TOR", "TOR", "TOR", "TOR", "TOR", "TOR", "TOR", "TOR", "T…
$ venue <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ opp <chr> "TBR", "TBR", "TBR", "TBR", "ATL", "ATL", "ATL", "BAL", "B…
$ result <chr> "L", "L", "W", "L", "L", "W-wo", "L", "W", "W", "W", "W", …
$ runs <dbl> 1, 2, 12, 2, 7, 6, 2, 13, 4, 7, 3, 3, 5, 7, 7, 3, 10, 2, 3…
$ Oppruns <dbl> 2, 3, 7, 4, 8, 5, 5, 6, 2, 6, 1, 6, 1, 0, 1, 6, 6, 3, 4, 4…
$ innings <dbl> NA, NA, NA, NA, NA, 10, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ wl <chr> "4-3", "4-4", "5-4", "5-5", "5-6", "6-6", "6-7", "7-7", "8…
$ position <dbl> 2, 3, 2, 4, 4, 3, 4, 2, 2, 1, 4, 5, 3, 3, 3, 3, 5, 5, 5, 5…
$ gb <chr> "1", "2", "1", "1.5", "2.5", "1.5", "1.5", "2", "1", "Tied…
$ winner <chr> "Odorizzi", "Geltz", "Buehrle", "Archer", "Martin", "Cecil…
$ loser <chr> "Dickey", "Castro", "Ramirez", "Sanchez", "Cecil", "Marimo…
$ save <chr> "Boxberger", "Jepsen", NA, "Boxberger", "Grilli", NA, "Gri…
$ `game time` <time> 02:30:00, 03:06:00, 03:02:00, 03:00:00, 03:09:00, 02:41:0…
$ Daynight <chr> "N", "N", "N", "N", "N", "D", "D", "N", "N", "N", "N", "N"…
$ attendance <dbl> 48414, 17264, 15086, 14433, 21397, 34743, 44794, 14184, 15…
$ streak <chr> "-", "--", "+", "-", "--", "+", "-", "+", "++", "+++", "+"…
t.test
function does CI and test. Look at CI first:
One Sample t-test
data: jays$attendance
t = 11.389, df = 24, p-value = 3.661e-11
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
20526.82 29613.50
sample estimates:
mean of x
25070.16
One Sample t-test
data: jays$attendance
t = 11.389, df = 24, p-value = 3.661e-11
alternative hypothesis: true mean is not equal to 0
90 percent confidence interval:
21303.93 28836.39
sample estimates:
mean of x
25070.16
jays
” using $
.
One Sample t-test
data: jays$attendance
t = -1.9338, df = 24, p-value = 0.06502
alternative hypothesis: true mean is not equal to 29327
95 percent confidence interval:
20526.82 29613.50
sample estimates:
mean of x
25070.16
read_delim
(check file to see)c
(control) before t
(treatment) alphabetically, so proper alternative is “less”.
Welch Two Sample t-test
data: score by group
t = -2.3109, df = 37.855, p-value = 0.01319
alternative hypothesis: true difference in means between group c and group t is less than 0
95 percent confidence interval:
-Inf -2.691293
sample estimates:
mean in group c mean in group t
41.52174 51.47619
Two Sample t-test
data: score by group
t = -2.2666, df = 42, p-value = 0.01431
alternative hypothesis: true difference in means between group c and group t is less than 0
95 percent confidence interval:
-Inf -2.567497
sample estimates:
mean in group c mean in group t
41.52174 51.47619
alternative
:
Welch Two Sample t-test
data: score by group
t = -2.3109, df = 37.855, p-value = 0.02638
alternative hypothesis: true difference in means between group c and group t is not equal to 0
95 percent confidence interval:
-18.67588 -1.23302
sample estimates:
mean in group c mean in group t
41.52174 51.47619
Some data:
Data are comparison of 2 drugs for effectiveness at reducing pain.
In reading example, each child tried only one reading method.
But here, each subject tried out both drugs, giving us two measurements.
Design: randomly choose 6 of 12 subjects to get drug A first, other 6 get drug B first.
Values aligned in columns:
Paired t-test
data: druga and drugb
t = -2.1677, df = 11, p-value = 0.05299
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
-4.29941513 0.03274847
sample estimates:
mean difference
-2.133333
One Sample t-test
data: diff
t = -2.1677, df = 11, p-value = 0.05299
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-4.29941513 0.03274847
sample estimates:
mean of x
-2.133333
Comments