library(tidyverse)
library(car)
library(lme4)
Worksheet 8
Packages
One not especially long scenario, but you may find yourself with some thinking to do.
You have intelligent rats
Each of 12 students trained rats to run a maze, and the number of successful runs (out of 50 attempts) was recorded on each of 5 days. Some of the students were told that the rats they were training were “bright” (intelligent) and some of them were told that their rats were “dull” (not intelligent). In actual fact, all the rats were from the same source, and none of them were any more intelligent than the others. Did it make a difference whether the students were told that their rats were intelligent on the number of mazes the rats successfully ran, and, if so, was the effect consistent over time? The same group of rats were trained by the same student throughout this experiment, so it makes sense to treat the data as repeated measures.
The data are in http://ritsokiguess.site/datafiles/intelligent-rats.csv. The columns of interest to us are:
Student
: a numerical identifier for each studentTreatment
: what the student was told about their ratsDay1
throughDay5
: the number of successful runs on each day.
There are some other variables that will not concern us here.
- Read in and display (some of) the data.
Nothing unexpected here:
<- "http://ritsokiguess.site/datafiles/intelligent-rats.csv"
my_url <- read_csv(my_url) maze
Rows: 12 Columns: 12
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Treatment
dbl (11): Student, PriorExp, Block, Day1, Day2, Day3, Day4, Day5, Relax, Suc...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
maze
- Set up and run an appropriate repeated-measures ANOVA. What do you conclude from it? (Hint: set up the response variable, and go through the steps required to run the
Manova
. Obtain the appropriate P-values, describing your process.)
There’s actually a fair bit to do here. The first bit is to grab the five Day
columns and make a response matrix out of them, either this way:
%>% select(starts_with("Day")) %>%
maze as.matrix() -> response
response
Day1 Day2 Day3 Day4 Day5
[1,] 10 13 12 12 9
[2,] 18 20 21 23 30
[3,] 16 14 20 20 20
[4,] 11 20 27 17 16
[5,] 27 24 26 28 27
[6,] 16 14 14 13 15
[7,] 8 10 21 10 20
[8,] 13 20 18 20 22
[9,] 24 23 27 29 27
[10,] 13 12 23 17 16
[11,] 9 15 24 12 11
[12,] 19 17 23 33 29
or this way:
<- with(maze, cbind(Day1, Day2, Day3, Day4, Day5))
response1 response1
Day1 Day2 Day3 Day4 Day5
[1,] 10 13 12 12 9
[2,] 18 20 21 23 30
[3,] 16 14 20 20 20
[4,] 11 20 27 17 16
[5,] 27 24 26 28 27
[6,] 16 14 14 13 15
[7,] 8 10 21 10 20
[8,] 13 20 18 20 22
[9,] 24 23 27 29 27
[10,] 13 12 23 17 16
[11,] 9 15 24 12 11
[12,] 19 17 23 33 29
The second way is more like what I did in class, but also more typing. The as.matrix
turns a dataframe into a matrix
, which is what the response needs to be in the end.
Next, run lm
, set up the time-dependence stuff, run Manova
, and look at the results. This is basically cookie-cutter code: take what appears in the lecture notes and copy and paste it, changing a few names:
.1 <- lm(response ~ Treatment, data = maze)
maze<- colnames(response)
times <- data.frame(times = factor(times))
times.df .2 <- Manova(maze.1, idata = times.df, idesign = ~ times)
mazesummary(maze.2)
Warning in summary.Anova.mlm(maze.2): HF eps > 1 treated as 1
Type II Repeated Measures MANOVA Tests:
------------------------------------------
Term: (Intercept)
Response transformation matrix:
(Intercept)
Day1 1
Day2 1
Day3 1
Day4 1
Day5 1
Sum of squares and products for the hypothesis:
(Intercept)
(Intercept) 104160.3
Multivariate Tests: (Intercept)
Df test stat approx F num Df den Df Pr(>F)
Pillai 1 0.97802 444.8761 1 10 1.2754e-09 ***
Wilks 1 0.02198 444.8761 1 10 1.2754e-09 ***
Hotelling-Lawley 1 44.48761 444.8761 1 10 1.2754e-09 ***
Roy 1 44.48761 444.8761 1 10 1.2754e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
------------------------------------------
Term: Treatment
Response transformation matrix:
(Intercept)
Day1 1
Day2 1
Day3 1
Day4 1
Day5 1
Sum of squares and products for the hypothesis:
(Intercept)
(Intercept) 4720.333
Multivariate Tests: Treatment
Df test stat approx F num Df den Df Pr(>F)
Pillai 1 0.6684447 20.16088 1 10 0.0011608 **
Wilks 1 0.3315553 20.16088 1 10 0.0011608 **
Hotelling-Lawley 1 2.0160877 20.16088 1 10 0.0011608 **
Roy 1 2.0160877 20.16088 1 10 0.0011608 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
------------------------------------------
Term: times
Response transformation matrix:
times1 times2 times3 times4
Day1 1 0 0 0
Day2 0 1 0 0
Day3 0 0 1 0
Day4 0 0 0 1
Day5 -1 -1 -1 -1
Sum of squares and products for the hypothesis:
times1 times2 times3 times4
times1 280.33333 193.33333 -67.666667 38.666667
times2 193.33333 133.33333 -46.666667 26.666667
times3 -67.66667 -46.66667 16.333333 -9.333333
times4 38.66667 26.66667 -9.333333 5.333333
Multivariate Tests: times
Df test stat approx F num Df den Df Pr(>F)
Pillai 1 0.6827637 3.766392 4 7 0.060953 .
Wilks 1 0.3172363 3.766392 4 7 0.060953 .
Hotelling-Lawley 1 2.1522241 3.766392 4 7 0.060953 .
Roy 1 2.1522241 3.766392 4 7 0.060953 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
------------------------------------------
Term: Treatment:times
Response transformation matrix:
times1 times2 times3 times4
Day1 1 0 0 0
Day2 0 1 0 0
Day3 0 0 1 0
Day4 0 0 0 1
Day5 -1 -1 -1 -1
Sum of squares and products for the hypothesis:
times1 times2 times3 times4
times1 27 51.00000 81 -6.000000
times2 51 96.33333 153 -11.333333
times3 81 153.00000 243 -18.000000
times4 -6 -11.33333 -18 1.333333
Multivariate Tests: Treatment:times
Df test stat approx F num Df den Df Pr(>F)
Pillai 1 0.6538624 3.305793 4 7 0.080236 .
Wilks 1 0.3461376 3.305793 4 7 0.080236 .
Hotelling-Lawley 1 1.8890244 3.305793 4 7 0.080236 .
Roy 1 1.8890244 3.305793 4 7 0.080236 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Univariate Type II Repeated-Measures ANOVA Assuming Sphericity
Sum Sq num Df Error SS den Df F value Pr(>F)
(Intercept) 20832.1 1 468.27 10 444.8761 1.275e-09 ***
Treatment 944.1 1 468.27 10 20.1609 0.0011608 **
times 294.3 4 411.07 40 7.1586 0.0001909 ***
Treatment:times 194.3 4 411.07 40 4.7259 0.0032260 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Mauchly Tests for Sphericity
Test statistic p-value
times 0.69814 0.96424
Treatment:times 0.69814 0.96424
Greenhouse-Geisser and Huynh-Feldt Corrections
for Departure from Sphericity
GG eps Pr(>F[GG])
times 0.86937 0.0004321 ***
Treatment:times 0.86937 0.0052136 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
HF eps Pr(>F[HF])
times 1.389507 0.0001909378
Treatment:times 1.389507 0.0032260394
Finally, go down to near the bottom and check the sphericity tests. Sphericity is by no means rejected (P-value 0.964), so look at the “Univariate Type II Repeated-Measures ANOVA Assuming Sphericity” just above that. The interaction is significant, with a P-value of 0.0032, so that is our finding. In the context of the data, that means that there is an effect of treatment on the number of successful runs, but that effect depends on which day you are looking at. (This is as far as we can go for now.)
- Draw a suitable interaction plot (for treatment and time). How does this clarify your conclusions of the previous part? (Hint: you’ll need longer data. If you want to be consistent with me, use
Day
for the names of the days, andruns
for the numbers of runs.)
The hint says that the first thing we need to do is to make “longer” data, with each observation on one row:1
%>%
maze pivot_longer(starts_with("Day"), names_to = "Day", values_to = "runs") -> maze_long
maze_long
Then work out the mean number of runs for each combination of Treatment
and Day
:
%>%
maze_long group_by(Treatment, Day) %>%
summarize(mean_runs = mean(runs)) -> means
`summarise()` has grouped output by 'Treatment'. You can override using the
`.groups` argument.
means
And then make the interaction plot, bearing in mind that you need a colour
and a group
and they need to be the same:
ggplot(means, aes(x = Day, y = mean_runs, colour = Treatment, group = Treatment)) +
geom_point() + geom_line()
(You don’t need to display the intervening dataframes, though you probably should for yourself while you’re working on this.)
When the students were told that their rats were “bright”, the number of successful runs on average increased over time. But for the students told that their rats were “dull”, the mean number of successful runs increased up to day 3, peaked there, and decreased afterwards. You might also say that the traces over time are not parallel, and talk about how they are not.
This difference in pattern over time was the reason for the significant interaction.
Extra: I also drew a spaghetti plot, which uses the same long data that we just made, but has a colour
and group
that are different. The lines on this one need to join observations that come from the same student, so that’s what goes in group
:
ggplot(maze_long, aes(x = Day, y = runs, colour = Treatment, group = Student)) +
geom_point() + geom_line()
The pattern is not so clear here (which is why I had you draw the interaction plot instead), but if you look carefully, you can that several of the students with the supposedly “dull” rats had the number of successful runs peak at day 3, while the students with “bright” rats saw the number of runs on a more or less upward trajectory. There is some variability here, but the different patterns over time were consistent enough to make the interaction strongly significant.
The red traces are mostly above the blue ones (except perhaps at day 3), which is probably the reason there is a significant treatment main effect, but we would do better to look at simple effects over time to further understand the significant interaction (the treatment effect will probably not be consistent over time, as a consequence of the significant interaction).
- For each time point (day), run a simple effects analysis: that is, using an ordinary
aov
, test whether the number of successful runs depends on whether the rat was labelled as “dull” or “bright”.
I think I airily talked about how you can use simple effects to understand an interaction in this kind of analysis, but never actually showed you an example. There are two ways you might go:
- fix time and look at the effect of treatment at each time
- fix treatment and look at the effect of time within that treatment
In this example, and perhaps more generally, the first option makes more sense to me: we care more about the treatment effect, while the time is a sort of “block” within which to observe the treatment effect. The first option is also easier to do, because by focusing on only one time point at a time, there is no time dependence left and each student gives you only one observation at that time (the number of runs their rats made at that time). Hence, the analysis at each time point is only an ordinary aov
.2
Last year, I had this as an Extra on an assignment (where students had to do the MANOVA-based approach above). In my solutions then, I used the longer dataframe that you made above to draw the interaction plot with, but thinking about it now, it actually goes more easily with the original wider data: pick out the day of interest, see whether the number of mazes completed that day depends on treatment, and then repeat for the other days.
Here’s day 1:
.1 <- aov(Day1 ~ Treatment, data = maze)
dsummary(d.1)
Df Sum Sq Mean Sq F value Pr(>F)
Treatment 1 208.3 208.33 11.81 0.00636 **
Residuals 10 176.3 17.63
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is a treatment effect (P-value 0.006) on Day 1, which we know (from the interaction plot) is because the supposedly bright
rats performed better than the dull
ones. This is all we need to do for Day 1, because there were only two treatments, and we now know that they were significantly different in terms of mazes run (and we know from our graph which way around the difference is). If there had been three treatments, say a third one in which the students were told nothing about their rats, then we would run TukeyHSD
to find out which treatments differed from which on Day 1. But there is no need for that here.
We would expect something similar to happen on Day 2. Copy, paste, and edit:
.2 <- aov(Day2 ~ Treatment, data = maze)
dsummary(d.2)
Df Sum Sq Mean Sq F value Pr(>F)
Treatment 1 96.33 96.33 7.565 0.0205 *
Residuals 10 127.33 12.73
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Still significant (P-value 0.020), though not as strongly so. The bright
rats (so-called) again did better than the dull
ones.
Day 3 is where the significance of the treatment effect might go away, since this is where the dull
rats peaked. Paste again (it is probably still on your clipboard):
.3 <- aov(Day3 ~ Treatment, data = maze)
dsummary(d.3)
Df Sum Sq Mean Sq F value Pr(>F)
Treatment 1 16.33 16.33 0.691 0.425
Residuals 10 236.33 23.63
The P-value is 0.425, which is nowhere near significance, so there is no treatment effect on Day 3.
After that, you would expect bright
to be better again, since the dull
rats peaked and the bright
ones continued to improve after Day 3:
.4 <- aov(Day4 ~ Treatment, data = maze)
dsummary(d.4)
Df Sum Sq Mean Sq F value Pr(>F)
Treatment 1 432 432.0 23.61 0.000663 ***
Residuals 10 183 18.3
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
.5 <- aov(Day5 ~ Treatment, data = maze)
dsummary(d.5)
Df Sum Sq Mean Sq F value Pr(>F)
Treatment 1 385.3 385.3 24.65 0.000566 ***
Residuals 10 156.3 15.6
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
And so it proves, with P-values 0.00066 and 0.00057. Thus, there is a significant treatment effect on all the days except Day 3. This (small) inconsistency of pattern is what drives the significant interaction; if the interaction had not been significant, there would have been a consistent treatment effect over all five days.3
- Do a mixed model analysis (hint: use the long data that you made for your interaction plot.)
Let’s remind ourselves of how the longer data looked:
maze_long
Make sure you have lme4
installed and loaded.
The response variable is runs
. That depends on Treatment
and time (Day
) and their interaction. In addition, you see that the first five measurements in the dataframe are from the rats of the same student (student 1), and because they are trained by the same student, we expect the numbers of runs from those rats to be correlated. This is incorporated in the analysis by adding a “random effect” for Student
. In the jargon of mixed models, this is more precisely called a “random intercept”: it says that the mean number of runs
is moved up or down for the particular student that trained those rats, the same for each day:
.1 <- lmer(runs ~ Treatment * Day + (1|Student), data = maze_long) maze_long
Then we run drop1
to see if we can get rid of any of the fixed (non-random) effects. The test
is Chisq
because this model is based on maximum likelihood not least squares:
drop1(maze_long.1, test = "Chisq")
The interaction between Treatment
and Day
is significant, so that is our finding: this is where we stop.
Extras:
- The
1
in the code1|Student
is the R notation for “intercept”, so the code1|Student
means “a different intercept for eachStudent
”. - Random effects, such as the random intercepts for students here, never get tested. We are allowing for the possibility that the students are different from each other, which is something we can estimate because we have repeated observations on the same students. This is not of immediate interest to us, because the students are the ones the original researchers could get: in principle, a random sample of all possible students, and so we don’t care especially much about these students. You might be reminded of “blocks” in a randomized blocks analysis: we don’t care about the blocks themselves, but we expect them to be different from each other and thus to affect the outcome in addition to whatever treatments we are interested in.
- You could also have a random effect of
Student
that increases over time, a so-called “random slope”. To do that, you pull out the number fromDay
and store it in, say,Day_number
, and then write your random effect asDay_number|Student
, which will introduce a random effect for each student that is a linear function of time. - Having repeated observations on the same subjects is new. In C32, we didn’t have that, because each subject only gave us one observation. For example, in a two-sample experiment, the subjects are divided at random and each gets only one of treatment or control. The exception is a matched-pairs experiment, where each subject has a “before” and an “after” (or something like that), and we rig up one observation for each subject by taking the difference between before and after.
- Compare your findings from the mixed model to the ones from the
Manova
analysis you did earlier.
The findings are exactly the same, albeit with a smaller P-value: there is a significant interaction between Treatment
and Day
, so the effect of treatment is different for the different days.
If you want to follow this up by doing simple effects of treatment for each day, the analysis is exactly the same as the one you did before, because when you look at each day separately, there is no repeated measures.4
Footnotes
The definition of “tidy data” gets a bit blurry for repeated-measures data. Is “one observation” one value of a number of mazes run, with another column saying which student and day it was (longer), or is it the observations for all five days for the same student (wider)? You could make a case for either answer.↩︎
The second option is still repeated measures because you are still looking at all five time points.↩︎
The significant treatment and time effects that showed up in part (b), which we are not really supposed to interpret, say that on average there is a treatment effect over all days (that is, when you average up over all five days, the
bright
rats do better), and there is an effect over time on average regardless of treatment (probably driven by rats on both treatments improving over the first three days, ignoring what happened after that).↩︎If you wanted to condition on treatment and look for a time effect for each treatment, that would still be repeated measures, and to be consistent here, you would use a mixed model for that as well.↩︎