More than one response measurement for each subject, same thing at different times
Generalization of matched pairs (“matched triples”, etc.).
Expect measurements on same subject to be correlated, so assumptions of independence will fail.
Repeated measures. Profile analysis uses Manova
(set up).
Another approach uses mixed models (random effects).
Variation: each subject does all treatments at different times (called crossover design).
8 dogs take part in experiment.
Dogs randomized to one of 2 different drugs.
Response: log of blood concentration of histamine 0, 1, 3 and 5 minutes after taking drug. (Repeated measures.)
Data in dogs.txt
, column-aligned.
lh0 lh1 lh3 lh5
[1,] -3.22 -1.61 -2.30 -2.53
[2,] -3.91 -2.81 -3.91 -3.91
[3,] -2.66 0.34 -0.73 -1.43
[4,] -1.77 -0.56 -1.05 -1.43
[5,] -3.51 -0.48 -1.17 -1.51
[6,] -3.51 0.05 -0.31 -0.51
[7,] -2.66 -0.19 0.07 -0.22
[8,] -2.41 1.14 0.72 0.21
lh0 lh1 lh3 lh5
[1,] -3.22 -1.61 -2.30 -2.53
[2,] -3.91 -2.81 -3.91 -3.91
[3,] -2.66 0.34 -0.73 -1.43
[4,] -1.77 -0.56 -1.05 -1.43
[5,] -3.51 -0.48 -1.17 -1.51
[6,] -3.51 0.05 -0.31 -0.51
[7,] -2.66 -0.19 0.07 -0.22
[8,] -2.41 1.14 0.72 0.21
Get list of response variable names; we call them times
. Save in data frame.
and pull out what you need to answer the question.
three sets of tests, for
two types of test for each of these:
univariate is more powerful if it applies; if it doesn’t, can make adjustments to it
Test statistic p-value
times 0.12334 0.084567
drug:times 0.12334 0.084567
Sphericity is not rejected; proceed to univariate tests.
Sum Sq num Df Error SS den Df F value Pr(>F)
(Intercept) 71.342 1 22.1026 6 19.3664 0.004565 **
drug 11.520 1 22.1026 6 3.1272 0.127406
times 26.160 3 2.2534 18 69.6546 4.215e-10 ***
drug:times 5.111 3 2.2534 18 13.6095 7.050e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drug
and time: the pattern of log-histamine over time is different for the different drugs.then we would use the H-F adjusted P-values:
GG eps Pr(>F[GG]) HF eps Pr(>F[HF])
times 0.5261798 3.744618e-06 0.6822614 1.843418e-07
drug:times 0.5261798 2.348896e-03 0.6822614 7.307096e-04
attr(,"na.action")
(Intercept) drug
1 2
attr(,"class")
[1] "omit"
In this case (because sphericity was not rejected), these are very similar to the ones from the univariate tests, and the conclusion (significant interaction) was the same.
lh
-something. Call them all lh
and put them in one column, with the time they belong to labelled.pivot_longer
, try 1Not quite right: want new variable containing just number in time
: parse_number
. (Top 5 rows shown.)
I realized that pivot_longer
was going to produce something like lh1
, which I needed to do something further with, so this time I gave it a temporary name timex
(which we actually do use later).
This enabled me to use the name time
for the actual numeric time.
This works now, so next save into a new data frame dogs.long
.
This says:
Take data frame dogs, and then:
Combine the columns lh0
through lh5
into one column called lh
, with the column that each lh
value originally came from labelled by timex
, and then:
Pull out numeric values in timex
, saving in time
and then:
save the result in a data frame dogs.long
.
Plot mean lh
value at each time, joining points on same drug by lines.
drugs same at time 0
after that, Trimethaphan higher than Morphine.
Effect of drug not consistent over time: significant interaction.
Lines on interaction plot would then be parallel, and so interaction should no longer be significant.
Go back to original “wide” dogs
data frame.
Test statistic p-value
times 0.57597 0.25176
drug:times 0.57597 0.25176
Sum Sq num Df Error SS den Df F value Pr(>F)
(Intercept) 24.2607 1 20.1874 6 7.2106 0.03628 *
drug 16.2197 1 20.1874 6 4.8207 0.07053 .
times 3.3250 2 0.7301 12 27.3251 3.406e-05 ***
drug:times 0.3764 2 0.7301 12 3.0929 0.08254 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Plot actual data: lh
against days
, labelling observations by drug: “spaghetti plot”.
Uses long data frame:
time
, lh
) points coloured by druggroup = dog
, but colour = drug
:For each dog over time, gradual decrease in log-histamine from time 1: significant time effect after we took out time 0.
Pattern about same for each dog, regardless of drug, hence non-significant interaction.
Most trimethaphan dogs (blue) have higher log-histamine throughout (time 1 and after), some morphine dogs (red) have lower.
But two morphine dogs have log-histamine profiles like trimethaphan dogs. This ambiguity probably why drug
effect not quite significant.
lme4
function lmer
(like lm
in some ways)lme4
)drop1
:Let’s pretend we are working at \(\alpha = 0.01\):
dogs.long %>% filter(timex != "lh0") -> dogs.long.no0
dogs.4 <- lmer(lh ~ drug * timex + (1|dog), data=dogs.long.no0)
drop1(dogs.4, test = "Chisq")
Interaction is not quite significant at \(\alpha = 0.01\). So we could remove it.
30 people took part in an exercise study.
Each subject randomly assigned to one of two diets (“low fat” or “non-low fat”) and to one of three exercise programs (“at rest”, “walking”, “running”).
\(2\times3 = 6\) experimental treatments, and thus each one replicated \(30/6=5\) times. (Two-way ANOVA, so far?)
However, each subject had pulse rate measured at three different times (1, 15 and 30 minutes after starting their exercise), so have repeated measures.
Separated by tabs:
“Long format”, usually what we want.
But for repeated measures analysis, we want wide format!
Keep track of which is which:
Manova
analysis: widerlmer
analysis: longer.pivot_wider
.
pivot_wider
needs: a column that is going to be split, and the column to make the values out of:min01, min15, min30
:diet
, exertype
, interaction using lm
:Manova
: Test statistic p-value
times 0.92416 0.40372
diet:times 0.92416 0.40372
exertype:times 0.92416 0.40372
diet:exertype:times 0.92416 0.40372
No problem with sphericity; go to univariate tests.
Sum Sq num Df Error SS den Df F value
(Intercept) 894608 1 2085.2 24 10296.6595
diet 1262 1 2085.2 24 14.5238
exertype 8326 2 2085.2 24 47.9152
diet:exertype 816 2 2085.2 24 4.6945
times 2067 2 1563.6 48 31.7206
diet:times 193 2 1563.6 48 2.9597
exertype:times 2723 4 1563.6 48 20.9005
diet:exertype:times 614 4 1563.6 48 4.7095
Pr(>F)
(Intercept) < 2.2e-16 ***
diet 0.0008483 ***
exertype 4.166e-09 ***
diet:exertype 0.0190230 *
times 1.662e-09 ***
diet:times 0.0613651 .
exertype:times 4.992e-10 ***
diet:exertype:times 0.0027501 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Three-way interactions are difficult to understand. To make an attempt, look at some graphs.
Plot time trace of pulse rates for each individual, joined by lines, and make separate plots for each diet-exertype
combo.
facet_grid(diet~exertype)
: do a separate plot for each combination of diet and exercise type, with diets going down the page and exercise types going across. (Graphs are usually landscape, so have the factor exertype
with more levels going across.)
ggplot
again. Using long data frame:At rest: no change in pulse rate over time
Walking: not much change in pulse rates over time.
Running: overall increase in pulse rate over time, but increase stronger for lowfat
group.
No consistent effect of:
Looks as if there is only any substantial time effect for the runners. For them, does diet have an effect?
Pull out only the runners from the wide data:
Test statistic p-value
times 0.81647 0.4918
diet:times 0.81647 0.4918
Sum Sq num Df Error SS den Df F value Pr(>F)
(Intercept) 383522 1 339.2 8 9045.3333 1.668e-13 ***
diet 1920 1 339.2 8 45.2830 0.0001482 ***
times 4714 2 1242.0 16 30.3644 3.575e-06 ***
diet:times 789 2 1242.0 16 5.0795 0.0195874 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summarize
is data frame, so can save it (and do more with it if needed).
Comments