Analysis of Covariance

Analysis of covariance

ANOVA: explanatory variables categorical (divide data into groups)
traditionally, analysis of covariance has categorical \(x\)’s plus one numerical \(x\) (“covariate”) to be adjusted for.
lm handles this too.
Simple example: two treatments (drugs) (a and b), with before and after scores.
Does knowing before score and/or treatment help to predict after score?
Is after score different by treatment/before score?

Data: treatment, before, after

Packages

library(tidyverse)
library(broom)
library(marginaleffects)

the last of these for predictions.

Read in data

url <- "http://ritsokiguess.site/datafiles/ancova.txt"
prepost <- read_delim(url, " ")
prepost

Making a plot

ggplot(prepost, aes(x = before, y = after, colour = drug)) +
  geom_point() + geom_smooth(method = "lm")

Comments

As before score goes up, after score goes up.
Red points (drug A) generally above blue points (drug B), for comparable before score.
Suggests before score effect and drug effect.

The means

prepost %>%
  group_by(drug) %>%
  summarize(
    before_mean = mean(before),
    after_mean = mean(after)
  )

Mean “after” score slightly higher for treatment A.
Mean “before” score much higher for treatment B.
Greater improvement on treatment A.

Testing for interaction

prepost.1 <- lm(after ~ before * drug, data = prepost)
drop1(prepost.1, test = "F")

Interaction not significant. Will remove later.

Predictions

Set up values to predict for, median and quartiles for before, the two drugs:

new <- datagrid(before = c(9.75, 14, 21.25), 
                drug = c("a", "b"), model = prepost.1)
new

and then

cbind(predictions(prepost.1, newdata = new)) %>% 
  select(drug, before, estimate, conf.low, conf.high)

Predictions (with interaction included), plotted

plot_predictions(model = prepost.1, 
                 condition = c("before", "drug"))

Lines almost parallel, but not quite.

Taking out interaction

prepost.2 <- update(prepost.1, . ~ . - before:drug)
drop1(prepost.2, test = "F")

Take out non-significant interaction.
before and drug strongly significant.
Do predictions again and plot them.

Predictions

cbind(predictions(prepost.2, newdata = new)) %>% 
  select(drug, before, estimate)

Plot of predicted values

plot_predictions(prepost.2, condition = c("before", "drug"))

This time the lines are exactly parallel. No-interaction model forces them to have the same slope.

Different look at model output

drop1(prepost.2) tests for significant effect of before score and of drug, but doesn’t help with interpretation.
summary(prepost.2) views as regression with slopes:

summary(prepost.2)


Call:
lm(formula = after ~ before + drug, data = prepost)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.6348 -2.5099 -0.2038  1.8871  4.7453 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  18.3600     1.5115  12.147 8.35e-10 ***
before        0.8275     0.0955   8.665 1.21e-07 ***
drugb        -5.1547     1.2876  -4.003 0.000921 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.682 on 17 degrees of freedom
Multiple R-squared:  0.817, Adjusted R-squared:  0.7955 
F-statistic: 37.96 on 2 and 17 DF,  p-value: 5.372e-07

Understanding those slopes

tidy(prepost.2)

before ordinary numerical variable; drug categorical.
lm uses first category druga as baseline.
Intercept is prediction of after score for before score 0 and drug A.
before slope is predicted change in after score when before score increases by 1 (usual slope)
Slope for drugb is change in predicted after score for being on drug B rather than drug A. Same for any before score (no interaction).

Summary

ANCOVA model: fits different regression line for each group, predicting response from covariate.
ANCOVA model with interaction between factor and covariate allows different slopes for each line.
Sometimes those lines can cross over!
If interaction not significant, take out. Lines then parallel.
With parallel lines, groups have consistent effect regardless of value of covariate.