Worksheet 7

Published

February 14, 2025

Packages

library(tidyverse)
library(marginaleffects)
library(car)

Hair colour and pain tolerance

In lecture, we sped through a review example of one-way ANOVA, in which we investigated the effect of hair colour on pain tolerance. The data were in http://ritsokiguess.site/datafiles/hairpain.txt, with two columns hair (hair colour) and pain (pain tolerance, with a higher value indicating that the individual can withstand more pain). The hair colours were brown and blond, each subdivided into light and dark. The data values are separated by single spaces.

In this problem, we take a different approach to an analysis of the same data. In particular, we focus on three particular comparisons of pain tolerance:

light brown hair vs. dark brown hair
light blond hair vs. dark blond hair
the average of brown hair vs the average of blond hair.

These are comparisons we have chosen to make before looking at the data (because, we suppose, they are particular research questions of interest to us).

Read in and display (some of) the data.

For the analysis we are about to do, the categorical variable needs to be a factor. Create and save a dataframe for which this is the case.

Find out how many observations you have for each hair type. (This has the side-effect of telling you which order the hair types are in, as far as R is concerned.)

Set up contrasts to represent your comparisons of interest.

Verify that your contrasts are orthogonal.

Set up your contrasts as a matrix, and set this matrix as the contrasts of your categorical variable. Then run the ANOVA as a regression.

Interpret your results.

Cats

144 cats, 47 female and 97 male, had been used in an experiment with the muscle-relaxing drug digitalis. For each cat, its Sex was recorded, along with the cat’s body weight (Bwt) in kilograms and the weight of its heart (Hwt) in grams. We are interested in the relationship between the weight of a cat’s heart (response) and its body weight (explanatory), and whether that relationship is different for male and female cats. The data are in the file http://ritsokiguess.site/datafiles/cats.csv.

Read in and display (some of) the data.

Make a suitable graph of the three variables. Add appropriate regression lines to your graph.

Fit a suitable analysis of covariance model, and display its output.

Is the interaction term significant? What does your answer mean in the context of the data?

For male and female cats of body weights 2.5 and 3.5 kg (all four combinations), obtain predicted heart weights.

(2 points) Using your predictions, verify that the slopes for males and females are different.

Neurocognition in individuals with schizophrenia

A study was carried out to evaluate patterns and levels of performance on neurocognitive measures among individuals with schizophrenia and schizoaffective disorder using a well-validated, comprehensive neurocognitive battery specifically designed for individuals with psychosis.

The main interest was in determining how well these measures distinguished among all groups and whether there were variables that distinguished between the schizophrenia and schizoaffective groups. Age and sex were also measured for each individual, but we ignore those in our analysis.

Variables of interest, all quantitative except for the first one:

Dx Diagnostic group, categorical with levels Schizophrenia Schizoaffective Control
Speed Speed of processing score
Attention Attention/Vigilance score
Memory Working memory score
Verbal Verbal Learning score
Visual Visual Learning score
ProbSolv Reasoning/Problem Solving score
SocialCog Social Cognition score

The clinical sample comprised 116 male and female patients who had a diagnosis of schizophrenia (\(n = 70)\) or schizoaffective disorder (\(n = 46\)) confirmed by a standard test.

Non-psychiatric control participants (\(n = 146\)) were screened for medical and psychiatric illness and history of substance abuse. Patients were recruited from three outpatient clinics in Hamilton, Ontario, Canada. Control participants were recruited through local newspaper and online classified advertisements for paid research participation.

The data are in http://ritsokiguess.site/datafiles/NeuroCog.csv.

Read in and display (some of) the data.

Why is this dataset suitable for a multivariate ANOVA analysis? Explain briefly.

Create a suitable response variable for a MANOVA. Show the first few rows of your response variable. NOTE: if you display it all, all 242 rows will be displayed, and if the grader has to scroll through that, they may run out of time to mark the rest of your assignment. Be careful.

Run a suitable MANOVA using the manova command, displaying the output.

Run a suitable MANOVA using Manova from the car package, displaying the results.

What are you able to conclude from your analyses? (The conclusion should be the same for both of them.)

Carry out Box’s M test. What do you conclude from it?

Make boxplots of all seven response variables against diagnosis (Dx), on one ggplot. The best answer will display the graphs so that they are easy to read. Hint: the idea is the same as plotting residuals against all of the explanatory variables in a regression.

Looking at your boxplots, why do you think your MANOVA came out significant, and what do your boxplots tell you about the relative test scores for patients with diagnoses of schizophrenia or schizoaffective?