7.4 ANOVA for a General Linear Model

The case we have examined this far was an ANOVA table based on a Linear Mixed Model. Often, you will want an ANOVA table based on a General Linear Model instead. R has a base function for an ANOVA table, aov(). Perversely, it will only give you F-ratios based on type 1 calculations. You therefore need to use the contributed package car. This package contains a function Anova() that will give you all three types. (Note, Anova() with a capital, to distinguish it from a base R function anova().) The rest of this section shows you how to get an ANOVA table for a General Linear Model in this way.

Another of the studies in Nettle and Saxe (2020), study 7, addressed broadly the same question as the study 1 we have already worked with. However, the manipulation of the IV was done between subjects rather than within subjects. Thus, there was one response per participant and the data can be analysed using a General Linear Model rather than a Linear Mixed Model. We will do this, and test for experimental effects using ANOVA.

Go to the repository at https://osf.io/xrqae/ and get the data file ‘study7.data.csv’. You know the deal by now; save the file into your working directory, and then:

d7 <- read_csv("study7.data.csv")

Now, run install.packages('car'), and:

library(car)

The DV in study 7 was once again that percentage of the harvest that the participant thought should be shared out between the villages (in the data, variable redistlevel). The IV (Condition) was again the importance of luck in the production of food, with three levels: High (participant is told luck is important); Low (participant is told luck is not very important); and Unspecified (participant is not told anything about the role of luck). There was only one IV in this experiment, but we are going to include a non-manipulated continuous covariate in our model too. (ANOVA can handle continuous predictor variables too, even though it is more often used for experimental IVs that are categorical). This is the participant’s political orientation on the left-to-right axis (variable leftright). The predictions of the study were that it would make a difference to how much the participant thought should be shared out: how important luck was; whether they identified as left or right wing; and (maybe) some interaction between these two predictors.

Let’s make Condition a factor with Low the first (and hence reference) level; centre leftright; and fit the model.

d7$Condition <- factor(d7$Condition, levels=c("Unspecified", "Low", "High"))
d7$leftright.c <- d7$leftright - mean(d7$leftright, na.rm=TRUE)
s1 <- lm(redistlevel ~ Condition*leftright.c, data=d7)
summary(s1)$coefficients

                          Estimate Std. Error t value  Pr(>|t|)
(Intercept)                39.3936     0.9514  41.407 3.10e-263
ConditionLow                0.3010     1.3449   0.224  8.23e-01
ConditionHigh               4.0201     1.3427   2.994  2.79e-03
leftright.c                -0.0970     0.0411  -2.363  1.82e-02
ConditionLow:leftright.c    0.0269     0.0580   0.464  6.43e-01
ConditionHigh:leftright.c  -0.0172     0.0578  -0.297  7.67e-01

Now let’s get the type-2 ANOVA table.

Anova(s1, type=2)

Anova Table (Type II tests)

Response: redistlevel
                      Sum Sq   Df F value  Pr(>F)    
Condition               5997    2    5.57  0.0039 ** 
leftright.c             8499    1   15.78 7.4e-05 ***
Condition:leftright.c    318    2    0.30  0.7443    
Residuals             961317 1785                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

So, we have a significant main effect of Condition, a significant main effect of leftright, and no significant interaction. You would need to follow this up of course by working out which levels of Condition differed from which other (it looks like High is different from Low, but neither is different from Unspecified), and which direction the association between leftright and redistlevel goes (there are no surprises there, people who identify as more right-wing think less of the harvest should be redistributed).

Model s1 is a case where the conclusion you would draw if you conducted statistical tests based on individual parameter estimates would depend a lot on your choice of reference category for Condition, and also whether or not you centre leftright. You can verify this for yourself by switching reference categories for Condition with the relevel function, and using the centred or un-centred version of leftright.