Link Search Menu Expand Document (external link)

How to do a one-way analysis of variance (ANOVA) (in Julia)

See all solutions.

Task

If we have multiple independent samples of the same quantity (such as students’ SAT scores from several different schools), we may want to test whether the means of each of the samples are the same. Analysis of Variance (ANOVA) can determine whether any two of the sample means differ significantly. How can we do an ANOVA?

Related tasks:

Solution

Let’s assume we have our samples in several different Julia arrays. Here I’ll construct some made-up data about SAT scores at four different schools.

1
2
3
4
school1_SATs = [ 1100, 1250, 1390, 970, 1510 ];
school2_SATs = [ 1010, 1050, 1090, 1110 ];
school3_SATs = [ 900, 1550, 1300, 1270, 1210 ];
school4_SATs = [ 900, 850, 1110, 1070, 910, 920 ];

ANOVA tests the null hypothesis that all group means are equal. You choose $\alpha$, the probability of Type I error (false positive, finding we should reject $H_0$ when it’s actually true). I will use $\alpha=0.05$ in this example.

1
2
3
4
5
using HypothesisTests
alpha = 0.05
p_value = pvalue( OneWayANOVATest( school1_SATs, school2_SATs, school3_SATs, school4_SATs ) )
reject_H0 = p_value < alpha
alpha, p_value, reject_H0
1
(0.05, 0.03405326535040251, true)

The result we see above is to reject $H_0$, and therefore conclude that at least one pair of means is statistically significantly different.

If you are using the most common $\alpha$ value of $0.05$, you can save a few lines of code and get a more detailed printout by just printing out the test itself:

1
OneWayANOVATest( school1_SATs, school2_SATs, school3_SATs, school4_SATs )
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
One-way analysis of variance (ANOVA) test
-----------------------------------------
Population details:
    parameter of interest:   Means
    value under h_0:         "all equal"
    point estimate:          NaN

Test summary:
    outcome with 95% confidence: reject h_0
    p-value:                     0.0341

Details:
    number of observations: [5, 4, 5, 6]
    F statistic:            3.69513
    degrees of freedom:     (3, 16)

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Contributed by Nathan Carter (ncarter@bentley.edu)