If we have samples from several independent populations, we might want to test whether the population medians are equal. We may not be able to assume anything about the populations’ variances, nor whether they are normally distributed, but we do assume that the populations have distributions that are approximately the same shape. The Kruskal-Wallis Test will allow us to test the medians for equality. It is similar to a One-Way ANOVA but using medians instead of means. How do we perform a Kruskal-Wallis Test?
- How to do a one-way analysis of variance (ANOVA)
- How to use Bonferroni’s Correction method
- How to do a Wilcoxon rank-sum test
For the purposes of this example, let’s say we have a sample of GPAs from matriculated students at three Ivy League institutions: Harvard, Dartmouth, and Columbia. This is example data, and you can replace it with your actual data when you re-use this code.
R requires that our categories and our numeric sample values be in separate vectors. We could structure our data as follows.
1 2 3 4 5 6 7 gpas <- c( 3.40, 3.66, 3.90, 3.55, 3.90, 3.58, 3.90, 3.97, 3.92, 3.83, 4.00, 3.68, 4.00, 3.75, 3.34 ) schools <- c( "Harvard", "Harvard", "Harvard", "Harvard", "Harvard", "Harvard", "Dartmouth", "Dartmouth", "Dartmouth", "Dartmouth", "Dartmouth", "Dartmouth", "Columbia", "Columbia", "Columbia" )
The Kruskal-Willis Test uses a null hypothesis that the category medians are equal, $H_0: m_C = m_H = m_D \le 0$. We choose $\alpha$, or the Type I error rate, as 0.05 and run the test as shown below.
1 kruskal.test(gpas, schools)
1 2 3 4 Kruskal-Wallis rank sum test data: gpas and schools Kruskal-Wallis chi-squared = 3.706, df = 2, p-value = 0.1568
The p-value, 0.1568, is greater than $\alpha$, so we fail to reject the null hypothesis. We do not have sufficient evidence to conclude that the median GPAs of matriculated students at these three schools are different from each other.
Content last modified on 24 July 2023.
Contributed by Elizabeth Czarniak (CZARNIA_ELIZ@bentley.edu)