# How to do a Kruskal-Wallis test (in R)

See all solutions.

If we have samples from several independent populations, we might want to test whether the population medians are equal. We may not be able to assume anything about the populations’ variances, nor whether they are normally distributed, but we do assume that the populations have distributions that are approximately the same shape. The Kruskal-Wallis Test will allow us to test the medians for equality. It is similar to a One-Way ANOVA but using medians instead of means. How do we perform a Kruskal-Wallis Test?

## Solution

For the purposes of this example, let’s say we have a sample of GPAs from matriculated students at three Ivy League institutions: Harvard, Dartmouth, and Columbia. This is example data, and you can replace it with your actual data when you re-use this code.

R requires that our categories and our numeric sample values be in separate vectors. We could structure our data as follows.

1
2
3
4
5
6
7
gpas <- c( 3.40, 3.66, 3.90, 3.55, 3.90, 3.58,
3.90, 3.97, 3.92, 3.83, 4.00, 3.68,
4.00, 3.75, 3.34 )
schools <- c(
"Harvard", "Harvard", "Harvard", "Harvard", "Harvard", "Harvard",
"Dartmouth", "Dartmouth", "Dartmouth", "Dartmouth", "Dartmouth", "Dartmouth",
"Columbia", "Columbia", "Columbia" )


The Kruskal-Willis Test uses a null hypothesis that the category medians are equal, $H_0: m_C = m_H = m_D \le 0$. We choose $\alpha$, or the Type I error rate, as 0.05 and run the test as shown below.

1
kruskal.test(gpas, schools)

1
2
3
4
Kruskal-Wallis rank sum test

data:  gpas and schools
Kruskal-Wallis chi-squared = 3.706, df = 2, p-value = 0.1568


The p-value, 0.1568, is greater than $\alpha$, so we fail to reject the null hypothesis. We do not have sufficient evidence to conclude that the median GPAs of matriculated students at these three schools are different from each other.