How to do a Wilcoxon rank-sum test

Description

Assume we have two independent samples of data, $x_{1}, x_{2}, x_{3}, \dots x_{n}$ and $x_{1}^{'}, x_{2}^{'}, x_{3}^{'}, \dots x_{m}^{'}$ , each from a different population. Also assume that the sample sizes are small or the populations are not normally distributed, but that the two population distributions are approximately the same shape. How can we test whether there is a significant difference between the two medians (or if one is significantly greater than or less than the other)? One method is the Wilcoxon Rank-Sum Test.

Related tasks:

Using SciPy, in Python

View this solution alone.

We’re going to use fake data for illustrative purposes, but you can replace our fake data with your real data. Say our first sample, $x_{1}, x_{2}, x_{3}, \dots x_{n}$ , has median $m_{1}$ , and our second sample, $x_{1}^{'}, x_{2}^{'}, x_{3}^{'}, \dots x_{m}^{'}$ , has median $m_{2}$ .

import numpy as np
# Replace sample1 and sample2 with your data
sample1 = np.array([56, 31, 190, 176, 119, 15, 140, 152, 167])
sample2 = np.array([45, 36, 78, 54, 12, 25, 39, 48, 52, 70, 85])

We choose a value, $0 \leq α \leq 1$ , as the Type I Error Rate. We’ll let $α$ be 0.05.

Two-tailed test

To test the null hypothesis $H_{0} : m_{1} - m_{2} = 0$ , that is, $m_{1} = m_{2}$ , we use a two-tailed test:

from scipy import stats
from scipy.stats import ranksums
ranksums(sample1, sample2)

RanksumsResult(statistic=2.0892772350933626, pvalue=0.03668277440246522)

Our p-value, $0.03668$ , is less than $α = 0.05$ , so we have sufficient evidence to reject the null hypothesis. The population medians are significantly different from each other.

(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

Right-tailed test

To test the null hypothesis $H_{0} : m_{1} - m_{2} \leq 0$ , that is, $m_{1} \leq m_{2}$ , we use a right-tailed test:

ranksums(sample1, sample2, alternative = 'greater')

RanksumsResult(statistic=2.0892772350933626, pvalue=0.01834138720123261)

Our p-value, $0.01834$ , is less than $α = 0.05$ , so we have sufficient evidence to reject the null hypothesis. The first population medians is significantly greater second.

(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

Left-tailed test

To test the null hypothesis $H_{0} : m_{1} - m_{2} \geq 0$ , that is, $m_{1} \geq m_{2}$ , we use a left-tailed test:

ranksums(sample1, sample2, alternative = 'less')

RanksumsResult(statistic=2.0892772350933626, pvalue=0.9816586127987674)

Our p-value, $0.98165$ , is greater than $α$ , so we do not have sufficient evidence to reject the null hypothesis. The first population median is not significantly smaller than the second population median.

(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Solution, in R

View this solution alone.

We’re going to use fake data for illustrative purposes, but you can replace our fake data with your real data. Say our first sample, $x_{1}, x_{2}, x_{3}, \dots x_{k}$ , has median $m_{1}$ , and our second sample, $x_{1}^{'}, x_{2}^{'}, x_{3}^{'}, \dots x_{k}^{'}$ , has median $m_{2}$ .

# Replace sample1 and sample2 with your data
sample1 <- c(56, 31, 190, 176, 119, 15, 140, 152, 167)
sample2 <- c(45, 36, 78, 54, 12, 25, 39, 48, 52, 70, 85)

We choose a value, $0 \leq α \leq 1$ , as the Type I Error Rate. We’ll let $α$ be 0.05.

Two-tailed test

To test the null hypothesis $H_{0} : m_{1} - m_{2} = 0$ , that is, $m_{1} = m_{2}$ , we use a two-tailed test:

wilcox.test(sample1, sample2, alternative = "two.sided", mu = 0, paired = FALSE)

	Wilcoxon rank sum exact test

data:  sample1 and sample2
W = 77, p-value = 0.03813
alternative hypothesis: true location shift is not equal to 0

Our p-value, $0.03813$ , is less than $α = 0.05$ , so we have sufficient evidence to reject the null hypothesis. The population medians are significantly different from each other.

(The output above is slightly different than the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

Right-tailed test

To test the null hypothesis $H_{0} : m_{1} - m_{2} \leq 0$ , that is, $m_{1} \leq m_{2}$ , we use a right-tailed test:

wilcox.test(sample1, sample2, alternative = "greater", mu = 0, paired = FALSE)

	Wilcoxon rank sum exact test

data:  sample1 and sample2
W = 77, p-value = 0.01906
alternative hypothesis: true location shift is greater than 0

Our p-value, $0.01906$ , is less than $α = 0.05$ , so we have sufficient evidence to reject the null hypothesis. The first population medians is significantly greater second.

(The output above is slightly different from the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

Left-tailed test

To test the null hypothesis $H_{0} : m_{1} - m_{2} \geq 0$ , that is, $m_{1} \geq m_{2}$ , we use a left-tailed test:

wilcox.test(sample1, sample2, alternative = "less", mu = 0, paired = FALSE)

	Wilcoxon rank sum exact test

data:  sample1 and sample2
W = 77, p-value = 0.9845
alternative hypothesis: true location shift is less than 0

Our p-value, $0.9845$ , is greater than $α$ , so we do not have sufficient evidence to reject the null hypothesis. The first population median is not significantly smaller than the second population median.

(The output above is slightly different from the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

NOTE: If there are ties in the data and there are fewer than 50 observations in each sample, then R will compute a $p$ -value using the normal approximation, and there will be an error message indicating that the exact $p$ -value cannot be calculated.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Topics that include this task

Bentley University MA214

Opportunities

This website does not yet contain a solution for this task in any of the following software packages.

Excel
Julia

If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.