# How to do a Wilcoxon rank-sum test

## Description

Assume we have two independent samples of data, $x_1, x_2, x_3, \ldots x_n$ and $x’_1, x’_2, x’_3, \ldots x’_m$, each from a different population. Also assume that the sample sizes are small or the populations are not normally distributed, but that the two population distributions are approximately the same shape. How can we test whether there is a significant difference between the two medians (or if one is significantly greater than or less than the other)? One method is the Wilcoxon Rank-Sum Test.

Related tasks:

- How to do a Kruskal-Wallis test
- How to do a Wilcoxon signed-rank test
- How to do a Wilcoxon signed-rank test for matched pairs

## Using SciPy, in Python

We’re going to use fake data for illustrative purposes, but you can replace our fake data with your real data. Say our first sample, $x_1, x_2, x_3, \ldots x_n$, has median $m_1$, and our second sample, $x’_1, x’_2, x’_3, \ldots x’_m$, has median $m_2$.

1
2
3
4

import numpy as np
# Replace sample1 and sample2 with your data
sample1 = np.array([56, 31, 190, 176, 119, 15, 140, 152, 167])
sample2 = np.array([45, 36, 78, 54, 12, 25, 39, 48, 52, 70, 85])

We choose a value, $0 \le \alpha \le 1$, as the Type I Error Rate. We’ll let $\alpha$ be 0.05.

### Two-tailed test

To test the null hypothesis $H_0: m_1 - m_2 = 0$, that is, $m_1=m_2$, we use a two-tailed test:

1
2
3

from scipy import stats
from scipy.stats import ranksums
ranksums(sample1, sample2)

1

RanksumsResult(statistic=2.0892772350933626, pvalue=0.03668277440246522)

Our p-value, $0.03668$, is less than $\alpha=0.05$, so we have sufficient evidence to reject the null hypothesis. The population medians are significantly different from each other.

(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

### Right-tailed test

To test the null hypothesis $H_0: m_1 - m_2 \le 0$, that is, $m_1\le m_2$, we use a right-tailed test:

1

ranksums(sample1, sample2, alternative = 'greater')

1

RanksumsResult(statistic=2.0892772350933626, pvalue=0.01834138720123261)

Our p-value, $0.01834$, is less than $\alpha=0.05$, so we have sufficient evidence to reject the null hypothesis. The first population medians is significantly greater second.

(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

### Left-tailed test

To test the null hypothesis $H_0: m_1 - m_2 \ge 0$, that is, $m_1\ge m_2$, we use a left-tailed test:

1

ranksums(sample1, sample2, alternative = 'less')

1

RanksumsResult(statistic=2.0892772350933626, pvalue=0.9816586127987674)

Our p-value, $0.98165$, is greater than $\alpha$, so we do not have sufficient evidence to reject the null hypothesis. The first population median is not significantly smaller than the second population median.

(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

## Solution, in R

We’re going to use fake data for illustrative purposes, but you can replace our fake data with your real data. Say our first sample, $x_1, x_2, x_3, \ldots x_k$, has median $m_1$, and our second sample, $x’_1, x’_2, x’_3, \ldots x’_k$, has median $m_2$.

1
2
3

# Replace sample1 and sample2 with your data
sample1 <- c(56, 31, 190, 176, 119, 15, 140, 152, 167)
sample2 <- c(45, 36, 78, 54, 12, 25, 39, 48, 52, 70, 85)

We choose a value, $0 \le \alpha \le 1$, as the Type I Error Rate. We’ll let $\alpha$ be 0.05.

### Two-tailed test

To test the null hypothesis $H_0: m_1 - m_2 = 0$, that is, $m_1=m_2$, we use a two-tailed test:

1

wilcox.test(sample1, sample2, alternative = "two.sided", mu = 0, paired = FALSE)

1
2
3
4
5

Wilcoxon rank sum exact test
data: sample1 and sample2
W = 77, p-value = 0.03813
alternative hypothesis: true location shift is not equal to 0

Our p-value, $0.03813$, is less than $\alpha=0.05$, so we have sufficient evidence to reject the null hypothesis. The population medians are significantly different from each other.

(The output above is slightly different than the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

### Right-tailed test

To test the null hypothesis $H_0: m_1 - m_2 \le 0$, that is, $m_1\le m_2$, we use a right-tailed test:

1

wilcox.test(sample1, sample2, alternative = "greater", mu = 0, paired = FALSE)

1
2
3
4
5

Wilcoxon rank sum exact test
data: sample1 and sample2
W = 77, p-value = 0.01906
alternative hypothesis: true location shift is greater than 0

Our p-value, $0.01906$, is less than $\alpha=0.05$, so we have sufficient evidence to reject the null hypothesis. The first population medians is significantly greater second.

(The output above is slightly different from the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

### Left-tailed test

To test the null hypothesis $H_0: m_1 - m_2 \ge 0$, that is, $m_1\ge m_2$, we use a left-tailed test:

1

wilcox.test(sample1, sample2, alternative = "less", mu = 0, paired = FALSE)

1
2
3
4
5

Wilcoxon rank sum exact test
data: sample1 and sample2
W = 77, p-value = 0.9845
alternative hypothesis: true location shift is less than 0

Our p-value, $0.9845$, is greater than $\alpha$, so we do not have sufficient evidence to reject the null hypothesis. The first population median is not significantly smaller than the second population median.

(The output above is slightly different from the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

NOTE: If there are ties in the data and there are fewer than 50 observations in each sample, then R will compute a $p$-value using the normal approximation, and there will be an error message indicating that the exact $p$-value cannot be calculated.

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

## Topics that include this task

## Opportunities

This website does not yet contain a solution for this task in any of the following software packages.

- Excel
- Julia

If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.