How to do a Wilcoxon rank-sum test
Description
Assume we have two independent samples of data,
Related tasks:
- How to do a Kruskal-Wallis test
- How to do a Wilcoxon signed-rank test
- How to do a Wilcoxon signed-rank test for matched pairs
Using SciPy, in Python
We’re going to use fake data for illustrative purposes,
but you can replace our fake data with your real data.
Say our first sample,
1
2
3
4
import numpy as np
# Replace sample1 and sample2 with your data
sample1 = np.array([56, 31, 190, 176, 119, 15, 140, 152, 167])
sample2 = np.array([45, 36, 78, 54, 12, 25, 39, 48, 52, 70, 85])
We choose a value,
Two-tailed test
To test the null hypothesis
1
2
3
from scipy import stats
from scipy.stats import ranksums
ranksums(sample1, sample2)
RanksumsResult(statistic=2.0892772350933626, pvalue=0.03668277440246522)
Our p-value,
(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Right-tailed test
To test the null hypothesis
1
ranksums(sample1, sample2, alternative = 'greater')
RanksumsResult(statistic=2.0892772350933626, pvalue=0.01834138720123261)
Our p-value,
(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Left-tailed test
To test the null hypothesis
1
ranksums(sample1, sample2, alternative = 'less')
RanksumsResult(statistic=2.0892772350933626, pvalue=0.9816586127987674)
Our p-value,
(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Solution, in R
We’re going to use fake data for illustrative purposes,
but you can replace our fake data with your real data.
Say our first sample,
1
2
3
# Replace sample1 and sample2 with your data
sample1 <- c(56, 31, 190, 176, 119, 15, 140, 152, 167)
sample2 <- c(45, 36, 78, 54, 12, 25, 39, 48, 52, 70, 85)
We choose a value,
Two-tailed test
To test the null hypothesis
1
wilcox.test(sample1, sample2, alternative = "two.sided", mu = 0, paired = FALSE)
Wilcoxon rank sum exact test
data: sample1 and sample2
W = 77, p-value = 0.03813
alternative hypothesis: true location shift is not equal to 0
Our p-value,
(The output above is slightly different than the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Right-tailed test
To test the null hypothesis
1
wilcox.test(sample1, sample2, alternative = "greater", mu = 0, paired = FALSE)
Wilcoxon rank sum exact test
data: sample1 and sample2
W = 77, p-value = 0.01906
alternative hypothesis: true location shift is greater than 0
Our p-value,
(The output above is slightly different from the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Left-tailed test
To test the null hypothesis
1
wilcox.test(sample1, sample2, alternative = "less", mu = 0, paired = FALSE)
Wilcoxon rank sum exact test
data: sample1 and sample2
W = 77, p-value = 0.9845
alternative hypothesis: true location shift is less than 0
Our p-value,
(The output above is slightly different from the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
NOTE: If there are ties in the data and there are fewer than 50 observations in each sample, then R will compute a
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Topics that include this task
Opportunities
This website does not yet contain a solution for this task in any of the following software packages.
- Excel
- Julia
If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.