How to do a Wilcoxon rank-sum test (in Python, using SciPy)
Task
Assume we have two independent samples of data,
Related tasks:
- How to do a Kruskal-Wallis test
- How to do a Wilcoxon signed-rank test
- How to do a Wilcoxon signed-rank test for matched pairs
Solution
We’re going to use fake data for illustrative purposes,
but you can replace our fake data with your real data.
Say our first sample,
1
2
3
4
import numpy as np
# Replace sample1 and sample2 with your data
sample1 = np.array([56, 31, 190, 176, 119, 15, 140, 152, 167])
sample2 = np.array([45, 36, 78, 54, 12, 25, 39, 48, 52, 70, 85])
We choose a value,
Two-tailed test
To test the null hypothesis
1
2
3
from scipy import stats
from scipy.stats import ranksums
ranksums(sample1, sample2)
RanksumsResult(statistic=2.0892772350933626, pvalue=0.03668277440246522)
Our p-value,
(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Right-tailed test
To test the null hypothesis
1
ranksums(sample1, sample2, alternative = 'greater')
RanksumsResult(statistic=2.0892772350933626, pvalue=0.01834138720123261)
Our p-value,
(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Left-tailed test
To test the null hypothesis
1
ranksums(sample1, sample2, alternative = 'less')
RanksumsResult(statistic=2.0892772350933626, pvalue=0.9816586127987674)
Our p-value,
(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Elizabeth Czarniak (CZARNIA_ELIZ@bentley.edu)