How to do a Wilcoxon rank-sum test (in Python, using SciPy)

Task

Assume we have two independent samples of data, $x_{1}, x_{2}, x_{3}, \dots x_{n}$ and $x_{1}^{'}, x_{2}^{'}, x_{3}^{'}, \dots x_{m}^{'}$ , each from a different population. Also assume that the sample sizes are small or the populations are not normally distributed, but that the two population distributions are approximately the same shape. How can we test whether there is a significant difference between the two medians (or if one is significantly greater than or less than the other)? One method is the Wilcoxon Rank-Sum Test.

Related tasks:

Solution

We’re going to use fake data for illustrative purposes, but you can replace our fake data with your real data. Say our first sample, $x_{1}, x_{2}, x_{3}, \dots x_{n}$ , has median $m_{1}$ , and our second sample, $x_{1}^{'}, x_{2}^{'}, x_{3}^{'}, \dots x_{m}^{'}$ , has median $m_{2}$ .

import numpy as np
# Replace sample1 and sample2 with your data
sample1 = np.array([56, 31, 190, 176, 119, 15, 140, 152, 167])
sample2 = np.array([45, 36, 78, 54, 12, 25, 39, 48, 52, 70, 85])

We choose a value, $0 \leq α \leq 1$ , as the Type I Error Rate. We’ll let $α$ be 0.05.

Two-tailed test

To test the null hypothesis $H_{0} : m_{1} - m_{2} = 0$ , that is, $m_{1} = m_{2}$ , we use a two-tailed test:

from scipy import stats
from scipy.stats import ranksums
ranksums(sample1, sample2)

RanksumsResult(statistic=2.0892772350933626, pvalue=0.03668277440246522)

Our p-value, $0.03668$ , is less than $α = 0.05$ , so we have sufficient evidence to reject the null hypothesis. The population medians are significantly different from each other.

(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

Right-tailed test

To test the null hypothesis $H_{0} : m_{1} - m_{2} \leq 0$ , that is, $m_{1} \leq m_{2}$ , we use a right-tailed test:

ranksums(sample1, sample2, alternative = 'greater')

RanksumsResult(statistic=2.0892772350933626, pvalue=0.01834138720123261)

Our p-value, $0.01834$ , is less than $α = 0.05$ , so we have sufficient evidence to reject the null hypothesis. The first population medians is significantly greater second.

(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

Left-tailed test

To test the null hypothesis $H_{0} : m_{1} - m_{2} \geq 0$ , that is, $m_{1} \geq m_{2}$ , we use a left-tailed test:

ranksums(sample1, sample2, alternative = 'less')

RanksumsResult(statistic=2.0892772350933626, pvalue=0.9816586127987674)

Our p-value, $0.98165$ , is greater than $α$ , so we do not have sufficient evidence to reject the null hypothesis. The first population median is not significantly smaller than the second population median.

(The output above is slightly different from the output you would get by running this test in R, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Contributed by Elizabeth Czarniak (CZARNIA_ELIZ@bentley.edu)