How to do a Wilcoxon rank-sum test (in R)
Task
Assume we have two independent samples of data,
Related tasks:
- How to do a Kruskal-Wallis test
- How to do a Wilcoxon signed-rank test
- How to do a Wilcoxon signed-rank test for matched pairs
Solution
We’re going to use fake data for illustrative purposes,
but you can replace our fake data with your real data.
Say our first sample,
1
2
3
# Replace sample1 and sample2 with your data
sample1 <- c(56, 31, 190, 176, 119, 15, 140, 152, 167)
sample2 <- c(45, 36, 78, 54, 12, 25, 39, 48, 52, 70, 85)
We choose a value,
Two-tailed test
To test the null hypothesis
1
wilcox.test(sample1, sample2, alternative = "two.sided", mu = 0, paired = FALSE)
Wilcoxon rank sum exact test
data: sample1 and sample2
W = 77, p-value = 0.03813
alternative hypothesis: true location shift is not equal to 0
Our p-value,
(The output above is slightly different than the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Right-tailed test
To test the null hypothesis
1
wilcox.test(sample1, sample2, alternative = "greater", mu = 0, paired = FALSE)
Wilcoxon rank sum exact test
data: sample1 and sample2
W = 77, p-value = 0.01906
alternative hypothesis: true location shift is greater than 0
Our p-value,
(The output above is slightly different from the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
Left-tailed test
To test the null hypothesis
1
wilcox.test(sample1, sample2, alternative = "less", mu = 0, paired = FALSE)
Wilcoxon rank sum exact test
data: sample1 and sample2
W = 77, p-value = 0.9845
alternative hypothesis: true location shift is less than 0
Our p-value,
(The output above is slightly different from the output you would get by running this test in Python, because SciPy uses a normal distribution internally, but R uses a Wilcoxon distribution.)
NOTE: If there are ties in the data and there are fewer than 50 observations in each sample, then R will compute a
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Elizabeth Czarniak (CZARNIA_ELIZ@bentley.edu)