How to do a Spearman rank correlation test (in Python, using SciPy)
Task
When we want to determine whether there is a relationship between two variables, but our samples do not come from normally distributed populations, we can use the Spearman Rank Correlation Test. How do we conduct it?
Solution
We will use some fake data about height and weight measurements for people. You can replace it with your real data.
Our data should be NumPy arrays, as in the example below. (Recall that pandas DataFrame columns are also NumPy arrays.)
1
2
3
import numpy as np
heights = np.array([60, 76, 57, 68, 70, 62, 63])
weights = np.array([145, 178, 120, 143, 174, 130, 137])
Let’s say we want to test the correlation between height (inches) and weight (pounds).
Our null hypothesis would state that the Pearson correlation coefficient is equal to zero,
or that there is no relationship between height and weight,
1
2
3
from scipy import stats
from scipy.stats import spearmanr
spearmanr(heights, weights)
SignificanceResult(statistic=0.7857142857142859, pvalue=0.03623846267982713)
Our
(This
Note that for right- or left-tailed tests, the following syntax can be used.
1
2
spearmanr(heights, weights, alternative="greater") # right-tailed
spearmanr(heights, weights, alternative="less") # left-talied
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Elizabeth Czarniak (CZARNIA_ELIZ@bentley.edu)