We often want to know whether a set of data is normally distributed, so that we can deduce what inference tests are appropriate to conduct. If we have a set of data and want to figure out if it comes from a population that follows a normal distribution, one tool that can help is the Jarque-Bera test for normality. How do we perform it?
- How to create a QQ-plot
- How to test data for normality with the D’Agostino-Pearson test
- How to test data for normality with Pearson’s chi-squared test
We’re going to use some fake restaurant data, but you can replace our fake data with your real data in the code below. The values in our fake data represent the amount of money that customers spent on a Sunday morning at the restaurant.
1 2 3 # Replace your data here spending = [ 34, 12, 19, 56, 54, 34, 45, 37, 13, 22, 65, 19, 16, 45, 19, 50, 36, 23, 28, 56, 40, 61, 45, 47, 37 ]
If we assume that the skewness coefficient $S$ and the kurtosis coefficient $K$ are both equal to zero, then our null hypothesis is $H_0: S=K=0$, or that the sample data comes from a normal distribution. We choose a value $0 \le \alpha \le 1$ as our Type 1 error rate. We’ll let $\alpha$ be 0.05 here.
We can use the
jarque_bera() function in SciPy’s stats package to run the hypothesis test.
1 2 from scipy import stats stats.jarque_bera( spending )
1 SignificanceResult(statistic=1.3347292970972002, pvalue=0.5130588882194849)
Our $p$-value of about $0.5131$ is greater than $\alpha$, so we fail to reject our null hypothesis. We would continue to operate under our original assumption that the data come from a normally distributed population.
Content last modified on 24 July 2023.
This website does not yet contain a solution for this task in any of the following software packages.
If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.