# How to perform a chi-squared test on a contingency table (in R)

See all solutions.

If we have a contingency table showing the frequencies observed in two categorical variables, how can we run a $\chi^2$ test to see if the two variables are independent?

## Solution

Here we will use a $2\times4$ matrix to store a contingency table of education vs. gender, taken from Penn State University’s online stats review website. You should use your own data. (Note: R’s table function is useful for creating contingency tables from data.)

1
2
3
4
data <- matrix( c( 60, 54, 46, 41, 40, 44, 53, 57 ), ncol = 4,
dimnames=list( c('F','M'), c('HS','BS','MS','PhD') ),
byrow =TRUE)
data

1
2
3
HS BS MS PhD
F 60 54 46 41
M 40 44 53 57


The $\chi^2$ test’s null hypothesis is that the two variables are independent. We choose a value $0\leq\alpha\leq1$ as the probability of a Type I error (false positive, finding we should reject $H_0$ when it’s actually true).

R provides a chisq.test function that does exactly what we need.

1
2
results <- chisq.test( data )
results

1
2
3
4
Pearson's Chi-squared test

data:  data
X-squared = 8.0061, df = 3, p-value = 0.04589


We can manually compare the $p$-value to an $\alpha$ we’ve chosen, or ask R to do it.

1
2
alpha <- 0.05            # or choose your own alpha here
results\$p.value < alpha  # reject the null hypothesis?

1
[1] TRUE