Link Search Menu Expand Document (external link)

How to perform a chi-squared test on a contingency table (in R)

See all solutions.

Task

If we have a contingency table showing the frequencies observed in two categorical variables, how can we run a $\chi^2$ test to see if the two variables are independent?

Solution

Here we will use a $2\times4$ matrix to store a contingency table of education vs. gender, taken from Penn State University’s online stats review website. You should use your own data. (Note: R’s table function is useful for creating contingency tables from data.)

1
2
3
4
data <- matrix( c( 60, 54, 46, 41, 40, 44, 53, 57 ), ncol = 4,
                dimnames=list( c('F','M'), c('HS','BS','MS','PhD') ),
                byrow =TRUE)
data
1
2
3
  HS BS MS PhD
F 60 54 46 41 
M 40 44 53 57 

The $\chi^2$ test’s null hypothesis is that the two variables are independent. We choose a value $0\leq\alpha\leq1$ as the probability of a Type I error (false positive, finding we should reject $H_0$ when it’s actually true).

R provides a chisq.test function that does exactly what we need.

1
2
results <- chisq.test( data )
results
1
2
3
4
	Pearson's Chi-squared test

data:  data
X-squared = 8.0061, df = 3, p-value = 0.04589

We can manually compare the $p$-value to an $\alpha$ we’ve chosen, or ask R to do it.

1
2
alpha <- 0.05            # or choose your own alpha here
results$p.value < alpha  # reject the null hypothesis?
1
[1] TRUE

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Contributed by Nathan Carter (ncarter@bentley.edu)