Link Search Menu Expand Document (external link)

How to perform a chi-squared test on a contingency table (in R)

See all solutions.

Task

If we have a contingency table showing the frequencies observed in two categorical variables, how can we run a χ2 test to see if the two variables are independent?

Solution

Here we will use a 2×4 matrix to store a contingency table of education vs. gender, taken from Penn State University’s online stats review website. You should use your own data. (Note: R’s table function is useful for creating contingency tables from data.)

1
2
3
4
data <- matrix( c( 60, 54, 46, 41, 40, 44, 53, 57 ), ncol = 4,
                dimnames=list( c('F','M'), c('HS','BS','MS','PhD') ),
                byrow =TRUE)
data
  HS BS MS PhD
F 60 54 46 41 
M 40 44 53 57 

The χ2 test’s null hypothesis is that the two variables are independent. We choose a value 0α1 as the probability of a Type I error (false positive, finding we should reject H0 when it’s actually true).

R provides a chisq.test function that does exactly what we need.

1
2
results <- chisq.test( data )
results
	Pearson's Chi-squared test

data:  data
X-squared = 8.0061, df = 3, p-value = 0.04589

We can manually compare the p-value to an α we’ve chosen, or ask R to do it.

1
2
alpha <- 0.05            # or choose your own alpha here
results$p.value < alpha  # reject the null hypothesis?
[1] TRUE

Content last modified on 24 July 2023.

See a problem? Tell us or edit the source.

Contributed by Nathan Carter (ncarter@bentley.edu)