# How to add a polynomial term to a model (in R)

See all solutions.

Sometimes, a simple linear model isn’t sufficient to describe the data. How can we include a higher-order term in a regression model, such as the square or cube of one of the predictors?

## Solution

We’re going to use the Pressure dataset in R’s ggplot library as example data. It contains observations of pressure and temperature. You would use your own data instead.

1
2
3
# install.packages( "ggplot2" ) # if you haven't done this already
library(ggplot2)
data("pressure")


Let’s model temperature as the dependent variable with pressure squared as the independent variable. To place the “pressure squared” term in the model, we use R’s poly function, as shown below. It automatically includes a pressure term as well (not squared).

1
2
3
# Build the model
model <- lm(temperature ~ poly(pressure, 2), data = pressure)
summary(model)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Call:
lm(formula = temperature ~ poly(pressure, 2), data = pressure)

Residuals:
Min       1Q   Median       3Q      Max
-113.095  -44.543    6.157   50.459   75.791

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)          180.00      14.31  12.581 1.03e-09 ***
poly(pressure, 2)1   361.84      62.36   5.802 2.70e-05 ***
poly(pressure, 2)2  -186.66      62.36  -2.993   0.0086 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 62.36 on 16 degrees of freedom
Multiple R-squared:  0.7271,	Adjusted R-squared:  0.693
F-statistic: 21.31 on 2 and 16 DF,  p-value: 3.079e-05


Now we have a model of the form $\hat t = 180 + 361.84p - 186.66p^2$, where $t$ stands for temperature and $p$ for pressure.

You can change the number in the poly function. For example, if we wanted to create a third-degree polynomial term then we would have specified poly(pressure, 3), and it would have included pressure, pressure squared, and pressure cubed.