How to add a polynomial term to a model (in R)
Task
Sometimes, a simple linear model isn’t sufficient to describe the data. How can we include a higher-order term in a regression model, such as the square or cube of one of the predictors?
Related tasks:
Solution
We’re going to use the Pressure
dataset in R’s ggplot
library as example data.
It contains observations of pressure and temperature.
You would use your own data instead.
1
2
3
# install.packages( "ggplot2" ) # if you haven't done this already
library(ggplot2)
data("pressure")
Let’s model temperature as the dependent variable with pressure squared as the
independent variable. To place the “pressure squared” term in the model, we use
R’s poly
function, as shown below. It automatically includes a pressure term
as well (not squared).
1
2
3
# Build the model
model <- lm(temperature ~ poly(pressure, 2), data = pressure)
summary(model)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Call:
lm(formula = temperature ~ poly(pressure, 2), data = pressure)
Residuals:
Min 1Q Median 3Q Max
-113.095 -44.543 6.157 50.459 75.791
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 180.00 14.31 12.581 1.03e-09 ***
poly(pressure, 2)1 361.84 62.36 5.802 2.70e-05 ***
poly(pressure, 2)2 -186.66 62.36 -2.993 0.0086 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 62.36 on 16 degrees of freedom
Multiple R-squared: 0.7271, Adjusted R-squared: 0.693
F-statistic: 21.31 on 2 and 16 DF, p-value: 3.079e-05
Now we have a model of the form $\hat t = 180 + 361.84p - 186.66p^2$, where $t$ stands for temperature and $p$ for pressure.
You can change the number in the poly
function.
For example, if we wanted to create a third-degree polynomial term
then we would have specified poly(pressure, 3)
, and it would have included pressure,
pressure squared, and pressure cubed.
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Elizabeth Czarniak (CZARNIA_ELIZ@bentley.edu)