How to create a box (and whisker) plot
Description
A box plot, or a box and whisker plot, shows the quartiles of a single variable from a dataset (one of which is the median) and may also show the outliers. It is a simplified way to see the distribution of a variable. Sometimes multiple box plots (one for each of several variables) are shown side-by-side on a plot, to compare the variables. How can we create such graphs?
Related tasks:
- How to create basic plots
- How to add details to a plot
- How to create a histogram
- How to change axes, ticks, and scale in a plot
- How to create bivariate plots to compare groups
- How to plot interaction effects of treatments
Using Matplotlib, in Python
We will create some fake data using Python lists, for simplicity. But everything we show below works also if your data is in columns of a DataFrame, such as df['age']
.
1
2
3
patient_id = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
patient_height = [ 60, 64, 64, 65, 66, 66, 70, 72, 72, 76 ]
patient_weight = [ 141, 182, 169, 204, 138, 198, 180, 175, 244, 196 ]
The conventional way to import matplotlib in Python is as follows.
1
import matplotlib.pyplot as plt
To create a box-and-whisker plot, sometimes called just a box plot requires just one line of code, plus one to show the plot.
1
2
plt.boxplot( patient_height )
plt.show()
You can show more than one variable’s box plot side-by-side by forming a list of the data.
1
2
plt.boxplot( [ patient_height, patient_weight ] )
plt.show()
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Solution, in R
We will create some fake data using vectors, for simplicity. But everything we show below works also if your data is in columns of a DataFrame.
1
2
3
patient_id <- c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
patient_height <- c(60, 64, 64, 65, 66, 66, 70, 72, 72, 76)
patient_weight <- c(141, 182, 169, 204, 138, 198, 180, 175, 244, 196)
We can use R’s boxplot() function to make the plot.
1
boxplot(patient_weight)
You can show more than one variable’s box plot side-by-side by passing both variables into the boxplot() function.
1
boxplot(patient_height, patient_weight)
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Topics that include this task
Opportunities
This website does not yet contain a solution for this task in any of the following software packages.
- Excel
- Julia
If you can contribute a solution using any of these pieces of software, see our Contributing page for how to help extend this website.