How to create a box (and whisker) plot (in Python, using Matplotlib)
Task
A box plot, or a box and whisker plot, shows the quartiles of a single variable from a dataset (one of which is the median) and may also show the outliers. It is a simplified way to see the distribution of a variable. Sometimes multiple box plots (one for each of several variables) are shown side-by-side on a plot, to compare the variables. How can we create such graphs?
Related tasks:
- How to create basic plots
- How to add details to a plot
- How to create a histogram
- How to change axes, ticks, and scale in a plot
- How to create bivariate plots to compare groups
- How to plot interaction effects of treatments
Solution
We will create some fake data using Python lists, for simplicity. But everything we show below works also if your data is in columns of a DataFrame, such as df['age']
.
1
2
3
patient_id = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
patient_height = [ 60, 64, 64, 65, 66, 66, 70, 72, 72, 76 ]
patient_weight = [ 141, 182, 169, 204, 138, 198, 180, 175, 244, 196 ]
The conventional way to import matplotlib in Python is as follows.
1
import matplotlib.pyplot as plt
To create a box-and-whisker plot, sometimes called just a box plot requires just one line of code, plus one to show the plot.
1
2
plt.boxplot( patient_height )
plt.show()
You can show more than one variable’s box plot side-by-side by forming a list of the data.
1
2
plt.boxplot( [ patient_height, patient_weight ] )
plt.show()
Content last modified on 24 July 2023.
See a problem? Tell us or edit the source.
Contributed by Nathan Carter (ncarter@bentley.edu)