Boxplot is probably the most commonly used chart type to compare distribution of several groups. However, you should keep in mind that data distribution is hidden behind each box. For instance, a normal distribution could look exactly the same as a bimodal distribution. Please read more explanation on this matter, and consider a violin plot or a ridgline chart instead.
The basic barplot hides information: how does the underlying distribution look like? What are the category sample sizes?
Boxplot are built thanks to the
geom_boxplot() geom of
ggplot2. See its basic usage on the first example below. Note that reordering groups is an important step to get a more insightful figure. Also, showing individual data points with jittering is a good way to avoid hiding the underlying distribution.
An overview of the boxplot options offered by ggplot2 to custom chart appearance.
Control group order
Changing group order in a boxplot is a crucial step. Learn why and discover 3 methods to do so.
Several examples showing most usual color customization: uniform, discrete, using colorBrewer, Viridis and more.
Highlight a group
Learn how to highlight a group on your chart to convey your message more efficiently.
A grouped boxplot displays the distribution of several categories organized in groups and subgroups
Faceting in boxplot
An alternative to grouped boxplot where each group or each subgroup is displayed in a distinct panel.
Boxplot from continuous variable.
how to build a boxplot with ggplot2 where categories are actually bins of a numeric variable.
Add mean value.
Explaines how to add mean value on top of boxplot. (remember boxplot displays the median, not the mean).
Add individual observation
Boxplot downside is to hide information. You can reveal box underlying distribution showing individual observations with jitter.
Build boxplot with base R is totally doable thanks to the
boxplot() function. Here are a few examples of its use:
X axis labels on several lines
How to display the X axis labels on several lines: an application to boxplot to show sample size of each group.
Boxplot with jitter
Show individual observations on top of boxes, with jittering to avoid dot overlap.
Grouped and ordered boxplot
How to build a grouped boxplot (groups & subgroups) and order it by increasing median.
Boxplot with labels on top
Add labels on top of each category to display custom information like category sample size.
Tukey test compares the mean of all pairs of category. Here is how to perform it and represent its result on a boxplot.
Box type around plot
Learn how the bty argument of the par() function allows to custom the box around base R plot.