ggplot2 boxplot from continuous variable

A boxplot summarizes the distribution of a continuous variable. This post explains how to build a boxplot with ggplot2 where categories are actually bins of a numeric variable. It is sometimes useful to study the relationship between 2 nnumeric variables.

Boxplot Section Boxplot pitfalls

Let’s say we want to study the relationship between 2 numeric variables. It is possible to cut on of them in different bins, and to use the created groups to build a boxplot.

Here, the numeric variable called carat from the diamonds dataset in cut in 0.5 length bins thanks to the cut_width function. Then, we just need to provide the newly created variable to the X axis of ggplot2.

Related chart types



This document is a work by Yan Holtz. Any feedback is highly encouraged. You can fill an issue on Github, drop me a message on Twitter, or send an email pasting with

Github Twitter