A 2d density plot is useful to study the relationship between 2 numeric variables if you have a huge number of points. To avoid overlapping (as in the scatterplot beside), it divides the plot area in a multitude of small fragment and represents the number of points in this fragment.

There are several types of 2d density plots. Each has its proper ggplot2 function. Let’s discover them:



2d Histogram

For 2d histogram, the plot area is divided in a multitude of squares. (It is a 2d version of the classic histogram). It is called using the geom_bin_2d function. This function offers a bins argument that controls the number of bins you want to display.




Another alternative is to divide the plot area in a multitude of hexagons: it’s thus called hex bin, and is made using the geom_hex function. This function provide the bins argument as well, to control the number of division per axis.


2d Distribution

As you can plot a density chart instead of an histogram, it is possible to compute a 2d density and represent it. Several possibilities are offered by ggplot2: you can show the contour of the distribution, or the area, or use the raster function:


Custom the colour

Whatever you use a 2d histogram, an hexbin or a 2d distribution, you can and should custom the colour of your chart. A good way to do so consists to call an R color Brewer palette using the geom_distiller function.





Leave a Reply

Notify of