A bubble plot is almost the same thing as a Scatterplot. It represents the relationship between** 2 numerical variables**. The bubble plot adds a **third dimension**: a third variable is mapped to the size of the points. It provides additional information to the plot.

Fortunately, ggplot2 allows to easily make the link between the column of a data frame and the size of dots. You just have to add it in the **aesthetic** call of the **geom_point()**. Here we start with a basic scatterplot, representing the relationship between 2 columns of the diamonds data set: ‘carat’ (the weight of diamonds) and ‘price’. Then, we map the depth to the size:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# library library(tidyverse) # Let's use the diamonds data set (available in base R) data = diamonds %>% sample_n(200) # A basic scatterplot = relationship between 2 values: ggplot(data, aes(x=carat, y=price)) + geom_point() # Now we see there is a link between caract and price # But what if we want to know about depth in the same time? ggplot(data, aes(x=carat, y=price, size=depth)) + geom_point(alpha=0.2) |

Once you understand this concept, you probably want to control the size of the smallest and biggest bubbles. It is possible using the **scale_size_continuous** facility. Note that you can even apply a transformation to the size variable. It can be handy if you want to highlight the highest values for example:

1 2 3 4 5 6 7 8 9 10 11 12 13 |
# You probably want to control the minimum and maximum size of the points: ggplot(data, aes(x=carat, y=price, size=depth)) + geom_point(alpha=0.2) + scale_size_continuous(range = c(0.5, 16)) # Note that you can add a transformation to your size variable. # For example if you want to highlight very high variables, you can use a exponential transformation. # Available: "asn", "atanh", "boxcox", "exp", "identity", "log", "log10", "log1p", "log2", "logit", "probability", "probit", "reciprocal", "reverse" and "sqrt" ggplot(data, aes(x=carat, y=price, size=depth)) + geom_point(alpha=0.2) + scale_size_continuous( trans="exp", range=c(1, 25)) |

Note that you can go even further and add a **fourth dimension** using **colour**. But be careful, if you add to much information on the same graphic, it gets hard to get the message. You can also use color to highlight one of the axis (but in this case you don’t need to show it in the legend).

1 2 3 4 5 6 |
ggplot(data, aes(x=carat, y=price, size=depth, color=carat)) + geom_point(alpha=0.4) + scale_size_continuous( trans="exp", range=c(1, 25)) + scale_colour_continuous(guide = FALSE) |

Last but not least, don’t forget to have a look to the ggplot2 section of the gallery to learn how to customise your plot. Here is an example of appearance you can get with a few additional lines of code:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
png("MyFigure.png", height = 800, width=800) ggplot(data, aes(x=carat, y=price, size=depth, color=carat)) + geom_point(alpha=0.4) + scale_size_continuous( trans="exp", range=c(4, 25)) + scale_colour_continuous(guide = FALSE) + xlab("weight of the diamond") + labs( size = "Depth in %" ) + theme_bw() + theme( text = element_text(size=20), legend.position = c(.95, .05), legend.justification = c("right", "bottom"), panel.border = element_blank() ) dev.off() |

Related

Make a search

## Leave a Reply

Be the First to Comment!