From Data to Viz | Find the graphic you need

I’m delighted to announce a new dataviz project called ‘Data to Viz’.



What it is

From Data to Viz is a classification of chart types based on input data format. It comes in the form of a decision tree leading to a set of potentially appropriate visualizations to represent the dataset.


The decision trees are available in a poster that has been presented at UseR in Brisbane.


The project is built on two underlying philosophies. First, that most data analysis can be summarized in about twenty different dataset formats. Second, that both data and context determine the appropriate chart.

Thus, our suggested method consists in identifying and trying all feasible chart types to find out which suits your data and idea best.

Once this set of graphic identified, aims to guide you toward the best decision.



Several sections are available on top of the decision tree:
portfolio – an overview of all chart possibilities. For each, an extensive description is given, showing variations, pros and cons, common pitfalls and more.
stories – for each input data format, a real life example is analyzed to illustrate the different chart types applicable to it.
caveat gallery – a list of common dataviz pitfalls, with suggested workarounds.


Link with R

From Data to Viz aims to give general advices for data visualization in general and is not targeting R users especialy.

However, 100% of the charts are made using R, mostly using ggplot2 and the tidyverse. The reproducible code snippets are always available. The biggest part of the website is built using R Markdown, using a good amount of hacks described here.

The website is tightly linked with the R graph gallery. Once you’ve identified the graphic that suits your needs, you will be redirected to the appropriate section of the gallery to get the R code in minutes.



Next step

From Data to Viz is still in beta version, and a lot remains to be done. The caveat gallery is incomplete and some chart types are missing. Also, a few inaccuracies may be present in the decision tree. Last but not least the English is horrible but this is not likely to change unfortunately, I apologize for that.

If you find any mistake or potential improvement, please fill an issue on GitHub, contact me at or on twitter, or leave a comment below. Any feedback will be very welcome.


A smooth transition between chloropleth and cartogram

This post describes how to make a smooth transition GIF between a chloropleth map and a cartogram. It starts by doing a basic map of Africa and then distorts country size using the cartogram library. Ggplot2 is then used to make a nice chloropleth version. Finally the tweenr and the gganimate libraries build a smooth transition between both maps. At the end of this post, you should obtain a .gif file that looks like this: Continue reading

How to draw connecting routes on map with R and great circles

This post explains how to draw connection lines between several localizations on a map, using R. The method proposed here relies on the use of the gcIntermediate function from the geosphere package. Instead of making straight lines, it offers to draw the shortest routes, using great circles. A special care is given for situations where cities are very far from each other and where the shortest connection thus passes behind the map. Continue reading

4 tricks for working with R, Leaflet and Shiny

I recently worked on a dataviz project involving Shiny and the Leaflet library. In this post I give 4 handy tricks we used to improve the app: 1/ how to use leaflet native widgets 2/ how to trigger an action when user clicks on map 3/ how to add a research bar on your map 4/ how to propose a “geolocalize me” button. For each trick, a reproducible code snippet is provided, so you just have to copy and paste it to reproduce the image. Continue reading

The Wordcloud2 library

A word cloud (or tag cloud) is a visual representation of text data. Tags are usually single words, and the importance of each tag is shown with font size or color. This mode of representation is useful for quickly perceiving the most prominent terms in a list and determine their relative prominences. In R, two libraries will allow you to create wordclouds: Wordcloud and Wordcloud2. In this post I will introduce the main features of the awesome Wordcloud2 library developed by Chiffon Lang.

Continue reading