This page aims to describe how to realise a basic interactive Sankey diagram using the networkD3 library. A Sankey diagram represents flows, going from one node to another. Two main input types are usual to store this kind of information: connection data frame and incidence matrix. In this post I create describe how to realise a Sankey diagram from these 2 types of input.


1 – Connection data frame.

A connection data frame lists all the connections one by one in a data frame. Usually you have a ‘source’ and a ‘target’ column. Eventually you can add a column which gives further information for each connections, like the value of the flow. This is the format you need to use the networkD3 library. Let’s build a connection data frame and represent it as a Sankey diagram:



2 – Incidence matrix

An incidence matrix is square or rectangle. The names of rows and columns are the names of the nodes. The entry in row x and column y represents the flow between x and y. In the Sankey diagram we represent all the flow that are over 0. Since the networkD3 library except a connection data frame, we will fist convert this format, and then re use the code from above:







Make a search