A dendrogram or tree diagram allows to illustrate the hierarchical organisation of several entities. For example, we often use it to make family trees. It is constituted of a root node, which give birth to several nodes that end by giving leaf nodes (the
bottom of the tree). Dendrogram can be made with 2 types of dataset. i/ a numeric matrix where several variables describe the features of individuals. We can then calculate the distance between individuals and cluster them. ii/ A hierarchical
dataset where the relationship between entities is provided directly. These 2 cases are described below. Note that for clusterization, it is a good practice to provide the corresponding heat map that illustrates the structure.
Dendrogram after clusterization
This part interests you if you want to study the structure of your samples. If you have a numeric matrix you can calculate a distance between each pair of sample using the dist or the cor function. Then the hclust function allows to clusterize the samples. Finally, the plot() function of R recognize this format and build a basic tree, like below:
To apply further customizations, you probably want to use the dendextend library:
Dendrogram from hierarchical data
Hierarchical data are usually stored in edge list data frame or nested data frame. In both case, I strongly advise to use the ggraph library to build your dendrogram from it. It provides all the customization you need, and allows to quickly try other related visualization like circle packing, treemap or network.
The collapsibleTree library is another alternative if you want to build an interactive tree (click on a node to unfold the tree). This is really handy to be inserted in a Rmarkdown document or in a Shiny application. Code here.