This page explains how to build a dotplot histogram with R, ggplot2 and plotly. This type of visualization in a specific type of histogram. It shows the distribution of a numeric variable. But instead of using bars, each individual observation is represented as a dot.
It particularly makes sense to use interactivity for dotplot histograms: hovering a datapoint will give you more information about its identity.
The idea is to split our numerical variable in several bins, and to calculate the position on the Y axis for each individual unit. Once this new information is available, it is possible to use the geom_point like if it was a scatterplot.
# A classic histogram for the iris data set (left)
ggplot(iris, aes(x=Sepal.Length)) +
# Transform a litte bit the dataset to make dots
don = iris %>%
arrange(Sepal.Length) %>% # sort using the numeric variable that interest you
mutate(var_rounded = (Sepal.Length+1) - ( (Sepal.Length+1) %% 0.2 ) ) %>% # This attributes a bin to each observation. Here 0.2 is the size of the bin.
mutate(y=ave(var_rounded, var_rounded, FUN=seq_along)) # This calculates the position on the Y axis: 1, 2, 3, 4...
# Make the plot (middle)
ggplot(don, aes(x=var_rounded, y=y) ) +
geom_point( size=6, color="skyblue" )
# Improve the plot, and make it interactive (right)
don=don %>% mutate(text=paste("ID: ", rownames(iris), "\n", "Sepal Length: ", Sepal.Length, "\n", "Species:: ", Species, sep="" ))
p=ggplot(don, aes(x=var_rounded, y=y) ) +
geom_point( aes(text=text), size=6, color="skyblue" ) +
xlab('Sepal Length') +
ylab('# of individual') +
axis.line.y = element_blank(),
# Use the magic of ggplotly to have an interactive version