Clarifying label placement with ggrepel
Bioinformatics datasets often have many thousands of data points. These can be genomic positions or genes within a genome, and as part of our data analysis, we will frequently want to label positions or genes so that the reader can identify them. A problem arises in that the labels can easily overlap or clash in the plots. The ggrepel
package provides geoms for ggplot2
that allow for labels to be positioned much more clearly, incorporating label layout algorithms that make labels and connecting lines repel intelligently. In this recipe, we’ll look at the most important options for applying that to a genomics dataset.
Getting ready
We’ll need the ggplot2
and ggrepel
packages and the fission yeast gene expression dataset in the rbioinfcookbook
data package. This data frame contains yeast gene IDs in one column, the log 2-fold change of gene expression for that gene, and the p-value from a statistical test.