The ggplot2 package is based on the principle that all plots consist of a few basic components: data, a coordinate system and a visual representation of the data. In ggplot2, you built plots incrementally, starting with the data and coordinates you want to use and then specifying the graphical features: lines, points, bars, color, etc.

We will be using three datasets in this tutorial. First is our main dataset on phage host interaction, second is the diamond dataset given by the ggplot package, lastly is the carbon dioxide dataset given by R.

For a list of preconfigured datasets, simply type data().

data()
dataset <- read.delim("phages.tsv")

Base R Plotting

Before we begin, load the ggplot2 package for R. ggplot2 is a graphics package that provides powerful plotting capabilities beyond R’s base plotting functions. We won’t actually get into ggplot2 itself quite yet. This will be a basic introduction to plotting in R.

library(ggplot2)
dataset

Histograms

A histogram is a univariate plot (a plot that displays one variable) that groups a numeric variable into bins and displays the number of observations that fall within each bin. A histogram is a useful tool for getting a sense of the distribution of a numeric variable.

hist(dataset$Positive.Strand....)

Note: When you create a plot in a local RStudio environment, it will appear in the bottom right pane under the “plots” tab. Use the left and right arrows to cycle through the plots you’ve created.

Box Plots

Boxplots are another type of univariate plot for summarizing distributions of numeric data graphically.

boxplot(dataset$molGC...)

The central box of the boxplot represents the middle 50% of the observations, the central bar is the median and the bars at the end of the dotted lines encapsulate the great majority of the observations. Circles that lie beyond the end of the whiskers are data points that may be outliers.

One of the most useful features of the boxplot() function is the ability to make side-by-side boxplots. A side-by-side boxplot takes a numeric variable and splits it on based on some categorical variable, drawing a different boxplot for each level of the categorical variable.

boxplot(dataset$molGC... ~ dataset$Molecule) # Plot GC content split based on molecular type

Density Plots

A density plot shows the distribution of a numeric variable with a continuous curve. It is similar to a histogram but without discrete bins, a density plot gives a better picture of the underlying shape of a distribution.

plot(density(dataset$molGC...))

Bar Plots

Barplots are graphs that visually display counts of categorical variables.

dataset$Jumbophage <- ifelse(dataset$Jumbophage, "Jumbophage", "Not Jumbophage")
dataset$Jumbophage
   [1] "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"    
  [10] "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"    
  [19] "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
  [28] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
  [37] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
  [46] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
  [55] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
  [64] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
  [73] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
  [82] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
  [91] "Not Jumbophage" "Jumbophage"     "Not Jumbophage" "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Not Jumbophage"
 [100] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [109] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [118] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [127] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [136] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [145] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [154] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [163] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [172] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [181] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [190] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [199] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [208] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [217] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [226] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [235] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [244] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [253] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [262] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [271] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [280] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [289] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [298] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [307] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [316] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [325] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [334] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [343] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [352] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [361] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [370] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [379] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [388] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [397] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [406] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [415] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [424] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [433] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [442] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [451] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [460] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [469] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [478] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [487] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [496] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [505] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [514] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage"    
 [523] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [532] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [541] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [550] "Jumbophage"     "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [559] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [568] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [577] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [586] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [595] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [604] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [613] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [622] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [631] "Jumbophage"     "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [640] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [649] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [658] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage"     "Not Jumbophage"
 [667] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [676] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [685] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [694] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [703] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage"     "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [712] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [721] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [730] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Not Jumbophage"
 [739] "Jumbophage"     "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [748] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [757] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [766] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [775] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [784] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [793] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [802] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [811] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [820] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [829] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [838] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [847] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [856] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [865] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage"     "Jumbophage"     "Jumbophage"     "Jumbophage"     "Not Jumbophage" "Not Jumbophage"
 [874] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage"     "Jumbophage"     "Not Jumbophage" "Not Jumbophage"
 [883] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [892] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [901] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [910] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [919] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage"     "Not Jumbophage" "Jumbophage"    
 [928] "Not Jumbophage" "Jumbophage"     "Jumbophage"     "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [937] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [946] "Not Jumbophage" "Jumbophage"     "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [955] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Jumbophage"     "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [964] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [973] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [982] "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
 [991] "Not Jumbophage" "Jumbophage"     "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage" "Not Jumbophage"
[1000] "Not Jumbophage"
 [ reached getOption("max.print") -- omitted 17406 entries ]
barplot(table(dataset$Molecule))

barplot(table(dataset$Jumbophage, dataset$Molecule),
        legend = levels(dataset$Jumbophage)
)

A grouped barplot is an alternative to a stacked barplot that gives each stacked section its own bar. To make a grouped barplot, create a stacked barplot and add the extra argument beside = TRUE.

barplot(table(dataset$Jumbophage, dataset$Molecule),
        legend = levels(dataset$Jumbophage),
        beside = TRUE
) # Group instead of stacking

Scatterplots

Scatterplots are bivariate (two variable) plots that take two numeric variables and plot data points on the x/y plane.

plot(dataset$molGC..., dataset$Positive.Strand....)

plot(dataset$molGC...,
        dataset$Positive.Strand....,
        col = rgb(red = 0, green = 0, blue = 0, alpha = 0.1)
)

Illustrating how we can make our different plots look more presentable.

barplot(table(dataset$Jumbophage, dataset$Molecule),
        legend = levels(dataset$Jumbophage),
        beside = TRUE,
        xlab = "Molecular Type",
        ylab = "Jumbophage",
        main = "Molecular Type, Grouped by Jumbophage",
        col = c(
                "#FFFFFF", "#F5FCC2", "#E0ED87", "#CCDE57", # Add color*
                "#B3C732", "#94A813", "#718200"
        )
)

Using ggplot()

The ggplot() function creates plots incrementally in layers. Every ggplot starts with the same basic syntax. Every ggplot starts with a call to the ggplot() function along with an argument specifying the data set to be used and aesthetic mappings from variables in the data set to visual properties of the plot, such as x and y position.

# install.packages("tidyverse")
library(tidyverse)

We are not going to spend much time learning about qplot() since learning the ggplot() syntax is at the heart of the package. Let’s look at one qplot for illustrative purposes and then move on.

library(ggplot2)

qplot(
        x = carat, # x variable
        y = price, # y variable
        data = diamonds, # Data set
        geom = "point", # Plot type
        color = clarity, # Color points by variable clarity
        xlab = "Carat Weight", # x label
        ylab = "Price", # y label
        main = "Diamond Carat vs. Price"
)
Warning: `qplot()` was deprecated in ggplot2 3.4.0.

# Title
ggplot(
        dataset,
        aes(Accession, molGC....)
) +
        geom_point()

In the code above, we specify the data we want to work with and then assign the variables of interest, Accession and GC Content, to the x and y values of the plot. “aes()” is an aesthetics wrapper used in ggplot to map variables to visual properties. When you want a visual property to change based on the value of a variable, that specification belongs inside an aes() wrapper. If you are setting a fixed value that doesn’t change based a variable, it belongs outside of aes().

Note: Add a new element to a plot by putting a “+” after the preceding element.

The layers you add determine the type of plot you create. In this case, we used geom_point() which simply draws the data as points at the specified x and y coordinates, creating a scatterplot. ggplot2 has a wide range of geoms to create different types of plots. Here is a list of geoms for all the plot types we covered in the last lesson, plus a few more

geom_histogram() # histogram
geom_bar: na.rm = FALSE, orientation = NA
stat_bin: binwidth = NULL, bins = NULL, na.rm = FALSE, orientation = NA, pad = FALSE
position_stack 
geom_density() # density plot
geom_density: na.rm = FALSE, orientation = NA, outline.type = upper
stat_density: na.rm = FALSE, orientation = NA
position_identity 
geom_boxplot() # boxplot
geom_boxplot: outliers = TRUE, outlier.colour = NULL, outlier.fill = NULL, outlier.shape = 19, outlier.size = 1.5, outlier.stroke = 0.5, outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5, staplewidth = 0, varwidth = FALSE, na.rm = FALSE, orientation = NA
stat_boxplot: na.rm = FALSE, orientation = NA
position_dodge2 
geom_violin() # violin plot (combination of boxplot and density plot)
geom_violin: draw_quantiles = NULL, na.rm = FALSE, orientation = NA
stat_ydensity: trim = TRUE, scale = area, na.rm = FALSE, orientation = NA, bounds = c(-Inf, Inf)
position_dodge 
geom_bar() # bar graph
geom_bar: just = 0.5, width = NULL, na.rm = FALSE, orientation = NA
stat_count: width = NULL, na.rm = FALSE, orientation = NA
position_stack 
geom_point() # scatterplot
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity 
geom_jitter() # scatterplot with points randomly perturbed to reduce overlap
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_jitter 
geom_line() # line graph
geom_line: na.rm = FALSE, orientation = NA
stat_identity: na.rm = FALSE
position_identity 
geom_errorbar() # Add error bar
geom_errorbar: na.rm = FALSE, orientation = NA
stat_identity: na.rm = FALSE
position_identity 
geom_smooth() # Add a best-fit line
geom_smooth: na.rm = FALSE, orientation = NA, se = TRUE
stat_smooth: na.rm = FALSE, orientation = NA, se = TRUE
position_identity 
geom_abline() # Add a line with specified slope and intercept
mapping: intercept = ~intercept, slope = ~slope 
geom_abline: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity 

Notice the scatterplot we made above didn’t have a nice coloring. We can attribute the colors of the points to its Molecular type.

ggplot(dataset, aes(Accession, molGC...., colour = Molecule)) +
        geom_point(alpha = 0.5)

ggplot(dataset, aes(Accession, molGC....)) +
        geom_point(aes(color = Molecule), alpha = 0.2)

We pass alpha in as an argument outside of the aes() mapping because we are setting alpha to a fixed value instead of mapping it to a variable.

By setting alpha to 0.1, each data point has 90% transparency. At such high transparency, single data points are hard to see, but it lets us focus on high density areas.

dataset %>%
        ggplot(aes(Genome.Length..bp., molGC....)) +
        geom_point(aes(colour = Molecule), alpha = 0.5) +
        labs(x = "Genome Length", y = "GC Content")

Geosmooth

Illustrating the use of geosmooth using a built in dataset in R.

In ggplot2, the geom_smooth() function is used to add a smooth line or curve to a plot. It is commonly used to visualize the trend or relationship between variables.

sample_DataSet <- CO2
sample_DataSet
Grouped Data: uptake ~ conc | Plant
   Plant        Type  Treatment conc uptake
1    Qn1      Quebec nonchilled   95   16.0
2    Qn1      Quebec nonchilled  175   30.4
3    Qn1      Quebec nonchilled  250   34.8
4    Qn1      Quebec nonchilled  350   37.2
5    Qn1      Quebec nonchilled  500   35.3
6    Qn1      Quebec nonchilled  675   39.2
7    Qn1      Quebec nonchilled 1000   39.7
8    Qn2      Quebec nonchilled   95   13.6
9    Qn2      Quebec nonchilled  175   27.3
10   Qn2      Quebec nonchilled  250   37.1
11   Qn2      Quebec nonchilled  350   41.8
12   Qn2      Quebec nonchilled  500   40.6
13   Qn2      Quebec nonchilled  675   41.4
14   Qn2      Quebec nonchilled 1000   44.3
15   Qn3      Quebec nonchilled   95   16.2
16   Qn3      Quebec nonchilled  175   32.4
17   Qn3      Quebec nonchilled  250   40.3
18   Qn3      Quebec nonchilled  350   42.1
19   Qn3      Quebec nonchilled  500   42.9
20   Qn3      Quebec nonchilled  675   43.9
21   Qn3      Quebec nonchilled 1000   45.5
22   Qc1      Quebec    chilled   95   14.2
23   Qc1      Quebec    chilled  175   24.1
24   Qc1      Quebec    chilled  250   30.3
25   Qc1      Quebec    chilled  350   34.6
26   Qc1      Quebec    chilled  500   32.5
27   Qc1      Quebec    chilled  675   35.4
28   Qc1      Quebec    chilled 1000   38.7
29   Qc2      Quebec    chilled   95    9.3
30   Qc2      Quebec    chilled  175   27.3
31   Qc2      Quebec    chilled  250   35.0
32   Qc2      Quebec    chilled  350   38.8
33   Qc2      Quebec    chilled  500   38.6
34   Qc2      Quebec    chilled  675   37.5
35   Qc2      Quebec    chilled 1000   42.4
36   Qc3      Quebec    chilled   95   15.1
37   Qc3      Quebec    chilled  175   21.0
38   Qc3      Quebec    chilled  250   38.1
39   Qc3      Quebec    chilled  350   34.0
40   Qc3      Quebec    chilled  500   38.9
41   Qc3      Quebec    chilled  675   39.6
42   Qc3      Quebec    chilled 1000   41.4
43   Mn1 Mississippi nonchilled   95   10.6
44   Mn1 Mississippi nonchilled  175   19.2
45   Mn1 Mississippi nonchilled  250   26.2
46   Mn1 Mississippi nonchilled  350   30.0
47   Mn1 Mississippi nonchilled  500   30.9
48   Mn1 Mississippi nonchilled  675   32.4
49   Mn1 Mississippi nonchilled 1000   35.5
50   Mn2 Mississippi nonchilled   95   12.0
51   Mn2 Mississippi nonchilled  175   22.0
52   Mn2 Mississippi nonchilled  250   30.6
53   Mn2 Mississippi nonchilled  350   31.8
54   Mn2 Mississippi nonchilled  500   32.4
55   Mn2 Mississippi nonchilled  675   31.1
56   Mn2 Mississippi nonchilled 1000   31.5
57   Mn3 Mississippi nonchilled   95   11.3
58   Mn3 Mississippi nonchilled  175   19.4
59   Mn3 Mississippi nonchilled  250   25.8
60   Mn3 Mississippi nonchilled  350   27.9
61   Mn3 Mississippi nonchilled  500   28.5
62   Mn3 Mississippi nonchilled  675   28.1
63   Mn3 Mississippi nonchilled 1000   27.8
64   Mc1 Mississippi    chilled   95   10.5
65   Mc1 Mississippi    chilled  175   14.9
66   Mc1 Mississippi    chilled  250   18.1
67   Mc1 Mississippi    chilled  350   18.9
68   Mc1 Mississippi    chilled  500   19.5
69   Mc1 Mississippi    chilled  675   22.2
70   Mc1 Mississippi    chilled 1000   21.9
71   Mc2 Mississippi    chilled   95    7.7
72   Mc2 Mississippi    chilled  175   11.4
73   Mc2 Mississippi    chilled  250   12.3
74   Mc2 Mississippi    chilled  350   13.0
75   Mc2 Mississippi    chilled  500   12.5
76   Mc2 Mississippi    chilled  675   13.7
77   Mc2 Mississippi    chilled 1000   14.4
78   Mc3 Mississippi    chilled   95   10.6
79   Mc3 Mississippi    chilled  175   18.0
80   Mc3 Mississippi    chilled  250   17.9
81   Mc3 Mississippi    chilled  350   17.9
82   Mc3 Mississippi    chilled  500   17.9
83   Mc3 Mississippi    chilled  675   18.9
84   Mc3 Mississippi    chilled 1000   19.9
ggplot(sample_DataSet, aes(conc, uptake, colour = Treatment)) +
        geom_point(size = 3, alpha = 0.5) +
        geom_smooth(method = lm, se = F)

If you want to classify it further into types based on your data, use facet_wrap().

ggplot(sample_DataSet, aes(conc, uptake, colour = Treatment)) +
        geom_point(size = 3, alpha = 0.5) +
        geom_smooth(method = lm, se = F) +
        facet_wrap(~Type) +
        labs(title = "Concentration of CO2") +
        theme_bw()

More Plot Examples

Now that we know the basics of creating plots with ggplot(), let’s remake some of the plots we created last time and see how they look in ggplot2, starting with a histogram.

Histograms

ggplot(data = dataset, aes(x = molGC....)) +
        geom_histogram(
                fill = "skyblue",
                col = "black"
        ) +
        labs(x = "GC Content")

Box Plots

ggplot(dataset, aes(Molecule, Genome.Length..bp.)) +
        geom_jitter(
                alpha = 0.05, # Add jittered data points
                color = "yellow"
        ) +
        geom_boxplot() +
        labs(
                title = "Genome Length of Different Molecular Types",
                y = "Genome Length"
        )

Density Plot

ggplot(data = dataset, aes(x = molGC....)) +
        geom_density(
                position = "stack", # Create a stacked density chart
                aes(fill = Molecule), # Fill based on cut
                alpha = 0.5
        ) # Set transparency

Multidimensional Plotting and Faceting

One of the most powerful aspects of plots is the ability to visually illustrate relationships between 3 or more variables. When we create a plot, each different dimension (variable) needs to map to a different perceptual feature (aesthetic) such as x position, y position, symbol, size or color. Making use of several of these aesthetics at once lets us make plots involving many dimensions. We’ve already seen some examples of multidimensional plots, such as the first scatterplot in this lesson that displayed carat weight and price colored by clarity.

Faceting is another way to add an extra dimension to a plot. Faceting breaks a plot up based on a factor variable and draws a different plot for each level of the factor. You can create a faceted plot in ggplot2 by adding a facet_wrap() layer.

ggplot(data = diamonds, aes(x = carat, y = price)) + # Initialize plot

        geom_point(aes(color = color), # Color based on diamond color
                alpha = 0.5
        ) +
        facet_wrap(~clarity) + # Facet on clarity

        geom_smooth() + # Add an estimated fit line*

        theme(legend.position = c(0.85, 0.16)) # Set legend position

Scales

Scales are parameters in ggplot2 that determine how a plot maps values to visual properties (aesthetics.). If you don’t specify a scale for an aesthetic the plot will use a default scale. For instance, the plots we split on color all used a default color scale. You can specify custom scales by adding scale elements to your plot. Scale elements have the following structure:

scale_aesthetic_scaletype()

We already saw an example of a scale when made the grouped barplot above. In that case we wanted to manually set the fill color scale for the bars, so the scale we used was:

scale_fill_manual()

Let’s make a new scatterplot with several aesthetic properties and alter some of the scales.

ggplot(data = diamonds, aes(x = carat, y = price)) + # Initialize plot

        geom_point(aes(
                size = carat, # Size points based on carat
                color = color, # Color based on diamond color
                alpha = clarity
        )) + # Set transparency based on clarity

        scale_color_manual(values = c(
                "#FFFFFF", "#F5FCC2", # Use manual color values
                "#E0ED87", "#CCDE57",
                "#B3C732", "#94A813",
                "#718200"
        )) +
        scale_alpha_manual(values = c(
                0.1, 0.15, 0.2, # Use manual alpha values
                0.3, 0.4, 0.6,
                0.8, 1
        )) +
        scale_size_identity() + # Set size values to the actual values of carat*

        xlim(0, 2.5) + # Limit x-axis

        theme(panel.background = element_rect(fill = "#7FB2B8")) + # Change background color

        theme(legend.key = element_rect(fill = "#7FB2B8")) # Change legend background color


  1. De La Salle University, Manila, Philippines, ↩︎

