Introduction to ggplot2

Dennis Murphy, Ph.D.
November 4, 2014

This is the short form of the presentation, which will focus primarily on the examples in the accompanying script file. See the Github repo for the long form presentation and extended set of examples and exercises.

What is ggplot2?

  • An R package that produces static, 2D publication-quality graphics
  • An implementation of a theory of graphics developed by Leland Wilkinson in The Grammar of Graphics.
  • ggplot2 implements a layered grammar of graphics
  • The concept of a layer is central to ggplot2

Ways to create ggplots

Two ways to produce a ggplot:

  • qplot()
  • ggplot()

We will only be concerned with the latter. For documentation on qplot(), see http://ggplot2.org/book/qplot.pdf

Components of the grammar of graphics

  • data
  • aesthetics (visual properties of a graphic)
  • geoms (geometric objects)
  • stats (statistical transformations)
  • faceting (panel plots by groups)
  • scales (positional, non-positional)

Components (cont.)

  • coordinate systems
  • annotation
  • positional adjustments
  • theming system (control of non-data aspects of a graphic)

Notes on aesthetics:

  • Categorized as positional (x- or y-variables) or non-positional
  • Can be mapped through aes() function or set to a single value
  • Mapped non-positional aesthetics generate legend or colorbar guides

Layers

ggplot2 is defined as a “layered grammar of graphics”, which implies that layers are central to the package.

A layer consists of the following components of the grammar:

  • data
  • aesthetic mapping (of one or more variables to aesthetics)
  • stat
  • geom
  • position adjustment

Layers, cont.

Sometimes, a component is implicit; for example, the typical default position adjustment is identity, which means no adjustment to the computed position.

Usually, one stat or geom specification is sufficient because each stat has a default geom and vice versa.

One can specify different data frames for different layers (USEFUL!!)

General syntax of a layer

layer(geom, geom_params, stat, stat_params, data, mapping, position)

It is possible to declare ggplot2 layers explicitly, but this is rarely done in practice. The stat, geom and aes functions do all of the work that layer could with fewer keystrokes.

A ggplot is typically generated by “adding” layers with the + operator. Each call to a geom_xxx() or stat_xxx() function generates a new graphical layer.

Base layer

Defined by invoking ggplot(). When its arguments are specified,

  • establishes the primary data frame
  • establishes the mapped aesthetics that are common to all subsequent layers

Examples:

  • ggplot(data = DF, mapping = aes(x, y, color = grp))
  • ggplot(data = DF)
  • ggplot()

ggplot2 syntax: general form

The primary functions in ggplot2 have the general form component_type(). Some examples:

  • geom_point()
  • stat_contour()
  • coord_polar()
  • position_jitter()
  • annotation_raster()
  • facet_wrap()
  • scale_y_continuous()

Transformations in ggplot2

Several types of transformations take place when constructing a ggplot, in the following order:

  • scale transformation (data units to physical units)
  • statistical transformation (stat functions)
  • coordinate transformation (coord functions)

Coordinate transformations are capable of changing the shape of geoms as well as the shape of positional axes.

Theme element functions

Each (complete) theme function defines about 35 properties, many of which are controlled by one of the following theme element functions:

  • element_text()
  • element_line()
  • element_rect()

The properties of a theme element can be set or redefined by one of the above element_*() functions. To erase the properties of a theme element, use the special function element_blank().

Inheritance and relative sizing

The theming system was overhauled in version 0.9.2. Two important new features were introduced in the revised system:

  • inheritance of theme properties
  • relative sizing of theme elements

Both of these features simplify code writing when defining a new theme function or modifying an existing theme function.

Resources: Books

  • Wickham, H. (2009). ggplot2: Elegant graphics for data analysis
  • Chang, W. (2013). R Graphics Cookbook.
  • Murrell, P. (2011) R Graphics (2nd ed.)
  • Wilkinson, L. (2005) The Grammar of Graphics

Resources: URLs

Resources: help groups