All contents are licensed under CC BY-NC-ND 4.0.

1 Store graphics

File format:

  • pdf(): ‘portable document format’
  • jpeg(): ‘joint photographic experts group’
  • tiff(): ‘tagged image file format’
  • png(): ‘portable network graphics’

Options:

  • width: width (for pdf in inches)
  • height: height (forpdf in inches)
  • onefile: logical value (should several graphics as separate pages in one file?)

Usage:

2 Generic plot-function plot(x, y, type, ...)

Das type Argument:

type Plot element
type = "p" P points (default value), scatter plot
type = "l" Connecting line
type = "b" Both (dots and connecting lines), but not on top of each other
type = "o" On top of each other (Overplotted): Points with connecting lines
type = "n" Nothing, e.g. if you first create a grid with grid()
type = "s" Step function
... See also ?plot

2.1 Frequently used arguments with plot()

Argument Plot element
axes Should axes be drawn?
las = 1 All tick labels horizontal?
xlim, ylim Limit of the axes
xlab, ylab Labeling of the axes
bty Type of box around the plot window
cex Size factor of the plot symbols
cex.axis, cex.lab, cex.main Size factor of some parts of the plot
col Color of the displayed data (see section on colors)
lty Line style (integer)
lwd Line width (real value, \(\geq 0\))
main Main heading
pch Symbol for points (integer)

2.2 Example plot()

(We use par(...) and colorspace::… here, but don’t be distracted, we will treat them later. For the moment: par(...) manipulates the arrangment of the plot on the ‘piece of paper’ that we have to draw on, and colorspace::... just helps us to find ‘good’(!) colors …)

3 Graphic-‘modules’

The remaining examples for type = "n" and type = "s" follow in next examples …

First: Functions that help us add something to a graphic

Function Plot element
axis () Adds an axis
lines () Adds a line between points
points () Adds points
curve () Connects points with a smooth curve
abline () Adds a straight line (horizontal, vertical, slope and y-intercept)
grid () Adds a grid (defined by tickmarks)
legend () Adds a legend (example on the next slide)
polygon () Adds a filled polygon
text () Adds text
mtext () Adds text in the plot margins

4 legend()

  • Adds explanation for plot elements.
  • Position either by x- and y-coordinates, or by specifying "topleft", "bottomleft", "topright" or "bottomright"
  • Optional with boundary box.
  • Argument legend: vector with explanations.
  • Further arguments define colors, plot symbols, line widths, …

5 Further plot types

5.1 boxplot()

A box plot shows:

  • The median as a thick horizontal line,
  • the first (\(Q_1\)) and third quartile (\(Q_3\)) as upper and lower box limits,
  • ‘fences’ calculated by: \[ \text{upper fence limit} = \min\left(\max(x),Q_3+1.5\cdot\text{IQA}\right), \] other \[ \text {lower fence edge} = \max\left(\min(x),Q_1-1.5\cdot\text{IQA}\right), \] with interquartile range \(\text{IQA} = \vert Q_3-Q_1 \vert\), as well as
  • Points outside the fences.

Use with argument x as a variable or formula:

5.2 stripchart()

stripcharts can be helpful additions to box plots, especially with small samples:

stripchart produces one dimensional scatter plots […] of the given data. These plots are a good alternative to boxplots when sample sizes are small.” (Quote taken from ?stripchart)

  • The argument method specifies by which method superimposed points should be made distinguishable, in particular method = "jitter" or method = "stack".

5.3 hist()

A histogram divides the value range of the sample into (preset equidistant) intervals and then shows the absolute frequency of the observations within these intervals through the heights of areas. The histogram thus provides a rough estimate for the probability density function.

  • The argument breaks defines the values of the interval limits or the number of intervals.

Usage:

5.4 density()

  • density () provides a continuous estimate of the probability density function.
  • A kernel function is defined at each observation point, the weights of these functions are estimated, and the sum of the kernel functions multiplied by the weights is then returned at each point as an estimator.
  • Overlapping a kernel function with areas for which the underlying size is not defined, positive density estimates can arise as artifacts that would be correctly equal to \(0\).
  • density () only returns information about the calculated estimate, the plot then works separately.
  • A kernel density estimate is a statistical model with a few assumptions, but pretends to be just a simple descriptive graphic.

Usage:

7 Colours

  • Colors are changed by the argument col = "name".
  • The function colors() contains already defined standard colors.
  • The function palette() contains the color palette that is used when col is specified by a numeric value.
  • rgb generates colors by mixing red, green and blue components (with the possibility of alpha shading through the argument alpha), but mixing several colors for usage in one graphic by hand is not recommended (Zeileis, Hornik, and Murrell 2009).
  • Therefore, I mostly use the very powerful colorspace (Zeileis et al. 2020) and viridis (Garnier 2018) packages.
  • viridis supports the search for optimal colors in terms of taking into account most types of color blindness, as well as the maximum contrast in gray-scale printing of colored graphics.

8 Mathematical notation in graphics

  • R offers limited possibilities for mathematical notation in graphics.
  • Syntax similar to LaTeX
  • The formulation is passed as an argument to the expression() function.
  • For an overview of the (im) possibilities see ?plotmath
Command Meaning
frac(a,b) Fraction
[i] Subscript
alpha, beta Greek letters
sqrt(a) Squarerootfunction
See ?plotmath

9 lattice Graphics for grouped / clustered data

  • library("lattice") (Sarkar 2008)
  • Plotting functions for grouped data.
  • Lattice offers a much more convenient segmentation of the graphic device compared to ‘by hand’ par(mfrow = c(i, j)), or layout().
Function Graphic type
xyplot Scatter plot
bwplot Box plot
barchart Bar plot
contourplot Contour lines (‘3D’)
levelplot Filled contour lines
histogram Histogram
densityplot kernel density estimation

Usage:

  • Plot of x againsty,
  • Grouped (individual plot windows) by g,
  • Returns trellis object (nobase plot),
  • no ‘target variable’ y fordensityplot, bwplot andhistogram.

10 ggplot2

In the last couple of years, creating graphics with ggplot2 (Wickham 2016) instead of base R commands has steadily increased among R users. I still have the impression base R allows me to have more flexibility in what my resulting plot may look, but by it’s modularity, and clear structure, and intuitiveness, command chains for making a graphic with ggplot2 often come naturelly and less labour intensive in comparison to base R. \(\rightarrow\) so it might be recommendanle to feel home in both worlds?!

In order to set up a graphic with ggplot2, you usually start with calling ggplot() where you supply a dataframe (Note that ggplot is very much centered on having everything organized in dataframe, which is good, of course!) and an aesthetic mapping using aes():

From here on, you add modules – layers, scales, faceting specifications, coordinate systems, … (a great overview is given in the official cheat sheet )– using +:

… and you keep going, module by module:

References

Garnier, Simon. 2018. Viridis: Default Color Maps from ’Matplotlib’. https://CRAN.R-project.org/package=viridis.

Sarkar, Deepayan. 2008. Lattice: Multivariate Data Visualization with R. New York: Springer. http://lmdvr.r-forge.r-project.org.

Wickham, Hadley. 2011. “The Split-Apply-Combine Strategy for Data Analysis.” Journal of Statistical Software 40 (1): 1–29. http://www.jstatsoft.org/v40/i01/.

———. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.

Zeileis, Achim, Jason C. Fisher, Kurt Hornik, Ross Ihaka, Claire D. McWhite, Paul Murrell, Reto Stauffer, and Claus O. Wilke. 2020. “colorspace: A Toolbox for Manipulating and Assessing Colors and Palettes.” Journal of Statistical Software 96 (1): 1–49. https://doi.org/10.18637/jss.v096.i01.

Zeileis, Achim, Kurt Hornik, and Paul Murrell. 2009. “Escaping RGBland: Selecting Colors for Statistical Graphics.” Computational Statistics & Data Analysis 53 (9): 3259–70. https://doi.org/10.1016/j.csda.2008.11.033.


  1. Private webpage: uncertaintree.github.io