- Updates
- Introduction
- Minimal line plot
- Range-frame (or quartile-frame) scatterplot
- Dot-dash (or rug) scatterplot
- Minimal boxplot
- Minimal barchart
- Slopegraph
- Sparklines
- Stem-and-leaf display
- Discussion

Updates

16th of February 2016: (1) Added stem-and-leaf displays; (2) added sparklines in *ggplot2* (thanks to Wouter Van Der Bijl for addressing my StackOverflow question); (3) back to all-in-one-page format and revised sparklines to simplify; (4) range-frame and dot-dash plots split into separate sections for clarity; (5) overlong lines in *ggplot2* version of *dot-dash plots* is back due to recent updates in *ggplot2* that crash previous solution; (6) changed data set for slopegraph to match it between graphical systems.

17th of October 2015: Thank you for the kind words regarding this project! Major changes in this update: (1) added the first two methods to create Sparklines using *Base Graphics* and *Lattice*; (2) the document has become too large to keep on one-page. Its now split into two separate pages-chapters: *Chapter 1: Line plot, Boxplot, Barchart and Slopegraph* and *Chapter 2: Sparklines*; (3) corrected overlong lines produces in y-axis when making *dot-dash plots* with `panel.rug()`

for *Lattice* (thanks to Josh Oâ€™Brien) and with `geom_rug()`

for *ggplot2* (thanks to BondedDust); (4) added this list of updates to keep a reasonable track of changes.

28th of July 2015: *Tufte in R* is live!

Introduction

The idea behind *Tufte in R* is to use *R* - the most powerful open-source statistical programming language - to replicate excellent visualisation practices developed by Edward Tufte. Itâ€™s not a novel approach - there are plenty of excellent *R* functions and related packages wrote by people who have much more expertise in programming than myself. I simply collect those resources in one place in an accessible and replicable format, adding a few bits of my own coding discoveries.

Each visualisation is provided in three graphical systems used in *R*: base graphics, lattice and ggplot2. As an example data I mainly use basic data sets easily accessible within *R*. Occasionally I use data from package `psych`

developed by William Revelle, package `MASS`

developed by Brian Ripley with collegues and various custom data I link via my *Gist* profile.

This page was produced in *RMarkdown* using Michael Sachsâ€™s `tuftehandout`

, but with a modified CSS inspired by Dave Liepmannâ€™s *Tufte CSS*. Its best if you view this page on a desktop computer rather than mobile devices.

You need the most recent version of *R* installed on you computer. You also need a basic understanding of *R* and there are some great online tutorials to get you started. I also recommend *R Studio* as an integrated development environment for *R*.

I use resources from a number of *R* packages. You can install all those packages at once via *R* console using the command below:

```
install.packages(c("CarletonStats", "devtools", "fmsb", "ggplot2", "ggthemes",
"latticeExtra", "MASS", "PerformanceAnalytics", "psych",
"plyr", "proto", "RCurl", "reshape", "reshape2"))
```

Minimal line plot

We start by plotting the most basic graph from page 65 of *The Visual Display of Quantitative Information* - a minimal line plot. This one is important because it illustrates the most elemental principle - that of minimalism with reduced â€˜data-inkâ€™. As Tufte explains, the â€˜data-inkâ€™ (total ink used to print the graphic) ratio should equal to â€˜1 - proportion of graphic that can be erased without loss of data-informationâ€™. The primary challenge is therefore to modify the default graphs produced with *R* so that we remove as much of â€˜non-data inkâ€™ as possible. As you will soon see, this is done by subtracting and deconstructing existing *R* graphs to get rid of as much â€˜non-data inkâ€™ as possible.

```
x <- 1967:1977
y <- c(0.5,1.8,4.6,5.3,5.3,5.7,5.4,5,5.5,6,5)
pdf(width=10, height=6)
plot(y ~ x, axes=F, xlab="", ylab="", pch=16, type="b")
axis(1, at=x, label=x, tick=F, family="serif")
axis(2, at=seq(1,6,1), label=sprintf("$%s", seq(300,400,20)), tick=F, las=2, family="serif")
abline(h=6,lty=2)
abline(h=5,lty=2)
text(max(x), min(y)*2.5,"Per capita\nbudget expanditures\nin constant dollars", adj=1,
family="serif")
text(max(x), max(y)/1.08, labels="5%", family="serif")
dev.off()
```

```
library(lattice)
x <- 1967:1977
y <- c(0.5,1.8,4.6,5.3,5.3,5.7,5.4,5,5.5,6,5)
xyplot(y~x, xlab="", ylab="", pch=16, col=1, border = "transparent", type="o",
abline=list(h = c(max(y),max(y)-1), lty = 2),
scales=list(x=list(at=x,labels=x, fontfamily="serif", cex=1),
y=list(at=seq(1,6,1), fontfamily="serif", cex=1,
label=sprintf("$%s",seq(300,400,20)))),
par.settings = list(axis.line = list(col = "transparent"), dot.line=list(lwd=0)),
axis = function(side, line.col = "black", ...) {
if(side %in% c("left","bottom")) {axis.default(side = side, line.col = "black", ...)}})
ltext(current.panel.limits()$xlim[2]/1.1, adj=1, fontfamily="serif",
current.panel.limits()$ylim[1]/1.3, cex=1,
"Per capita\nbudget expandures\nin constant dollars")
ltext(current.panel.limits()$xlim[2]/1.1, adj=1, fontfamily="serif",
current.panel.limits()$ylim[1]/5.5, cex=1, "5%")
```

```
library(ggplot2)
library(ggthemes)
x <- 1967:1977
y <- c(0.5,1.8,4.6,5.3,5.3,5.7,5.4,5,5.5,6,5)
d <- data.frame(x, y)
ggplot(d, aes(x,y)) + geom_line() + geom_point(size=3) + theme_tufte(base_size = 15) +
theme(axis.title=element_blank()) + geom_hline(yintercept = c(5,6), lty=2) +
scale_y_continuous(breaks=seq(1, 6, 1), label=sprintf("$%s",seq(300,400,20))) +
scale_x_continuous(breaks=x,label=x) +
annotate("text", x = c(1977,1977.2), y = c(1.5,5.5), adj=1, family="serif",
label = c("Per capita\nbudget expandures\nin constant dollars", "5%"))
```

Range-frame (or quartile-frame) scatterplot

```
x <- mtcars$wt
y <- mtcars$mpg
plot(x, y, main="", axes=FALSE, pch=16, cex=0.8, family="serif",
xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel")
axis(1,at=summary(x),labels=round(summary(x),1), tick=F, family="serif")
axis(2,at=summary(y),labels=round(summary(y),1), tick=F, las=2, family="serif")
```

```
library(devtools)
source_url("https://raw.githubusercontent.com/sjmurdoch/fancyaxis/master/fancyaxis.R")
x <- mtcars$wt
y <- mtcars$mpg
plot(x, y, main="", axes=FALSE, pch=16, cex=0.8,
xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel")
fancyaxis(1, summary(x), digits=1)
fancyaxis(2, summary(y), digits=1)
```

```
library(lattice)
x <- mtcars$wt
y <- mtcars$mpg
xyplot(y ~ x, mtcars, col=1, pch=16, fontfamily="serif",
xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel",
par.settings = list(axis.line = list(col="transparent"),
par.xlab.text=list(fontfamily="serif"),
par.ylab.text=list(fontfamily="serif")),
scales = list(x=list(at=summary(mtcars$wt),labels=round(summary(mtcars$wt),1),
fontfamily="serif"),
y=list(at=summary(mtcars$mpg),labels=round(summary(mtcars$mpg),1),
fontfamily="serif")),
axis = function(side, line.col = "black", ...) {
if(side %in% c("left","bottom")) {axis.default(side = side, line.col = "black", ...)}})
```

```
library(ggplot2)
library(ggthemes)
ggplot(mtcars, aes(wt, mpg)) + geom_point() + geom_rangeframe() + theme_tufte() +
xlab("Car weight (lb/1000)") + ylab("Miles per gallon of fuel") +
theme(axis.title.x = element_text(vjust=-0.5), axis.title.y = element_text(vjust=1.5))
```

Dot-dash (or rug) scatterplot

```
library(devtools)
source_url("https://raw.githubusercontent.com/sjmurdoch/fancyaxis/master/fancyaxis.R")
x <- mtcars$wt
y <- mtcars$mpg
plot(x, y, main="", axes=FALSE, pch=16, cex=0.8,
xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel",
xlim=c(min(x)-0.2, max(x)+0.2),
ylim=c(min(y)-1.5, max(y)+1.5))
axis(1, tick=F)
axis(2, tick=F, las=2)
minimalrug(x, side=1, line=-0.8)
minimalrug(y, side=2, line=-0.8)
```

```
library(lattice)
x <- mtcars$wt
y <- mtcars$mpg
xyplot(y ~ x, xlab="Car weight (lb/1000)", ylab="Miles per gallon of fuel",
par.settings = list(axis.line = list(col="transparent")),
panel = function(x, y,...) {
panel.xyplot(x, y, col=1, pch=16)
panel.rug(x, y, col=1, x.units = rep("snpc", 2), y.units = rep("snpc", 2), ...)})
```

```
library(ggplot2)
library(ggthemes)
ggplot(mtcars, aes(wt, mpg)) + geom_point() + geom_rug() + theme_tufte(ticks=F) +
xlab("Car weight (lb/1000)") + ylab("Miles per gallon of fuel") +
theme(axis.title.x = element_text(vjust=-0.5), axis.title.y = element_text(vjust=1))
```

Minimal boxplot

```
x <- quakes$mag
y <- quakes$stations
boxplot(y ~ x, main = "", axes = FALSE, xlab=" ", ylab=" ",
pars = list(boxcol = "transparent", medlty = "blank", medpch=16, whisklty = c(1, 1),
medcex = 0.7, outcex = 0, staplelty = "blank"))
axis(1, at=1:length(unique(x)), label=sort(unique(x)), tick=F, family="serif")
axis(2, las=2, tick=F, family="serif")
text(min(x)/3, max(y)/1.1, pos = 4, family="serif",
"Number of stations \nreporting Richter Magnitude\nof Fiji earthquakes (n=1000)")
```

`chart.Boxplot`

```
library(PerformanceAnalytics)
library(psych)
d <- msq[,80:84]
chart.Boxplot(d, main = "", xlab="average personality rating (based on n=3896)", ylab="",
element.color = "transparent", as.Tufte=TRUE)
```

```
x <- quakes$mag
y <- quakes$stations
bwplot(y ~ x, horizontal=F, xlab="", ylab="", do.out = FALSE, box.ratio = 0,
scales=list(x=list(labels=sort(unique(x)), fontfamily="serif"),
y=list(fontfamily="serif")),
par.settings = list(axis.line = list(col = "transparent"), box.umbrella=list(lty=1, col= 1),
box.dot=list(col= 1), box.rectangle = list(col= c("transparent"))))
ltext(current.panel.limits()$xlim[1]+250, adj=1,
current.panel.limits()$ylim[2]+50, fontfamily="serif",
"Number of stations \nreporting Richter Magnitude\nof Fiji earthquakes (n=1000)")
```