The Book of R

Book of R

A First Course in Programming and Statistics
by Tilman M. Davies
July 2016, 832 pp.

“You must see this epic work...a game changer.”
Kirk Borne, Principal Data Scientist at Booz Allen Hamilton

“Extremely well written with excellent explanations and examples, this book fully accomplishes the goal of providing the reading with both the programming and statistical skills required to become proficient with this language. I am nothing short of amazed at the consistent quality and clarity of the text and the utility of the exercises.”

The Book of R is a comprehensive, beginner-friendly guide to R, the world’s most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you’ll find everything you need to begin using R effectively for statistical analysis.

You’ll start with the basics, like how to handle data and write simple programs, before moving on to more advanced topics, like producing statistical summaries of your data and performing statistical tests and modeling. You’ll even learn how to create impressive data visualizations with R’s basic graphics tools and contributed packages, like ggplot2 and ggvis, as well as interactive 3D visualizations using the rgl package.

Dozens of hands-on exercises (with downloadable solutions) take you from theory to practice, as you learn:

  • The fundamentals of programming in R, including how to write data frames, create functions, and use variables, statements, and loops
  • Statistical concepts like exploratory data analysis, probabilities, hypothesis tests, and regression modeling, and how to execute them in R
  • How to access R’s thousands of functions, libraries, and data sets
  • How to draw valid and useful conclusions from your data
  • How to create publication-quality graphics of your results

Combining detailed explanations with real-world examples and exercises, this book will provide you with a solid understanding of both statistics and the depth of R’s functionality. Make The Book of R your doorway into the growing world of data analysis.

Author Bio 

Tilman M. Davies is a senior lecturer at the University of Otago in New Zealand, where he teaches statistics and R at all university levels. He has been programming in R for 10 years and uses it in all of his courses. In 2017, Davies received Otago's Early Career Award for Distinction in Research.

Table of contents 


1 Getting Started
2 Numerics, Arithmetic, Assignment, and Vectors
3 Matrices and Arrays
4 Non-numeric Values
5 Lists and Data Frames
6 Special Values, Classes, and Coercion
7 Basic Plotting
8 Reading and Writing Files

9 Calling Functions
10 Conditions and Loops
11 Writing Functions
12 Exceptions, Timings, and Visibility

13 Elementary Statistics
14 Basic Data Visualization
15 Probability
16 Common Probability Distributions

17 Sampling Distributions and Confidence
18 Hypothesis Testing
19 Analysis of Variance
20 Simple Linear Regression
21 Multiple Linear Regression
22 Linear Model Selection and Diagnostics

23 Advanced Plot Customization
24 Going Further with the Grammar of Graphics
25 Defining Colors and Plotting in Higher Dimensions
26 Interactive 3D Plots (AVAILABLE NOW)

A Installing R and Contributed Packages
B Working with RStudio (AVAILABLE NOW)


View the detailed Table of Contents (PDF)
View the Index (PDF)


“I’ve been looking for a book like this for some time. It fills some holes in my course content that my own book doesn’t address.”

“I recommend this book to both beginners, as a good introduction to basic statistics and R, and to intermediate users as a desktop reference to assist in performing day-to-day analysis.”
One R Tip a Day

“Overall, The Book of R is an excellent reference for novice data analysts and for students being introduced to statistical programming tools.”
—Harry J. Foxwell, ACM's Computing Reviews

“The book is therefore addressing two audiences with different needs – coders who might need help with understanding statistical concepts and statisticians of one breed or another who want to learn how to code. Satisfying both groups is a big ask, but Tilman Davies pulls it off.”
Network Security Newsletter

“Davies' book is perhaps the most comprehensive explanation of the core R language in print, and an excellent introduction to using R for statistical programming.”
Oliver Keyes, Sociotechnical Systems researcher


Page 95: in Exercise 5.1, in entry b.iii., "num" should read as "nums"

Pages 101-102: The answer for Exercise 5.2 (a) has been updated in the resources file. In the call to data.frame, the correct use should have stringsAsFactors=F.

Page 135: The paragraph at the bottom of the page should read:
...This line of code adds two separate horizontal lines, one at y = -5 and the other at y = 5, using h= c(-5,5).

Page 155: The website address within the code lines should read as:
R> dia.url <- ""

Page 206: The website address within the first code line at the bottom of the page should read as:
R> dia.url <- ""

Page 213: Exercise 10.6 (a) (i) should read: ...That is, produce the same vector as loop1.result in the text.
Exercise 10.6 (a) (ii) should read: Obtain the same result as loop2.result, the example...

Page 214: In Exercise 10.6 (c) (ii), the code listing line that reads:
matlist1 <- list(matrix(1:4,2,2),matrix(2:5,2,2)

should read:

matlist1 <- list(matrix(1:4,2,2),matrix(2:5,2,2)

Page 238: Exercise 11.3 (b) (ii) should read: 12 factorial is 479,001,600.

Page 280: In Equation 13.6, the index of the sum should be i = 1 rather than i - 1.

Page 307: The answer for Exercise 14.1 (i) has been updated in the resources file. It should appear as follows:

magquan <- quantile(quakes$mag,c(1/3,2/3))
magfac <- cut(quakes$mag,breaks=c(min(quakes$mag),magquan[1],
magquan[2],max(quakes$mag)), include.lowest=TRUE)

Page 317: In the final line of Equation 15.4, a power of 2 should appear after the first set of parentheses.

Page 325: Equation 15.8 should read:

Page 352: In Figure 16-7, the legend should read: (mean - 1 sd)

Page 363: The Note at the end of Section 16.2.5 should read:

...However, most p- and q-functions in R include an optional logical argument, lower.tail, which defaults to TRUE. Therefore, an alternative is to set lower.tail=FALSE in any relevant function call...

Page 489: In the final code listing in Section 21.2.3, the line that reads:

R> BETA.HAT <- solve(t(X)

Should instead read:

R> BETA.HAT <- solve(t(X)%*%X)%*%t(X)%*%Y

The correct line appears in the resources file, Davies_Part4_Source_Code.R.

Page 530: In Equation 22.1, the last line should read:
HA : At least one of the βj 6 ≠ 0 (for j = p + 1, . . . , q)

Page 568: The website address within the first line of code in the middle of the page should read:
R> dia.url <- ""

Page 590: In Exercise 23.1 (c), the line that reads:
After you open the device and setting the layout, the plot margins...
should read:
After you open the device and set the layout, the plot margins...

Page 606: In Exercise 23.2, the website address within the first line of code at the top of the page should read:
R> dia.url <- ""

Page 619: For later versions of ggplot2, the commands do not center the title, but have it flush-left. In order to center the title, add theme(plot.title = element_text(hjust = 0.5) ) to the code.

Page 637: The last line of code at the top of the page that reads:
R> axis(2,at=4:1,labels=c("peryel.colors"...
should read:
R> axis(2,at=4:1,labels=c("puryel.colors"...

Page 671: In Section 25.5.2, install.package("spatstat") should be install.packages("spatstat").

Page 698: In Exercise 26.1 (b), the first line in the legend of the figure should read: Male RH.

Page 718: The last line of the code in the middle of the page that reads:
should read:

Page 719: The last line of code in the middle of the page that reads:
should read:
And the last line of code at the bottom of the page that reads:
1.771214e-05 2.964305e-05 4.249407e-05 9.543976e-05
should read:
2.260448e-5 3.692132e-05 6.086846e-05 1.258873e-04