In this lesson we go through the steps building a working environment. Specifically, you will install R and RStudio IDE. You will also install some R packages needed to get started.
We have a wide range of experience levels in this course, so this lesson is devoted to the basics of setting up your working environment and laying down a good foundation for your projects. I would like to give everyone a chance to get everything installed and to get comfortable with these new tools before we dive in. I realize that some of you may have a lot of this done already. Let’s begin.
In this lesson you will install the basic tools needed for the course, namely R, R Studio, and a few R packages. At its simplest, R is like the engine of a car while RStudio is like the dashboard. More precisely, R is the programming language that runs computations, while RStudio is an integrated development environment (IDE) that provides an interface with many convenient features and tools. Just as having access to a speedometer, rear view mirrors, and a navigation system makes driving much easier, using RStudio’s interface makes using R much easier as well.
As I mentioned, our goals for this course are to build data-driven web products. Each lesson will be built around adding functionality to your project. You will start with something simple and, hopefully, by the end have something awesome. Some of you may have more experience with some of these tools than I do. That’s great—you can help me and help your classmates. If you think something I say is wrong, please tell me. I would rather do something the right way than be right about the wrong way I am doing something.
The other thing I want to mention is that I assure you this in not a web development course. As you go through this lesson it may feel like that is where we are heading, but I promise it is not. Markdown is built on top of HTML and CSS, so you will encounter these lanuguages from time to time. I am interested in functional sites that contain useful content, with maybe a little bit of flair. I have found that even a superficial knowledge of web development and design goes a long way in building good R Markdown sites.
It is important to follow the next steps in order. You must install R before RStudio or any R packages.
R is both a software language and an environment for statistical computing and graphics. At some point, you will need to embrace a programming environment to analyze your data and summarize your findings using figures, tables, etc. R is certainly not the only way to do this; however I believe this environment offers a valuable suite of tools for your scientific needs. The benefits of R include: a) it is free and open source; b) its capabilities are extended through user-created packages; c) it has a huge community of users (which means it is well supported); d) it is powerful and flexible. And we need R to run RStudio.
If you already have R installed please make use it is a relatively up-to-date version. If it is not, please consider reinstalling R.
Download and install R by going to https://cloud.r-project.org/.
Download R for Windows
, then click on base
,
then click on Download R X.X.X for Windows
.Download R for (Mac) OS X
, then under Latest
release: click on R-X.X.X.pkg
, where R-X.X.X is
the version number. For example, the latest version of R as of April 27,
2022 is R-4.1.3. Note for MAC users your choice will depend on
whether your computer has an Intel or M1 chip.Download R for Linux
and choose your distribution for more information on installing R for
your setup.Once R is installed, test the install. Open R and type
sessionInfo()
. If you don’t get an error, then the install
is good.
For more information, please have a look at these instructions for Installing R and RStudio.
RStudio is an integrated development environment (IDE) for the R language. Take a moment to familiarize yourself with the idea of an IDE. RStudio is a syntax-highlighting editor that supports direct code execution; tools for plotting, history, debugging and workspace management. We will use RStudio to create our web and interactive data products.
Again, if you already have RStudio installed please make use it is a relatively up-to-date version. If it is not, please consider reinstalling RStudio.
Requires macOS 10.15+ (64-bit)
, however I am running
macOS 10.14+
so I had to install an older version. If need
be, you can find older versions of RStudio here.For more information, please have a look at these instructions for Installing R and RStudio.
R packages extend the functionality of R by providing additional functions, data, and documentation. Packages are written by a worldwide community of R users and can be downloaded for free from numerous online repositories. A repository, or repo, is simply a place where packages are located on the web. The three most popular repositories for R packages are:
The Comprehensive R Archive Network, or CRAN: the official R package repository, it is a network of ftp and web servers maintained by the R community around the world. The R foundation coordinates CRAN, and for a package to be published here, it needs to pass several tests that ensure the package is following CRAN policies.
Bioconductor: this is a topic specific repository, intended for open source bioinformatics software. As with CRAN, Bioconductor has its own submission and review processes, and its community is very active.
Github : although this is not R specific, Github is probably the most popular repository for open source projects. Its popularity comes from the unlimited space for open source, the integration with git (a version control software), and its ease to share and collaborate with others. But be aware that there is no review process associated with GitHub package repos.
On CRAN alone there are over 18909 packages available and an additional 2083 packages on Bioconductor. You will need a few packages to get going with your web project and you will install more as the need arises.
Before you install any packages, check to see if the package is already installed. You have a few options. First, open R. If you want to see all the packages you have installed you can run:
What about checking for a specific package, like tmap
?
tmap
is an actively maintained open-source R-library for
drawing thematic
maps.
"tmap" %in% rownames(installed.packages())
[1] FALSE
If the output of this command is TRUE
, you are good to
go. If it says FALSE
then the package is not installed.
Or you can use the library
command, which loads/attaches
installed packages.
If you do not get an error that means the package is already installed. However, if you see something like…
Error in library(tmap) : there is no package called ‘tmap’`
…then the package needs to be installed.
I only use R (instead of RStudio) to install all of my packages. I have found this prevents any conflicts with installation location. Otherwise, I almost exclusively work in RStudio when creating web products.
Let’s run through a quick example. I have 303 R packages installed on
my computer but I do not have tmap
. Turns out that a stable
version of tmap
is available on CRAN and a development
version GitHub1. First, here is how to install the
stable version of the tmap
package.
install.packages("tmap")
If all goes well, the beginning of the output should look something like this.
Installing package into ‘/Users/scottjj/Library/R/4.1/library’
(as ‘lib’ is unspecified)
also installing the dependencies ‘wk’, ‘geometries’, ‘jsonify’, ‘rapidjsonr’,
‘sfheaders’, ‘lwgeom’, ‘dichromat’, ‘s2’, ‘geojsonsf’, ‘tmaptools’, ‘sf’,
‘stars’, ‘units’, ‘widgetframe’, ‘leafsync’, ‘leafem’
In my case, the command is also installing around 15
dependencies—additional packages that tmap
needs but that
are not currently installed.
If instead we wanted to install the development version of tmap, we need to
install the remotes
package and the run the
install_github
command2.
install.packages("remotes")
library(remotes)
install_github("r-tmap/tmaptools")
install_github("r-tmap/tmap")
Sweet. Let’s see what this package can do. Here is a little toy
example of an interactive thematic map made with tmap
.
Once R and RStudio are installed you can install some essential
packages. Remember, to install a package from CRAN, call the
library()
command with the package name in quotes
""
, like so: library("PACKAGE_NAME")
library(tidyverse)
. Other
packages in tidyverse need to be loaded separately with their own call
to library()
.Note. If you are trying to install a package and get an error message containing something like
Timeout of 60 seconds was reached
you need to do the following:
First run getOption("timeout")
to see what the timeout
option is. I think the default is 60 seconds. If so, change it to
something larger by running options(timeout = seconds)
where seconds
is an integer value.
This will only change the timeout for the current session. Once you quit R the setting will revert to 60 seconds.
If you have problems with any of these steps, we will help.
Once you are all set up, open RStudio and start poking around. You
can personalize the appearance of the IDE by going to
RStudio
> Preferences
. You should also
download this cheat
sheet that contains a lot of additional information on the RStudio
IDE.
I wanted to leave you with a few final thoughts. I made my first commit to GitHub a little over three years ago with my first web product, a single page R Markdown HTML document. I am a microbial ecologist, and I was collaborating with four reef ecologists on a project about fish guts and microbes. Anyway, they were having a hard time understanding the microbial data, or I was doing a poor job of explaining it. So I made an HTML page outlining every step of my analysis, including all the code and results. My collaborators were so excited that they started sending me additional material to add to the page.
As the amount of material accumulated, the single page turned into a project website. I made a GitHub repo for the site and used GitHub Pages to distribute site over the web. I did my best to document everything we did in that study. When we submitted the paper, I included the project website in the Data accessibility section so people could find information not included in the main paper or supplementary material. During the revision process, I was able to quickly address reviewer comments because I could pull material directly from the website without having to dig around in the depths of my computer.
The site is nothing fancy but the entire project is now archived on the web. I learned by doing and that is my goal for you in this module. I also use R Markdown for presentations, my CV, and professional letters. Along the way, I have gained a lot of experience using other languages like HTML, CSS, and Hugo for my projects.
And that’s it for this lesson.
The source code for this page can be accessed on GitHub by clicking this link.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. Source code is available at https://github.com/stri-mcgill-neo/2022/, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".