R Basics

Explaining output on slides

In slides, a command (we’ll also call them code or a code chunk) will look like this

head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

And then directly after it, will be the output of the code.

These slides were made in R using knitr and R Markdown (covered later today when we discuss reproducible research)

R variables

A few reminders:

  • You can create variables from within the R environment and from files on your computer

  • Use “<-” to assign values to a variable name

  • Variable names are case-sensitive, i.e. X and x are different

x <- 2
x
[1] 2
x * 4
[1] 8

Help

For any function, you can write ?FUNCTION_NAME, or help("FUNCTION_NAME") to look at the help file:

?dir
help("dir")

Packages

Not all packages are available by default.

install.packages("tidyverse")
library(tidyverse)

Commenting in Scripts

Commenting in code is super important. You should be able to go back to your code years after writing it and figure out exactly what the script is doing. Commenting helps you do this. Also handy for notes!

.

Commenting in Scripts

Data Input

Outline

  • Part 0: A little bit of set up!
  • Part 1: reading in manually (point and click)
  • Part 2: reading in directly & working directories
  • Part 3: checking data & multiple file formats

Part 0: Setup - R Project

New R Project

Let’s make an R Project so we can stay organized in the next steps.

Click the new R Project button at the top left of RStudio:

The New R Project button is highlighted.

New R Project

In the New Project Wizard, click “New Directory”:

In the New Project Wizard, the 'New Directory' option is highlighted.

New R Project

Click “New Project”:

In the New Project Wizard, the 'New Project' option is highlighted.

New R Project

Type in a name for your new folder.

Store it somewhere easy to find, such as your Desktop:

In the New Project Wizard, the new project has been given a name and is going to be stored in the Desktop directory. The 'Create Project' button is highlighted.

New R Project

You now have a new R Project folder on your Desktop!

Make sure you add any scripts or data files to this folder as we go through today’s lesson. This will make sure R is able to “find” your files.

The image shows an image of an arrow pointing to the newly created R project repository.

Why Projects?

R Projects are a super helpful feature of RStudio. They help you:

  • Stay organized. R Projects help in organizing your work into self-contained directories (folders), where all related scripts, data, and outputs are stored together. This organization simplifies file management and makes it easier to locate and manage files associated with your analysis or project.

  • Find the right files. When you open an R Project, RStudio automatically sets the working directory to the project’s directory. This is where RStudio “looks” for files. Because it’s always the Project folder, it can help avoid common issues with file paths.

  • Be more reproducible. You can share the entire project directory with others, and they can replicate your environment and analysis without much hassle.

Part 1: Getting data into R (manual/point and click)

Data Input

  • ‘Reading in’ data is the first step of any real project/analysis
  • R can read almost any file format, especially via add-on packages
  • We are going to focus on simple delimited files first
    • comma separated (e.g. ‘.csv’)
    • tab delimited (e.g. ‘.txt’)
    • Microsoft Excel (e.g. ‘.xlsx’)

Data Input

UFO dataset:

“This dataset contains over 80,000 reports of UFO sightings over the last century. Inspiration includes What areas of the country are most likely to have UFO sightings? Are there any trends in UFO sightings over time? Do they tend to be clustered or seasonal? Do clusters of UFO sightings correlate with landmarks, such as airports or government research centers? What are the most common UFO descriptions?”

Data Input: Dataset Location

Import Dataset

What Just Happened?

  • You see a preview of the data on the top left pane.
  • You see a new object called ufo_data_complete in your environment pane (top right). The table button opens the data for you to view.
  • R ran some code in the console (bottom left).

Browsing for Data on Your Machine (not URL)

The image shows an image of an arrow pointing to the newly created R project repository.

Watch the process (recap)

Gif showing the process of importing a dataset via readr.

Example 2: Delimiters

Example 3: Excel

Manual Import: Pros and Cons

Pros: easy!!

Cons: obscures some of what’s happening, others will have difficulty running your code

Summary & Lab