Data I/O Lab, Part 2

In this lab you can use the interactive console to explore but please record your commands here. Remember anything you type here can be “sent” to the console with Cmd-Enter (OS-X) or Cntr-Enter (Windows/Linux) (But only in side the {r} areas).

library(tidyverse)
library(readxl)

Read in the the Charm City Circulator Dataset from “https://sisbid.github.io/Data-Wrangling/data/Charm_City_Circulator_Ridership.csv” using read_csv, call it circ

circ <- read_csv("https://sisbid.github.io/Data-Wrangling/data/Charm_City_Circulator_Ridership.csv")

## Rows: 1146 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (2): day, date
## dbl (13): orangeBoardings, orangeAlightings, orangeAverage, purpleBoardings,...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Use a few functions to check out the data (e.g., head(), glimpse(), str() etc.).

str(circ)

## spc_tbl_ [1,146 × 15] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ day             : chr [1:1146] "Monday" "Tuesday" "Wednesday" "Thursday" ...
##  $ date            : chr [1:1146] "01/11/2010" "01/12/2010" "01/13/2010" "01/14/2010" ...
##  $ orangeBoardings : num [1:1146] 877 777 1203 1194 1645 ...
##  $ orangeAlightings: num [1:1146] 1027 815 1220 1233 1643 ...
##  $ orangeAverage   : num [1:1146] 952 796 1212 1214 1644 ...
##  $ purpleBoardings : num [1:1146] NA NA NA NA NA NA NA NA NA NA ...
##  $ purpleAlightings: num [1:1146] NA NA NA NA NA NA NA NA NA NA ...
##  $ purpleAverage   : num [1:1146] NA NA NA NA NA NA NA NA NA NA ...
##  $ greenBoardings  : num [1:1146] NA NA NA NA NA NA NA NA NA NA ...
##  $ greenAlightings : num [1:1146] NA NA NA NA NA NA NA NA NA NA ...
##  $ greenAverage    : num [1:1146] NA NA NA NA NA NA NA NA NA NA ...
##  $ bannerBoardings : num [1:1146] NA NA NA NA NA NA NA NA NA NA ...
##  $ bannerAlightings: num [1:1146] NA NA NA NA NA NA NA NA NA NA ...
##  $ bannerAverage   : num [1:1146] NA NA NA NA NA NA NA NA NA NA ...
##  $ daily           : num [1:1146] 952 796 1212 1214 1644 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   day = col_character(),
##   ..   date = col_character(),
##   ..   orangeBoardings = col_double(),
##   ..   orangeAlightings = col_double(),
##   ..   orangeAverage = col_double(),
##   ..   purpleBoardings = col_double(),
##   ..   purpleAlightings = col_double(),
##   ..   purpleAverage = col_double(),
##   ..   greenBoardings = col_double(),
##   ..   greenAlightings = col_double(),
##   ..   greenAverage = col_double(),
##   ..   bannerBoardings = col_double(),
##   ..   bannerAlightings = col_double(),
##   ..   bannerAverage = col_double(),
##   ..   daily = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>

head(circ)

## # A tibble: 6 × 15
##   day       date  orangeBoardings orangeAlightings orangeAverage purpleBoardings
##   <chr>     <chr>           <dbl>            <dbl>         <dbl>           <dbl>
## 1 Monday    01/1…             877             1027          952               NA
## 2 Tuesday   01/1…             777              815          796               NA
## 3 Wednesday 01/1…            1203             1220         1212.              NA
## 4 Thursday  01/1…            1194             1233         1214.              NA
## 5 Friday    01/1…            1645             1643         1644               NA
## 6 Saturday  01/1…            1457             1524         1490.              NA
## # ℹ 9 more variables: purpleAlightings <dbl>, purpleAverage <dbl>,
## #   greenBoardings <dbl>, greenAlightings <dbl>, greenAverage <dbl>,
## #   bannerBoardings <dbl>, bannerAlightings <dbl>, bannerAverage <dbl>,
## #   daily <dbl>

glimpse(circ)

## Rows: 1,146
## Columns: 15
## $ day              <chr> "Monday", "Tuesday", "Wednesday", "Thursday", "Friday…
## $ date             <chr> "01/11/2010", "01/12/2010", "01/13/2010", "01/14/2010…
## $ orangeBoardings  <dbl> 877, 777, 1203, 1194, 1645, 1457, 839, 999, 1023, 137…
## $ orangeAlightings <dbl> 1027, 815, 1220, 1233, 1643, 1524, 938, 1000, 1047, 1…
## $ orangeAverage    <dbl> 952.0, 796.0, 1211.5, 1213.5, 1644.0, 1490.5, 888.5, …
## $ purpleBoardings  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ purpleAlightings <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ purpleAverage    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ greenBoardings   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ greenAlightings  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ greenAverage     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ bannerBoardings  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ bannerAlightings <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ bannerAverage    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ daily            <dbl> 952.0, 796.0, 1211.5, 1213.5, 1644.0, 1490.5, 888.5, …

nrow(circ)

## [1] 1146

Write out circ to a file called “Circulator.csv” using write_csv.

write_csv(circ, "Circulator.csv")

Write out circ to a file called “Circulator.rds” using write_rds.

write_rds(circ, "Circulator.rds")

Download the Excel data for the next two questions. Make sure it’s in your project folder.

curl::curl_download("https://sisbid.github.io/Data-Wrangling/data/iris/iris_q6.xlsx", "iris_q6.xlsx")

Read in sheet 1 of the iris_q6.xlsx dataset into the iris_q6_1 R object. How many rows are in the dataset?

iris_q6_1 <- read_excel("iris_q6.xlsx", sheet= 1)
dim(iris_q6_1)

## [1] 1 2

Read in sheet 2 of the iris_q6.xlsx dataset into the iris_q6_2 R object. How many rows are in the dataset?

iris_q6_2 <- read_excel("iris_q6.xlsx", sheet= 2)
dim(iris_q6_2)

## [1] 150   5