Manipulating data with dplyr
The dplyr
package for R is described as a package providing a grammar for data manipulation. It has the entry points you would expect for wrangling your data frame in one package. We will use the dplyr
package against the baseball player statistics we used earlier in this chapter.
We read in the player data and show the first few rows:
players <- read.csv(file="Documents/baseball.csv", header=TRUE, sep=",") head(players)
We will be using the dplyr
package, so we need to pull the package into our notebook:
library(dplyr)
Converting a data frame to a dplyr table
The dplyr
package has functions to convert your data object into a dplyr
table. A dplyr
table stores data in a compact format using much less memory. Most of the other dplyr
functions can operate directly on the table as well.
We can convert our data frame to a table using:
playerst <- tbl_df(players)playerst
This results in a very similar display pattern:
Getting a quick overview of the data value ranges
Another...