Introducing dplyr
According to the dplyr
documentation at http://dplyr.tidyverse.org/, dplyr
is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges, as follows:
mutate()
: Adds new variables that are functions of existing variablesselect()
: Picks variables based on their namesfilter()
: Picks cases based on their valuessummarize()
: Reduces multiple values down to a single summaryarrange()
: Changes the ordering of the rowsgroup_by()
: Allows you to perform any operation by group
While each of the verbs corresponds to a particular function in dplyr
, a verb can be thought of more generally as particular action that transform the data in a certain way.
Note
In addition to the verbs listed here, there is also functionality in dplyr that can be used to merge (or join) data from different sources though I won't be covering these features here.
In the following sections, I will demonstrate each of these functions individually...