Determining all of the subset groups
Since we have only looked at parts of the file (via head()
or tail()
functions), we do not know how many categories there are and how they differ in terms of health care coverage. So we will start off by looking at some of the groupings.
In previous chapters, we have used sql()
and the aggregate()
function to group data. For this example, we will use the dplyr
package. One advange of the dplyr()
package is that it can also be used with pipe syntax, which allows the result of one function to be passed to the next function without intermediate assignments:
library(dplyr) > > Attaching package: 'dplyr' > The following objects are masked from 'package:stats': > > filter, lag > The following objects are masked from 'package:base': > > intersect, setdiff, setequal, union # str(x)
The by.cat
object will show the average number insured, and the average total population for each category. Remember, this data is also grouped...