Subsetting the columns
For this exercise, we will be using a restricted set of columns from the CSV file. We can either select the specific columns from the dataframe just read in (if we just read in the whole file), or reread the csv file using the colClasses
parameter to only read the columns that are required. Often, this method is preferable when you are reading a large file, and will instruct read.csv
to only retain the first three and the last two columns, and ignore the columns priemp
through govmilitary
.
After rereading in the file, with a subset of the columns, we print a few records from the beginning and end of the file. We can do this using a combination of the rbind()
, head()
, and tail()
functions. This will give us all of the columns we will be using for this chapter, except for some columns, which we will derive in the next section:
x <- read.csv("hihist2bedit.csv", colClasses = c(NA,NA, NA, NA, rep("NULL", 7))) rbind(head(x), tail(x)) > Year Year.1 Total.People...