One of the standards for file formats is CSV. In this section, we will walk through the process of reading a CSV and adjusting the dataset to arrive at some conclusions about the data. The data I am using is from the Heating System Choice in California Houses dataset, found at https://vincentarelbundock.github.io/Rdatasets/datasets.html:
#read in the CSV file as available on the site heating <- read.csv(file="Documents/heating.csv", header=TRUE, sep=",") # make sure the data is laid out the way we expect head(heating)
The data appears to be as expected; however, a number of the columns have acronym names and are somewhat duplicated. Let us change the names of interest that we want to be more readable and remove the extras we are not going to use:
# change the column names to be more readable colnames(heating)[colnames(heating)=="...