Exploring and understanding data
After collecting data and loading it into R's data structures, the next step in the machine learning process involves examining the data in detail. It is during this step that you will begin to explore the data's features and examples, and realize the peculiarities that make your data unique. The better you understand your data, the better you will be able to match a machine learning model to your learning problem.
The best way to learn the process of data exploration is with an example. In this section, we will explore the usedcars.csv
dataset, which contains actual data about used cars recently advertised for sale on a popular U.S. website.
Tip
The usedcars.csv
dataset is available for download on the Packt Publishing support page for this book. If you are following along with the examples, be sure that this file has been downloaded and saved to your R working directory.
Since the dataset is stored in the CSV form, we can use the read.csv()
function...