Introduction
In the previous chapters, we explored some of packages of R, such as the dplyr, plyr, lubridate, and ggplot2, where we discussed the basics of storing and processing data in R. Later, the same ideas were used in Exploratory Data Analysis (EDA) to understand the ways to break data into smaller parts, extract insights from data, and explore other ways to understand the data better, before venturing into advanced modeling techniques.
In this chapter, we will take one step further toward introducing machine learning ideas. While broadly laying the foundation for thinking about various algorithms in machine learning, we will discuss supervised learning at length.
Supervised learning is based on data that is well labeled by domain experts. For classifying cats and dogs from images, an algorithm first needs to see the images labeled as cats and dogs and then learn the features based on the label. Most enterprises with a good volume of historical data are the biggest beneficiaries of...