Summary
In this chapter, we covered high-level concepts in supervised machine learning in general and, more specifically, in classification.
Classification involves predicting a categorical variable, such as the presence or absence of cancer, predicting which football position a player might perform best in, predicting whether a customer will unsubscribe/churn in the next six months, or determining whether a social media post is likely to “go viral.”
In machine learning, we train models by providing training data to the model training process and selecting the machine learning algorithm and its hyperparameters to use. Pre-processing data may also be necessary to get data into a standardized form.
Once a model is trained, we can evaluate its performance by getting predictions for data we’ve reserved for testing purposes. This test data gives us an idea of whether our model accurately predicts values for data it hasn’t seen before.
Models are evaluated...