Chapter 6. Naïve Bayes Classifiers
So far, we have dealt with processing, filtering of data, and discovery of features through unsupervised learning. Although these techniques are critical to understand the problems, trends, and outliers, they do not provide data scientists with the ability to train a model with known, expected outcome, or labelled observations. These techniques are collectively known as supervised learning as described in the Taxonomy of machine learning algorithms section of Chapter 1, Getting Started. Supervised learning is further categorized as generative and discriminative techniques.
This chapter describes the most common and simple generative classifiers—Naïve Bayes. As a reminder, generative classifiers are supervised learning algorithms that attempt to fit a joint probability distribution p(X, Y) of two events, X and Y representing two sets of observed and hidden (or latent) variables x, y.
In this chapter, you will appreciate the simplicity...