Now, we will look at the Naive Bayes classifier. We will begin by looking at how the classifier works. Then, we will look at the linear separability assumptions that are used not only by the Naive Bayes algorithm, but also other important classifiers. Finally, we will train the classifier on the Titanic dataset.
The Naive Bayes classifier's big idea is to assume that the features used for prediction are independent of one other, but not of the class that we are trying to predict. To make a prediction, we estimate the likelihood that a data point would have the observed values for its features, given each possible class it belongs to. We predict the class that maximizes this likelihood.
Training consists of estimating the quantities that are used in forming these likelihoods. The independence assumption makes this task relatively painless. The Naive...