The Naive Bayes classifier is commonly used in classifying textual data. In the following sections, we are going to see its different flavors and learn how to configure their parameters. But first, to understand the Naive Bayes classifier, we need to first go through Thomas Bayes' theorem, which he published in the 18th century.
The Bayes rule
When talking about classifiers, we can describe the probability of a certain sample belonging to a certain class using conditional probability, P(y|x). This is the probability of a sample belonging to class y given its features, x. The pipe sign (|) is what we use to refer to conditional probability, that is, y given x. The Bayes rule is capable of expressing this conditional probability in terms of P(x|y), P(x), and P(y), using the following formula:
Usually, we ignore the denominator part of the equation and convert it into a proportion as follows:
...