Naive Bayes
Naive Bayes is a classification algorithm based on Bayes' probability theorem and conditional independence hypothesis on the features. Given a set of m features, , and a set of labels (classes)
, the probability of having label c (also given the feature set xi) is expressed by Bayes' theorem:
![Naive Bayes](https://static.packt-cdn.com/products/9781785886607/graphics/graphics/B05143_03_60.jpg)
Here:
is called the likelihood distribution
is the posteriori distribution
is the prior distribution
is called the evidence
The predicted class associated with the set of features will be the value p such that the probability is maximized:
![Naive Bayes](https://static.packt-cdn.com/products/9781785886607/graphics/graphics/B05143_03_65.jpg)
However, the equation cannot be computed. So, an assumption is needed.
Using the rule on conditional probability , we can write the numerator of the previous formula as follows:
![Naive Bayes](https://static.packt-cdn.com/products/9781785886607/graphics/graphics/B05143_03_67.jpg)
![Naive Bayes](https://static.packt-cdn.com/products/9781785886607/graphics/graphics/B05143_03_68.jpg)
![Naive Bayes](https://static.packt-cdn.com/products/9781785886607/graphics/graphics/B05143_03_69.jpg)
We now use the assumption that each feature xi is conditionally independent given c (for example, to calculate the probability of x1 given c, the knowledge of the label c makes the knowledge of the other feature x0 redundant, ):
![Naive Bayes](https://static.packt-cdn.com/products/9781785886607/graphics/graphics/B05143_03_71.jpg)
Under this assumption, the probability...