The Naive Bayes algorithm makes use of the Bayes theorem, in order to classify classes and categories. The word naive was given to the algorithm because the algorithm assumes that all attributes are independent of one another. This is not actually possible, as every attribute/feature in a dataset is related to another attribute, in one way or another.
Despite being naive, the algorithm does well in actual practice. The formula for the Bayes theorem is as follows:
Bayes theorem formula
We can split the preceding algorithm into the following components:
- p(h|D): This is the probability of a hypothesis taking place, provided that we have a dataset. An example of this would be the probability of a fraudulent transaction taking place, provided that we had a dataset that consisted of fraudulent and non-fraudulent transactions.
- p(D|h): This is the probability...