scikit-learn implements three Naive Bayes variants based on the same number of different probabilistic distributions: Bernoulli, Multinomial, and Gaussian. The first one is a binary distribution, and is useful when a feature can be present or absent. The second one is a discrete distribution and is used whenever a feature must be represented by a whole number (for example, in NLP, it can be the frequency of a term), while the third is a continuous distribution characterized by its mean and variance.
Naive Bayes in scikit-learn
Bernoulli Naive Bayes
If X is a Bernoulli-distributed random variable, it can have only two possible outcomes (for simplicity, let's call them 0 and 1) and their probability is this:
![](https://static.packt-cdn.com/products/9781789347999/graphics/assets/3fe575a7-6e95-412f-82c4-1699a2381ecf.png)
In general...