Laplace estimator
In the previous calculation, all the values are nonzeros, which makes calculations well. Whereas in practice some words never appear in past for specific category and suddenly appear at later stages, which makes entire calculations as zeros.
For example, in the previous equation W3 did have a 0 value instead of 13, and it will convert entire equations to 0 altogether:
In order to avoid this situation, Laplace estimator essentially adds a small number to each of the counts in the frequency table, which ensures that each feature has a nonzero probability of occurring with each class. Usually, Laplace estimator is set to 1, which ensures that each class-feature combination is found in the data at least once:
Note
If you observe the equation carefully, value 1 is added to all three words in the numerator and at the same time, three has been added to all denominators to provide equivalence.