Summary
There is a reason why the Naïve Bayes model is the first supervised learning technique you learned: it is simple and robust. As a matter of fact, this is the first technique that should come to mind when you are considering creating a model from a labeled dataset, as long as the features are conditionally independent.
This chapter also introduced you to the basics of text mining as an application of Naïve Bayes.
Despite all its benefits, the Naïve Bayes classifier assumes that the features are conditionally independent, a limitation that cannot be always overcome. In the case of document classification, Naïve Bayes assumes incorrectly that terms are semantically independent: the two entities' age and date of birth are highly correlated. The discriminative classifiers described in the next few chapters attempt to address some of the Naïve Bayes's disadvantages [5:14].
However, this chapter does not address temporal dependencies, sequence of events...