Naïve Bayes and text mining
The multinomial Naïve Bayes classifier is particularly suited for text mining. Naïve Bayes is used to classify the following entities:
- E-mails as legitimate versus spam
- Business news stories
- Movie reviews and scoring
- Technical papers as per field of expertise
This third use case consists of predicting the direction of a stock, Tesla Motors Inc, (ticker symbol: TSLA) give the financial news. The features are the frequency of occurrence of some specific terms related to the stock. It is unclear how fast the investor or trader reacts to the news and influence, if any, of the value of a stock. Therefore, the delayed response time, as depicted in the following chart, should be a feature of the proposed model:
The feature market response delay would play a role in the training, only if the variance of the observations is significant. The distribution of the frequencies of the delay in the market response to any newsworthy articles regarding TSLA shows that...