Part 2 – Enriching the data with sentiment and most relevant extracted entity
In this part, we enrich the Twitter data with sentiment information, for example, positive, negative, and neutral. We also want to extract the most relevant entity from the tweet, for example, sport, organization, and location. This extra information will be analyzed and visualized by the real-time dashboard that we'll build in the next section. The algorithms used to extract sentiment and entity from an unstructured text belong to a field of computer science and artificial intelligence called natural language processing (NLP). There are plenty of tutorials available on the web that provide algorithm examples on how to extract sentiment. For example, you can find a comprehensive text analytic tutorial on the scikit-learn repo at https://github.com/scikit-learn/scikit-learn/blob/master/doc/tutorial/text_analytics/working_with_text_data.rst.
However, for this sample application, we are not going to build our own NLP...