Twitter sentiment analysis application
As always, we start by defining the requirements for our MVP version:
- Connect to Twitter to get a stream of real-time tweets filtered by a query string provided by the user
- Enrich the tweets to add sentiment information and relevant entities extracted from the text
- Display a dashboard with various statistics about the data using live charts that are updated at specified intervals
- The system should be able to scale up to Twitter data size
The following diagram shows the first version of our application architecture:
For version 1, the application will be entirely implemented in a single Python Notebook and will call out to an external service for the NLP part. To be able to scale, we will certainly have to externalize some of the processing outside of the Notebook, but for development and testing, I found that being able to contain the whole application in a single Notebook significantly increases productivity...