Chapter 12. Bringing It All Together
While we have demonstrated many aspects of using Java to support data science tasks, the need to combine and use these techniques in an integrated manner exists. It is one thing to use the techniques in isolation and another to use them in a cohesive fashion. In this chapter, we will provide you with additional experience with these technologies and insights into how they can be used together.
Specifically, we will create a console-based application that analyzes tweets related to a user-defined topic. Using a console-based application allows us to focus on data-science-specific technologies and avoids having to choose a specific GUI technology that may not be relevant to us. It provides a common base from which a GUI implementation can be created if needed.
The application performs and illustrates the following high-level tasks:
- Data acquisition
- Data cleaning, including:
- Removing stop words
- Cleaning the text
- Sentiment analysis
- Basic data statistic...