Chapter 3: Data Analysis on Streaming Data
Now that you have seen an introduction to streaming data and streaming use cases, as well as an introduction to streaming architecture, it is time to enter into the core of this book: analytics and machine learning.
As you probably know, descriptive statistics and data analysis are the entry points into machine learning, but they are also often used as a standalone use case. In this chapter, you will first discover descriptive statistics from a traditional statistics viewpoint. Some parts of traditional statistics focus on making correct estimations of descriptive statistics when only part of the data is available.
In streaming, you will encounter such problems in an even more impacting manner than in batch data. Through a continuous data collection process, your descriptive statistics will continue changing on every new data point. This chapter will propose some solutions for dealing with this.
You will also build a data visualization...