Descriptive statistics on streaming data
Computing descriptive statistics is generally one of the first things covered in statistics and data analytics courses. Descriptive statistics are measurements that data practitioners are very familiar with, as they allow you to summarize a dataset in a small set of indicators.
Why are descriptive statistics different on streaming data?
On regular datasets, you can use almost any statistical software to easily obtain descriptive statistics using well-known formulas. On streaming datasets, unfortunately, this is much less obvious.
The problem with applying descriptive statistics on streaming data is that the formulas are made for finding fixed measurements. In streaming data, you continue to receive new data, which unfortunately may alter your values. When you do not have all the data of a variable, you cannot be certain about its value. In the following section, you will get an introduction to sampling theory, the domain that deals...