Assumptions and mathematical notations
There are some key assumptions made by many stream machine learning techniques and we will state them explicitly here:
The number of features in the data is fixed.
Data has small to medium dimensions, or number of features, typically in the hundreds.
The number of examples or training data can be infinite or very large, typically in the millions or billions.
The number of class labels in supervised learning or clusters are small and finite, typically less than 10.
Normally, there is an upper bound on memory; that is, we cannot fit all the data in memory, so learning from data must take this into account, especially lazy learners such as K-Nearest-Neighbors.
Normally, there is an upper bound on the time taken to process the event or the data, typically a few milliseconds.
The patterns or the distributions in the data can be evolving over time.
Learning algorithms must converge to a solution in finite time.
Let Dt = {xi, yi : y = f(x)} be the given data available...