Scaling data for streaming
In the first part of this section, let's start by looking at some solutions for streaming scaling data. Before going into the solutions, let's do a quick recap of what scaling is and how it works.
Introducing scaling
Numerical variables can be of any scale, meaning they can have very high average values or low average values, for example. Some machine learning algorithms are not at all impacted by the scale of a variable, whereas other machine learning algorithms can be strongly impacted.
Scaling is the practice of taking a numerical variable and reducing its range, and potentially its standard deviation, to a pre-specified range. This will allow all machine learning algorithms to learn from the data without problems.
Scaling with MinMaxScaler
To achieve this goal, a commonly used method is the Min-Max scaler. The Min-Max scaler will take an input variable in any range and reduce all of the values to fall in between the range (0
to...