Scaling
In many instances, numerical features in a dataset can vary greatly in scale with other features. For example, the typical square footage of a house might be a number between 1000 and 3000 square feet, whereas 2, 3, or 4 might be a more typical number for the number of bedrooms in a house. If we leave these values alone, the features with a higher scale might be given a higher weighting if left alone. How can this issue be fixed?
Scaling can be a way to solve this problem. Continuous features become comparable in terms of the range after scaling is applied. Not all algorithms require scaled values (Random Forest comes to mind), but other algorithms will produce meaningless results if the dataset is not scaled beforehand (examples are k-nearest neighbors or k-means). We will now explore the two most common scaling methods.
Normalization (or minmax normalization) scales all values for a feature within a fixed range between 0 and 1. More formally, each value for...