Standardizing the features
Standardization is the process of centering the variable at 0 and standardizing the variance to 1. To standardize features, we subtract the mean from each observation and then divide the result by the standard deviation:
The result of the preceding transformation is called the z-score and represents how many standard deviations a given observation deviates from the mean. In this recipe, we will implement standardization with scikit-learn.
How to do it...
To begin, we will import the required packages, load the dataset, and prepare the train and test sets:
- Import the required Python packages, classes, and functions:
import matplotlib.pyplot as plt import pandas as pd from sklearn.datasets import fetch_california_housing from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler
- Let’s load the California housing dataset from scikit-learn into a dataframe, and drop...