Performing mean normalization
In mean normalization, we center the variable at 0 and rescale the distribution to the value range. This procedure involves subtracting the mean from each observation and then dividing the result by the difference between the minimum and maximum values:
In this recipe, we will implement mean normalization with pandas and then with scikit-learn.
How to do it...
We’ll begin by importing the required libraries, loading the dataset, and preparing the train and test sets:
- Import pandas and the required scikit-learn class and function:
import pandas as pd from sklearn.datasets import fetch_california_housing from sklearn.model_selection import train_test_split
- Let’s load the California housing dataset from scikit-learn into a pandas dataframe:
X, y = fetch_california_housing( return_X_y=True, as_frame=True) X.drop(labels=["Latitude", "Longitude"], axis=1, inplace...