So far, we have been evaluating our models in the test set. By now, it is clear why we do it; however, there is one point we have not discussed yet. Let's go back to the diamond prices problem. In this chapter, we have built a simple multiple linear regression model and we have calculated some metrics on the test set. Let's say that we will use the MAE for evaluating the model. When we calculated this metric, we got 733.67. Now let's repeat the same steps for model building:
- Train-test split
- Standardize the numeric features
- Model training
- Get predictions
- Evaluate the model using the same metric
Here we have the code again:
## Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=2)
## Standardize the numeric features
scaler = StandardScaler()
scaler.fit(X_train[numerical_features])
X_train...