In the Data visualization section, we saw that some predictors have outliers. Outliers are the values that, when compared to others, are particularly extreme. Outliers are a problem because they tend to distort data analysis results, in particular, in descriptive statistics and correlations. Outliers have a large influence on the fit, because squaring the residuals magnifies the effects of these extreme data points. For these reasons, it may be necessary to remove these values first to improve the performance of the model.
In some cases, you may be tempted to remove outliers that are influential or have an excessive impact on the synthesis measures you want to consider (such as the mean or the linear correlation coefficient). However, this way of proceeding isn't always cautious, unless the reasons for an abnormal observation...