Regression diagnostics
In the Useful residual plots subsection, we saw how outliers can be identified using the residual plots. If there are outliers, we need to ask the following questions:
Is the observation an outlier due to an anomalous value in one or more covariate values?
Is the observation an outlier due to an extreme output value?
Is the observation an outlier because of both the covariate and output values being extreme values?
The distinction in the nature of an outlier is vital as one needs to be sure of its type. The techniques for an outlier identification are certainly different as is their impact. If the outlier is due to the covariate value, the observation is called a leverage point, and if it is due to the y value, we call it an influential point. The rest of the section is for the exact statistical technique for such an outlier identification.
Leverage points
As noted, a leverage point has an anomalous x value. The leverage points may be theoretically proved not to impact the...