Another R object produced by our linear regression is the error object. The error object is a vector that was computed by taking the difference between the predicted value of height and the actual height. These values are also known as the residual errors, or just residuals.
error <- women$height-prediction
Since the error object is a vector, you cannot use the nrow() function to get its size. But you can use the length() function:
>length(error)
[1] 15
In all of the previous cases, the counts all total 15, so all is good. If we want to see the raw data, predictions, and the prediction errors for all of the data, we can use the cbind() function (Column bind) to concatenate all three of those values, and display as a simple table.
At the console enter the follow cbind command:
> cbind(height=women$height,PredictedHeight=prediction,ErrorInPrediction=error)
height PredictedHeight ErrorInPrediction
1 58 58.75712 -0.75711680
2 59 59.33162 -0.33161526
3 60 60.19336 -0.19336294
4 61 61.05511 -0.05511062
5 62 61.91686 0.08314170
6 63 62.77861 0.22139402
7 64 63.64035 0.35964634
8 65 64.50210 0.49789866
9 66 65.65110 0.34890175
10 67 66.51285 0.48715407
11 68 67.66184 0.33815716
12 69 68.81084 0.18916026
13 70 69.95984 0.04016335
14 71 71.39608 -0.39608278
15 72 72.83233 -0.83232892
From the preceding output, we can see that there are a total 15 predictions. If you compare the ErrorInPrediction with the error plot shown previously, you can see that for this very simple model, the prediction errors are much larger for extreme values in height (shaded values).
Just to verify that we have one for each of our original observations we will use the nrow() function to count the number of rows.
At the command prompt in the console area, enter the command:
nrow(women)
The following should appear:
>nrow(women)
[1] 15
Refer back to the seventh line of code in the original script: plot(women$height,error) plots the predicted height versus the errors. It shows how much the prediction was off from the original value. You can see that the errors show a non-random pattern.
After you are done, save the file using File | File Save, navigate to the PracticalPredictiveAnalytics/R folder that was created, and name it Chapter1_LinearRegression.