Bootstrapping regression models
The US Crime
dataset introduced in Chapter 1, Introduction to Ensemble Techniques, is an example of why the linear regression model might be a good fit. In this example, we are interested in understanding the crime rate (R) as a function of thirteen related variables such as average age, the southern state indicator, and so on. Mathematically, the linear regression model is as follows:
data:image/s3,"s3://crabby-images/810ea/810eaf7337a528d77c19ada82bce71cd802483dc" alt="Bootstrapping regression models"
Here, are the p-covariates,
is the intercept term,
are the regression coefficients, and
is the error term assumed to follow a normal distribution
. The covariates can be written in a vector form and the ith observation can be summarized as
, where
. The n observations
, are assumed to be stochastically independent. The linear regression model has been detailed in many classical regression books; see Draper and Smith (1999), for instance. A recent book that details the implementation of the linear regression model in R is Ciaburro (2018). As the reader might have guessed...