Bootstrapping regression models
The US Crime
dataset introduced in Chapter 1, Introduction to Ensemble Techniques, is an example of why the linear regression model might be a good fit. In this example, we are interested in understanding the crime rate (R) as a function of thirteen related variables such as average age, the southern state indicator, and so on. Mathematically, the linear regression model is as follows:
Here, are the p-covariates, is the intercept term, are the regression coefficients, and is the error term assumed to follow a normal distribution . The covariates can be written in a vector form and the ith observation can be summarized as , where . The n observations , are assumed to be stochastically independent. The linear regression model has been detailed in many classical regression books; see Draper and Smith (1999), for instance. A recent book that details the implementation of the linear regression model in R is Ciaburro (2018). As the reader might have guessed...