Simple linear regression
Before looking at some real-world datasets, it is very helpful to try to train a model on artificially generated data. In an artificial scenario such as this, we know what the true output function is beforehand, something that as a rule is not the case when it comes to real-world data. The advantage of performing this exercise is that it gives us a good idea of how our model works under the ideal scenario when all of our assumptions are fully satisfied, and it helps visualize what happens when we have a good linear fit. We'll begin by simulating a simple linear regression model. The following R snippet is used to create a data frame with 100 simulated observations of the following linear model with a single input feature:
Here is the code for the simple linear regression model:
> set.seed(5427395) > nObs = 100 > x1minrange = 5 > x1maxrange = 25 > x1 = runif(nObs, x1minrange, x1maxrange) > e = rnorm(nObs, mean = 0, sd = 2.0) > y = 1.67 * x1 - 2...