We now combine the aforementioned methods into a single prediction. This seems intuitively a good idea, but how can we do this in practice? Perhaps the first thought that comes to mind is that we can average the predictions. This might give decent results, but there is no reason to think that all estimated predictions should be treated the same. It might be that one is better than the others.
We can try a weighted average, multiplying each prediction by a given weight before summing it all up. How do we find the best weights, though? We learn them from the data, of course!
Ensemble learning:
We are using a general technique in machine learning that is not just applicable in regression: ensemble learning. We learn an ensemble (that is, a set) of predictors. Then, we combine them to obtain a single output. What is interesting is that we can see each prediction...
We are using a general technique in machine learning that is not just applicable in regression: ensemble learning. We learn an ensemble (that is, a set) of predictors. Then, we combine them to obtain a single output. What is interesting is that we can see each prediction...