The bias, variance, and regularization properties
Bias, variance, and the closely related topic of regularization hold very special and fundamental positions in the field of machine learning.
Bias happens when a machine learning model is too 'simple', leading to results that are consistently off from the actual values.
Variance happens when a model is too 'complex', leading to results that are very accurate on test datasets, but do not perform well on unseen/new datasets.
Once users become familiar with the process of creating machine learning models, it would seem that the process is quite simplistic - get the data, create a training set and a test set, create a model, apply the model on the test dataset, and the exercise is complete. Creating models is easy; creating a good model is a much more challenging topic. But how can one test the quality of a model? And, perhaps more importantly, how does one go about building a 'good' model?
The answer lies in a term called regularization. It's arguably...