Before we understand about how the preceding techniques are useful, let's build a scenario, so that we understand the phenomenon of overfitting.
Scenario 1: A case of not generalizing on an unseen dataset
In this scenario, we will create a dataset, for which there is a clear linearly separable mapping between input and output. For example, whenever the independent variables are positive, the output is [1,0], and when the input variables are negative, the output is [0,1]:
To that dataset, we will add a small amount of noise (10% of the preceding dataset created) by adding some data points that follow the opposite of the preceding pattern, that is, when the input variables are positive, the output is [0,1], and the output is [1,0] when the input variables are negative:
Appending the datasets obtained by the preceding two steps gives us the...