Implementing a train-test split procedure
The main idea of splitting your data into two datasets is that you can train your model in one and then test your model performance over new data. When a dataset is split into a training and testing set, the majority of the data goes to the training set and a small part of it is used for testing.
The subset used to fit a model is known as the training dataset. This contains example inputs and outputs (I/Os) that will train the model fitting the parameters.
On the other hand, when the inputs on the test dataset are provided to the model, the resulting predictions made from those inputs are then compared to the expected values to assess the model's accuracy.
When to use a train-test split procedure
A train-test split evaluation procedure can be used for classification or regression problems.
The dataset to be used should be large enough to represent the problem domain, covering every common case and enough uncommon cases....