Often, when we are in a controlled machine learning setting, we are given a dataset that we can use for training and a different set that we can use for testing. The idea is that you only run the learning algorithm on the training data, but when it comes to seeing how good your model is, you feed your model the test data and observe the output. It is typical for competitions and hackathons to give out the test data but withhold the labels associated with it because the winner will be selected based on how well the model performs on the test data and you don't want them to cheat by looking at the labels of the test data and making adjustments. If this is the case, we can use a validation dataset, which we can create by ourselves by separating a portion of the training data to be the validation data.
The whole point of having separate sets, namely a validation or test dataset, is to measure the performance on this data, knowing that our model...