Pseudo-labeling
In competitions where the number of examples used for training can make a difference, pseudo-labeling can boost your scores by providing further examples taken from the test set. The idea is to add examples from the test set whose predictions you are confident about to your training set.
First introduced in the Santander Customer Transaction Prediction competition by team Wizardry (read here: https://www.kaggle.com/c/santander-customer-transaction-prediction/discussion/89003), pseudo-labeling simply helps models to refine their coefficients thanks to more data available, but it won’t always work. First of all, it is not necessary in some competitions. That is, adding pseudo-labels won’t change the result; it may even worsen it if there is some added noise in the pseudo-labeled data.
Unfortunately, you cannot know for sure beforehand whether or not pseudo-labeling will work in a competition (you have to test it empirically), though plotting...