Snooping on the leaderboard
As we previously described, in each competition, Kaggle divides the test set into a public part, which is visualized on the ongoing leaderboard, and a private part, which will be used to calculate the final scores. These test parts are usually randomly determined (although in time series competitions, they are determined based on time) and the entire test set is released without any distinction made between public and private.
Recently, in order to avoid the scrutinizing of test data in certain competitions, Kaggle has even held back the test data, providing only some examples of it and replacing them with the real test set when the submission is made. These are called Code competitions because you are not actually providing the predictions themselves, but a Notebook containing the code to generate them.
Therefore, a submission derived from a model will cover the entire test set, but only the public part will immediately be scored, leaving...