Splitting your dataset for training and evaluation
To evaluate a model, you must divide your dataset into three subsets: a training set, a validation set, and a test set. During the training phase, AutoKeras will train your model with the training dataset, while using the validation dataset to evaluate its performance. Once you are ready, the final evaluation will be done using the test dataset.
Why you should split your dataset
Having a separate test dataset that is not used during training is really important to avoid information leaks.
As we mentioned previously, the validation set is used to tune the hyperparameters of your model based on the performance of the model, but some information about the validation data is filtered into the model. Due to this, you run the risk of ending up with a model that works artificially well with the validation data, because that's what you trained it for. However, the actual performance of the model is due to us using previously...