Test your knowledge
Use the full housing dataset from this book's GitHub repository (under Chapter13/data/housing_data_full.csv
), then use PyCaret and/or another AutoML package to find the best ML model for the data. It may help to first use recursive feature selection to trim down the number of features if it takes too long to run (or sample down the data). Once the optimum model has been found, plot the learning curve of the model to see if we have enough data or should ideally collect more. You should see similar results to what we've seen in this chapter, although if you use the full dataset, you may see the learning curve has flattened out.