Embedding training code in Power Query
One of the easiest ways to train a ML model is to write the necessary code directly in Power Query, right after you import a dataset on which you want to build the model.
Training a model on a fairly large dataset typically takes a significant amount of time. Because you embed the code in Power Query, it runs every time the data is refreshed, and this can result in a non-negligible delay in getting the data online. Therefore, the following applies:
IMPORTANT NOTE
This solution is recommended if you are confident that the time required to complete the model training is acceptable.
Now let’s look at an example of how to write some training code with PyCaret.
Training and using ML models with PyCaret
Let’s take the Titanic disaster dataset to train a ML model. Specifically, we want to build a model that predicts whether a passenger survives (the Survived
column) based on their attributes described...