Experimenting and modeling
In this section, we will use a regression ML algorithm to train an ML model to predict trip duration based on several features in the dataset, such as date, time, pickup and drop-off locations, distance, and so on. To learn about the capabilities related to the Data Science experience in Fabric, we will create two versions of the trained model with different sets of hyperparameters and then register each of them in the model registry. While doing this, we will log all the hyperparameters and evaluation metrics by taking advantage of the native integration of MLflow in Fabric.
Note
MLflow is an open source platform for managing the end-to-end ML life cycle. You can read more about it at https://mlflow.org/docs/latest/index.html.
The code that will be discussed in this section can be found in the Data Science – Model Training
notebook. Please make sure you attach the lakehouse (nyctaxilake
) you created in the Data and storage – creating...