Building and training a pipeline
Similarly to models, in order to add a pipeline to the catalog, we’ll have to train it. Pipeline training requires several steps:
- Create and name the pipeline object.
- Optionally, compute features from other GDS algorithms (such as graph algorithms, embeddings, or pre-processing).
- Define the feature set from the features added in the previous step, and/or any node property included in the projected graph.
- Select the ML models to be tested with their hyperparameters: The pipeline training will run all algorithms and select the best one.
- Finally, train the model.
The following sub-sections detail each of these steps. The supporting notebook is Pipeline_Train_Predict
. This can be found in the Chapter08
folder of the code bundle that comes with this book.
Creating the pipeline and choosing the features
In GDS, we can create three kinds of pipelines:
- Node classification: Each node gets assigned to one target...