Training on a compute cluster
In the previous section, we showed how to train your model on a compute instance. In this section, we will show you how to submit your training job to a compute cluster when the training job needs to scale out. AML has made it extremely easy to run your training code on various compute targets without the need to change the training script. You need to create an AML pipeline that handles the data processing, the model training, and registering the trained model, as explained in this section.
The following are the steps to train your model on a compute cluster:
- Go to https://ml.azure.com.
- Select your workspace name.
- On the left side of the workspace user interface, click Compute:
Figure 3.45 – Compute icon
- On the Compute screen, click on the Compute clusters tab and then click on + New, as shown in Figure 3.46:
Figure 3.46 – Creating a new compute cluster
-
...