Overview of a standard Amazon Machine Learning workflow
The Amazon Machine Learning service is available at https://console.aws.amazon.com/machinelearning/. The Amazon ML workflow closely follows a standard Data Science workflow with steps:
- Extract the data and clean it up. Make it available to the algorithm.
- Split the data into a training and validation set, typically a 70/30 split with equal distribution of the predictors in each part.
- Select the best model by training several models on the training dataset and comparing their performances on the validation dataset.
- Use the best model for predictions on new data.
As shown in the following Amazon ML menu, the service is built around four objects:
![](https://static.packt-cdn.com/products/9781785883231/graphics/image_03_006.png)
Datasource
ML model
Evaluation
Prediction
The Datasource and Model can also be configured and set up in the same flow by creating a new Datasource and ML model
. Let us take a closer look at each one of these steps.
The dataset
For the rest of the chapter, we will use the simple Predicting Weight by Height...