Transforming data with recipes
A crucial element of the data science workflow is feature engineering. Amazon ML offers certain data transformations via its data recipes. Note that although transformations are conceptually part of the ETL or data preparation phase of a predictive analytics workflow, in Amazon ML, data recipes are part of the model-building step and not of the initial datasource creation step. In this section, we start by reviewing the available data transformations in Amazon ML, and then we apply some of them to the Titanic
dataset using the Titanic train set 11 variables
datasource.
Managing variables
Recipes are JSON-structured scripts that contains the following three sections in the given order:
- Groups
- Assignments
- Outputs
An empty recipe instructing Amazon ML to take all the dataset variables into account for model training will be as follows:
{ "groups" : {}, "assignments" : { }, "outputs":["ALL_INPUTS"] }
The recipe does not transform the data in any way.
Note
The...