Exploring Amazon SageMaker Processing
Collecting and labeling data samples is only the first step in preparing a dataset. Indeed, it's very likely that you'll have to pre-process your dataset in order to do the following, for example:
- Convert it to the input format expected by the machine learning algorithm you're using.
- Rescale or normalize numerical features.
- Engineer higher-level features, for example, one-hot encoding.
- Clean and tokenize text for natural language processing applications.
- And more!
Once training is complete, you may want to run additional jobs to post-process the predicted data and to evaluate your model on different datasets.
In this section, you'll learn about Amazon SageMaker Processing, a SageMaker capability that helps you run batch jobs related to your machine learning project.
Discovering the Amazon SageMaker Processing API
The Amazon SageMaker Processing API is part of the SageMaker SDK...