Managed data processing with SageMaker Processing in R
In the Preparing the SageMaker Processing prerequisites using the AWS CLI recipe, we prepared a few prerequisites including the dummy dataset we will use in our SageMaker Processing job and the ECR repository where we will store the custom container image we will prepare in this recipe.
Now, we will create an R script, build a custom R container image, and use SageMaker Processing to run the R script inside a managed environment that is automatically created, configured, and destroyed when the processing job is launched and executed. If you are working on a requirement that is similar to one of the following, then this recipe is for you:
- Normalizing numerical features with the
normalr
package - Text preprocessing with the
tm
(text mining) package - Automated feature engineering with the
dplyr
package - Performing post-training processing and evaluation steps
Once we have completed this recipe, we will have...