Storing the training data
First of all, you can use multiple AWS services to prepare data for machine learning, such as Elastic MapReduce (EMR), Redshift, Glue, and so on. After preprocessing the training data, you should store it in S3, in a format expected by the algorithm you are using. Table 6.1 shows the list of acceptable data formats per algorithm.
Data format |
Algorithm |
|
Object detection algorithm, semantic segmentation |
|
Object detection algorithm |
|
Factorization machines, K-Means, KNN, latent Dirichlet allocation, linear learner, NTM, PCA, RCF, sequence-to-sequence |
|
BlazingText, DeepAR |