Using other storage services
So far, we've used S3 to store training data. At a large scale, throughput and latency can become a bottleneck, making it necessary to consider other storage services:
- Amazon Elastic File System (EFS): https://aws.amazon.com/efs
- Amazon FSx for Lustre: https://aws.amazon.com/fsx/lustre.
Note
This section requires a little bit of AWS knowledge on VPCs, subnets, and security groups. If you're not familiar at all with these, I'd recommend reading the following:
https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Subnets.html
https://docs.aws.amazon.com/vpc/latest/userguide/VPC_SecurityGroups.html
Working with SageMaker and Amazon EFS
EFS is a managed storage service compatible with NFS v4. It lets you create volumes that can be attached to EC2 instances and SageMaker instances. This is a convenient way to share data, and you can use it to scale I/O for large training jobs.
By default, files are stored in the Standard class...