To get the most out of this book
As mentioned earlier, you want to be very happy in Python development to absolutely maximize your time in this book. The pages don’t spend a lot of time focusing on the software, but again, everything in the GitHub repository is Python. If you’re already using a few key AWS services, like Amazon SageMaker, S3 buckets, ECR images, and FSx for Lustre, that will speed you up tremendously in applying what you’ve learned here. If you’re new to these, that’s ok, we’ll include introductions to each of these.
AWS Service or Open-source software framework |
What we’re using it for |
Amazon SageMaker |
Studio, notebook instances, training jobs, endpoints, pipelines |
S3 buckets |
Storing objects and retrieving metadata |
Elastic Container Registry |
Storing Docker images |
FSx for Lustre |
Storing large-scale data for model training loops |
Python |
General scripting: including managing and interacting with services, importing other packages, cleaning your data, defining your model training and evaluation loops, etc |
PyTorch and TensorFlow |
Deep learning frameworks to define your neural networks |
Hugging Face |
Hub with more than 100,000 open-source pretrained models and countless extremely useful and reliable methods for NLP and increasingly CV |
Pandas |
Go-to library for data analysis |
Docker |
Open-source framework for building and managing containers |
If you are using the digital version of this book, we advise you to access the code from the book’s GitHub repository (a link is available in the next section), step through the examples, and type the code yourself. Doing so will help you avoid any potential errors related to the copying and pasting of code.