Building a molecular property prediction model on SageMaker
In the previous chapters, we learned how to use SageMaker training and inference using built-in containers. In this chapter, we will see how we can extend SageMaker to train with custom containers. We will be using a Dockerfile to create a training container for SageMaker using the SageMaker training toolkit. We will then utilize that container to train a molecular property prediction model and see some results of our training job in a Jupyter notebook. We will also see how to test the container locally before submitting a training job to it. This is a handy feature of SageMaker that lets you validate whether your training container is working as expected and helps you debug the errors if needed.
For the purposes of this exercise, we will use a few custom libraries. Here is a list of custom libraries that we will be using:
- RDKit: A collection of cheminformatics and machine-learning software written in C++ and Python...