SageMaker overview
Amazon SageMaker offers ML functionalities that cover the entire ML lifecycle, spanning from initial experimentation to production deployment and ongoing monitoring. It caters to various roles, such as data scientists, data analysts, and MLOps engineers. The following diagram showcases the key SageMaker features that support the complete data science journey for different personas:
Figure 8.1: SageMaker capabilities
Within SageMaker, data scientists have access to an array of features and services to support different ML tasks. These include Studio notebooks for model building, Data Wrangler for visual data preparation, the Processing service for large-scale data processing and transformation, the Training service, the Tuning service for model tuning, and the Hosting service for model hosting. With these tools, data scientists can handle various ML responsibilities, such as data preparation, model building and training, model tuning, and conducting model integration testing.
On the other hand, data analysts can utilize SageMaker Canvas, a user-friendly model-building service that requires little to no coding. This visual interface empowers analysts to train models effortlessly. Additionally, they can use Studio notebooks for lightweight data analysis and processing.
MLOps engineers play a crucial role in managing and governing the ML environment. They are responsible for automating ML workflows and can leverage SageMaker Pipelines, Model Registry, and endpoint monitoring to achieve this. Furthermore, MLOps engineers configure the processing, training, and hosting infrastructure to ensure smooth operations for both interactive usage by data scientists and automated operations.
In this chapter, our focus will center on data science environments catered specifically to data scientists. Subsequently, in the following chapter, we will delve into the administration, governance, and automation of ML infrastructure.