Setting up Amazon SageMaker Studio
Amazon SageMaker Studio goes one step further in integrating the ML tools you need from experimentation to production. At its core is an integrated development environment based on Jupyter that makes it instantly familiar.
In addition, SageMaker Studio is integrated with other SageMaker capabilities, such as SageMaker Experiments to track and compare all jobs, SageMaker Autopilot to automatically create ML models, and more. A lot of operations can be achieved in just a few clicks, without having to write any code.
SageMaker Studio also further simplifies infrastructure management. You won't have to create notebook instances: SageMaker Studio provides you with compute environments that are readily available to run your notebooks.
Note:
This section requires basic knowledge of Amazon VPC and Amazon IAM. If you're not familiar with them at all, please read the following documentation:a) https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html b) https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html
Onboarding to Amazon SageMaker Studio
You can access SageMaker Studio using any of these three options:
- Use the quick start procedure: This is the easiest option for individual accounts, and we'll walk through it in the following paragraphs.
- Use AWS Single Sign-On (SSO): If your company has an SSO application set up, this is probably the best option. You can learn more about SSO onboarding at https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-sso-users.html. Please contact your IT administrator for details.
- Use Amazon IAM: If your company doesn't use SSO, this is probably the best option. You can learn more about SSO onboarding at https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-iam.html. Again, please contact your IT administrator for details.
Onboarding with the quick start procedure
Perform the following steps to access the SageMaker Studio with the quick start procedure:
- First, open the AWS Console in one of the regions where Amazon SageMaker Studio is available, for example, https://us-east-2.console.aws.amazon.com/sagemaker/.
- As shown in the following screenshot, the left-hand vertical panel has a link to SageMaker Studio:
- Clicking on this link opens the onboarding screen, and you can see its first section in the next screenshot:
- Let's select Quick start. Then, we enter the username we'd like to use to log into SageMaker Studio, and we create a new IAM role as shown in the preceding screenshot. This opens the following screen:
The only decision we have to make here is whether we want to allow our notebook instance to access specific Amazon S3 buckets. Let's select Any S3 bucket and click on Create role. This is the most flexible setting for development and testing, but we'd want to apply much stricter settings for production. Of course, we can edit this role later on in the IAM console, or create a new one.
- Once we've clicked on Create role, we're back to the previous screen, where we just have to click on Submit to launch the onboarding procedure. Depending on your account setup, you may get an extra screen asking you to select a VPC and a subnet. I'd recommend selecting any subnet in your default VPC.
- A few minutes later, SageMaker Studio is in service, as shown in the following screenshot. We could add extra users if we needed to, but for now, let's just click on Open Studio:
Don't worry if this takes a few more minutes, as SageMaker Studio needs to complete the first-run setup of your environment. As shown in the following screenshot, SageMaker Studio opens, and we see the familiar JupyterLab layout:
Note:
SageMaker Studio is a living thing. By the time you're reading this, some screens may have been updated. Also, you may notice small differences from one region to the next, as some features or instance types are not available everywhere.
- We can immediately create our first notebook. In the Launcher tab, let's select Data Science, and click on Notebook – Python 3.
- This opens a notebook, as is visible in the following screenshot. We first check that SDKs are readily available. If this is the first time you've launched the Data Science image, please wait for a couple of minutes for the environment to start:
- When we're done working with SageMaker Studio, all we have to do is close the browser tab. If we want to resume working, we just have to go back to the SageMaker console, and click on Open Studio.
Now that we've completed this exercise, let's review what we learned in this chapter.