Launching and preparing the Cloud9 environment
In this recipe, we will launch and configure an AWS Cloud9 instance running an Ubuntu server. This will serve as the experimentation and simulation environment for the other recipes in this chapter. After that, we will resize the volume attached to the instance so that we can build container images later. This will ensure that we don't have to worry about disk space issues while we are working with Docker containers and container images. In the succeeding recipes, we will be preparing the expected file and directory structure that our train
and serve
scripts will expect when they are inside the custom container.
Important note
Why go through all this effort of preparing an experimentation environment? Once we have finished preparing the experimentation environment, we will be able to prepare, test, and update the custom scripts quickly, without having to use the fit()
and deploy()
functions from the SageMaker Python SDK during the initial stages of writing the script. With this approach, the feedback loop is much faster, and we will detect the issues in our script and container image before we even attempt using these with the SageMaker Python SDK during training and deployment.
Getting ready
Make sure you have permission to manage the AWS Cloud9 and EC2 resources if you're using an AWS IAM user with a custom URL. It is recommended to be signed in as an AWS IAM user instead of using the root account in most cases.
How to do it…
The steps in this recipe can be divided into three parts:
- Launching a Cloud9 environment
- Increasing the disk space of the environment
- Making sure that the volume configuration changes get reflected by rebooting the instance associated with the Cloud9 environment
We'll begin by launching the Cloud9 environment with the help of the following steps:
- Click Services on the navigation bar. A list of services will be shown in the menu. Under Developer Tools, look for Cloud9 and then click the link to navigate to the Cloud9 console:
In the preceding screenshot, we can see the services after clicking the Services link on the navigation bar.
- In the Cloud9 console, navigate to Your environments using the sidebar and click Create environment:
Here, we can see that the Create environment button is located near the top-right corner of the page.
- Specify the environment's name (for example,
Cookbook Experimentation Environment
) and, optionally, a description for your environment. Click Next step afterward:Here, we have the Name environment form, where we can specify the name and description of our Cloud9 environment.
- Select the Create a new EC2 instance for environment (direct access) option under Environment type, t3.small under Instance type, and Ubuntu Server 18.04 LTS under Platform:
We can see the different configuration settings here. Feel free to choose a different instance type as needed.
- Under Cost-saving setting, select After one hour. Leave the other settings as-is and click Next step:
Here, we can see that we have selected a Cost-saving setting of After one hour. This means that after an hour of inactivity, the EC2 instance linked to the Cloud9 environment will be automatically turned off to save costs.
- Review the configuration you selected in the previous steps and then click Create environment:
After clicking the Create environment button, it may take a minute or so for the environment to be ready. Once the environment is ready, check the different sections of the IDE:
As you can see, we have the file tree on the left-hand side. At the bottom part of the screen, we have the Terminal, where we can run our Bash commands. The largest portion, at the center of the screen, is the Editor, where we can edit the files.
Now, we need to increase the disk space.
- Using the Terminal at the bottom section of the IDE, run the following command:
lsblk
With the
lsblk
command, we will get information about the available block devices, as shown in the following screenshot:Here, we can see the results of the
lsblk
command. At this point, the root volume only has10G
of disk space (minus what is already in the volume). - At the top left section of the screen, click AWS Cloud9. From the dropdown list, click Go To Your Dashboard:
This will open a new tab showing the Cloud9 dashboard.
- Navigate to the EC2 console using the search bar. Type
ec2
in the search bar and click the EC2 service from the list of results:Here, we can see that the search bar quickly gives us a list of search results after we have typed in
ec2
. - In the EC2 console, click Instances (running) under Resources:
We should see the link we need to click under the Resources pane, as shown in the preceding screenshot.
- Select the EC2 instance corresponding to the Cloud9 environment we launched in the previous set of steps. It should contain
aws-cloud9
and the name we specified while creating the environment. In the bottom pane showing the details, click the Storage tab to show Root device details and Block devices. - Inside the Storage tab, scroll down to the bottom of the page to locate the volumes under Block devices:
Here, we can see the Storage tab showing Root device details and Block devices.
- You should see an attached volume with
10
GiB for the volume size. Click the link under Volume ID (for example,vol-0130f00a6cf349ab37
). Take note that this Volume ID will be different for your volume:You will be redirected to the Elastic Block Store Volumes page, which shows the details of the volume attached to your instance:
Here, we can see that the size of the volume is currently set to 10 GiB.
- Click Actions and then Modify Volume:
This is where we can find the Modify Volume option.
- Set Size to
100
and click Modify:As you can see, we specified a new volume size of
100
GiB. This should be more than enough to help us get through this chapter and build our custom algorithm container image. - Click Yes to confirm the volume modification action:
We should see a confirmation screen here after clicking Modify in the previous step.
- Click Close upon seeing the confirmation dialog:
Here, we can see a message stating Modify Volume Request Succeeded. At this point, the volume modification is still pending and we need to wait about 10-15 minutes for this to complete. Feel free to check out the How it works… section for this recipe while waiting.
- Click the refresh button (the two rotating arrows) so that the volume state will change to the correct state accordingly:
Clicking the refresh button will update State from in-use (green) to in-use – optimizing (yellow):
Here, we can see that the volume modification step has not been completed yet.
- After a few minutes, State of the volume will go back to in-use (green):
When we see what is shown in the preceding screenshot, we should celebrate as this means that the volume modification step has been completed!
Now that the volume modification step has been completed, our next goal is to make sure that this change is reflected in our environment.
- Navigate back to the browser tab of the AWS Cloud9 IDE. In the Terminal, run
lsblk
:lsblk
Running
lsblk
should yield the following output:As you can see, while the size of the root volume,
/dev/nvme0n1
, reflects the new size,100G
, the size of the/dev/nvme0n1p1
partition reflects the original size,10G
.There are multiple ways to grow the partition, but we will proceed by simply rebooting the EC2 instance so that the size of the
/dev/nvme0n1p1
partition will reflect the size of the root volume, which is100G
. - Navigate back to the EC2 Volumes page and select the EC2 volume attached to the Cloud9 instance. At the bottom portion of the screen showing the volume's details, locate the Attachment information value under the Description tab. Click the Attachment information link:
Clicking this link will redirect us to the EC2 Instances page. It will automatically select the EC2 instance of our Cloud9 environment:
The preceding screenshot shows the EC2 instance linked to our Cloud9 environment.
- Click Instance state at the top right of the screen and click Reboot instance:
This is where we can find the Reboot instance option.
- Navigate back to the browser tab showing the AWS Cloud9 environment IDE. It should take a minute or two to complete the reboot step:
We should see a screen similar to the preceding one.
- Once connected, run
lsblk
in the Terminal:lsblk
We should get a set of results similar to what is shown in the following screenshot:
As we can see, the /dev/nvme0n1p1
partition now reflects the size of the root volume, which is 100G
.
That was a lot of setup work, but this will be definitely worth it, as you will see in the next few recipes in this chapter. Now, let's see how this works!
How it works…
In this recipe, we launched a Cloud9 environment where we will prepare the custom container image. When building Docker container images, it is important to note that each container image consumes a bit of disk space. This is why we had to go through a couple of steps to increase the volume attached to the EC2 instance of our Cloud9 environment. This recipe was composed of three parts: launching a new Cloud9 environment, modifying the mounted volume, and rebooting the instance.
Launching a new Cloud9 environment involves using a CloudFormation template behind the scenes. This CloudFormation template is used as the blueprint when creating the EC2 instance:
Here, we have a CloudFormation stack that was successfully created. What's CloudFormation? AWS CloudFormation is a service that helps developers and DevOps professionals manage resources using templates written in JSON or YAML. These templates get converted into AWS resources using the CloudFormation service.
At this point, the EC2 instance should be running already and we can use the Cloud9 environment as well:
We should be able to see the preceding output once the Cloud9 environment is ready. If we were to use the environment right away, we would run into disk space issues as we will be working with Docker images, which take up a bit of space. To prevent these issues from happening later on, we modified the volume in this recipe and restarted the EC2 instance so that this volume modification gets reflected right away.
Important note
In this recipe, we took a shortcut and simply restarted the EC2 instance. If we were running a production environment, we should avoid having to reboot and follow this guide instead: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/recognize-expanded-volume-linux.html.
Note that we can also use a SageMaker Notebook instance that's been configured with root access enabled as a potential experimentation environment for our custom scripts and container images, before using them in SageMaker. The issue here is that when using a SageMaker Notebook instance, it reverts to how it was originally configured every time we turn off and reboot the instance. This makes us lose certain directories and installed packages, which is not ideal.