This article is an excerpt from the book, The Definitive Guide to Google Vertex AI, by Jasmeet Bhatia, Kartik Chaudhary. Accelerate your machine learning journey with Google Cloud Vertex AI and MLOps best practices
Machine learning (ML) projects are complex in nature and require an entirely different type of development environment from normal software applications. When the data is huge, a data scientist may want to use several big data tools for quick wrangling or preprocessing needs, and a deep learning (DL) model might require several GPUs for fast training and experimentation. Additionally, dedicated compute resources are required for hosting models in production, and even more to scale them up to the enterprise level. Acquiring such resources and tools is quite costly, and even if we manage to buy and set things up, it takes a lot of effort and technical knowledge to bring them together into a project pipeline. Even after doing all that, there are risks of downtime and data security.
Nowadays, cloud-based solutions are very popular and take care of all the technical hassle, scaling, and security aspects for us. These solutions let ML developers focus more on project development and experimentation without worrying about infrastructure and other low-level things. As an artificial intelligence (AI)-first company, Google brings all the important resources required for ML project development under one umbrella called Vertex AI. In this chapter, we will learn about Vertex AI Workbench, a managed solution for Jupyter Notebook kernels that can help us bring our ML projects from prototype to production many times faster.
This chapter covers the following topics:
Jupyter Notebook is an open source web-based application for writing and sharing live code, documentation, visualizations, and so on. Jupyter Notebooks are very popular among ML practitioners as they provide the flexibility to run code dynamically and collaborate, provide fast visualizations, and can also be used for presentations. Most data scientists and ML practitioners prefer Jupyter Notebook as their primary tool for exploring, visualizing, and preprocessing data using powerful Python libraries such as pandas and NumPy. Jupyter Notebooks are very useful for exploratory data analysis (EDA) as they let us run small code blocks dynamically and also draw quick plots to understand data statistically. Notebooks can also be used for doing quick ML modeling experiments. Another good thing about Jupyter Notebooks is that they let us write Markdown cells as well. Using Markdown, we can explain each code block inside the notebook and turn it into a tutorial. Jupyter Notebooks are popular among ML communities to share and collaborate on projects on platforms such as GitHub and Kaggle.
The Jupyter Notebook application can be installed in local systems using a simple pip command (shown next). For quick experiments, we can also utilize web-based notebook kernels such as Colab and Kaggle, where everything is already set and we can run the Python code directly. As these kernels are public, we can’t use them if our data is confidential, and we will have to install the Jupyter Notebook application on our system.
We can install the Jupyter application on our local system by using the following pip command:
$ pip install jupyter
Once the application is installed, it can be launched through the terminal by typing the following command, and it will automatically open the Jupyter application in a browser tab:
$ jupyter notebook
If it doesn’t open the browser tab automatically, we can launch the application by typing the following URL: http://localhost:8888/tree. By default, the Jupyter server starts on port 8888, but if this port is unavailable, it finds the next available port. If we are interested in using a custom port, we can launch Jupyter by passing a custom port number.
Here is a terminal command for launching the Jupyter application on custom port number 9999:
$ jupyter notebook --port 9999
Note: In some cases, the Jupyter server may ask for a token (maybe in the case of a non-default browser) when we try to hit the aforementioned URL manually. In such cases, we can copy the URL from the terminal output that provides the token within the URL. Alternatively, we can obtain a token by running the jupyter notebook list command in the terminal.
Once we are able to launch the application server in a browser, the Jupyter server looks something like this:
Figure 4.1 – Jupyter application server UI
Now, we can launch a Jupyter Notebook instance by clicking on the New button. It creates a new notebook and saves it in the same directory where we started the Jupyter Notebook from the terminal. We can now open that notebook in a new tab and start running scripts. The following screenshot shows an empty notebook:
Figure 4.2 – A Jupyter Notebook instance
As we can see in the previous screenshot, the web UI provides multiple options to manipulate notebooks, code, cells, kernels, and so on. A notebook cell can execute code or can be converted into a Markdown cell by changing its type from the drop-down menu. There are also options for exporting notebooks into different formats such as HTML, PDF, Markdown, LaTeX, and so on for creating reports or presentations. Going further in the book, we will be working with notebooks a lot for data wrangling, modeling, and so on.
Now that we have some basic understanding of Jupyter Notebooks in general, let’s see how Vertex AI Workbench provides a more enriched experience of working with a Jupyter Notebook-based environment.
While working on an ML project, if we are running a Jupyter Notebook in a local environment, or using a web-based Colab- or Kaggle-like kernel, we can perform some quick experiments and get some initial accuracy or results from ML algorithms very fast. But we hit a wall when it comes to performing large-scale experiments, launching long-running jobs, hosting a model, and also in the case of model monitoring. Additionally, if the data related to a project requires some more granular permissions on security and privacy (fine-grained control over who can view/access the data), it’s not feasible in local or Colab-like environments. All these challenges can be solved just by moving to the cloud. Vertex AI Workbench within Google Cloud is a JupyterLab-based environment that can be leveraged for all kinds of development needs of a typical data science project. The JupyterLab environment is very similar to the Jupyter Notebook environment, and thus we will be using these terms interchangeably throughout the book.
Vertex AI Workbench has options for creating managed notebook instances as well as user-managed notebook instances. User-managed notebook instances give more control to the user, while managed notebooks come with some key extra features. We will discuss more about these later in this section. Some key features of the Vertex AI Workbench notebook suite include the following:
With this background, we can now start working with Jupyter Notebooks on Vertex AI Workbench. The next section provides basic guidelines for getting started with notebooks on Vertex AI.
Go to the Google Cloud console and open Vertex AI from the products menu on the left pane or by using the search bar on the top. Inside Vertex AI, click on Workbench, and it will open a page very similar to the one shown in Figure 4.3. More information on this is available in the official
documentation (https://cloud.google.com/vertex-ai/docs/workbench/ introduction).
Figure 4.3 – Vertex AI Workbench UI within the Google Cloud console
As we can see, Vertex AI Workbench is basically Jupyter Notebook as a service with the flexibility of working with managed as well as user-managed notebooks. User-managed notebooks are suitable for use cases where we need a more customized environment with relatively higher control. Another good thing about user-managed notebooks is that we can choose a suitable Docker container based on our development needs; these notebooks also let us change the type/size of the instance later on with a restart.
To choose the best Jupyter Notebook option for a particular project, it’s important to know about the common differences between the two solutions. Table 4.1 describes some common differences between fully managed and user-managed notebooks:
Vertex AI-managed notebooks | Vertex AI user-managed notebooks |
Google-managed environment with integrations and features that provide us with an end-toend notebook-based production environment without setting up anything by hand. | Heavily customizable VM instances (with prebuilt DL images) that are ideal for users who need a lot of control over the environment. |
Scaling up and down (for vCPUs and RAM) can be performed from within the notebook itself without needing to restart the environment. | Changing the size/memory of an instance requires stopping the instance in the Workbench UI and restarting it every time. |
Managed notebooks let us browse data from Google Cloud Storage (GCS) and BigQuery without leaving the Jupyter environment (with GCS and BigQuery integrations). | UI-level data browsing is not supported in usermanaged notebooks. However, we can read the data using Python in a notebook cell and view it. |
Automated notebook runs are supported with one-time and recurring schedules. The executor runs scheduled tasks and saves results even when an instance is in a shutdown state. | Automated runs are not yet supported in a user-managed environment. |
Less control over networking and security. | Option to implement desired networking and security features and VPC service controls on a per-need basis. |
Not much control for a DL-based environment while setting up notebooks. | User-managed instances provide multiple DL VM options to choose from during notebook creation. |
Table 4.1 – Differences between managed and user-managed notebook instances Let’s create one user-managed notebook to check the available options:
Figure 4.4 – Jupyter Notebook kernel configurations
As we can see in the preceding screenshot, user-managed notebook instances come with several customized image options to choose from. Along with the support of tools such as TensorFlow Enterprise, PyTorch, JAX, and so on, it also lets us decide whether we want to work with GPUs (which can be changed later, of course, as per needs). These customized images come with all useful libraries pre-installed for the desired framework, plus provide the flexibility to install any third-party packages within the instance.
After choosing the appropriate image, we get more options to customize things such as notebook name, notebook region, operating system, environment, machine types, accelerators, and so on (see the following screenshot):
Figure 4.5 – Configuring a new user-managed Jupyter Notebook
Once we click on the CREATE button, it can take a couple of minutes to create a notebook instance. Once it is ready, we can launch the Jupyter instance in a browser tab using the link provided inside Workbench (see Figure 4.6). We also get the option to stop the notebook for some time when we are not using it (to reduce cost):
Figure 4.6 – A running Jupyter Notebook instance
This Jupyter instance can be accessed by all team members having access to Workbench, which helps in collaborating and sharing progress with other teammates. Once we click on OPEN JUPYTERLAB, it opens a familiar Jupyter environment in a new tab (see Figure 4.7):
Figure 4.7 – A user-managed JupyterLab instance in Vertex AI Workbench A Google-managed JupyterLab instance also looks very similar (see Figure 4.8):
Figure 4.8 – A Google-managed JupyterLab instance in Vertex AI Workbench
Now that we can access the notebook instance in the browser, we can launch a new Jupyter Notebook or terminal and get started on the project. After providing sufficient permissions to the service account, many useful Google Cloud services such as BigQuery, GCS, Dataflow, and so on can be accessed from the Jupyter Notebook itself using SDKs. This makes Vertex AI Workbench a one-stop tool for every ML development need.
Note: We should stop Vertex AI Workbench instances when we are not using them or don’t plan to use them for a long period of time. This will help prevent us from incurring costs from running them unnecessarily for a long period of time.
In the next sections, we will learn how to create notebooks using custom containers and how to schedule notebooks with Vertex AI Workbench.
Vertex AI Workbench gives us the flexibility of creating notebook instances based on a custom container as well. The main advantage of a custom container-based notebook is that it lets us customize the notebook environment based on our specific needs. Suppose we want to work with a new TensorFlow version (or any other library) that is currently not available as a predefined kernel. We can create a custom Docker container with the required version and launch a Workbench instance using this container. Custom containers are supported by both managed and user-managed notebooks.
Here is how to launch a user-managed notebook instance using a custom container:
1. The first step is to create a custom container based on the requirements. Most of the time, a derivative container (a container based on an existing DL container image) would be easy to set up. See the following example Dockerfile; here, we are first pulling an existing TensorFlow GPU image and then installing a new TensorFlow version from the source:
FROM gcr.io/deeplearning-platform-release/tf-gpu:latest
RUN pip install -y tensorflow
2. Next, build and push the container image to Container Registry, such that it should be accessible to the Google Compute Engine (GCE) service account. See the following source to build and push the container image:
export PROJECT=$(gcloud config list project --format "value(core.project)")
docker build . -f Dockerfile.example -t "gcr.io/${PROJECT}/
tf-custom:latest"
docker push "gcr.io/${PROJECT}/tf-custom:latest"
Note that the service account should be provided with sufficient permissions to build and push the image to the container registry, and the respective APIs should be enabled.
3. Go to the User-managed notebooks page, click on the New Notebook button, and then select Customize. Provide a notebook name and select an appropriate Region and Zone value.
4. In the Environment field, select Custom Container.
5. In the Docker Container Image field, enter the address of the custom image; in our case, it would look like this:
gcr.io/${PROJECT}/tf-custom:latest
6. Make the remaining appropriate selections and click the Create button.
We are all set now. While launching the notebook, we can select the custom container as a kernel and start working on the custom environment.
We can now successfully launch Vertex AI notebooks and also create custom container-based environments if required. In the next section, we will learn how to schedule notebook runs within Vertex AI.
Jupyter Notebook environments are great for doing some initial experiments. But when it comes to launching long-running jobs, multiple training trials with different input parameters (such as hyperparameter tuning jobs), or adding accelerators to training jobs, we usually copy our code into a Python file and launch experiments using custom Docker containers or managed pipelines such as Vertex AI pipelines. Considering this situation and to minimize the duplication of efforts, Vertex AI-managed notebook instances provide us with the functionality of scheduling notebooks on an ad hoc or recurring basis. This feature allows us to execute our scheduled notebook cell by cell on Vertex AI. It provides us with the flexibility to seamlessly scale our processing power and choose suitable hardware for the task. Additionally, we can pass different input parameters for experimentation purposes.
Let’s try to configure notebook executions to check the various options it provides. Imagine we are building a toy application that takes two parameters–user_name and frequency–and when executed, it prints the user_name parameter as many times as the frequency parameter. Now, let’s launch a managed notebook and create our application, as follows:
Figure 4.9 – A simple Python application within Jupyter Notebook
Next, put all the parameters into a single cell and click on the gear-like button at the top-right corner.
Assign this cell with tag parameters. See the following screenshot:
Figure 4.10 – Tagging parameters within a Jupyter Notebook cell
Our toy application is now ready. Once you click on the Execute button from the toolbar, it provides us with the options for customizing machine type, accelerators, environment (which can be a custom Docker container), and execution type–one-time or recurring. See the following screenshot:
Figure 4.11 – Configuring notebook execution for Python application
Next, let’s change the parameters for our one-time execution by clicking on the ADVANCED OPTIONS Here, we can provide key-value pairs for parameter names and values. Check the following screenshot:
Figure 4.12 – Setting up parameters for one-time execution
Finally, click the SUBMIT button. It will then display the following dialog box:
Figure 4.13 – One-time execution scheduled
We have now successfully scheduled our notebook run with custom parameters on Vertex AI. We can find it under the EXECUTIONS section in the Vertex AI UI:
Figure 4.14 – Checking the EXECUTIONS section for executed notebook instances
We can now check the results by clicking on VIEW RESULT. Check the following screenshot for how it overrides the input parameters:
Figure 4.15 – Checking the results of the execution
Similarly, we can schedule large one-time or recurring experiments without moving our code out of the notebook and take advantage of the cloud platform’s scalability.
We just saw how easy it is to configure and schedule notebook runs within Vertex AI Workbench. This capability allows us to do seamless experiments while keeping our code in the notebook. This is also helpful in setting up recurring jobs in the development environment.
In this chapter, we learned about Vertex AI Workbench, a managed platform for launching the Jupyter Notebook application on Google Cloud. We talked about the benefits of having notebooks in a cloud-based environment as compared to a local environment. Having Jupyter Notebook in the cloud makes it perfect for collaboration, scaling, adding security, and launching long-running jobs. We also discussed additional features of Vertex AI Workbench that are pretty useful while working on different aspects of ML project development.
After reading this chapter, we should be able to successfully deploy, manage, and use Jupyter Notebooks on the Vertex AI platform for our ML development needs. As we understand the difference between managed and user-managed notebook instances, we should be in good shape to choose the best solution for our development needs. We should also be able to create custom Docker containerbased notebooks if required. Most importantly, we should now be able to schedule notebook runs for recurring as well as one-time execution based on the requirements. Notebook scheduling is also quite useful for launching multiple model training experiments in parallel with different input parameters. Now that we have a good background in Vertex AI Workbench, it will be easier for us to follow the code samples in the upcoming chapters.
Kartik is an Artificial Intelligence and Machine Learning professional with 6+ years of industry experience in developing and architecting large scale AI/ML solutions using the technological advancements in the field of Machine Learning, Deep Learning, Computer Vision and Natural Language Processing. Kartik has filed 9 patents at the intersection of Machine Learning, Healthcare, and Operations. Kartik loves sharing knowledge, blogging, travel, and photography.
Jasmeet is a Machine Learning Architect with over 8 years of experience in Data Science and Machine Learning Engineering at Google and Microsoft, and overall has 17 years of experience in Product Engineering and Technology consulting at Deloitte, Disney, and Motorola. He has been involved in building technology solutions that focus on solving complex business problems by utilizing information and data assets. He has built high performing engineering teams, designed and built global scale AI/Machine Learning, Data Science, and Advanced analytics solutions for image recognition, natural language processing, sentiment analysis, and personalization.