GCP compute services
Previously, we looked at the GCE service and created our VM instances in the cloud. Now, let’s look at the whole GCP compute spectrum, which includes Google Compute Engine (GCE), Google Kubernetes Engine (GKE), Cloud Run, Google App Engine (GAE), and Cloud Functions, as shown in the following diagram:
Figure 1.3 – GCP compute services
The GCP compute spectrum provides a broad range of business use cases. Based on the business model, we can choose GCE, GKE, GAE, Cloud Run, or Cloud Functions to match the requirements. We will discuss each of them briefly in the next few sections.
GCE virtual machines
We discussed the concepts surrounding GCE and provisioned VMs using the cloud console and Cloud Shell. In this section, we will discuss GCP GCE VM images and the pricing model.
Compute Engine images provide the base operating environment for applications that run in Compute Engines (that is, VMs), and they are critical to ensuring that your application deploys and scales quickly and reliably. You can also use golden/trusted images to archive application versions for disaster recovery or rollback scenarios. GCE images are also crucial in security since they can be used to deploy all the VMs in a company.
GCE offers different pricing models for VMs: pay-as-you-go, preemptive, committed usage, and sole-tenant host.
Pay-as-you-go is for business cases that need to provision VMs on the fly. If the workload is foreseeable, we want to use committed usage for the discounted price. If the workload can be restarted, we want to further leverage the preemptive model and bid for the VM prices. If licenses tied to the host exist, the sole-tenant host type fits our needs. For more details about GCE VM pricing, check out https://cloud.google.com/compute/vm-instance-pricing.
Load balancers and managed instance groups
A single computer may be down due to hardware or software failures, and it also does not provide any scaling when computing power demands are changed along the timeline. To ensure high availability and scalability, GCP provides load balancers (LBs) and managed instance groups (MIGs). LBs and MIGs allow you to create homogeneous groups of instances so that load balancers can direct traffic to more than one VM instance. MIG also offers features such as auto-scaling and auto-healing. Auto-scaling lets you deal with spikes in traffic by configuring the appropriate minimum and maximum instances in the autoscaling policy and scaling the number of VM instances up or down based on specific signals, while auto-healing performs health checking and, if necessary, automatically recreates unhealthy instances.
Let’s look at an example to explain this idea:
Figure 1.4 – GCP load balancers and managed instance groups
As shown in the preceding diagram, www.zeebestbuy.com is a global e-commerce company. Every year, when Black Friday comes, their website is so heavily loaded that a single computer cannot accommodate the traffic – many more web servers (running on VM instances) are needed to distribute the traffic load. After Black Friday, the traffic goes back to normal, and not that many instances are needed. On the GCP platform, we use LBs and MIGs to solve this problem. As shown in the preceding diagram, we build three web servers globally (N. Virginia in the US, Singapore, and London in the UK), and GCP DNS can distribute the user traffic to these three locations based on the user’s browser location and the latency to the three sites. At each site, we set up an LB and a MIG: the desired capacity, as well as the minimum and maximum capacities, can be set appropriately based on the normal and peak traffic. When Black Friday comes, the LB and MIG work together to elastically launch new VM instances (web servers) to handle the increased traffic. After the Black Friday sale ends, they will stop/delete the VM instances to reflect the decreased traffic.
MIG uses a launch template, which is like a launch configuration, and specifies instance configuration information, including the ID of the VM image, the instance type, the scaling thresholds, and other parameters that are used to launch VM instances. LB uses health checks to monitor the instances. If an instance is not responding within the configured threshed times, new instances will be launched based on the launch template.
Containers and Google Kubernetes Engine
Just like the transformation from physical machines into VMs, the transformation from VMs into containers is revolutionary. Instead of launching a VM to run an application, we package the application into a standard unit that contains everything to run the application or service in the same way on different VMs. We build the package into a Docker image; a container is a running instance of a Docker image. While a hypervisor virtualizes the hardware into VMs, a Docker image virtualizes an operating system into application containers.
Due to loose coupling and modular portability, more and more applications are being containerized. Quickly, a question arose: how can all these containers/Docker images be managed? There is where Google Kubernetes Engine (GKE) comes in, a container management system developed by Google. A GKE cluster usually consists of at least one control plane and multiple worker machines called nodes, which work together to manage/orchestrate the containers. A Kubernetes Pod is a group of containers that are deployed together and work together to complete a task. For example, an app server pod contains three separate containers: the app server itself, a monitoring container, and a logging container. Working together, they form the application or service of a business use case.
Following the instructions at https://cloud.google.com/kubernetes-engine/docs/how-to/creating-a-zonal-cluster, you can create a GKE zonal cluster. Let’s pause here and use GCP Cloud Shell to create a GKE cluster.
GCP Cloud Run
GCP Cloud Run is a managed compute platform that enables you to run stateless containers that can be invoked via HTTP requests on either a fully managed environment or in your GKE cluster. Cloud Run is serverless, which means that all infrastructure management tasks are the responsibility of Google, leaving the user to focus on application development. With Cloud Run, you can build your applications in any language using whatever frameworks and tools you want, and then deploy them in seconds without having to manage the server infrastructure.
GCP Cloud Functions
Different from the GCE and GKE services, which deploy VMs or containers to run applications, respectively, Cloud Functions is a serverless compute service that allows you to submit your code (written in JavaScript, Python, Go, and so on). Google Cloud will run the code in the backend and deliver the results to you. You do not know and do not care about where the code was run – you are only charged for the time your code runs on GCP.
Leveraging Cloud Functions, a piece of code can be triggered within a few milliseconds based on certain events. For example, after an object is uploaded to a GCS bucket, a message can be generated and sent to GCP Pub/Sub, which will cause Cloud Functions to process the object. Cloud Functions can also be triggered based on HTTP endpoints you define or by events in Firebase mobile applications.
With Cloud Functions, Google takes care of the backend infrastructure for running the code and lets you focus on the code development only.