In recent years, digital transformation has changed the way business is done. Mobility, Internet of Things, and cloud computing require agility, simplicity, and speed to meet market demands. However, traditional businesses and enterprises maintain separation between departments, especially those that are responsible for developing new features and those responsible for maintaining application stability. DevOps methodologies break down this dogma and create a circular environment between development and operational processes. The DevOps goal is to deliver services faster and on-demand, and this can be achieved when development and operation teams work together without any barriers.
DevOps concepts and microservice architectures
The DevOps culture introduces important guidelines, also called CALMS, that should be adopted at every level:
- Culture: Trust, collaboration, respect, and common goals are the main pillars of DevOps culture.
- Automation: Everything should be automated, from building to application delivery.
- Lean: Always optimize processes and reduce waste as much as possible.
- Measurement: Measure everything for continuous improvement.
- Sharing: Share everything, from ideas to common problems.
DevOps culture starts with increasing velocity in software development and deployment. This Agile approach allows us to reduce the time between the application's design and deployment. Thus, DevOps culture promotes the continuous integration, continuous delivery, and continuous deployment model (often referred to as CI/CD) against the traditional waterfall model, as shown in the following diagram:
Figure 1.22 – Traditional waterfall model versus CI/CD model
Continuous integration is the process of constantly merging new code into the code base. This allows software engineers and developers to increase velocity in new feature integrations. Also, automated testing can be inserted early in the process so that it is easier to catch problems and bugs. Continuous delivery is the process of staging code for review and inspection before release. Here, there is manual control over the deployment phase of a new feature. On the other hand continuous deployment leverages automation to deploy new features in production when code has been committed and tested.
To support the CI/CD model and adopt DevOps methodology, software engineers have moved from monolith to microservices application design. A microservice is a small piece of software that is independently developed, tested, and deployed as part of a larger application. Moreover, a microservice is stateless and loosely coupled with independent technology and programming languages from other microservices. Large applications built as collections of microservices that work together have the following benefits:
- High horizontal scalability: Microservices can be created as workload increases.
- High modularity: Microservices can be reused to build modular applications.
- High fault tolerance: Microservices can be restarted quickly in case of crashes. Workloads can also be distributed across multiple identical microservices to improve reliability.
- High integration with the CI/CD model: Microservices can fit the CI/CD model because they can be quickly and easily tested and deployed in production.
The best way to follow the microservices approach is to leverage virtualization technology, or better, the containerization methodology. In the next section, we will show how containers are like virtual machines and the main differences that make them ideal for microservices implementation.
Containerization versus virtualization
Since we introduced virtual machines at the beginning of this chapter, it is time to understand what a container is and how it differs from virtual machines. Containers are portable software packages that are independent of the infrastructure that they run in. They wrap one application and all its dependencies that are needed for execution.
Containers fit very well into the microservice architecture because they are modular and they are easy to change and scale.
The main differences between containers and virtual machines are shown in the following diagram:
Figure 1.23 – Virtual machines versus containers
The major features of containers, compared to virtual machines, are as follows:
- Faster deployment: Deploying a container requires seconds rather than minutes.
- Less overhead: Containers do not include the operating systems. Virtual machines do.
- Faster migration: Migrating one container from one host to another takes seconds instead of minutes.
- Faster restart: Restarting one container takes seconds rather than minutes.
Usually, containers apply when users want to run multiple instances of the same application. Containers share a single operating system kernel, and they are logically separated in terms of the runtime environment, filesystem, and others. Virtual machines are logically separated operating systems running on the same general-purpose hardware. Both virtual machines and containers need to run on software that allows for virtualization. For virtual machines, the hypervisor is responsible for virtualizing the hardware to let multiple operating systems run on the same machine. For containers, Container Engine is responsible for virtualizing the operating system (binaries, libraries, filesystem, and so on) to let multiple applications run on the same OS.
It is clear from Figure 1.21 that containers have less overhead than virtual machines. They do not need to load the operating system when the workload requires new applications. Applications can be started in seconds and their isolation is maintained as it would be with virtual machines. In addition, application agility is improved as applications can be created or destroyed dynamically when the workload requires it. Moreover, containers reduce the number of resources that would be needed to deploy a new application. It has been well demonstrated that running a new containerized application consumes far fewer resources than one running on a virtual machine. This is because containers do not need to load an OS that includes dozens of processes in the idle state.
One of the most popular platforms for developing, packaging, and deploying containers is Docker. It also includes Docker Engine, which is supported on several operating systems. With Docker, users can build container images and manage their distribution. Docker has several key concepts:
- Portability: Docker applications can be packaged in images. These can be built on a user's laptop and shift unchanged to production.
- Version control: Each image is versioned with a tag that is assigned during the building process.
- Immutable: When Docker containers are created, they cannot be changed. If restarted, the container is different from the previous one.
- Distribution: Docker images can be maintained in repositories called registries. Images can be pushed to the registry when new images are available. They can be pulled to deploy new containers in production.
Using Docker, applications can be packed into containers using Dockerfiles
, which describe how to build application images from source code. This process is consistent across different platforms and environments, thus greatly increasing portability. The main instructions contained in a Dockerfile
are represented in the following diagram:
Figure 1.24 – Dockerfile example
The FROM
instruction tells Docker Engine which base image this containerized application will start from. It is the first statement of every Dockerfile, and it allows users to build images from the previous one. The COPY
instruction copies the code and its library files into the container image. The RUN
clause instruction runs commands when the container will be built. The WORKDIR
instruction works as a change directory inside the container. The EXPOSE
instruction tells us which port the container will use to provide services. Finally, ENTRYPOINT
starts the application when the container is launched.
Important Note
The EXPOSE
instruction does not publish the port. It works as a type of documentation. To publish the port when running the container, the user who runs the container should use the -p
flag on docker run
to publish and map one or more ports.
Once the Dockerfile
is ready, you can build the container image using the docker build
command. It is mandatory to also include the code and the library requirement files during the building process. Additionally, it is good practice to tag images that have been built to identify the application version.
Container orchestration with Google Kubernetes Engine
So far, we have learned that containerization helps adopt DevOps culture and minimize the gap between application development and deployment. However, when large and complex applications are composed of dozens of microservices, it becomes extremely difficult to coordinate and orchestrate them. It is important to know where containers are running, whether they are healthy, and how to scale when the workload increases. All these functions cannot be done manually; they need a dedicated system that automatically orchestrates all the tasks. Here is where Kubernetes comes in.
Kubernetes (K8s for short) is an open source orchestration tool (formerly an internal Google tool) that can automatically deploy, scale, and failover containerized applications. It supports declarative configurations, so administrators describe the state of the infrastructure. K8s will do everything it can to reach the desired state. So, Kubernetes maintains the state of the infrastructure that is written in configuration files (also known as manifest files).
The main Kubernetes features can be listed as follows:
- Supports both stateless and stateful applications: On K8s, you can run applications that do not save user sessions such as web servers or some others that do store persistently.
- Auto-scaling: K8s can scale containerized applications in and out based on resource utilization. This happens automatically and is controlled by the cluster itself. The administrators can declare autoscaling thresholds in the deployment manifest files.
- Portable: Administrators are free to move their workloads between on-premises clusters and public cloud providers with minimal effort.
K8s is composed of a cluster of several nodes. The node that's responsible for controlling the entire cluster is called the master node. At least one of these is needed to run the cluster. Here, Kubernetes stores the information regarding the objects and their desired states. The most common Kubernetes objects are as follows:
- Pod: This object is a logical structure that the container will run in.
- Deployment: This object describes how one application should be deployed into the K8s cluster. Here, the administrator can decide what container image to use for its application, the desired number of Pods running, and how to auto-scale.
- Service: This object describes how the application that's been deployed can be reached from other applications.
In Kubernetes, worker nodes are responsible for running containers. Containers cannot run on the Kubernetes cluster in their native format. They need to be wrapped into a logical structure known as a Pod. Kubernetes manages Pods, not containers. These Pods provide storage and networking functions for containers running within the Pod. They have one IP address that is used by containers to expose their services. It is good practice to have one container running in a Pod. Additionally, Pods can specify a set of volumes, which can be used as a storage system for containers. Pods can be grouped into namespaces. This provides environment isolation and increases cluster utilization.
The Kubernetes architecture is shown in the following diagram:
Figure 1.25 – Kubernetes architecture – clusters and namespaces
In GCP, administrators can run managed Kubernetes clusters with Google Kubernetes Engine (GKE). GKE allows users to deploy Kubernetes clusters in minutes without worrying about installation problems. It has the following features:
- Node autoscaling: GKE can auto-scale worker nodes to support variable workloads.
- Load balancing: GKE can benefit from Google Load Balancing solutions for its workloads.
- Node pools: GKE can have one or more worker node pools with different Compute Engine instances.
- Automatic repair and upgrades: GKE can monitor and maintain healthy Compute Engine worker nodes and apply automatic updates.
- Cluster logging and monitoring: Google Cloud Operations lets administrators have full control over the state of the Kubernetes cluster and its running workloads.
- Regional cluster: GKE can run K8s clusters across multiple zones of one region. This allows you to have highly available K8s clusters with redundant masters, and multiple worker nodes spread between zones.
When it comes to networking with Kubernetes and GKE, it is important to remember the following definitions:
- Node IP: This is the IP address that a worker node gets when it starts. In GKE, this IP address is assigned based on the VPC subnet that the cluster is running in. This address is used to allow communication between the master node and the worker node of the K8s cluster.
- Pod IP: This is the IP address that's assigned to the Pod. This address is ephemeral and lives for as long as the Pod runs. By default, GKE allocates a
/14
secondary network block for the entire set of Pods running in the cluster. More specifically, GKE allocates a /24
secondary IP address range for each worker node the cluster has.
- Cluster IP: This is the IP address that's given to a service. This address is stable for as long as the service is present on the cluster. By default, GKE allocates a secondary block of IP addresses to run all the services in the cluster.
The following diagram provides a better understanding of GKE IP addressing:
Figure 1.26 – IP addressing in the GKE cluster
Since Pods maintain a separate IP address space from worker nodes, they can communicate with each other in the same cluster without using any kind of network address translation. This is because GKE automatically configures the VPC subnet with an alias IP, which is an authorized secondary subnet in the region where the cluster is deployed.
In Kubernetes, Pods are ephemeral, and they might have a short life. K8s may create new Pods in case of a change in the deployment or may restart Pods in case of crashes or errors. Moreover, when load balancing is needed across multiple Pods, it is crucial to have load balancing services to direct traffic to Pods. Here, the Kubernetes Service comes in handy because it allocates a static IP address that refers to a collection of Pods. The link between the Service and Pods is based on Pod labels and the Service selector. This last parameter allows Service objects to bind one static IP address to a group of Pods.
When Services are created, the ClusterIP is allocated statically, and it can be reached from any other application running within the cluster. However, most of the time, traffic comes from outside the cluster, so this cannot reach Services running inside it. GKE provides four types of load balancers that address this problem, as follows:
- External TCP/UDP load balancer: This is a layer 4 load balancer that manages traffic coming from both outside the cluster and outside the VPC.
- External HTTP(S) load balancer: This is a layer 7 load balancer that uses a dedicated URL forwarding rule to route the traffic to the application. This is also called Ingress.
- Internal TCP/UDP load balancer: This is a layer 4 load balancer that manages traffic coming from outside the cluster but internally to the VPC.
- Internal HTTP(S) load balancer: This is a layer 7 load balancer that uses a dedicated URL forwarding rule to route the intra-VPC traffic to the application. This is also called Ingress and it is applied to internal traffic.
In this section, you learned about the basics of Kubernetes and its implementation in GCP, Google Kubernetes Engine. Since GKE is based on clusters of Compute Instance VMs, networking is a crucial part to make your Pods and Services run as you need them to.