Understanding containers
Following this comparison between monolithic and microservice architectures, you should have understood that the architecture that best combines agility and DevOps is the microservice architecture. It is this architecture that we will discuss throughout the book because this is the architecture that Kubernetes manages well.
Now, we will move on to discuss how Docker, which is a container engine for Linux, is a good option for managing microservices. If you already know a lot about Docker, you can skip this section. Otherwise, I suggest that you read through it carefully.
Understanding why containers are good for microservices
Recall the two important aspects of the microservice architecture:
- Each microservice can have its own technical environment and dependencies.
- At the same time, it must be decoupled from the operating system it’s running on.
Let’s put the latter point aside for the moment and discuss the first one: two microservices of the same app can be developed in two different languages or be written in the same language but as two different versions. Now, let’s say that you want to deploy these two microservices on the same Linux machine. That would be a nightmare.
The reason for this is that you’ll have to install all the versions of the different runtimes, as well as the dependencies, and there might also be different versions or overlaps between the two microservices. Additionally, all of this will be on the same host operating system. Now, let’s imagine you want to remove one of these two microservices from the machine to deploy it on another server and clean the former machine of all the dependencies used by that microservice. Of course, if you are a talented Linux engineer, you’ll succeed in doing this. However, for most people, the risk of conflicts between the dependencies is huge, and in the end, you might just make your app unavailable while running such a nightmarish infrastructure.
There is a solution to this: you could build a machine image for each microservice and then put each microservice on a dedicated virtual machine. In other words, you refrain from deploying multiple microservices on the same machine. However, in this example, you will need as many machines as you have microservices. Of course, with the help of AWS or GCP, it’s going to be easy to bootstrap tons of servers, each of them tasked with running one and only one microservice, but it would be a huge waste of money to not mutualize the computing power provided by the host.
You have similar solutions in the container world, but not with the default container runtimes because they don’t guarantee complete isolation between microservices. This is exactly how the Kata runtime and the Confidential Container projects come into play. These technologies provide enhanced security and isolation for containerized applications. We’ll delve deeper into these container isolation concepts later in this book.
We will learn about how containers help with isolation in the next section.
Understanding the benefits of container isolation
Container engines such as Docker and Podman play a crucial role in managing microservices. Unlike virtual machines (VMs) that require a full guest operating system, containers are lightweight units that share the host machine’s Linux kernel. This makes them much faster to start and stop than VMs.
Container engines provide a user-friendly API to build, deploy, and manage containers. Container engines don’t introduce an additional layer of virtualization. Instead, they use the built-in capabilities of the Linux kernel for process isolation, security, and resource allocation. This efficient approach makes containerization a compelling solution for deploying microservices.
The following diagram shows how containers are different from virtual machines:
Figure 1.6: The difference between virtual machines and containers
Your microservices are going to be launched on top of this layer, not directly on the host system whose sole role will be to run your containers.
Since containers are isolated, you can run as many containers as you want and have them run applications written in different languages without any conflicts. Microservice relocation becomes as easy as stopping a running container and launching another one from the same image on another machine.
The usage of containers with microservices provides three main benefits:
- It reduces the footprint on the host system.
- It mutualizes the host system without conflicts between different microservices.
- It removes the coupling between the microservice and the host system.
Once a microservice has been containerized, you can eliminate its coupling with the host operating system. The microservice will only depend on the container in which it will operate. Since a container is much lighter than a real full-featured Linux operating system, it will be easy to share and deploy on many different machines. Therefore, the container and your microservice will work on any machine that is running a container engine.
The following diagram shows a microservice architecture where each microservice is wrapped by a container:
Figure 1.7: A microservice application where all microservices are wrapped by a container; the life cycle of the app becomes tied to the container, and it is easy to deploy it on any machine that is running a container engine
Containers fit well with the DevOps methodology too. By developing locally in a container, which would later be built and deployed in production, you ensure you develop in the same environment as the one that will eventually run the application.
Container engines are not only capable of managing the life cycle of a container but also an entire ecosystem around containers. They can manage networks, and the intercommunication between different containers, and all these features respond particularly well to the properties of the microservice architecture that we mentioned earlier.
By using the cloud and containers together, you can build a very strong infrastructure to host your microservice. The cloud will give you as many machines as you want. You simply need to install a container engine on each of them, and you’ll be able to deploy multiple containerized microservices on each of these machines.
Container engines such as Docker or Podman are very nice tools on their own. However, you’ll discover that it’s hard to run them in production alone, just as they are.
Container engines excel in development environments because of their:
- Simplicity: Container engines are easy to install and use, allowing developers to quickly build, test, and run containerized applications.
- Flexibility: Developers can use container engines to experiment with different container configurations and explore the world of containerization.
- Isolation: Container engines ensure isolation between applications, preventing conflicts and simplifying debugging.
However, production environments have strict requirements. Container engines alone cannot address all of these needs:
- Scaling: Container engines (such as Docker or Podman) don’t provide built-in auto-scaling features to dynamically adapt container deployments based on resource utilization.
- Disaster Recovery: Container engines don’t provide comprehensive disaster recovery capabilities to ensure service availability in case of outages.
- Security: While container engines provide basic isolation, managing security policies for large-scale containerized deployments across multiple machines can be challenging.
- Standardization: Container engines require custom scripting or integrations for interacting with external systems, such as CI/CD pipelines or monitoring tools.
While container engines excel in development environments, production deployments demand a more robust approach. Kubernetes, a powerful container orchestration platform, tackles this challenge by providing a comprehensive suite of functionalities. It manages the entire container lifecycle, from scheduling them to run on available resources to scaling deployments up or down based on demand and distributing traffic for optimal performance (load balancing). Unlike custom scripting with container engines, Kubernetes provides a well-defined API for interacting with containerized applications, simplifying integration with other tools used in production environments. Beyond basic isolation, Kubernetes provides advanced security features such as role-based access control and network policies. This allows the efficient management of containerized workloads from multiple teams or projects on the same infrastructure, optimizing resource utilization and simplifying complex deployments.
Before we dive into the Kubernetes topics, let’s discuss the basics of containers and container engines in the next section.
Container engines
A container engine acts as the interface for end-users and REST clients, managing user inputs, downloading container images from container registries, extracting downloaded images onto the disk, transforming user or REST client data for interaction with container engines, preparing container mount points, and facilitating communication with container engines. In essence, container engines serve as the user-facing layer, streamlining image and container management, while the underlying container runtimes handle the intricate low-level details of container and image management.
Docker stands out as one of the most widely adopted container engines, but it’s important to note that various alternatives exist in the containerization landscape. Some notable ones are LXD, Rkt, CRI-O, and Podman.
At its core, Docker relies on the containerd
container runtime, which oversees critical aspects of container management, including the container life cycle, image transfer and storage, execution, and supervision, as well as storage and network attachments. containerd
, in turn, relies on components such as runc
and hcsshim
. Runc is a command-line tool that facilitates creating and running containers in Linux, while hcsshim
plays a crucial role in the creation and management of Windows containers.
It’s worth noting that containerd
is typically not meant for direct end-user interaction. Instead, container engines, such as Docker, interact with the container runtime to facilitate the creation and management of containers. The essential role of runc
is evident, serving not only containerd
but also being used by Podman, CRI-O, and indirectly by Docker itself.
The basics of containers
As we learned in the previous section, Docker is a well-known and widely used container engine. Let’s learn the basic terminology related to containers in general.
Container image
A container image is a kind of template used by container engines to launch containers. A container image is a self-contained, executable package that encapsulates an application and its dependencies. It includes everything needed to run the software, such as code, runtime, libraries, and system tools. Container images are created from a Dockerfile
or Containerfile
, which specify the build steps. Container images are stored in image repositories and shared through container registries such as Docker Hub, making them a fundamental component of containerization.
Container
A container can be considered a running instance of a container image. Containers are like modular shipping containers for applications. They bundle an application’s code, dependencies, and runtime environment into a single, lightweight package. Containers run consistently across different environments because they include everything needed. Each container runs independently, preventing conflicts with other applications on the same system. Containers share the host operating system’s kernel, making them faster to start and stop than virtual machines.
Container registry
A container registry is a centralized repository for storing and sharing container images. It acts as a distribution mechanism, allowing users to push and pull images to and from the registry. Popular public registries include Docker Hub, Red Hat Quayi, Amazon’s Elastic Container Registry (ECR), Azure Container Registry, Google Container Registry, and GitHub Container Registry. Organizations often use private registries to securely store and share custom images. Registries play a crucial role in the Docker ecosystem, facilitating collaboration and efficient management of containerized applications.
Dockerfile or Containerfile
A Dockerfile or Containerfile is a text document that contains a set of instructions for building a container image. It defines the base image, sets up the environment, copies the application code, installs the dependencies, and configures the runtime settings. Dockerfiles or Containerfiles provide a reproducible and automated way to create consistent images, enabling developers to version and share their application configurations.
A sample Dockerfile can be seen in the following code snippet:
# syntax=docker/dockerfile:1
FROM node:18-alpine
WORKDIR /app
COPY . .
RUN yarn install --production
CMD ["node", "src/index.js"]
EXPOSE 3000
And, here’s a line-by-line explanation of the provided Dockerfile:
# syntax=docker/dockerfile:1
: This line defines the Dockerfile syntax version used to build the image. In this case, it specifies version 1 of the standard Dockerfile syntax.FROM node:18-alpine
: This line defines the base image for your container. It instructs the container engine to use the official Node.js 18 image with the Alpine Linux base. This provides a lightweight and efficient foundation for your application.WORKDIR /app
: This line sets the working directory within the container. Here, it specifies /app as the working directory. This is where subsequent commands in the Dockerfile will be executed relative to.COPY . .
: This line copies all files and directories from the current context (the directory where you have your Dockerfile) into the working directory (/app
) defined in the previous step. This essentially copies your entire application codebase into the container.RUN yarn install --production
: This line instructs the container engine to execute a command within the container. In this case, it runsyarn install --production
. This command uses theyarn
package manager to install all production dependencies listed in yourpackage.json
file. The--production
flag ensures that only production dependencies are installed, excluding development dependencies.CMD ["node", "src/index.js"]
: This line defines the default command to be executed when the container starts. Here, it specifies an array with two elements:“node”
and“src/index.js”
. This tells Docker to run the Node.js interpreter (node) and execute the application’s entry point script (src/index.js
) when the container starts up.EXPOSE 3000
: This line exposes a port on the container. Here, it exposes port3000
within the container. This doesn’t map the port to the host machine by default, but it allows you to do so later when running the container with the-p
flag (e.g.,docker run -p 3000:3000 my-image
). Exposing port3000
suggests your application might be listening on this port for incoming connections.IMPORTANT NOTE
To build the container image, you can use a supported container engine (such as Docker or Podman) or a container build tool, such as Buildah or kaniko.
Docker Compose or Podman Compose
Docker Compose is a tool for defining and running multi-container applications. It uses a YAML file to configure the services, networks, and volumes required for an application, allowing developers to define the entire application stack in a single file. Docker Compose or Podman Compose simplifies the orchestration of complex applications, making it easy to manage multiple containers as a single application stack.
The following compose.yaml
file will spin up two containers for a WordPress application stack using a single docker compose
or podman compose
command:
# compose.yaml
services:
db:
image: docker.io/library/mariadb
command: '--default-authentication-plugin=mysql_native_password'
volumes:
- db_data:/var/lib/mysql
restart: always
environment:
- MYSQL_ROOT_PASSWORD=somewordpress
- MYSQL_DATABASE=wordpress
- MYSQL_USER=wordpress
- MYSQL_PASSWORD=wordpress
expose:
- 3306
- 33060
networks:
- wordpress
wordpress:
image: wordpress:latest
ports:
- 8081:80
restart: always
environment:
- WORDPRESS_DB_HOST=db
- WORDPRESS_DB_USER=wordpress
- WORDPRESS_DB_PASSWORD=wordpress
- WORDPRESS_DB_NAME=wordpress
networks:
- wordpress
volumes:
db_data:
networks:
wordpress: {}
In the next section, we will learn how Kubernetes can efficiently orchestrate all these container operations.