Comparing virtualization and containers
The following schema represents a couple of virtual guest nodes running on top of a physical host:
Figure 1.5 – Applications running on top of virtual guest nodes, running on top of a physical server
A physical server running its own operating system executes a hypervisor software layer to provide virtualization capabilities. A specific amount of hardware resources is virtualized and provisioned to these new virtual guest nodes. We should install new operating systems for these new hosts and after that, we will be able to run applications. Physical host resources are partitioned for guest hosts and both nodes are completely isolated from each other. Each virtual machine executes its own kernel and its operating system runs on top of the host. There is complete isolation between guests’ operating systems because the underlying host’s hypervisor software keeps them separated.
In this model, we require a lot of resources, even if we just need to run a couple of processes per virtual host. Starting and stopping virtual hosts will take time. Lots of non-required software and processes will probably run on our guest host and it will require some tuning to remove them.
As we have learned, the microservices model is based on the idea of applications running decoupled in different processes with complete functionality. Thus, running a complete operating system within just a couple of processes doesn’t seem like a good idea.
Although automation will help us, we need to maintain and configure those guest operating systems in terms of running the required processes and managing users, access rights, and network communications, among other things. System administrators maintain these hosts as if they were physical. Developers require their own copies to develop, test, and certify application components. Scaling up these virtual servers can be a problem because in most cases, increasing resources require a complete reboot to apply the changes.
Modern virtualization software provides API-based management, which enhances their usage and virtual node maintenance, but it is not enough for microservice environments. Elastic environments, where components should be able to scale up or down on demand, will not fit well in virtual machines.
Now, let’s review the following schema, which represents a set of containers running on physical and virtual hosts:
Figure 1.6 – A set of containers running on top of physical and virtual hosts
All containers in this schema share the same host kernel as they are just processes running on top of an operating system. In this case, we don’t care whether they run on a virtual or a physical host; we expect the same behavior. Instead of hypervisor software, we have a container runtime for running containers. Only a template filesystem and a set of defined resources are required for each container. To clarify, a complete operating system filesystem is not required – we just need the specific files required by our process to work. For example, if a process runs on a Linux kernel and is going to use some network capabilities, then the /etc/hosts
and /etc/nsswitch.conf
files would probably be required (along with some network libraries and their dependencies). The attack surface will be completely different than having a whole operating system full of binaries, libraries, and running services, regardless of whether the application uses them or not.
Containers are designed to run just one main process (and its threads or sub-processes) and this makes them lightweight. They can start and stop as fast as their main process does.
All the resources consumed by a container are related to the given process, which is great in terms of the allocation of hardware resources. We can calculate our application’s resource consumption by observing the load of all its microservices.
We define images as templates for running containers. These images contain all the files required by the container to work plus some meta-information providing its features, capabilities, and which commands or binaries will be used to start the process. Using images, we can ensure that all the containers created with one template will run the same. This eliminates infrastructure friction and helps developers prepare their applications to run in production. The configuration (and of course security information such as credentials) is the only thing that differs between the development, testing, certification, and production environments.
Software containers also improve application security because they run by default with limited privileges and allow only a set of system calls. They run anywhere; all we need is a container runtime to be able to create, share, and run containers.
Now that we know what containers are and the most important concepts involved, let’s try to understand how they fit into development processes.