Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Benefits and Components of Docker

Save for later
  • 19 min read
  • 04 Nov 2016

article-image

In this article by Jaroslaw Krochmalski, author of Developing with Docker, at the beginning, Docker was created as an internal tool by a Platform as a Service company, called dotCloud. Later on, in March 2013, it was released as open source. Apart from the Docker Inc. team, which is the main sponsor, there are some other big names contributing to the tool –Red Hat, IBM, Microsoft, Google, and Cisco Systems, just to name a few. Software development today needs to be agile and react quickly to the changes. We use methodologies such as Scrum, estimate our work in story points and attend the daily stand-ups. But what about preparing our software for shipment and the deployment? Let's see how Docker fits into that scenario and can help us being agile.

(For more resources related to this topic, see here.)

In this article, we will cover the following topics:

  • The basic idea behind Docker
  • A difference between virtualization and containerization
  • Benefits of using Docker
  • Components available to install

We will begin with a basic idea behind this wonderful tool.

The basic idea

The basic idea behind Docker is to pack an application with all of its dependencies (let it be binaries, libraries, configuration files, scripts, jars, and so on) into a single, standardized unit for software development and deployment. Docker containers wrap up a piece of software in a complete filesystem that contains everything it needs to run: code, runtime, system tools, and system libraries—anything you can install on a server. This guarantees that it will always run in the same way, no matter what environment it will be deployed in. With Docker, you can build some Node.js or Java project (but you are of course not limited to those two) without having to install Node.js or Java on your host machine. Once you're done with it, you can just destroy the Docker image and it's as though nothing ever happened. It's not a programming language or a framework, rather think of it as about a tool that helps solving the common problems such as installing, distributing, and managing the software. It allows programmers and DevOps to build, ship, and run their code anywhere.

You can think that maybe Docker is a virtualization engine, but it's far from it as we will explain in a while.

Containerization versus virtualization

To fully understand what Docker really is, first we need to understand the difference between traditional virtualization and containerization. Let's compare those two technologies now.

Traditional virtualization

A traditional virtual machine, which represents the hardware-level virtualization, is basically a complete operating system running on top of the host operating system. There are two types of virtualization hypervisors: Type 1 and Type 2. Type 1 hypervisors provide server virtualization on bare metal hardware—there is no traditional, end user's operating system. Type 2 hypervisors, on the other hand, are commonly used as a desktop virtualization—you run the virtualization engine on top of your own operating system. There can be a lot of use cases that would make an advantage from using virtualization—the biggest asset is that you can run many virtual machines with totally different operating systems on a single host. Virtual machines are fully isolated, hence very secure. But nothing comes without a price. There are many drawbacks—they contain all the features that an operating system needs to have: device drivers, core system libraries, and so on. They are heavyweight, usually resource-hungry and not so easy to set up—virtual machines require full installation. They require more computing resources to execute. To successfully run an application on a virtual machine, the hypervisor needs to first import the virtual machine and then power it up, and this takes time. Furthermore, their performance gets substantially degraded. As a result, only a few virtual machines can be provisioned and made available to work on a single machine.

Containerization

The Docker software runs in an isolated environment called a Docker container. A Docker container is not a virtual machine in the popular sense. It represents operating system virtualization. While each virtual machine image runs on an independent guest OS, the Docker images run within the same operating system kernel. A container has its own filesystem and environment variables. It's self-sufficient. Because of the containers run within the same kernel, they utilize fewer system resources. The base container can be, and usually is, very lightweight. Worth knowing is that Docker containers are isolated not only from the underlying OS, but from each other as well. There is no overhead related to a classic virtualization hypervisor and a guest operating system. This allows achieving almost bare metal, near native performance. The boot time of a dockerized application is usually very fast due to the low overhead of containers. It is also possible to speed up the roll-out of hundreds of application containers in seconds and to reduce the time taken for provisioning your software.

As you can see, Docker is quite different from the traditional virtualization engines. Be aware that containers cannot substitute virtual machines for all use cases. A thoughtful evaluation is still required to determine what is best for your application. Both solutions have their advantages. On one hand we have the fully isolated, secure virtual machine with average performance and on the other hand, we have the containers that are missing some of the key features (such as total isolation), but are equipped with high performance that can be provisioned swiftly. Let's see what other benefits you will get when using Docker containerization.

Benefits of using Docker

When comparing the Docker containers with traditional virtual machines, we have mentioned some of its advantages. Let's summarize them now in more detail and add some more.

Speed and size

As we have said before, the first visible benefit of using Docker will be very satisfactory performance and short provisioning time. You can create or destroy containers quickly and easily. Docker shares only the Kernel, nothing less, nothing more. However, it reuses the image layers on which the specific image is built upon. Because of that, multiple versions of an application running in containers will be very lightweight. The result is faster deployment, easier migration, and nimble boot times.

Reproducible and portable builds

Using Docker enables you to deploy ready-to-run software, which is portable and extremely easy to distribute. Your containerized application simply runs within its container, there's no need for traditional installation. The key advantage of a Docker image is, it is bundled with all the dependency the containerized application needs to run. The lack of installation of dependencies process has a huge advantage. This eliminates problems such as software and library conflicts or even driver compatibility issues. Because of Docker's reproducible build environment, it's particularly well suited for testing, especially in your continuous integration flow. You can quickly boot up identical environments to run the tests. And because the container images are all identical each time, you can distribute the workload and run tests in parallel without a problem. Developers can run the same image on their machine that will be run in production later, which again has a huge advantage in testing. The use of Docker containers speeds up continuous integration. There are no more endless build-test-deploy cycles. Docker containers ensure that applications run identically in development, test, and production environments.

One of Docker's greatest features is the portability. Docker containers are portable – they can be run from anywhere: your local machine, a nearby or distant server, and private or public cloud. When speaking about the cloud, all major cloud computing providers, such as Amazon Web Services, Google's Compute Platform have perceived Docker's availability and now support it. Docker containers can be run inside an Amazon EC2 instance, Google Compute Engine instance provided that the host operating system supports Docker. A container running on an Amazon EC2 instance can easily be transferred to some other environment, achieving the same consistency and functionality. Docker works very well with various other IaaS (Infrastructure-as-a-Service) providers such as Microsoft's Azure, IBM SoftLayer, or OpenStack. This additional level of abstraction from your infrastructure layer is an indispensable feature. You can just develop your software without worrying about the platform it will be run later on. It's truly a write once run everywhere solution.

Immutable and agile infrastructure

Maintaining a truly idempotent configuration management code base can be tricky and a time-consuming process. The code grows over time and becomes more and more troublesome. That's why the idea of an immutable infrastructure becomes more and more popular nowadays. Containerization comes to the rescue. By using containers during the process of development and deployment of your applications, you can simplify the process. Having a lightweight Docker server that needs almost no configuration management, you manage your applications simply by deploying and redeploying containers to the server. And again, because the containers are very lightweight, it takes only seconds of your time.

As a starting point, you can download a prebuilt Docker image from the Docker Hub, which is like a repository of ready-to-use images. There are many choices of web servers, runtime platforms, databases, messaging servers and so on. It's like a real gold mine of software you can use for free to get a base foundation for your own project.

The effect of the immutability of Docker's images is the result of a way they are being created. Docker makes use of a special file called a Dockerfile. This file contains all the setup instructions on how to create an image, like must-have components, libraries, exposed shared directories, network configuration, and so on. An image can be very basic, containing nothing but the operating system foundations, or—something that is more common—containing a whole prebuilt technology stack which is ready to launch. You can create images by hand, but it can be an automated process also. Docker creates images in a layered fashion: every feature you include will be added as another layer in the base image. This is another serious speed boost compared to the traditional virtualization techniques.

Tools and APIs

Of course, Docker is not just a Dockerfile processor and the runtime engine. It's a complete package with wide selection of tools and APIs that are helpful during the developer's and DevOp's daily work. First of all, there's The Docker Toolbox, which is an installer to quickly and easily install and setup a Docker environment on your own machine. The Kinematic is desktop developer environment for using Docker on Windows and Mac OS X. Docker distribution also contains a whole bunch of command-line tools. Let's look at them now.

Tools overview

On Windows, depending on the Windows version you use, there are two choices. It can be either Docker for Windows if you are on Windows 10 or later, or Docker Toolbox for all earlier versions of Windows. The same applies to MacOS. The newest offering is Docker for Mac, which runs as a native Mac application and uses xhyve to virtualize the Docker Engine environment and Linux kernel. For earlier version of Mac that doesn't meet the Docker for Mac requirements you should pick the Docker Toolbox for Mac. The idea behind Docker Toolbox and Docker native applications are the same—to virtualize Linux kernel and Docker engine on top of your operating system, we will be using Docker Toolbox, as it is more universal; it will run in all Windows and MacOS versions. The installation package for Windows and Mac OS is wrapped in an executable called the Docker Toolbox. The package contains all the tools you need to begin working with Docker. Of course there are tons of additional third-party utilities compatible with Docker, some of them very useful. But for now, let's focus on the default toolset. Before we start the installation, let's look at the tools that the installer package contains to better understand what changes will be made to your system.

Docker Engine and Docker Engine client

Docker is a client-server application. It consists of the daemon that does the important job: builds and downloads images, starts and stops containers and so on. It exposes a REST API that specifies interfaces for interacting with the daemon and is being used for remote management. Docker Engine accepts Docker commands from the command line, such as docker to run the image, docker ps to list running containers, docker images to list images, and so on.

The Docker client is a command-line program that is being used to manage Docker hosts running Linux containers. It communicates with the Docker server using the REST API wrapper. You will interact with Docker by using the client to send commands to the server.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime

Docker Engine works only on Linux. If you want run Docker on Windows or Mac OS, or want to provision multiple Docker hosts on a network or in the Cloud, you will then need the Docker Machine.

Docker Machine

Docker Machine is a fairly new command-line tool created by Docker team to manage Docker servers you can deploy containers to. It deprecated the old way of installing Docker with the Boot2Docker utility. The Docker Machine eliminates the need to create virtual machines manually and install Docker before starting Docker containers on them. It handles the provisioning and installation process for you behind the scenes. In other words, it's a quick way to get a new virtual machine provisioned and ready to run Docker containers. This is an indispensable tool when developing PaaS (Platform as a Service) architecture. Docker Machine not only creates a new VM with the Docker Engine installed in it, but sets up the certificate files for authentication and then configures the Docker client to talk to it. For flexibility purposes, the Docker Machine introduces the concept of drivers. Using drivers, Docker is able to communicate with various virtualization software and cloud providers. In fact, when you install Docker for Windows or Mac OS, the default VirtualBox driver will be used. The following command will be executed behind the scenes:

docker-machine create --driver=virtualbox default

Another available driver is amazonec2 for Amazon Web Services. It can be used to install Docker on the Amazon's cloud—we will do it later in this article. There are a lot of drivers ready to be used, and more are coming all the time. The list of existing official drivers with their documentation is always available at the Docker Drivers website: https://docs.docker.com/machine/drivers. The list contains the following drivers at the moment:

  • Amazon Web Services
  • Microsoft Azure
  • Digital Ocean
  • Exoscale
  • Google Compute Engine
  • Generic
  • Microsoft Hyper-V
  • OpenStack
  • Rackspace
  • IBM Softlayer
  • Oracle VirtualBox
  • VMware vCloud Air
  • VMware Fusion
  • VMware vSphere

Apart from these, there is also a lot of third-party driver plugins available freely on the Internet sites such as GitHub. You can find additional drivers for different cloud providers and virtualization platforms, such as OVH Cloud or Parallels for Mac OS, for example, you are not limited to Amazon's AWS or Oracle's VirtualBox. As you can see, the choice is very broad.

If you cannot find a specific driver for your Cloud provider, try looking for it on the GitHub.

When installing the Docker Toolbox on Windows or Mac OS, Docker Machine will be selected by default. It's mandatory and currently the only way to run Docker on these operating systems. Installing the Docker Machine is not obligatory for Linux—there is no need to virtualize the Linux kernel there. However, if you want to deal with the Cloud providers or just want to have common runtime environment portable between Mac OS, Windows, and Linux, you can install Docker Machine for Linux as well. We will describe the process later in this article. Machine will be also used behind the scenes when using the graphical tool Kitematic, which we will present in a while.

After the installation process, Docker Machine will be available as a command-line tool: docker-machine.

Kitematic

Kitematic is the software tool you can use to run containers through a plain, yet robust graphical user interface. In 2015, Docker acquired the Kitematic team, expecting to attract many more developers and hoping to open up the containerization solution to more developers and a wider, more general public.

Kitematic is now included by default when installing Docker Toolbox on Mac OS and MS Windows. You can use it to comfortably search and fetch images you need from the Docker Hub. The tool can also be used to run your own app containers. Using the GUI, you can edit environment variables, map ports, configure volumes, study logs, and have command-line access to the containers. It is worth mentioning that you can seamlessly switch between Kitematic GUI and command-line interface to run and manage application containers. Kitematic is very convenient, however, if you want to have more control when dealing with the containers or just want to use scripting - the command line will be a better solution. In fact, Kitematic allows you to switch back and forth between the Docker CLI and the graphical interface. Any changes you make on the command-line interface will be directly reflected in Kitematic. The tool is simple to use, as you will see at the end of this article, when we are going to test our setup on Mac or Windows PC, we will be using the command-line interface for working with Docker.

Docker compose

Compose is a tool, executed from the command line as docker-compose. It replaces the old fig utility. It's used to define and run multicontainer Docker applications. Although it's very easy to imagine a multi-container application (like a web server in one container and a database in the other), it's not mandatory. So if you decide that your application will fit in a single Docker container, there will be no use for docker-compose. In real life, it's very likely that your application will span into multiple containers. With docker-compose, you use a compose file to configure your application's services, so they can be run together in an isolated environment. Then, using a single command, you create and start all the services from your configuration. When it comes to multicontainer applications, docker-compose is great for development and testing, as well as continuous integration workflows.

Oracle VirtualBox

Oracle VM VirtualBox is a free and open source hypervisor for x86 computers from Oracle. It will be installed by default when installing the Docker Toolbox. It supports the creation and management of virtual machines running Windows, Linux, BSD, OS/2, Solaris, and so on. In our case, the Docker Machine using VirtualBox driver, will use VirtualBox to create and boot a bitsy Linux distribution capable of the running docker-engine. It's worth mentioning that you can also run the teensy–weensy virtualized Linux straight from the VirtualBox itself. Every Docker Machine you have created using the docker-machine or Kitematic, will be visible and available to boot in the VirtualBox, when you run it directly, as shown in the following screenshot:

benefits-and-components-docker-img-0

You can start, stop, reset, change settings, and read logs in the same way as for other virtualized operating systems.

You can use VirtualBox in Windows or Mac for other purposes than Docker.

Git

Git is a distributed version control system that is widely used for software development and other version control tasks. It has emphasis on speed, data integrity, and support for distributed, non-linear workflows. Docker Machine and Docker client follows the pull/push model of Git for fetching the needed dependencies from the network. For example, if you decide to run the Docker image which is not present on your local machine, Docker will fetch this image from the Docker Hub. Docker doesn't internally use Git for any kind of resource versioning. It does, however, rely on hashing to uniquely identify the filesystem layers which is very similar to what Git does. Docker also takes initial inspiration in the notion of commits, pushes, and pulls. Git is also included in the Docker Toolbox installation package.

From a developer's perspective, there are tools especially useful in a programmer's daily job, be it IntelliJ IDEA Docker Integration Plugin for Java fans or Visual Studio 2015 Tools for Docker for those who prefer C#. They let you download and build Docker images, create and start containers, and carry out other related tasks straight from your favorite IDE.

Apart from the tools included in the Docker's distribution package (it will be Docker Toolbox for older versions of Windows or Docker for Windows and Docker for Mac), there are hundreds of third-party tools, such as Kubernetes and Helios (for Docker orchestration), Prometheus (for monitoring of statistics) or Swarm and Shipyard for managing clusters. As Docker captures higher attention, more and more Docker-related tools pop-up almost every week.

But these are not the only tools available for you. Additionally, Docker provides a set of APIs that can be very handy. One of them is the Remote API for the management of the images and containers. Using this API, you will be able to distribute your images to the runtime Docker engine. The container can be shifted to a different machine that runs Docker, and executed there without compatibility concerns. This may be especially useful when creating PaaS (Platform-as-a-Service) architectures. There's also the Stats API that will expose live resource usage information (such as CPU, memory, network I/O and block I/O) for your containers. This API endpoint can be used to create tools that show how your containers behave, for example, on a production system.

Summary

By now we understand the difference between the virtualization and containerization and also, I hope, we can see the advantages of using the latter. We also know what components are available for us to install and use. Let's begin our journey to the world of containers and go straight to the action by installing the software.

Resources for Article:


Further resources on this subject: