We can’t start talking about Docker without actually covering the ideas that make it such a powerful tool. A container, at the most basic level, is an isolated user-space environment for a given discrete set of functionality. In other words, it is a way to modularize a system (or a part of one) into pieces that are much easier to manage and maintain while often also being very resilient to failures.
In practice, this net gain is never free and requires some investment in the adoption and implementation of new tooling (such as Docker), but the change pays heavy dividends to the adopters in a drastic reduction of development, maintenance, and scaling costs over its lifetime.
At this point, you might ask this: how exactly are containers able to provide such huge benefits? To understand this, we first need to take a look at deployments before such tooling was available.
In the earlier days of deployments, the process for deploying a service would go something like this:
- Developer would write some code.
- Operations would deploy that code.
- If there were any problems in deployment, the operations team would tell the developer to fix something and we would go back to step 1.
A simplification of this process would look something like this:
dev machine => code => ops => bare-metal hosts
The developer would have to wait for the whole process to bounce back for them to try to write a fix anytime there was a problem. What is even worse, operations groups would often have to use various arcane forms of magic to ensure that the code that developers gave them can actually run on deployment machines, as differences in library versions, OS patches, and language compilers/interpreters were all high risk for failures and likely to spend a huge amount of time in this long cycle of break-patch-deploy attempts.
The next step in the evolution of deployments came to improve this workflow with the virtualization of bare-metal hosts as manual maintenance of a heterogeneous mix of machines and environments is a complete nightmare even when they were in single-digit counts. Early tools such as chroot came out in the late 70s but were later replaced (though not fully) with hypervisors such as Xen, KVM, Hyper-V, and a few others, which not only reduced the management complexity of larger systems, but also provided Ops and developers both with a deployment environment that was more consistent as well as more computationally dense:
dev machine => code => ops => n hosts * VM deployments per host
This helped out in the reduction of failures at the end of the pipeline, but the path from the developer to the deployment was still a risk as the VM environments could very easily get out of sync with the developers.
From here, if we really try to figure out how to make this system better, we can already see how Docker and other container technologies are the organic next step. By making the developers' sandbox environment as close as we can get to the one in production, a developer with an adequately functional container system can literally bypass the ops step, be sure that the code will work on the deployment environment, and prevent any lengthy rewrite cycles due to the overhead of multiple group interactions:
dev machine => container => n hosts * VM deployments per host
With Ops being needed primarily in the early stages of system setup, developers can now be empowered to take their code directly from the idea all the way to the user with the confidence that a majority of issues that they will find will be ones that they will be able to fix.
If you consider this the new model of deploying services, it is very reasonable to understand why we have DevOps roles nowadays, why there is such a buzz around Platform as a Service (PaaS) setups, and how so many tech giants can apply a change to a service used by millions at a time within 15 minutes with something as simple as git push origin by a developer without any other interactions with the system.
But the benefits don't stop there either! If you have many little containers everywhere and if you have increased or decreased demand for a service, you can add or eliminate a portion of your host machines, and if the container orchestration is properly done, there will be zero downtime and zero user-noticeable changes on scaling changes. This comes in extremely handy to providers of services that need to handle variable loads at different times--think of Netflix and their peak viewership times as an example. In most cases, these can also be automated on almost all cloud platforms (that is, AWS Auto Scaling Groups, Google Cluster Autoscaler, and Azure Autoscale) so that if some triggers occur or there are changes in resource consumption, the service will automatically scale up and down the number of hosts to handle the load. By automating all these processes, your PaaS can pretty much be a fire-and-forget flexible layer, on top of which developers can worry about things that really matter and not waste time with things such as trying to figure out whether some system library is installed on deployment hosts.
Now don't get me wrong; making one of these amazing PaaS services is not an easy task by any stretch of imagination, and the road is covered in countless hidden traps but if you want to be able to sleep soundly throughout the night without phone calls from angry customers, bosses, or coworkers, you must strive to be as close as you can to these ideal setups regardless of whether you are a developer or not.