What exactly is a Linux distribution?
Linux is the standard operating system for cloud workloads. However, there is not a single Linux operating system that goes by that name. Its ecosystem is quite complex. This comes from how it came to be originally.
Long before Linux was conceived by its creator, Linus Torvalds, there was Unix. Unix source code was – for legal reasons – licensed to anyone who bought it, thus making it very popular among many institutions. This included universities. The code, however, was not entirely free. This didn’t sit well with many people, who believed that software should be free – as in speech or beer – including source code. In the 1980s, a completely free and open implementation of Unix was born under the aegis of the GNU Project. Its goal was to develop an operating system that gave complete control over a computer to the user. The project was successful in that it was able to produce all the software required to run an operating system, except one thing – the kernel.
A kernel of the operating system is, in short, the core that operates hardware and manages the hardware and programs for the user.
In 1991, Finnish student Linus Torvalds famously announced his hobby kernel – Linux. He called it at that time “just a hobby – won’t be big and professional like GNU.” It wasn’t supposed to get big and popular. The rest is history. The Linux kernel became popular among developers and administrators and became the missing piece of the GNU Project. Linux is the kernel, the GNU tools are the so-called userland, and together they make the GNU/Linux operating system.
The preceding short story is important to us for two reasons:
- While the GNU userland and the Linux kernel is the most popular combination, you’ll see it is not the only one.
- Linux delivers a kernel and the GNU Project delivers userland tools, but they have to be somehow prepared for installation. Many people and teams had separate ideas on how to do it best. I will expand on this thought next.
The way that a team or a company delivers a GNU/Linux operating system to end users is called a distribution. It facilitates operating system installation, the means to manage software later on, and general notions on how an operating system and the running processes have to be managed.
What makes distributions different?
The open nature of Linux and the GNU Project made it possible for almost anyone to create their own distribution. One of the things that made new users dizzy was the sheer amount of Operating System (OS) versions they could use. The surefire way to start a holy war between Linux users is by asking which distribution is the best.
One of the ways we can group Linux distributions is the format in which they deliver the software (packages) and additional software used to install and remove that software (package managers). There are a number of them, but the two most prevalent are RPM (RPM Package Manager) and DEB packages. Packages are more than just an archive with binaries. They contain scripts that set the software up for use – creating directories, users, permissions, log rules, and a number of other things that we will explain in later chapters.
The RPM family of distributions starts with Red Hat Enterprise Linux (RHEL), created and maintained by the Red Hat company. Closely related is Fedora (a free community distribution sponsored by Red Hat). It also includes CentOS Linux (a free version of RHEL) and Rocky Linux (another free version of RHEL).
The DEB distributions include Debian (where the DEB packages originate from) – a technocracy community project. From the Debian distribution arose a number of distributions based on it, using most of its core components. Most notable is Ubuntu, a server and desktop distribution sponsored by Canonical.
There are also distributions that use packages with minimum automation, most notably Slackware, one of the oldest existing Linux distributions.
There are distributions that give the user a set of scripts that compile actual software on the hardware it will be used on – most notably, Gentoo.
Finally, there is a distribution that is actually a book with a set of instructions that a user can follow to build the whole OS by hand – the Linux From Scratch project.
Another way of grouping distributions is by their acceptance of closed software – software that limits the distribution of source code, binaries, or both. This can mean hardware drivers, such as the ones for NVIDIA graphic cards, and user software, such as movie codecs that allow you to play streamed media and DVD and Blu-Ray discs. Some distributions make it easy to install and use them, while some make it more difficult, arguing that we should strive for all software to be open source and free (as in speech and as in beer).
Yet another way to differentiate them is the security framework a given distribution uses. Two of the most notable ones are AppArmor, used mainly by Ubuntu, and SELinux (from the USA’s National Security Agency), used by, among others, Red Hat Enterprise Linux (and its derivatives) and Fedora Linux.
It’s also worth noting that while most Linux distributions use the GNU Project as the userland, the popular one in the cloud Alpine Linux distribution uses its own set of software, written especially with minimum size in mind.
Looking at how the distribution is developed, it can be community-driven (without any commercial entity being an owner of any part of the process and software – Debian being one prime example), commercial (wholly owned by a company – RHEL being one example and SuSE another), and all the mixes in between (Ubuntu and Fedora being examples of a commercially owned distribution with a large body of independent contributors).
Finally, a way we can group distributions is by how well they facilitate the cloud workload. Here, we can look at different aspects:
- The server side: How well a given distribution works as an underlying OS for our infrastructure, virtual machines, and containers
- The service side: How well a given distribution is suited to run our software as a container or a virtual machine
To make things even more confusing and amusing for new adopters, each distribution can have many variants (called flavors, variants, or spins, depending on the distribution lingo) that offer different sets of software or default configurations.
And to finally confuse you, dear reader, for use on a desktop or laptop, Linux offers the best it can give you – a choice. The number of graphical interfaces for the Linux OS can spin the head of even the most experienced user – KDE Plasma, GNOME, Cinnamon Desktop, MATE, Unity Desktop (not related to the Unity 3D game engine), and Xfce. The list is non-exhaustive, subjective, and very limited. They all differ in the ease of use, configurability, the amount of memory and CPU they use, and many other aspects.
The number of distributions is staggering – the go-to site that tracks Linux distributions (https://distrowatch.com/dwres.php?resource=popularity) lists 265 various Linux distributions on its distribution popularity page at the time of writing. The sheer number of them makes it necessary to limit the book to three of our choosing. For the most part, it doesn’t make a difference which one you choose for yourself, except maybe in licensing and subscription if you choose a commercial one. Each time the choice of distribution makes a difference, especially a technical one, we will point it out.
Choosing a distribution is more than just a pragmatic choice. The Linux community is deeply driven by ideals. For some people, they are the most important ideals on which they build their lives. Harsh words have been traded countless times over which text editor is better, based on their user interface, the license they are published with, or the quality of the source code. The same level of emotion is displayed toward the choice of software to run the WWW server or how to accept new contributions. This will inevitably lead to the way the Linux distribution is installed, what tools there are for configuration and maintenance, and how big the selection of software installed on it out of the box is.
Having said that, we have to mention that even though they have strong beliefs, the open-source community, the Linux community included, is a friendly bunch. In most cases, you’ll be able to find help or advice on online forums, and the chances are quite high that you will be able to meet them in person.
To choose your distribution, you need to pay attention to several factors:
- Is the software you wish to run supported on the distribution? Some commercial software limits the number of distributions it publishes packages for. It may be possible to run them on unsupported versions of Linux, but this may be tricky and prone to disruptions.
- Which versions of the software you intend to run are available? Sometimes, the distribution of choice doesn’t update your desired packages often enough. In the world of the cloud, software a few months old may already be outdated and lack important features or security updates.
- What is the licensing for the distribution? Is it free to use or does it require a subscription plan?
- What are your support options? For community-driven free distributions, your options are limited to friendly Linux gurus online and in the vicinity. For commercial offerings, you can pay for various support offerings. Depending on your needs and budget, you can find a mix of support options that will suit your requirements and financial reserves.
- What is your level of comfort with editing configuration files and running long and complex commands? Some distributions offer tools (both command-line and graphical) that make the configuration tasks easier and less error-prone. However, those tools are mostly distribution-specific, and you won’t find them anywhere else.
- How well are cloud-related tools supported on a given distribution? This can be the ease of installation, the recency of the software itself, or the number of steps to configure for use.
- How well is this distribution supported by the cloud of your choosing? This will mean how many cloud operators offer virtual machines with this distribution. How easy is it to obtain a container image with this distribution to run your software in it? How easy do we suspect it to be to build for this distribution and deploy on it?
- How well is it documented on the internet? This will not only include the documentation written by distribution maintainers but also various blog posts and documentation (mainly tutorials and so-called how-to documents) written by its users.
So far, you’ve learned what a Linux distribution is, how distributions differentiate from one another, and what criteria you can use to actually choose one as the core of the system you will manage.
In the next section, we will look deeper into each distribution to get to know the most popular ones better, giving you a first glimpse of how each one works and what to expect.