The CAP theorem and physical limitations of replication
You might wonder why theory is being covered in such a prominent place in a book that is supposed to be highly practical. Well, there is a very simple reason for that: some nice-looking marketing papers of some commercial database vendors might leave you with the impression that everything is possible and easy to do, without any serious limitation. This is not the case; there are physical limitations that every software vendor has to cope with. There is simply no way around the laws of nature, and shiny marketing cannot help overcome nature. The laws of physics and logic are the same for everybody, regardless of the power of someone's marketing department.
In this section, you will be taught the so-called CAP theorem. Understanding the basic ideas of this theorem is essential to avoid some requirements that cannot be turned into reality.
The CAP theorem was first described by Eric Brewer back in the year 2000. It has quickly developed into one of the most fundamental concepts in the database world. Especially with the rise of NoSQL database systems, Brewer's theorem (as the CAP theorem is often called) has become an important cornerstone of every distributed system.
Understanding the CAP theorem
Before we dig into the details, we need to discuss what CAP actually means. CAP is an abbreviation for the following three concepts:
- Consistency: This term indicates whether all the nodes in a cluster see the same data at the same time or not. A read-only node has to see all previously completed reads at any time.
- Availability: Reads and writes have to succeed all the time. In other words a node has to be available for users at any point of time.
- Partition tolerance: This means that the system will continue to work even if arbitrary messages are lost on the way. A network partition event occurs when a system is no longer accessible (think of a network connection failure). A different way of considering partition tolerance is to think of it as message passing. If an individual system can no longer send or receive messages from other systems, it means that it has been effectively partitioned out of the network. The guaranteed properties are maintained even when network failures prevent some machines from communicating with others.
Why are these three concepts relevant to normal users? Well, the bad news is that a replicated (or distributed) system can provide only two out of these three features at the same time.
Note
Keep in mind that only two out of the three promises can be fulfilled.
It is theoretically impossible to offer consistency, availability, and partition tolerance at the same time. As you will see later in this book, this can have a significant impact on system layouts that are safe and feasible to use. There is simply no such thing as the solution to all replication-related problems. When you are planning a large-scale system, you might have to come up with different concepts, depending on needs that are specific to your requirements.
Note
PostgreSQL, Oracle, DB2, and so on will provide you with CAp ("consistent" and "available"), while NoSQL systems, such as MongoDB and Cassandra, will provide you with cAP ("available" and "partition tolerant"). This is why NoSQL is often referred to as eventually consistent.
Consider a financial application. You really want to be consistent and partition tolerant. Keeping balances in sync is the highest priority.
Or consider an application collecting a log of weather data from some remote locations. If the data is a couple of minutes late, it is really no problem. In this case, you might want to go for cAP. Availability and partition tolerance might really be the most important things in this case.
Depending on the use, people have to decide what is really important and which attributes (consistency, availability, or partition tolerance) are crucial and which can be neglected.
Keep in mind there is no system which can fulfill all those wishes at the same time (neither open source nor paid software).
Understanding the limits of physics
The speed of light is not just a theoretical issue; it really does have an impact on your daily life. And more importantly, it has a serious implication when it comes to finding the right solution for your cluster.
We all know that there is some sort of cosmic speed limit called the speed of light. So why care? Well, let's do a simple mental experiment. Let's assume for a second that our database server is running at 3 GHz clock speed.
How far can light travel within one clock cycle of your CPU? If you do the math, you will figure out that light travels around 10 cm per clock cycle (in pure vacuum). We can safely assume that an electric signal inside a CPU will be very slow compared to pure light in vacuum. The core idea is, "10 cm in one clock cycle? Well, this is not much at all."
For the sake of our mental experiment, let's now consider various distances:
- Distance from one end of the CPU to the other
- Distance from your server to some other server next door
- Distance from your server in Central Europe to a server somewhere in China
Considering the size of a CPU core on a die, you can assume that you can send a signal (even if it is not traveling anywhere close to the speed of light) from one part of the CPU to some other part quite fast. It simply won't take 1 million clock cycles to add up two numbers that are already in your first level cache on your CPU.
But what happens if you have to send a signal from one server to some other server and back? You can safely assume that sending a signal from server A to server B next door takes a lot longer because the cable is simply a lot longer. In addition to that, network switches and other network components will add some latency as well.
Let's talk about the length of the cable here, and not about its bandwidth.
Sending a message (or a transaction) from Europe to China is, of course, many times more time-consuming than sending some data to a server next door. Again, the important thing here is that the amount of data is not as relevant as the so-called latency, consider the following criteria:
- Long-distance transmission: To explain the concept of latency, let's cover a very simple example. Let's assume you are a European and you are sending a letter to China. You will easily accept the fact that the size of your letter is not the limiting factor here. It makes absolutely no difference whether your letter is two or 20 pages long; the time it takes to reach the destination is basically the same. Also, it makes no difference whether you send one, two, or 10 letters at the same time. Given a reasonable numbers of letter, the size of the aircraft required (that is, the bandwidth) to ship the stuff to China is usually not the problem. However, the so-called round trip might very well be an issue. If you rely on the response to your letter from China to continue your work, you will soon find yourself waiting for a long time.
- Why latency matters: Latency is an important issue. If you send a chunk of data from Europe to China, you should avoid waiting for the response. But if you send a chunk of data from your server to a server in the same rack, you might be able to wait for the response, because your electronic signal will simply be fast enough to make it back in time.
Note
The basic problems of latency described in this section are not PostgreSQL-specific. The very same concepts and physical limitations apply to all types of databases and systems. As mentioned before, this fact is sometimes silently hidden and neglected in shiny commercial marketing papers. Nevertheless, the laws of physics will stand firm. This applies to both commercial and open source software.
The most important point you have to keep in mind here is that bandwidth is not always the magical fix to a performance problem in a replicated environment. In many setups, latency is at least as important as bandwidth.