Preventing split brain

Split brain is the scenario that occurs when more than one primary node is active in a PostgreSQL cluster simultaneously. In these circumstances, if any data was written to both nodes from the application, it becomes extremely difficult to rectify. Certainly, no cluster with such data corruption can be considered highly available!

In this recipe, we will further explore the concept, and how we might mitigate this problem.

Getting ready

An important concept necessary for preventing split brain scenarios is fencing, or isolation of a node from the application and database stack. Often, this is accomplished through STONITH (which stands for Shoot The Other Node In The Head). After accounting for situations where this is not possible, the old primary must invoke SMITH instead, or Shoot Myself In The Head. While it may sound extreme, for servers, this is really a temporary solution to prevent a more worrying complication.

Keep these terms in mind while we explore how they may affect our architecture.

How to do it...

Apply these steps when designing a cluster to help minimize or eliminate the risk of split brain:

If available, allocate STONITH hardware for nodes that may take the role of the Primary.
Consider situations where SMITH must be used instead, during network interruptions.
Ensure PostgreSQL does not start automatically following a system reboot.

How it works...

Modern servers are often equipped with hardware that enables remote administration. These components often allow network access to the boot process itself. In the case where PostgreSQL is installed on virtual servers, the hypervisor serves this role. Many companies commonly install Power Distribution Units (PDU) that can be remotely instructed to cut power to a server.

Whatever the approach, working with infrastructure or systems operations teams is likely necessary to gain access to interact with these devices. As an example, imagine we have a 2-node cluster consisting of a Primary and Standby, and a PDU is available for each. We could do something like this:

In this case, in the case of a failover and subsequent promotion, the Standby could instruct the PDU to cut power to the Primary to ensure that it wasn't possible for applications to be connected. But what about a scenario where it's far more common for network interruptions, such as between two data centers, as seen here:

In this scenario, the Standby would be unable to shut down the Primary node in the case of a failover. This is why it's important to consider SMITH approaches as well. In this case, the Primary should monitor the state of the Standby and Witness, and if both disconnect for a sufficient period, it should assume the Standby will be promoted. In this case, it would power itself down to prevent applications from interacting with it.

Finally, always disable the PostgreSQL start up mechanism on clusters equipped with high availability management software. That software should manage starting and stopping the service, and this will also prevent unintended events such as accidentally having two primary nodes active simply because a reboot started a previously failed PostgreSQL node.

Advanced recipes later in this book will adhere to the rule of disabling PostgreSQL on startup and provide exact instructions for doing so.

There's more...

The reason preventing split brain is so difficult is that it's not an easy problem to solve. This is why dedicated software for managing high availability exists. While these are not immune to the issue, they greatly reduce the potential of occurrence.

Pacemaker has components specifically for interacting with STONITH hardware. repmgr implements the concept of hook scripts for event notifications, and accounts for scenarios where the Primary is isolated from the remainder of the cluster as seen in the previous diagram. Patroni uses a sophisticated locking mechanism that only allows one primary node to be registered at once.

Don't try to invent a solution for an already solved problem when most of the work has already been done by companies dedicated to the cause.