Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
PostgreSQL 12 High Availability Cookbook - Third Edition

You're reading from  PostgreSQL 12 High Availability Cookbook - Third Edition

Product type Book
Published in Feb 2020
Publisher Packt
ISBN-13 9781838984854
Pages 734 pages
Edition 3rd Edition
Languages
Concepts
Author (1):
Shaun Thomas Shaun Thomas
Profile icon Shaun Thomas
Toc

Table of Contents (17) Chapters close

Preface 1. Architectural Considerations 2. Hardware Planning 3. Minimizing Downtime 4. Proxy and Pooling Resources 5. Troubleshooting 6. Monitoring 7. PostgreSQL Replication 8. Backup Management 9. High Availability with repmgr 10. High Availability with Patroni 11. Low-Level Server Mirroring 12. High Availability via Pacemaker 13. High Availability with Multi-Master Replication 14. Data Distribution 15. Zero-downtime Upgrades 16. Other Books You May Enjoy

Introducing indirection

What happens to connections to a PostgreSQL system when the service must be shut down for maintenance, or the node itself experiences a hardware problem? Previous recipes have already recommended we integrate at least one data replica into our design, but how should we handle switching between these resources? A great way to achieve high availability is to make server maintenance or replacement as simple as possible.

The concept we'll be exploring in this recipe will be one of anticipating system outages, and even welcoming them, by incorporating proxy techniques into the design.

Getting ready

There are actually several methods for switching from one PostgreSQL node to another. However, when considering the node architecture as a whole, we need to know the four major techniques to handle node indirection:

  1. Domain name reassignment
  2. Virtual IP address
  3. Session multiplexing software
  4. Software or hardware load balancer

In real terms, these are all basically the same thing: a proxy for our PostgreSQL primary node. Keep this in mind as we consider how they may affect our architecture. It would also be a good idea to have some diagram software ready to describe how communication flows through the cluster.

How to do it...

Integrating a proxy into a PostgreSQL cluster is generally simple if we consider these steps in the design phase:

  1. Assign a proxy to the primary node.
  2. Redirect all communication to the primary node through the proxy.
  3. If the proxy requires dedicated hardware or software, designate two to account for failures.

How it works...

These rules are simple, but that's one of the reasons they're often overlooked. Always communicate with the Primary node through at least one proxy.

Even if this is merely an abstract network name, or an ephemeral IP address, doing so prevents problems that could occur, as seen in the following diagram:

What happens when the Primary PostgreSQL node is offline and the cluster is now being managed by the Standby? We have to reconfigure—and possibly restartany and all applications that connect directly to it. With one simple change, we can avoid that concern, as seen here:

By following the second guideline, all traffic is directed through the Proxy, thus ensuring that either the Primary or Standby will stay online and remain accessible without further invasive changes. Now, we can switch the active primary node, perform maintenance, or even replace nodes entirely, and the application stack will only see the proxy.

We've encountered clusters that do not follow these two guidelines. Sometimes, applications will actually communicate directly with the primary node as assigned by the inventory reference number. This means any time the infrastructure team or vendor needs to reassign or rename nodes, the application becomes unusable for a short period of time.

Sometimes, hardware load balancers are utilized to redirect application traffic to PostgreSQL. On other occasions, this is done with connection multiplexing software such as PgBouncer or HAProxy. In these cases the proxy is not simply a permanent network name or IP address that is associated with the PostgreSQL cluster, but a piece of hardware. This means that a software or hardware failure could also affect the proxy itself.

In this case, we recommend using two proxies, as seen here:

This is especially useful in microarchitectures, which may consist of dozens or even hundreds of different application servers. Each may target a different proxy such that a failure of either only affects the application servers assigned to it.

There's more...

Given that applications must always access PostgreSQL exclusively through the Proxy, we always recommend assigning a reference hostname that is as permanent as possible. This may fit with the company naming scheme, and should always be documented. PostgreSQL nodes may come and go and, in extreme cases, the cluster itself can be swapped for a replacement, but the Proxy is (or should be) forever.

Physical proxy nodes themselves are not immune to maintenance or failure. Thus, it may be necessary to contact the network team to assign a CNAME or other fixture that can remain static even as the proxy hardware fluctuates.

See also

If you want to learn more about how proxies work, check out this resource: https://whatis.techtarget.com/definition/proxy-server

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime