Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
PostgreSQL 12 High Availability Cookbook

You're reading from   PostgreSQL 12 High Availability Cookbook Over 100 recipes to design a highly available server with the advanced features of PostgreSQL 12

Arrow left icon
Product type Paperback
Published in Feb 2020
Publisher Packt
ISBN-13 9781838984854
Length 734 pages
Edition 3rd Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Shaun Thomas Shaun Thomas
Author Profile Icon Shaun Thomas
Shaun Thomas
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Architectural Considerations 2. Hardware Planning FREE CHAPTER 3. Minimizing Downtime 4. Proxy and Pooling Resources 5. Troubleshooting 6. Monitoring 7. PostgreSQL Replication 8. Backup Management 9. High Availability with repmgr 10. High Availability with Patroni 11. Low-Level Server Mirroring 12. High Availability via Pacemaker 13. High Availability with Multi-Master Replication 14. Data Distribution 15. Zero-downtime Upgrades 16. Other Books You May Enjoy

Having enough backups

Database backups are a crucial component to any architecture, and should be considered a required part of the central design. The only real question in most cases is: how many backups? All highly available clusters account for relevant backup copies, lest the cluster itself is lost.

In this recipe, we'll cover one simple set of rules to provide an answer.

Getting ready

This is very important, so write it down if necessary. Put it in company documentation if possible.

How to do it...

When considering how many backups to allocate, follow the 3-2-1 backup rule, which consists of these elements:

  1. Keep at least three copies of your data.
  2. Store two copies on different devices.
  3. Keep at least one copy offsite.

Take note that many things may qualify as a copy, including PostgreSQL replicas, and the original data itself.

How it works...

Notice how we don't really recommend a specific backup method, how it should be stored, or how filesystem-based features might contribute. All of those things are implementation details and don't matter as much as the rules themselves.

Consider the first rule: keep at least three copies of your data. Since our PostgreSQL instance is the first copy, we need two more. What might these two be? Could we use one replica and a backup? Maybe we could use two replicas? Perhaps, but let's examine the remaining rules first.

Imagine we've produced a PostgreSQL backup, and it's stored on our primary server and the same filesystem as the database instance. What happens if that storage device is damaged or destroyed? Now, we've lost the database and the backup.

That naturally leads to the second rule: store two copies on different devices. It's fine to retain a copy on the local PostgreSQL server and even the same physical storage device, provided we store a copy of the backup on a device that won't be lost simultaneously. Store another copy of the backup on a second physical device. This can be a separate set of storage drives, a SAN, a shared NFS filesystem, or anything else, so long as it's separate from the database itself.

Be wary of relying on shared infrastructure for following the second rule. If we have two separate LUNs from the same SAN mounted on our database server for PostgreSQL and the backup, this means nothing if the SAN itself is lost. If possible, try to ensure that the backup is actually on a physically distinct device.

So, to fulfill the second rule, we merely need to ensure that the second copy of our PostgreSQL instance is on another device. This is most easily done by creating a replica on another server, or a VM hosted on a different hypervisor. That's two copies of the data.

Finally there's rule three: keep at least one copy offsite. This is the third copy of our data, and it's best to place it somewhere that's immune from a catastrophic failure of the data center itself. In limited circumstances, it may be safe enough to place the backup on a server in another rack of the same data center, but why take the risk?

There are ample cloud providers, vault systems, and cheap storage services that can fill the role of hosting the third and final copy of our data. If we have our own second data center, that's an easy choice of venue. If not, it's important to select, allocate, and designate some tertiary location that won't be lost if the worst happens. This third data copy is an insurance policy, and it doesn't even have to be easily available. As long as we can obtain the backup upon request, that fits the minimum requirement.

There's more...

There's one important corollary here. PostgreSQL replicas tend to immediately reflect every change made to the primary node. What happens if someone accidentally drops a table? In this case, the 3-2-1 rule by itself is not sufficient. Relying on replicas alone means we've permanently lost this data.

Thus, we strongly recommend the following additional rule: At least one copy must be a true backup.

Databases such as PostgreSQL are equipped with Point-In-Time-Recovery (PITR), which allows the user to start with any past backup and apply changes until it reaches a specific point where recovery is stopped. This lets us recover a backup to the point before a table was damaged or removed, capture the desired contents, and reintroduce them into the original database. This can only be done with a real binary-level backup, and there are multiple tools dedicated to performing this task. This book even contains a chapter dedicated to Backup Management.

See also

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime