Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
PostgreSQL 12 High Availability Cookbook
PostgreSQL 12 High Availability Cookbook

PostgreSQL 12 High Availability Cookbook: Over 100 recipes to design a highly available server with the advanced features of PostgreSQL 12 , Third Edition

Arrow left icon
Profile Icon Shaun Thomas
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.5 (2 Ratings)
Paperback Feb 2020 734 pages 3rd Edition
eBook
$24.99 $35.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Shaun Thomas
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.5 (2 Ratings)
Paperback Feb 2020 734 pages 3rd Edition
eBook
$24.99 $35.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$24.99 $35.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

PostgreSQL 12 High Availability Cookbook

Hardware Planning

What does high availability mean? In the context of what we're trying to build, it means we want our database to start and remain online for as long as possible. A critical component of this is the hardware that hosts the database itself. No matter how perfect a machine and its parts may be, the failure, of or unexpected behavior from, any element can result in an outage.

So how do we avoid these unwanted outages? We expect them. We must start by assuming hardware can and will fail, and at the worst possible moment. If we start with that in mind, it becomes much easier to make decisions regarding the composition of each server we are building.

Make no mistake! Much of this planning will rely on worksheets, caveats, and compromise. Some of our choices will have several expensive options, and we will have to weigh the benefits offered against our total cost...

Planning for redundancy

Redundancy means having a spare, but a spare for what? Everything. Every single part, from the motherboard to the chassis, power supply to network cable, disk space to throughput, should have at least one piece of excess equipment or capacity available for immediate use.

The intent of this recipe is to consider as many of these elements as we can imagine before committing to a final inventory purchase.

Getting ready

Fire up your favorite spreadsheet program; we'll be using it to keep track of all the parts that go into the server, and any capacity concerns. If you don't have one, OpenOffice and LibreOffice are good free alternatives for building these spreadsheets, as is Google Sheets. The...

Having enough IOPS

IOPS (stands for Input/Output Operations Per Second) describes how many operations a device can perform per second before it should be considered saturated. If a device is saturated, further requests must wait until the device has spare bandwidth. A server overwhelmed with requests can amount to seconds, minutes, or even hours of delayed results.

Depending on application timeout settings and user patience, a device with low IOPS appears as a bottleneck that reduces both system responsiveness and the perception of quality. A database with insufficient IOPS to service queries in a timely manner is unavailable for all intents and purposes. It doesn't matter if PostgreSQL is still online and serving requests in this scenario, as its availability has already suffered.

In this recipe, we will attempt to account for future storage and throughput needs based on...

Sizing storage

Capacity planning for a database server involves a lot of variables. We must account for table count, user activity, compliance storage requirements, indexes, object bloat, maintenance, archival, and more. We may even need to consider application features that do not yet exist. New functionality often brings additional tables, extra storage standards, and increased archival needs. Planning done now may have little relevance to future usage.

So how do we produce functional estimates for disk space with so many uncertain or fluctuating elements? We primarily want to avoid a scenario where we lack sufficient capacity to continue operating. Exhausting disk space results in ignored queries at best, and a completely frozen and difficult to repair database at worst. Neither are the ingredients of a highly available environment.

In this recipe, we will explore a possible...

Investing in a RAID

A Redundant Array of Independent (or Inexpensive) Disks (RAID) often requires a separate controller card for management. The primary purpose of a RAID is to combine several physical devices into a single logical unit for the sake of redundancy and performance.

This is especially relevant to our interests. Carnegie Mellon University published a study in 2007 on hard drive failure rates. They found that hard drives fail at a rate of about 3% per year. Furthermore, they found that drive type and interface contributed little to disk longevity, and that hard drives do not reflect a tendency to fail early as was commonly accepted. These findings were largely corroborated by a parallel study released the same year by Google.

What does this mean? For our purposes in building a highly available server, it means hard drives should be looked at with great disdain. Larger...

Picking a processor

In selecting a CPU for our server, we have much to consider. At the time of writing, the current trend among processors in every space—including mobile—is toward multiple cores per chip. CPU manufacturers have found that providing a large number of smaller processing units spreads workload horizontally for better overall scalability.

As users of PostgreSQL, this benefits us tremendously. PostgreSQL is based on processes instead of threads. This means each connected client is assigned to a process that can use a CPU core when available. The host operating system can perform such allocations without any input from the database software. Motherboards have limited space, so we need more cores on the same limited real estate, which means more simultaneously active database clients.

Once again, our discussion veers toward capacity planning for a three...

Allocating enough memory

The primary focus when selecting memory for a highly available system is stability. It's no accident that most, if not all, server-class RAM is of the error-correcting variety. There are a few other things to consider that may not appear obvious at first glance.

Due to the multi-core nature of our CPUs, the amount of addressable memory may depend on the core count. In addition, speed, latency, and parity are all considerations. We also must include the number of channels reported by each CPU; failing to match this with an equal count of memory sticks can drastically degrade performance.

This recipe will help ensure our server remains fast and stable by considering memory options.

Getting ready

...

Exploring nimble networking

The network card enables the database server to exchange data with the outside world. This includes far more than web servers, spreadsheets, loading jobs, application servers, and other data consumers. The database server is part of a large continuum of activity, much of which will center around maintenance, management, and even filesystem availability.

Little of this other traffic involves PostgreSQL directly. Much happens in the background regardless of the database and its current workload. Yet even one mishandled network packet across an otherwise normal driver can render the entire server invisible to the outside world or, in extreme cases, even lead to a system panic and subsequent shutdown. On a busy database server, network cards can handle several terabytes of traffic on a daily basis; the margin of error for such a critical piece of hardware...

Managing motherboards

We have been working up to this for quite some time. None of our storage, memory, CPU, or network matters if we have nothing to plug all of it into.

This could have been a long section dedicated to properly weighing the pros and cons of selecting a motherboard manufacturer for maximum stability. It turns out that most server vendors have already done all the hard work in that regard. In fact, a few vendors even disclose many details about the motherboard in their servers outside the model documentation. We can't really read hundreds of pages of documentation about every potential server we would like to consider, so what is the alternative?

No matter where we decide to purchase our server, vendors will not sell—or even present—incompatible choices. If we approached this chapter as intended, we already have a long list of parts, counts...

Selecting a chassis

To round out our hardware selection phase, it's time to decide just what kind of case to order from our server vendor. This is the final protective element that hosts the motherboard, drives, and power supplies necessary to keep everything running. And like always, we place a heavy emphasis on redundancy.

For the purposes of this section, we will concentrate primarily on 1U and 2U rack-mounted servers. Why not 4U or larger? Our goal is to obtain at least two of everything, with similar or matching specifications in every possible scenario. The idea is to scale horizontally in order to more easily replace a failed component or server. As the size of the chassis increases, its cost, complexity, and resource consumption also rise. In this delicate balancing act, it's safer to err toward two smaller systems with respectable capabilities, than one giant...

Saddling up to a SAN

Those familiar with the industry may have encountered NAS as well. How exactly is that different, and how is it relevant to us?

It's subtle but important. While both introduce networked storage, only a SAN grants direct block-level access, as if the allocation were raw, unformatted disk space. NAS systems operate one level higher, providing a fully formatted filesystem such as NFS or CIFS. This means our PostgreSQL database does not have direct control over the filesystem; locks, flushes, allocation, and read cache management are all controlled by a remote server.

When building a highly available server, raw I/O and synchronization messages are very important, and NFS is more for sharing storage than extending the storage capabilities of a server. What must we consider when deciding upon utilizing a SAN, and when should we do this instead of adopting...

Tallying up

Now it's time to get serious. We have discussed all the components that go into a stable server for several pages, and have strongly suggested obtaining multiple spares for each. Well, that applies to the server itself. Not only does this mean having a spare idle server in case of a catastrophic failure, but it means having an online server as well.

Determining excess server inventory isn't quite that simple, but it's fairly close. This is where the project starts to get expensive, but high availability is never cheap; the company itself might depend on it.

Unlike the process we used in Chapter 1, Architectural Considerations, this recipe will focus on server hardware redundancy rather than cluster design. In many ways, this recipe can help augment architectural considerations.

...

Protecting your eggs

Have we ever implied that mere server inventory was sufficient for high availability? The place where our servers live—the data centeralso incorporates several redundancies. Extra network lines, separate power sources, multiple generators, air conditioning and ventilation—everything a server might require—are all part of most data center guarantees.

Yet some have joked that a common backhoe is the natural enemy of the internet. There is more truth to that statement than its apparent lack of gravitas might suggest. Data centers are geographically insecure. Inclement weather, natural disasters, disrupted network backbones, power outages, and of course, accidentally damaged trunk lines (from an errant backhoe?) and simple human error can all remove a data center from the grid. When a data center vanishes from the internet, our servers...

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Newly updated edition, covering the latest PostgreSQL 12 features with hands-on industry-driven recipes
  • Create a PostgreSQL cluster that stays online even when disaster strikes
  • Learn how to avoid costly downtime and data loss that can ruin your business

Description

Databases are nothing without the data they store. In the event of an outage or technical catastrophe, immediate recovery is essential. This updated edition ensures that you will learn the important concepts related to node architecture design, as well as techniques such as using repmgr for failover automation. From cluster layout and hardware selection to software stacks and horizontal scalability, this PostgreSQL cookbook will help you build a PostgreSQL cluster that will survive crashes, resist data corruption, and grow smoothly with customer demand. You’ll start by understanding how to plan a PostgreSQL database architecture that is resistant to outages and scalable, as it is the scaffolding on which everything rests. With the bedrock established, you'll cover the topics that PostgreSQL database administrators need to know to manage a highly available cluster. This includes configuration, troubleshooting, monitoring and alerting, backups through proxies, failover automation, and other considerations that are essential for a healthy PostgreSQL cluster. Later, you’ll learn to use multi-master replication to maximize server availability. Later chapters will guide you through managing major version upgrades without downtime. By the end of this book, you’ll have learned how to build an efficient and adaptive PostgreSQL 12 database cluster.

Who is this book for?

This book is for Postgres administrators and developers who are looking to build and maintain a highly reliable PostgreSQL cluster. Although knowledge of the new features of PostgreSQL 12 is not required, a basic understanding of PostgreSQL administration is expected.

What you will learn

  • Understand how to protect data with PostgreSQL replication tools
  • Focus on hardware planning to ensure that your database runs efficiently
  • Reduce database resource contention with connection pooling
  • Monitor and visualize cluster activity with Nagios and the TIG (Telegraf, In?uxDB, Grafana) stack
  • Construct a robust software stack that can detect and avert outages
  • Use multi-master to achieve an enduring PostgreSQL cluster

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Feb 25, 2020
Length: 734 pages
Edition : 3rd
Language : English
ISBN-13 : 9781838984854
Vendor :
PostgreSQL Global Development Group
Category :
Languages :
Concepts :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Feb 25, 2020
Length: 734 pages
Edition : 3rd
Language : English
ISBN-13 : 9781838984854
Vendor :
PostgreSQL Global Development Group
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 141.97
Mastering PostgreSQL 13
$43.99
Learn PostgreSQL
$48.99
PostgreSQL 12 High Availability Cookbook
$48.99
Total $ 141.97 Stars icon

Table of Contents

16 Chapters
Architectural Considerations Chevron down icon Chevron up icon
Hardware Planning Chevron down icon Chevron up icon
Minimizing Downtime Chevron down icon Chevron up icon
Proxy and Pooling Resources Chevron down icon Chevron up icon
Troubleshooting Chevron down icon Chevron up icon
Monitoring Chevron down icon Chevron up icon
PostgreSQL Replication Chevron down icon Chevron up icon
Backup Management Chevron down icon Chevron up icon
High Availability with repmgr Chevron down icon Chevron up icon
High Availability with Patroni Chevron down icon Chevron up icon
Low-Level Server Mirroring Chevron down icon Chevron up icon
High Availability via Pacemaker Chevron down icon Chevron up icon
High Availability with Multi-Master Replication Chevron down icon Chevron up icon
Data Distribution Chevron down icon Chevron up icon
Zero-downtime Upgrades Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.5
(2 Ratings)
5 star 50%
4 star 50%
3 star 0%
2 star 0%
1 star 0%
Katie L Kabe Jun 12, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Very thorough and well written.
Amazon Verified review Amazon
raghuraman Oct 16, 2020
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Some of the pages are not formatted properly, how can this be rectified, please advise
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.