Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering PostgreSQL 17

You're reading from   Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Arrow left icon
Product type Paperback
Published in Dec 2024
Publisher Packt
ISBN-13 9781836205975
Length 474 pages
Edition 6th Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Hans-Jürgen Schönig Hans-Jürgen Schönig
Author Profile Icon Hans-Jürgen Schönig
Hans-Jürgen Schönig
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Chapter 1: What is New in PostgreSQL 17 2. Chapter 2: Understanding Transactions and Locking FREE CHAPTER 3. Chapter 3: Making Use of Indexes 4. Chapter 4: Handling Advanced SQL 5. Chapter 5: Log Files and System Statistics 6. Chapter 6: Optimizing Queries for Good Performance 7. Chapter 7: Writing Stored Procedures 8. Chapter 8: Managing PostgreSQL Security 9. Chapter 9: Handling Backup and Recovery 10. Chapter 10: Making Sense of Backups and Replication 11. Chapter 11: Deciding on Useful Extensions 12. Chapter 12: Troubleshooting PostgreSQL 13. Chapter 13: Migrating to PostgreSQL 14. Index 15. Other Books You May Enjoy

Making use of new replication and backup add-ons

As you’ve worked your way through some of the new developer-related features, you’re now ready to address the new version’s powerful set of advanced features related to database administration. In this section, we’ll delve into the more complex world of database management, exploring topics that are new to PostgreSQL 17.

More powerful pg_dump, again

pg_dump is the single most well-known tool to run a basic backup in PostgreSQL. It is a command-line utility that comes with PostgreSQL, used for backing up a PostgreSQL database or extracting its schema and data in a format suitable for loading into another PostgreSQL database. The main question is: after 38 years of development, what might have been added to this tool that is not already there? Well, the answer is that you can now define a file that configures what you want to dump and what you want to ignore. By adding the --filter option, we can feed a file containing all our desired rules.

Handling incremental base backups

Talking about backups in general, pg_basebackup has also been extended. PostgreSQL 17 supports the idea of incremental base backups. Why is that important? Often, we might want to use a simple backup policy such as “Take a base backup every night and keep it for 7 days.” The problem is that if your database is large (say, 50 TB) but static (virtually no changes), you will waste a lot of space just to store the backup, which can, of course, lead to serious cost considerations. Incremental base backup addresses this issue:

summarize_wal = on
wal_summary_keep_time = '7d'

A new process called summarizer was added to PostgreSQL. It will keep track of all those blocks that have been changed and help pg_basebackup to only copy those blocks that have indeed been touched, which reduces the amount of space needed for the backups to drop significantly.

Here is how it works:

pg_basebackup -h source_server.com \
  -D /data/full --checkpoint=fast
...
pg_basebackup -h source_server.com \
  --checkpoint=fast \
  --incremental=/data/full/backup_manifest \
     -D /backup/incremental

The secret to success is the backup manifest that is needed to run the incremental backup. It contains all the necessary information to tell the tooling what has to be done.

After running those two commands, we have a full backup as well as an incremental one. The question now is: how can we combine those things together and turn them into something usable? The following command shows how this works:

$ pg_combinebackup --help
pg_combinebackup reconstructs full backups from incrementals.
Usage:
  pg_combinebackup [OPTION]... DIRECTORY...
Options:
  -d, --debug               generate lots of debugging output
  -n, --dry-run             do not actually do anything
  -N, --no-sync             do not wait for changes to be written 
                            safely to disk
  -o, --output              output directory
  -T, --tablespace-mapping=OLDDIR=NEWDIR
                            relocate tablespace in OLDDIR to NEWDIR
      --clone               clone (reflink) instead of copying files
      --copy-file-range     copy using copy_file_range() syscall
      --manifest-checksums=SHA{224,256,384,512}|CRC32C|NONE
                            use algorithm for manifest checksums
      --no-manifest         suppress generation of backup manifest
      --sync-method=METHOD  set method for syncing files to disk
  -V, --version             output version information, then exit
  -?, --help                show this help, then exit

pg_combinebackup does exactly what we want. It creates the desired set of files that are then needed for recovery. Given our example, we could use the following instruction to combine our full backup with our incremental backup:

pg_combinebackup -o /data/combined \
  /data/full \
  /backup/incremental

What is noteworthy here is that this process works for one base backup and exactly one incremental backup. However, in real life, we might have to apply a set of incremental backups to reach the desired state. In this case, we can simply list all those incremental ones one after the other, as shown in the next listing:

pg_combinebackup -o /data/combined \
  /data/full \
  /backup/incremental \
  /backup/incremental2 \
  /backup/incremental3

Simply list all the incremental backups to produce the desired state.

Logical replication upgraded

In PostgreSQL, there are two types of replication: physical (binary) and logical (text) replication. While binary replication is ideal for all kinds of backup, logical replication has become more and more widespread in heterogeneous environments to achieve cross-cloud portability.

The trouble is that publications and subscriptions (the backbone of logical replication) were lost during pg_upgrades prior to PostgreSQL 17. This has now changed and has significantly eased the burden.

Adding pg_createsubscriber

In the new release, we can all enjoy a new command-line tool called pg_createsubscriber. What is the purpose of this new tool? When people decide to use logical replication, the initial sync phase can take quite a while – especially when the database instance is large. pg_createsubscriber has been designed to help solve this problem. It converts a physical standby (binary replication) and turns it into a logical standby by wiring all the publications, subscriptions, and so on for you. For each database, a replication set will be created and automatically configured. The command has to be executed on the target system.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image