You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Product type Paperback

Published in Dec 2024

Publisher Packt

ISBN-13 9781836205975

Length 474 pages

Edition 6th Edition

Languages

SQL

Tools

PostgreSQL

Concepts

Databases

Author (1):

Hans-Jürgen Schönig

View More author details

Table of Contents (16) Chapters

Preface

1. Chapter 1: What is New in PostgreSQL 17

2. Chapter 2: Understanding Transactions and Locking FREE CHAPTER

3. Chapter 3: Making Use of Indexes

4. Chapter 4: Handling Advanced SQL

5. Chapter 5: Log Files and System Statistics

6. Chapter 6: Optimizing Queries for Good Performance

7. Chapter 7: Writing Stored Procedures

8. Chapter 8: Managing PostgreSQL Security

9. Chapter 9: Handling Backup and Recovery

10. Chapter 10: Making Sense of Backups and Replication

11. Chapter 11: Deciding on Useful Extensions

12. Chapter 12: Troubleshooting PostgreSQL

13. Chapter 13: Migrating to PostgreSQL

14. Index

Why subscribe?

15. Other Books You May Enjoy

Making use of new replication and backup add-ons

As you’ve worked your way through some of the new developer-related features, you’re now ready to address the new version’s powerful set of advanced features related to database administration. In this section, we’ll delve into the more complex world of database management, exploring topics that are new to PostgreSQL 17.

More powerful pg_dump, again

pg_dump is the single most well-known tool to run a basic backup in PostgreSQL. It is a command-line utility that comes with PostgreSQL, used for backing up a PostgreSQL database or extracting its schema and data in a format suitable for loading into another PostgreSQL database. The main question is: after 38 years of development, what might have been added to this tool that is not already there? Well, the answer is that you can now define a file that configures what you want to dump and what you want to ignore. By adding the --filter option, we can feed a file containing all our desired rules.

Handling incremental base backups

Talking about backups in general, pg_basebackup has also been extended. PostgreSQL 17 supports the idea of incremental base backups. Why is that important? Often, we might want to use a simple backup policy such as “Take a base backup every night and keep it for 7 days.” The problem is that if your database is large (say, 50 TB) but static (virtually no changes), you will waste a lot of space just to store the backup, which can, of course, lead to serious cost considerations. Incremental base backup addresses this issue:

summarize_wal = on
wal_summary_keep_time = '7d'

A new process called summarizer was added to PostgreSQL. It will keep track of all those blocks that have been changed and help pg_basebackup to only copy those blocks that have indeed been touched, which reduces the amount of space needed for the backups to drop significantly.

Here is how it works:

pg_basebackup -h source_server.com \
  -D /data/full --checkpoint=fast
...
pg_basebackup -h source_server.com \
  --checkpoint=fast \
  --incremental=/data/full/backup_manifest \
     -D /backup/incremental

The secret to success is the backup manifest that is needed to run the incremental backup. It contains all the necessary information to tell the tooling what has to be done.

After running those two commands, we have a full backup as well as an incremental one. The question now is: how can we combine those things together and turn them into something usable? The following command shows how this works:

$ pg_combinebackup --help
pg_combinebackup reconstructs full backups from incrementals.
Usage:
  pg_combinebackup [OPTION]... DIRECTORY...
Options:
  -d, --debug               generate lots of debugging output
  -n, --dry-run             do not actually do anything
  -N, --no-sync             do not wait for changes to be written 
                            safely to disk
  -o, --output              output directory
  -T, --tablespace-mapping=OLDDIR=NEWDIR
                            relocate tablespace in OLDDIR to NEWDIR
      --clone               clone (reflink) instead of copying files
      --copy-file-range     copy using copy_file_range() syscall
      --manifest-checksums=SHA{224,256,384,512}|CRC32C|NONE
                            use algorithm for manifest checksums
      --no-manifest         suppress generation of backup manifest
      --sync-method=METHOD  set method for syncing files to disk
  -V, --version             output version information, then exit
  -?, --help                show this help, then exit

pg_combinebackup does exactly what we want. It creates the desired set of files that are then needed for recovery. Given our example, we could use the following instruction to combine our full backup with our incremental backup:

pg_combinebackup -o /data/combined \
  /data/full \
  /backup/incremental

What is noteworthy here is that this process works for one base backup and exactly one incremental backup. However, in real life, we might have to apply a set of incremental backups to reach the desired state. In this case, we can simply list all those incremental ones one after the other, as shown in the next listing:

pg_combinebackup -o /data/combined \
  /data/full \
  /backup/incremental \
  /backup/incremental2 \
  /backup/incremental3

Simply list all the incremental backups to produce the desired state.

Logical replication upgraded

In PostgreSQL, there are two types of replication: physical (binary) and logical (text) replication. While binary replication is ideal for all kinds of backup, logical replication has become more and more widespread in heterogeneous environments to achieve cross-cloud portability.

The trouble is that publications and subscriptions (the backbone of logical replication) were lost during pg_upgrades prior to PostgreSQL 17. This has now changed and has significantly eased the burden.

Adding pg_createsubscriber

In the new release, we can all enjoy a new command-line tool called pg_createsubscriber. What is the purpose of this new tool? When people decide to use logical replication, the initial sync phase can take quite a while – especially when the database instance is large. pg_createsubscriber has been designed to help solve this problem. It converts a physical standby (binary replication) and turns it into a logical standby by wiring all the publications, subscriptions, and so on for you. For each database, a replication set will be created and automatically configured. The command has to be executed on the target system.

The rest of the chapter is locked

You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Table of Contents (16) Chapters

Making use of new replication and backup add-ons

More powerful pg_dump, again

Handling incremental base backups

Logical replication upgraded

Adding pg_createsubscriber

Authors (1)

Personalised recommendations for you

You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Table of Contents (16) Chapters

Making use of new replication and backup add-ons

More powerful pg_dump, again

Handling incremental base backups

Logical replication upgraded

Adding pg_createsubscriber

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you