Mastering PostgreSQL 17

What is New in PostgreSQL 17

PostgreSQL has come a long way since its first release in 1986. Today, it’s one of the most widely used open source databases in the world. In this chapter, you will be introduced to all the most important and shiny features of PostgreSQL 17. Of course, the list of new stuff is almost infinite; therefore, this chapter will focus on those things that are expected to be most relevant to the majority of users.

We will cover the following topics in this chapter:

Understanding DBA and administration features
Using SQL and developer features
Making use of new replication and backup add-ons
Considering breaking changes in PostgreSQL 17

By the end of this chapter, you will know all about these shiny new features and you will understand how to use the new functionality in more detail.

Understanding DBA and administration features

In every release, a comprehensive set of features is added to make the lives of database administrators (DBAs) easier and more effective. The same is true for PostgreSQL. So, let us take a look and dive into the new features of PostgreSQL 17.

Terminating long transactions

PostgreSQL supports various features to limit the duration of statements, the maximum time a query will wait for locks, and a lot more. However, there is one feature that has been requested by customers for some time and has finally made it into the next release of PostgreSQL – the ability to limit the duration of a transaction. In the past, PostgreSQL supported an instance-wide configuration variable called old_snapshot_threshold. The idea of this variable was to limit the maximum length of a transaction to avoid table bloat as well as some other issues that are dangerous for the server. However, the old_snapshot_threshold variable could only be set per instance and not in a more fine-grained way. Thus, the community has removed this feature and introduced a new variable called transaction_timeout, which can be set per session.

The default value of this new setting is “unlimited” (0):

test=# SHOW transaction_timeout ;
 transaction_timeout
---------------------
 0
(1 row)

However, if you want to limit the duration of your transaction, you can simply set a value inside your session. The following command sets the configuration variable to 5 seconds (5000 milliseconds):

test=# SET transaction_timeout TO 5000;
SET
test=# BEGIN;
BEGIN
test=*# SELECT now();
             now
------------------------------
 2024-06-21 19:37:35.81715+00
(1 row)
test=*# SELECT now();
             now
------------------------------
 2024-06-21 19:37:35.81715+00
(1 row)
test=*# SELECT now();
FATAL:  terminating connection due to transaction timeout
server closed the connection unexpectedly
  This probably means the server terminated abnormally
  before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
test=#

After 5 seconds, our transaction will terminate and PostgreSQL closes the connection entirely. It is the easiest way to eliminate unnecessarily long transactions and prevent table bloat.

However, PostgreSQL 17 has more to offer.

Improved event triggers

Event triggers are an important feature and were introduced in PostgreSQL many years ago. What is the main idea behind an event trigger? Imagine somebody changes your data structure by creating a table, an index, or some other kind of object. An event trigger allows us to react to those events and execute code as needed.

In PostgreSQL, some functionality has been added. First of all, we now have a configuration variable that looks as follows:

test=# SHOW event_triggers;
 event_triggers
----------------
 on
(1 row)

When this one is enabled, event triggers will fire for all applicable statements. The important part here is that only superusers can change this value to a different setting.

In PostgreSQL, there is also the possibility to create an event trigger on REINDEX. However, this is not as critical as the next new feature that has to be discussed – the ability to trigger LOGIN. Now, what is a LOGIN trigger? It basically calls a function (or fires) when a new connection to the database is established. Needless to say, this is an incredibly powerful footgun and can cause serious issues because many things can go wrong.

But before we discuss running a trigger during LOGIN, it makes sense to take a look at a simple example and understand how things work in general. The most basic example is to write a trigger that is supposed to track login attempts in a table. To do that, we can create a simple table:

login_trigger=# CREATE TABLE user_logins (
    id serial,
    who text
);
CREATE TABLE

In PostgreSQL, a trigger will always launch a function. Therefore, the first step is to come up with the function we want to run:

CREATE FUNCTION on_login_proc()
RETURNS event_trigger AS
$$
BEGIN
    INSERT INTO user_logins (who)
    VALUES (SESSION_USER);
    RAISE NOTICE 'You are welcome!';
END;
$$ LANGUAGE plpgsql;

What we see here is that the return value of this function is event_trigger – it is a special data type specifically designed for this purpose. The rest is plain and simple PL/pgSQL code that does not require a return value.

Finally, we can define the trigger itself:

CREATE EVENT TRIGGER on_login_event
ON ddl_command_start
EXECUTE FUNCTION on_login_proc();

Note that the event we are interested in is login. The rest is like a normal event trigger that calls a function of our choice. In the next step, we can already enable the trigger:

login_trigger=# ALTER EVENT TRIGGER on_login_trigger ENABLE ALWAYS;
ALTER EVENT TRIGGER

Congratulations, the trigger has been deployed successfully. Let us log out and reconnect to the database. After we have reestablished the new connection, we can already see the content of our audit table:

login_trigger=# SELECT * FROM user_logins;
 id | who
----+-----
  1 | hs
(1 row)

This looks pretty successful but why did we just call this type of trigger a footgun? Let us modify the function and see what happens:

CREATE OR REPLACE FUNCTION on_login_proc()
RETURNS event_trigger AS
$$
BEGIN
    INSERT INTO user_lo (w) VALUES (SESSION_USER);
    RAISE NOTICE 'You are welcome!';
END;
$$ LANGUAGE plpgsql;

The function is essentially buggy. The consequences are nothing short of a total failure:

linux$ psql login_trigger
psql: error: connection to server
  on socket "/tmp/.s.PGSQL.5432"
  failed: FATAL:  relation "user_lo" does not exist
LINE 1: INSERT INTO user_lo (w) VALUES (SESSION_USER)
QUERY:  INSERT INTO user_lo (w) VALUES (SESSION_USER)
CONTEXT:  PL/pgSQL function on_login_proc() line 3 at SQL statement

The entire authentication process will fail. That is important because small mistakes can lead to large-scale outages. It is therefore highly recommended to think twice before deploying event triggers to handle login attempts. What is important to understand is that PostgreSQL does exactly what it has been designed to do – it just does so in a sensitive area that can cause a lot of trouble.

The problem is that when you have installed a broken event trigger and you want to get rid of it, you have a hard time ahead. First of all, you have to shut down the database and then you have to start it in single-user mode (https://www.postgresql.org/docs/17/app-postgres.html). In single-user mode, you can then drop the event trigger because, in single-user mode, event triggers are actually disabled – you can therefore log in without those functions being fired.

Inspecting wait events in PostgreSQL

PostgreSQL provides a shiny new system view called pg_wait_events. For many years, wait events have been an integral feature of PostgreSQL, and allowed us to monitor and inspect all kinds of performance problems. However, in real life, DBAs often had to switch between the database and the documentation to figure out which type of wait events actually do exist.

pg_wait_events puts an end to this type of problem and provides an easy way to understand which events are there and what they mean. Here is an example:

test=# \x
Expanded display is on.
test=# SELECT *
FROM pg_wait_events
WHERE name = 'DataFileFlush';
-[ RECORD1 ]-----------------------------------------
type        | IO
name        | DataFileFlush
description | Waiting for a relation data file to
              reach durable storage

What we can see here is that DataFileFlush means that we are waiting for the operating system to write the data to your physical storage device.

The beauty is that this new view provides a comprehensive overview that might surprise many people:

test=# SELECT  type, count(*)
FROM    pg_wait_events
GROUP BY ROLLUP (1)
ORDER BY 1;
   type    | count
-----------+-------
 Activity  |    16
 BufferPin |     1
 Client    |     9
 Extension |     1
 IO        |    77
 IPC       |    57
 LWLock    |    81
 Lock      |    12
 Timeout   |    10
           |   264
(10 rows)

Yes, this is true: PostgreSQL knows a grand total of 264 different types of events, which is a huge number.

Digging into checkpoints and background writing

The background writer and the checkpoint process have been around for many years. In the latest release of PostgreSQL, the system statistics related to those processes have been changed. First of all, a couple of fields have been removed from the pg_stat_bgwriter system view:

test=# \d pg_stat_bgwriter
     View "pg_catalog.pg_stat_bgwriter"
      Column      |           Type   ...
------------------+-------------------------
 buffers_clean    | bigint
 maxwritten_clean | bigint
 buffers_alloc    | bigint
 stats_reset      | timestamp with time zone

The view is way more compact now because a great deal of this information has been moved to a new system view that is defined as follows:

test=# \d pg_stat_checkpointer
                     View "pg_catalog.pg_stat_checkpointer"
       Column        |           Type ...
---------------------+--------------------------
 num_timed           | bigint
 num_requested       | bigint
 restartpoints_timed | bigint
 restartpoints_req   | bigint
 restartpoints_done  | bigint
 write_time          | double precision
 sync_time           | double precision
 buffers_written     | bigint
 stats_reset         | timestamp with time zone

The pg_stat_checkpointer view contains most of the information previously found in pg_stat_bgwriter. Therefore, it is necessary to adjust your monitoring queries to reflect those changes.

Improving pg_stat_statements

The pg_stat_statements module is an extension for PostgreSQL that is shipped as part of the contrib package. To me, it has always been the gold standard for performance analysis; therefore, I am happy to see even more changes made to improve the data provided by this view. There are various interesting ones worth mentioning:

CALL statements now support parameters as placeholders
Allows placeholders for savepoint and 2PC-related commands
Tracks DEALLOCATE statements
Adds support for local block I/O statistics
Adds more details to JIT statistics
Adds an optional argument to pg_stat_statements_reset()

Overall, these improvements will make pg_stat_statements more compact and thus make it easier to find relevant information.

Adding permissions for maintenance tasks

In the past, maintenance was often done by the superuser. The goal is to avoid this. Therefore, more and more permissions and roles have been added over the years to reduce the need for the superuser.

PostgreSQL 17 has added the MAINTAIN permission to a couple of commands, which allows us to execute various important tasks such as the following:

VACUUM and ANALYZE
CLUSTER
REINDEX
REFRESH MATERIALIZED VIEW
LOCK TABLE

The feature works as follows:

test=# CREATE USER joe;
CREATE ROLE
test=# GRANT MAINTAIN ON …

The tab completion will reveal the full power of this new feature. The number of options you have are quite numerous:

ALL FUNCTIONS IN SCHEMA DATABASE
TABLE
ALL PROCEDURES IN SCHEMA
DOMAIN
LANGUAGE
ROUTINE
TABLESPACE
ALL ROUTINES IN SCHEMA
FOREIGN DATA WRAPPER
LARGE OBJECT
SCHEMA
TYPE
ALL SEQUENCES IN SCHEMA
FOREIGN SERVER
PARAMETER
SEQUENCE
ALL TABLES IN SCHEMA
FUNCTION
PROCEDURE

After dealing with DBA-related functionalities, we can now turn our attention to some more developer-oriented functionalities.

Using SQL and developer features

In this section, we will discuss some of the most desired developer and SQL features that have made it into PostgreSQL 17.

Teaching COPY error handling

Let us start with my personal favorite: COPY is finally able to handle errors in a reasonably good way. Many people were frustrated by the error-handling behavior. Standard PostgreSQL will stop COPY when it hits an error. It is good to see that this vital functionality has made it into the official release of PostgreSQL.

The main question arising is: how can we make use of this feature? Here is a simple command to create a sample table:

test=# CREATE TABLE t_data (
    id    int,
    data  text
);
CREATE TABLE

The goal is to import the following dataset into the table and make sure those errors are handled properly:

1  hans
2  paul
abc  joe
4  jane
def  james
5  laura

What we see here is that the data is definitely wrong. The following listing proves this beyond doubt:

test=# COPY t_data FROM '/tmp/file.txt';
ERROR:  invalid input syntax for type integer: "abc"
CONTEXT:  COPY t_data, line 3, column id: "abc"

Fortunately, we can handle this kind of problem in PostgreSQL 17 and simply ignore the error:

test=# COPY t_data FROM '/tmp/file.txt'
WITH (ON_ERROR 'ignore');
NOTICE:  2 rows were skipped due to data type incompatibility
COPY 4
test=# SELECT * FROM t_data ;
 id | data
----+-------
  1 | hans
  2 | paul
  4 | jane
  5 | laura
(4 rows)

PostgreSQL will import the data and simply skip over invalid data. A NOTICE label will indicate how many rows have been skipped. As of version 17, two types of ON_ERROR settings are supported: stop and ignore. In the future, it is likely that more options will be available.

Splitting and merging partitions

In the previous years, there has not been a single version of PostgreSQL that has not provided relevant improvements to partitioning as a whole. The same holds true for the new release. This time, the development team has been working on splitting and merging partitions, which has been a frequent requirement over the years.

The following listing shows how a simple table including a partition can be created:

CREATE TABLE t_timeseries (
    id serial,
    d date,
    payload text
) PARTITION BY RANGE (d);
CREATE TABLE t_timeseries_2024
PARTITION OF t_timeseries
FOR VALUES FROM ('2024-01-01')
TO ('2025-01-01');

Here is a typical example one would encounter in real life. We have some kind of time series and we are using range partitions to split the data into smaller chunks for various reasons (faster cleanup, scalability, and so on). However, tables are often too large and we have to break them up into smaller chunks. This is when PostgreSQL can do some of the heavy lifting for us:

ALTER TABLE t_timeseries
   SPLIT PARTITION t_timeseries_2024
   INTO (
     PARTITION t_timeseries_2024_h1
    FOR VALUES FROM ('2024-01-01') TO ('2024-07-01'),
  PARTITION t_timeseries_2024_h2
    FOR VALUES FROM ('2024-07-01') TO ('2025-01-01')
);

What we do here is take our partition and split it into two new chunks that contain roughly half of the data. Note that this is a single command that takes care of this operation.

While splitting partitions into various pieces might be by far the most common new operation, it is also possible to reverse this decision and unify various partitions into a single entity. The way to do that is by using the ALTER TABLE … MERGE PARTITIONS … command, which is equally as easy to use as the SPLIT command that we have observed and tested before:

ALTER TABLE t_timeseries
MERGE PARTITIONS (
    t_timeseries_2024_h1, 
    t_timeseries_2024_h2
)
INTO t_timeseries_2024;

All we have to do here is to tell PostgreSQL which partitions are supposed to form the new entity and let the database engine do its magic.

Tuning numbers into binary and octal values

One of the lesser-known features that made it into PostgreSQL is the ability to convert numbers to a binary and, respectively, octal representation. Two overloaded functions have been added – to_bin and to_oct:

test=# \df *to_bin*
   List of functions
   Schema   |  Name  | Result data type | Argument data types | Type
------------+--------+------------------+---------------------+------
 pg_catalog | to_bin | text             | bigint              | func
 pg_catalog | to_bin | text             | integer             | func
(2 rows)
test=# \df *to_oct*
   List of functions
   Schema   |  Name  | Result data type | Argument data types | Type
------------+--------+------------------+---------------------+------
 pg_catalog | to_oct | text             | bigint              | func
 pg_catalog | to_oct | text             | integer             | func
(2 rows)

Both functions can be called with 32- or 64-bit integer values. The following listing shows an example of those functions in action:

test=# SELECT to_bin(4711), to_oct(4711);
    to_bin     | to_oct
---------------+--------
 1001001100111 | 11147
(1 row)

Improving MERGE even more

MERGE has been around for various releases. In SQL, the MERGE command is used to merge data from two tables into one table. This command is useful when you need to update or insert rows based on a common column between the two tables.

The new release of PostgreSQL has also introduced another feature, namely, WHEN NOT MATCHED BY SOURCE THEN. This additional syntax allows us to define the behavior even better and adds some flexibility.

Here is how it works:

CREATE TABLE t_demo (
    a int PRIMARY KEY, 
    b int
);
INSERT INTO t_demo
VALUES (1, 4711), 
       (2, 5822), 
       (3, 6933);
CREATE TABLE t_source (
    a int PRIMARY KEY, 
    b int
);
INSERT INTO t_source
VALUES (2, 6822), 
       (3, 6933), 
       (4, 1252);
MERGE INTO t_demo AS t1
USING t_source AS t2 
ON t1.a = t2.a
WHEN MATCHED THEN
    UPDATE SET b = t1.b * 100
WHEN NOT MATCHED THEN
    INSERT (a, b) VALUES (t2.a, t2.b)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE
RETURNING t1.*, t2.*;

This MERGE statement will return the following data:

 a |   b    | a |  b
---+--------+---+------
 1 |   4711 |   |
 2 | 582200 | 2 | 6822
 3 | 693300 | 3 | 6933
 4 |   1252 | 4 | 1252
(4 rows)

RETURNING * can be really useful to debug the statement as a whole. The same is true in my example: a = 1 is available in the original table but not in the source table and the row is therefore deleted. In the case of a = 2 and a = 3, we got a full match and, therefore, the UPDATE statement will execute. a = 4 is only present in the t_source table and is therefore inserted into the t_demo table.

The following table shows what we can expect to find after the MERGE operation:

 a |   b
---+--------
 2 | 582200
 3 | 693300
 4 |   1252
(3 rows)

As you can see, all three cases defined in the MERGE statement have been executed successfully. The question is: which row was touched by which rule? We can modify the RETURNING clause a bit:

RETURNING merge_action(), t1.*, t2.*

In this case, PostgreSQL will provide us with even more information, as we can see in the following listing:

 merge_action | a |   b    | a |  b
--------------+---+--------+---+------
 DELETE       | 1 |   4711 |   |
 UPDATE       | 2 | 582200 | 2 | 6822
 UPDATE       | 3 | 693300 | 3 | 6933
 INSERT       | 4 |   1252 | 4 | 1252
(4 rows)

Additional JSON functionality

The number of JSON-related functionalities has skyrocketed over the years. The same is true for version 17, which provides many more features that will make JSON easier to use and more powerful overall. The first thing that caught my attention was the fact that more standard compliant functions have been added – namely, JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE().

What is also noteworthy is JSON_TABLE, which allows us to turn a JSON document into a tabular format in one go. This is pretty similar to what XMLTABLE does.

The syntax might look as follows:

SELECT jt.*
FROM customers,
     JSON_TABLE (
         js, '$.favorites[*]' COLUMNS (
             id FOR ORDINALITY,
             country text PATH '$.country',
             branch text PATH '$.industry[*].branch' WITH WRAPPER,
             ceo text PATH '$.company[*].ceo' WITH WRAPPER
         )
     ) AS jt;

What this does is to address various elements in the JSON document and return it in a format we can actually and safely read.

Creating BRIN indexes in parallel

Technically, this is not a developer feature, but given the fact that performance topics are often hard to categorize, I decided to include this here. BRIN indexes are often used in data warehouses to quickly filter data without carrying the overhead of full-blown B-tree indexes. Creating B-trees has long been possible using more than one CPU. However, in PostgreSQL 17, it is now possible to create BRIN indexes in parallel, which can greatly speed up the process

Making use of new replication and backup add-ons

As you’ve worked your way through some of the new developer-related features, you’re now ready to address the new version’s powerful set of advanced features related to database administration. In this section, we’ll delve into the more complex world of database management, exploring topics that are new to PostgreSQL 17.

More powerful pg_dump, again

pg_dump is the single most well-known tool to run a basic backup in PostgreSQL. It is a command-line utility that comes with PostgreSQL, used for backing up a PostgreSQL database or extracting its schema and data in a format suitable for loading into another PostgreSQL database. The main question is: after 38 years of development, what might have been added to this tool that is not already there? Well, the answer is that you can now define a file that configures what you want to dump and what you want to ignore. By adding the --filter option, we can feed a file containing all our desired rules.

Handling incremental base backups

Talking about backups in general, pg_basebackup has also been extended. PostgreSQL 17 supports the idea of incremental base backups. Why is that important? Often, we might want to use a simple backup policy such as “Take a base backup every night and keep it for 7 days.” The problem is that if your database is large (say, 50 TB) but static (virtually no changes), you will waste a lot of space just to store the backup, which can, of course, lead to serious cost considerations. Incremental base backup addresses this issue:

summarize_wal = on
wal_summary_keep_time = '7d'

A new process called summarizer was added to PostgreSQL. It will keep track of all those blocks that have been changed and help pg_basebackup to only copy those blocks that have indeed been touched, which reduces the amount of space needed for the backups to drop significantly.

Here is how it works:

pg_basebackup -h source_server.com \
  -D /data/full --checkpoint=fast
...
pg_basebackup -h source_server.com \
  --checkpoint=fast \
  --incremental=/data/full/backup_manifest \
     -D /backup/incremental

The secret to success is the backup manifest that is needed to run the incremental backup. It contains all the necessary information to tell the tooling what has to be done.

After running those two commands, we have a full backup as well as an incremental one. The question now is: how can we combine those things together and turn them into something usable? The following command shows how this works:

$ pg_combinebackup --help
pg_combinebackup reconstructs full backups from incrementals.
Usage:
  pg_combinebackup [OPTION]... DIRECTORY...
Options:
  -d, --debug               generate lots of debugging output
  -n, --dry-run             do not actually do anything
  -N, --no-sync             do not wait for changes to be written 
                            safely to disk
  -o, --output              output directory
  -T, --tablespace-mapping=OLDDIR=NEWDIR
                            relocate tablespace in OLDDIR to NEWDIR
      --clone               clone (reflink) instead of copying files
      --copy-file-range     copy using copy_file_range() syscall
      --manifest-checksums=SHA{224,256,384,512}|CRC32C|NONE
                            use algorithm for manifest checksums
      --no-manifest         suppress generation of backup manifest
      --sync-method=METHOD  set method for syncing files to disk
  -V, --version             output version information, then exit
  -?, --help                show this help, then exit

pg_combinebackup does exactly what we want. It creates the desired set of files that are then needed for recovery. Given our example, we could use the following instruction to combine our full backup with our incremental backup:

pg_combinebackup -o /data/combined \
  /data/full \
  /backup/incremental

What is noteworthy here is that this process works for one base backup and exactly one incremental backup. However, in real life, we might have to apply a set of incremental backups to reach the desired state. In this case, we can simply list all those incremental ones one after the other, as shown in the next listing:

pg_combinebackup -o /data/combined \
  /data/full \
  /backup/incremental \
  /backup/incremental2 \
  /backup/incremental3

Simply list all the incremental backups to produce the desired state.

Logical replication upgraded

In PostgreSQL, there are two types of replication: physical (binary) and logical (text) replication. While binary replication is ideal for all kinds of backup, logical replication has become more and more widespread in heterogeneous environments to achieve cross-cloud portability.

The trouble is that publications and subscriptions (the backbone of logical replication) were lost during pg_upgrades prior to PostgreSQL 17. This has now changed and has significantly eased the burden.

Adding pg_createsubscriber

In the new release, we can all enjoy a new command-line tool called pg_createsubscriber. What is the purpose of this new tool? When people decide to use logical replication, the initial sync phase can take quite a while – especially when the database instance is large. pg_createsubscriber has been designed to help solve this problem. It converts a physical standby (binary replication) and turns it into a logical standby by wiring all the publications, subscriptions, and so on for you. For each database, a replication set will be created and automatically configured. The command has to be executed on the target system.