Understanding DBA and administration features
In every release, a comprehensive set of features is added to make the lives of database administrators (DBAs) easier and more effective. The same is true for PostgreSQL. So, let us take a look and dive into the new features of PostgreSQL 17.
Terminating long transactions
PostgreSQL supports various features to limit the duration of statements, the maximum time a query will wait for locks, and a lot more. However, there is one feature that has been requested by customers for some time and has finally made it into the next release of PostgreSQL – the ability to limit the duration of a transaction. In the past, PostgreSQL supported an instance-wide configuration variable called old_snapshot_threshold
. The idea of this variable was to limit the maximum length of a transaction to avoid table bloat as well as some other issues that are dangerous for the server. However, the old_snapshot_threshold
variable could only be set per instance and not in a more fine-grained way. Thus, the community has removed this feature and introduced a new variable called transaction_timeout
, which can be set per session.
The default value of this new setting is “unlimited” (0
):
test=# SHOW transaction_timeout ; transaction_timeout --------------------- 0 (1 row)
However, if you want to limit the duration of your transaction, you can simply set a value inside your session. The following command sets the configuration variable to 5 seconds (5000
milliseconds):
test=# SET transaction_timeout TO 5000; SET test=# BEGIN; BEGIN test=*# SELECT now(); now ------------------------------ 2024-06-21 19:37:35.81715+00 (1 row) test=*# SELECT now(); now ------------------------------ 2024-06-21 19:37:35.81715+00 (1 row) test=*# SELECT now(); FATAL: terminating connection due to transaction timeout server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. test=#
After 5 seconds, our transaction will terminate and PostgreSQL closes the connection entirely. It is the easiest way to eliminate unnecessarily long transactions and prevent table bloat.
However, PostgreSQL 17 has more to offer.
Improved event triggers
Event triggers are an important feature and were introduced in PostgreSQL many years ago. What is the main idea behind an event trigger? Imagine somebody changes your data structure by creating a table, an index, or some other kind of object. An event trigger allows us to react to those events and execute code as needed.
In PostgreSQL, some functionality has been added. First of all, we now have a configuration variable that looks as follows:
test=# SHOW event_triggers; event_triggers ---------------- on (1 row)
When this one is enabled, event triggers will fire for all applicable statements. The important part here is that only superusers can change this value to a different setting.
In PostgreSQL, there is also the possibility to create an event trigger on REINDEX
. However, this is not as critical as the next new feature that has to be discussed – the ability to trigger LOGIN
. Now, what is a LOGIN
trigger? It basically calls a function (or fires) when a new connection to the database is established. Needless to say, this is an incredibly powerful footgun and can cause serious issues because many things can go wrong.
But before we discuss running a trigger during LOGIN
, it makes sense to take a look at a simple example and understand how things work in general. The most basic example is to write a trigger that is supposed to track login attempts in a table. To do that, we can create a simple table:
login_trigger=# CREATE TABLE user_logins ( id serial, who text ); CREATE TABLE
In PostgreSQL, a trigger will always launch a function. Therefore, the first step is to come up with the function we want to run:
CREATE FUNCTION on_login_proc() RETURNS event_trigger AS $$ BEGIN INSERT INTO user_logins (who) VALUES (SESSION_USER); RAISE NOTICE 'You are welcome!'; END; $$ LANGUAGE plpgsql;
What we see here is that the return value of this function is event_trigger
– it is a special data type specifically designed for this purpose. The rest is plain and simple PL/pgSQL code that does not require a return value.
Finally, we can define the trigger itself:
CREATE EVENT TRIGGER on_login_event ON ddl_command_start EXECUTE FUNCTION on_login_proc();
Note that the event we are interested in is login
. The rest is like a normal event trigger that calls a function of our choice. In the next step, we can already enable the trigger:
login_trigger=# ALTER EVENT TRIGGER on_login_trigger ENABLE ALWAYS; ALTER EVENT TRIGGER
Congratulations, the trigger has been deployed successfully. Let us log out and reconnect to the database. After we have reestablished the new connection, we can already see the content of our audit table:
login_trigger=# SELECT * FROM user_logins; id | who ----+----- 1 | hs (1 row)
This looks pretty successful but why did we just call this type of trigger a footgun? Let us modify the function and see what happens:
CREATE OR REPLACE FUNCTION on_login_proc() RETURNS event_trigger AS $$ BEGIN INSERT INTO user_lo (w) VALUES (SESSION_USER); RAISE NOTICE 'You are welcome!'; END; $$ LANGUAGE plpgsql;
The function is essentially buggy. The consequences are nothing short of a total failure:
linux$ psql login_trigger psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: relation "user_lo" does not exist LINE 1: INSERT INTO user_lo (w) VALUES (SESSION_USER) QUERY: INSERT INTO user_lo (w) VALUES (SESSION_USER) CONTEXT: PL/pgSQL function on_login_proc() line 3 at SQL statement
The entire authentication process will fail. That is important because small mistakes can lead to large-scale outages. It is therefore highly recommended to think twice before deploying event triggers to handle login attempts. What is important to understand is that PostgreSQL does exactly what it has been designed to do – it just does so in a sensitive area that can cause a lot of trouble.
The problem is that when you have installed a broken event trigger and you want to get rid of it, you have a hard time ahead. First of all, you have to shut down the database and then you have to start it in single-user mode (https://www.postgresql.org/docs/17/app-postgres.html). In single-user mode, you can then drop the event trigger because, in single-user mode, event triggers are actually disabled – you can therefore log in without those functions being fired.
Inspecting wait events in PostgreSQL
PostgreSQL provides a shiny new system view called pg_wait_events
. For many years, wait events have been an integral feature of PostgreSQL, and allowed us to monitor and inspect all kinds of performance problems. However, in real life, DBAs often had to switch between the database and the documentation to figure out which type of wait events actually do exist.
pg_wait_events
puts an end to this type of problem and provides an easy way to understand which events are there and what they mean. Here is an example:
test=# \x Expanded display is on. test=# SELECT * FROM pg_wait_events WHERE name = 'DataFileFlush'; -[ RECORD1 ]----------------------------------------- type | IO name | DataFileFlush description | Waiting for a relation data file to reach durable storage
What we can see here is that DataFileFlush
means that we are waiting for the operating system to write the data to your physical storage device.
The beauty is that this new view provides a comprehensive overview that might surprise many people:
test=# SELECT type, count(*) FROM pg_wait_events GROUP BY ROLLUP (1) ORDER BY 1; type | count -----------+------- Activity | 16 BufferPin | 1 Client | 9 Extension | 1 IO | 77 IPC | 57 LWLock | 81 Lock | 12 Timeout | 10 | 264 (10 rows)
Yes, this is true: PostgreSQL knows a grand total of 264 different types of events, which is a huge number.
Digging into checkpoints and background writing
The background writer and the checkpoint process have been around for many years. In the latest release of PostgreSQL, the system statistics related to those processes have been changed. First of all, a couple of fields have been removed from the pg_stat_bgwriter
system view:
test=# \d pg_stat_bgwriter View "pg_catalog.pg_stat_bgwriter" Column | Type ... ------------------+------------------------- buffers_clean | bigint maxwritten_clean | bigint buffers_alloc | bigint stats_reset | timestamp with time zone
The view is way more compact now because a great deal of this information has been moved to a new system view that is defined as follows:
test=# \d pg_stat_checkpointer View "pg_catalog.pg_stat_checkpointer" Column | Type ... ---------------------+-------------------------- num_timed | bigint num_requested | bigint restartpoints_timed | bigint restartpoints_req | bigint restartpoints_done | bigint write_time | double precision sync_time | double precision buffers_written | bigint stats_reset | timestamp with time zone
The pg_stat_checkpointer
view contains most of the information previously found in pg_stat_bgwriter
. Therefore, it is necessary to adjust your monitoring queries to reflect those changes.
Improving pg_stat_statements
The pg_stat_statements
module is an extension for PostgreSQL that is shipped as part of the contrib
package. To me, it has always been the gold standard for performance analysis; therefore, I am happy to see even more changes made to improve the data provided by this view. There are various interesting ones worth mentioning:
CALL
statements now support parameters as placeholders- Allows placeholders for savepoint and 2PC-related commands
- Tracks
DEALLOCATE
statements - Adds support for local block I/O statistics
- Adds more details to JIT statistics
- Adds an optional argument to
pg_stat_statements_reset()
Overall, these improvements will make pg_stat_statements
more compact and thus make it easier to find relevant information.
Adding permissions for maintenance tasks
In the past, maintenance was often done by the superuser. The goal is to avoid this. Therefore, more and more permissions and roles have been added over the years to reduce the need for the superuser.
PostgreSQL 17 has added the MAINTAIN
permission to a couple of commands, which allows us to execute various important tasks such as the following:
VACUUM
andANALYZE
CLUSTER
REINDEX
REFRESH
MATERIALIZED VIEW
LOCK TABLE
The feature works as follows:
test=# CREATE USER joe; CREATE ROLE test=# GRANT MAINTAIN ON …
The tab completion will reveal the full power of this new feature. The number of options you have are quite numerous:
ALL FUNCTIONS IN
SCHEMA DATABASE
TABLE
ALL PROCEDURES
IN SCHEMA
DOMAIN
LANGUAGE
ROUTINE
TABLESPACE
ALL ROUTINES
IN SCHEMA
FOREIGN
DATA WRAPPER
LARGE OBJECT
SCHEMA
TYPE
ALL SEQUENCES
IN SCHEMA
FOREIGN SERVER
PARAMETER
SEQUENCE
ALL TABLES
IN SCHEMA
FUNCTION
PROCEDURE
After dealing with DBA-related functionalities, we can now turn our attention to some more developer-oriented functionalities.