You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Product type Paperback

Published in Dec 2024

Publisher Packt

ISBN-13 9781836205975

Length 474 pages

Edition 6th Edition

Languages

SQL

Tools

PostgreSQL

Concepts

Databases

Author (1):

Hans-Jürgen Schönig

View More author details

Table of Contents (16) Chapters

Preface

1. Chapter 1: What is New in PostgreSQL 17

2. Chapter 2: Understanding Transactions and Locking FREE CHAPTER

3. Chapter 3: Making Use of Indexes

4. Chapter 4: Handling Advanced SQL

5. Chapter 5: Log Files and System Statistics

6. Chapter 6: Optimizing Queries for Good Performance

7. Chapter 7: Writing Stored Procedures

8. Chapter 8: Managing PostgreSQL Security

9. Chapter 9: Handling Backup and Recovery

10. Chapter 10: Making Sense of Backups and Replication

11. Chapter 11: Deciding on Useful Extensions

12. Chapter 12: Troubleshooting PostgreSQL

13. Chapter 13: Migrating to PostgreSQL

14. Index

Why subscribe?

15. Other Books You May Enjoy

Understanding DBA and administration features

In every release, a comprehensive set of features is added to make the lives of database administrators (DBAs) easier and more effective. The same is true for PostgreSQL. So, let us take a look and dive into the new features of PostgreSQL 17.

Terminating long transactions

PostgreSQL supports various features to limit the duration of statements, the maximum time a query will wait for locks, and a lot more. However, there is one feature that has been requested by customers for some time and has finally made it into the next release of PostgreSQL – the ability to limit the duration of a transaction. In the past, PostgreSQL supported an instance-wide configuration variable called old_snapshot_threshold. The idea of this variable was to limit the maximum length of a transaction to avoid table bloat as well as some other issues that are dangerous for the server. However, the old_snapshot_threshold variable could only be set per instance and not in a more fine-grained way. Thus, the community has removed this feature and introduced a new variable called transaction_timeout, which can be set per session.

The default value of this new setting is “unlimited” (0):

test=# SHOW transaction_timeout ;
 transaction_timeout
---------------------
 0
(1 row)

However, if you want to limit the duration of your transaction, you can simply set a value inside your session. The following command sets the configuration variable to 5 seconds (5000 milliseconds):

test=# SET transaction_timeout TO 5000;
SET
test=# BEGIN;
BEGIN
test=*# SELECT now();
             now
------------------------------
 2024-06-21 19:37:35.81715+00
(1 row)
test=*# SELECT now();
             now
------------------------------
 2024-06-21 19:37:35.81715+00
(1 row)
test=*# SELECT now();
FATAL:  terminating connection due to transaction timeout
server closed the connection unexpectedly
  This probably means the server terminated abnormally
  before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
test=#

After 5 seconds, our transaction will terminate and PostgreSQL closes the connection entirely. It is the easiest way to eliminate unnecessarily long transactions and prevent table bloat.

However, PostgreSQL 17 has more to offer.

Improved event triggers

Event triggers are an important feature and were introduced in PostgreSQL many years ago. What is the main idea behind an event trigger? Imagine somebody changes your data structure by creating a table, an index, or some other kind of object. An event trigger allows us to react to those events and execute code as needed.

In PostgreSQL, some functionality has been added. First of all, we now have a configuration variable that looks as follows:

test=# SHOW event_triggers;
 event_triggers
----------------
 on
(1 row)

When this one is enabled, event triggers will fire for all applicable statements. The important part here is that only superusers can change this value to a different setting.

In PostgreSQL, there is also the possibility to create an event trigger on REINDEX. However, this is not as critical as the next new feature that has to be discussed – the ability to trigger LOGIN. Now, what is a LOGIN trigger? It basically calls a function (or fires) when a new connection to the database is established. Needless to say, this is an incredibly powerful footgun and can cause serious issues because many things can go wrong.

But before we discuss running a trigger during LOGIN, it makes sense to take a look at a simple example and understand how things work in general. The most basic example is to write a trigger that is supposed to track login attempts in a table. To do that, we can create a simple table:

login_trigger=# CREATE TABLE user_logins (
    id serial,
    who text
);
CREATE TABLE

In PostgreSQL, a trigger will always launch a function. Therefore, the first step is to come up with the function we want to run:

CREATE FUNCTION on_login_proc()
RETURNS event_trigger AS
$$
BEGIN
    INSERT INTO user_logins (who)
    VALUES (SESSION_USER);
    RAISE NOTICE 'You are welcome!';
END;
$$ LANGUAGE plpgsql;

What we see here is that the return value of this function is event_trigger – it is a special data type specifically designed for this purpose. The rest is plain and simple PL/pgSQL code that does not require a return value.

Finally, we can define the trigger itself:

CREATE EVENT TRIGGER on_login_event
ON ddl_command_start
EXECUTE FUNCTION on_login_proc();

Note that the event we are interested in is login. The rest is like a normal event trigger that calls a function of our choice. In the next step, we can already enable the trigger:

login_trigger=# ALTER EVENT TRIGGER on_login_trigger ENABLE ALWAYS;
ALTER EVENT TRIGGER

Congratulations, the trigger has been deployed successfully. Let us log out and reconnect to the database. After we have reestablished the new connection, we can already see the content of our audit table:

login_trigger=# SELECT * FROM user_logins;
 id | who
----+-----
  1 | hs
(1 row)

This looks pretty successful but why did we just call this type of trigger a footgun? Let us modify the function and see what happens:

CREATE OR REPLACE FUNCTION on_login_proc()
RETURNS event_trigger AS
$$
BEGIN
    INSERT INTO user_lo (w) VALUES (SESSION_USER);
    RAISE NOTICE 'You are welcome!';
END;
$$ LANGUAGE plpgsql;

The function is essentially buggy. The consequences are nothing short of a total failure:

linux$ psql login_trigger
psql: error: connection to server
  on socket "/tmp/.s.PGSQL.5432"
  failed: FATAL:  relation "user_lo" does not exist
LINE 1: INSERT INTO user_lo (w) VALUES (SESSION_USER)
QUERY:  INSERT INTO user_lo (w) VALUES (SESSION_USER)
CONTEXT:  PL/pgSQL function on_login_proc() line 3 at SQL statement

The entire authentication process will fail. That is important because small mistakes can lead to large-scale outages. It is therefore highly recommended to think twice before deploying event triggers to handle login attempts. What is important to understand is that PostgreSQL does exactly what it has been designed to do – it just does so in a sensitive area that can cause a lot of trouble.

The problem is that when you have installed a broken event trigger and you want to get rid of it, you have a hard time ahead. First of all, you have to shut down the database and then you have to start it in single-user mode (https://www.postgresql.org/docs/17/app-postgres.html). In single-user mode, you can then drop the event trigger because, in single-user mode, event triggers are actually disabled – you can therefore log in without those functions being fired.

Inspecting wait events in PostgreSQL

PostgreSQL provides a shiny new system view called pg_wait_events. For many years, wait events have been an integral feature of PostgreSQL, and allowed us to monitor and inspect all kinds of performance problems. However, in real life, DBAs often had to switch between the database and the documentation to figure out which type of wait events actually do exist.

pg_wait_events puts an end to this type of problem and provides an easy way to understand which events are there and what they mean. Here is an example:

test=# \x
Expanded display is on.
test=# SELECT *
FROM pg_wait_events
WHERE name = 'DataFileFlush';
-[ RECORD1 ]-----------------------------------------
type        | IO
name        | DataFileFlush
description | Waiting for a relation data file to
              reach durable storage

What we can see here is that DataFileFlush means that we are waiting for the operating system to write the data to your physical storage device.

The beauty is that this new view provides a comprehensive overview that might surprise many people:

test=# SELECT  type, count(*)
FROM    pg_wait_events
GROUP BY ROLLUP (1)
ORDER BY 1;
   type    | count
-----------+-------
 Activity  |    16
 BufferPin |     1
 Client    |     9
 Extension |     1
 IO        |    77
 IPC       |    57
 LWLock    |    81
 Lock      |    12
 Timeout   |    10
           |   264
(10 rows)

Yes, this is true: PostgreSQL knows a grand total of 264 different types of events, which is a huge number.

Digging into checkpoints and background writing

The background writer and the checkpoint process have been around for many years. In the latest release of PostgreSQL, the system statistics related to those processes have been changed. First of all, a couple of fields have been removed from the pg_stat_bgwriter system view:

test=# \d pg_stat_bgwriter
     View "pg_catalog.pg_stat_bgwriter"
      Column      |           Type   ...
------------------+-------------------------
 buffers_clean    | bigint
 maxwritten_clean | bigint
 buffers_alloc    | bigint
 stats_reset      | timestamp with time zone

The view is way more compact now because a great deal of this information has been moved to a new system view that is defined as follows:

test=# \d pg_stat_checkpointer
                     View "pg_catalog.pg_stat_checkpointer"
       Column        |           Type ...
---------------------+--------------------------
 num_timed           | bigint
 num_requested       | bigint
 restartpoints_timed | bigint
 restartpoints_req   | bigint
 restartpoints_done  | bigint
 write_time          | double precision
 sync_time           | double precision
 buffers_written     | bigint
 stats_reset         | timestamp with time zone

The pg_stat_checkpointer view contains most of the information previously found in pg_stat_bgwriter. Therefore, it is necessary to adjust your monitoring queries to reflect those changes.

Improving pg_stat_statements

The pg_stat_statements module is an extension for PostgreSQL that is shipped as part of the contrib package. To me, it has always been the gold standard for performance analysis; therefore, I am happy to see even more changes made to improve the data provided by this view. There are various interesting ones worth mentioning:

CALL statements now support parameters as placeholders
Allows placeholders for savepoint and 2PC-related commands
Tracks DEALLOCATE statements
Adds support for local block I/O statistics
Adds more details to JIT statistics
Adds an optional argument to pg_stat_statements_reset()

Overall, these improvements will make pg_stat_statements more compact and thus make it easier to find relevant information.

Adding permissions for maintenance tasks

In the past, maintenance was often done by the superuser. The goal is to avoid this. Therefore, more and more permissions and roles have been added over the years to reduce the need for the superuser.

PostgreSQL 17 has added the MAINTAIN permission to a couple of commands, which allows us to execute various important tasks such as the following:

VACUUM and ANALYZE
CLUSTER
REINDEX
REFRESH MATERIALIZED VIEW
LOCK TABLE

The feature works as follows:

test=# CREATE USER joe;
CREATE ROLE
test=# GRANT MAINTAIN ON …

The tab completion will reveal the full power of this new feature. The number of options you have are quite numerous:

ALL FUNCTIONS IN SCHEMA DATABASE
TABLE
ALL PROCEDURES IN SCHEMA
DOMAIN
LANGUAGE
ROUTINE
TABLESPACE
ALL ROUTINES IN SCHEMA
FOREIGN DATA WRAPPER
LARGE OBJECT
SCHEMA
TYPE
ALL SEQUENCES IN SCHEMA
FOREIGN SERVER
PARAMETER
SEQUENCE
ALL TABLES IN SCHEMA
FUNCTION
PROCEDURE

After dealing with DBA-related functionalities, we can now turn our attention to some more developer-oriented functionalities.

The rest of the chapter is locked

You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Table of Contents (16) Chapters

Understanding DBA and administration features

Terminating long transactions

Improved event triggers

Inspecting wait events in PostgreSQL

Digging into checkpoints and background writing

Improving pg_stat_statements

Adding permissions for maintenance tasks

Authors (1)

Personalised recommendations for you

You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Table of Contents (16) Chapters

Understanding DBA and administration features

Terminating long transactions

Improved event triggers

Inspecting wait events in PostgreSQL

Digging into checkpoints and background writing

Improving pg_stat_statements

Adding permissions for maintenance tasks

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you