Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering PostgreSQL 17

You're reading from   Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Arrow left icon
Product type Paperback
Published in Dec 2024
Publisher Packt
ISBN-13 9781836205975
Length 474 pages
Edition 6th Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Hans-Jürgen Schönig Hans-Jürgen Schönig
Author Profile Icon Hans-Jürgen Schönig
Hans-Jürgen Schönig
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Chapter 1: What is New in PostgreSQL 17 2. Chapter 2: Understanding Transactions and Locking FREE CHAPTER 3. Chapter 3: Making Use of Indexes 4. Chapter 4: Handling Advanced SQL 5. Chapter 5: Log Files and System Statistics 6. Chapter 6: Optimizing Queries for Good Performance 7. Chapter 7: Writing Stored Procedures 8. Chapter 8: Managing PostgreSQL Security 9. Chapter 9: Handling Backup and Recovery 10. Chapter 10: Making Sense of Backups and Replication 11. Chapter 11: Deciding on Useful Extensions 12. Chapter 12: Troubleshooting PostgreSQL 13. Chapter 13: Migrating to PostgreSQL 14. Index 15. Other Books You May Enjoy

Understanding DBA and administration features

In every release, a comprehensive set of features is added to make the lives of database administrators (DBAs) easier and more effective. The same is true for PostgreSQL. So, let us take a look and dive into the new features of PostgreSQL 17.

Terminating long transactions

PostgreSQL supports various features to limit the duration of statements, the maximum time a query will wait for locks, and a lot more. However, there is one feature that has been requested by customers for some time and has finally made it into the next release of PostgreSQL – the ability to limit the duration of a transaction. In the past, PostgreSQL supported an instance-wide configuration variable called old_snapshot_threshold. The idea of this variable was to limit the maximum length of a transaction to avoid table bloat as well as some other issues that are dangerous for the server. However, the old_snapshot_threshold variable could only be set per instance and not in a more fine-grained way. Thus, the community has removed this feature and introduced a new variable called transaction_timeout, which can be set per session.

The default value of this new setting is “unlimited” (0):

test=# SHOW transaction_timeout ;
 transaction_timeout
---------------------
 0
(1 row)

However, if you want to limit the duration of your transaction, you can simply set a value inside your session. The following command sets the configuration variable to 5 seconds (5000 milliseconds):

test=# SET transaction_timeout TO 5000;
SET
test=# BEGIN;
BEGIN
test=*# SELECT now();
             now
------------------------------
 2024-06-21 19:37:35.81715+00
(1 row)
test=*# SELECT now();
             now
------------------------------
 2024-06-21 19:37:35.81715+00
(1 row)
test=*# SELECT now();
FATAL:  terminating connection due to transaction timeout
server closed the connection unexpectedly
  This probably means the server terminated abnormally
  before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
test=#

After 5 seconds, our transaction will terminate and PostgreSQL closes the connection entirely. It is the easiest way to eliminate unnecessarily long transactions and prevent table bloat.

However, PostgreSQL 17 has more to offer.

Improved event triggers

Event triggers are an important feature and were introduced in PostgreSQL many years ago. What is the main idea behind an event trigger? Imagine somebody changes your data structure by creating a table, an index, or some other kind of object. An event trigger allows us to react to those events and execute code as needed.

In PostgreSQL, some functionality has been added. First of all, we now have a configuration variable that looks as follows:

test=# SHOW event_triggers;
 event_triggers
----------------
 on
(1 row)

When this one is enabled, event triggers will fire for all applicable statements. The important part here is that only superusers can change this value to a different setting.

In PostgreSQL, there is also the possibility to create an event trigger on REINDEX. However, this is not as critical as the next new feature that has to be discussed – the ability to trigger LOGIN. Now, what is a LOGIN trigger? It basically calls a function (or fires) when a new connection to the database is established. Needless to say, this is an incredibly powerful footgun and can cause serious issues because many things can go wrong.

But before we discuss running a trigger during LOGIN, it makes sense to take a look at a simple example and understand how things work in general. The most basic example is to write a trigger that is supposed to track login attempts in a table. To do that, we can create a simple table:

login_trigger=# CREATE TABLE user_logins (
    id serial,
    who text
);
CREATE TABLE

In PostgreSQL, a trigger will always launch a function. Therefore, the first step is to come up with the function we want to run:

CREATE FUNCTION on_login_proc()
RETURNS event_trigger AS
$$
BEGIN
    INSERT INTO user_logins (who)
    VALUES (SESSION_USER);
    RAISE NOTICE 'You are welcome!';
END;
$$ LANGUAGE plpgsql;

What we see here is that the return value of this function is event_trigger – it is a special data type specifically designed for this purpose. The rest is plain and simple PL/pgSQL code that does not require a return value.

Finally, we can define the trigger itself:

CREATE EVENT TRIGGER on_login_event
ON ddl_command_start
EXECUTE FUNCTION on_login_proc();

Note that the event we are interested in is login. The rest is like a normal event trigger that calls a function of our choice. In the next step, we can already enable the trigger:

login_trigger=# ALTER EVENT TRIGGER on_login_trigger ENABLE ALWAYS;
ALTER EVENT TRIGGER

Congratulations, the trigger has been deployed successfully. Let us log out and reconnect to the database. After we have reestablished the new connection, we can already see the content of our audit table:

login_trigger=# SELECT * FROM user_logins;
 id | who
----+-----
  1 | hs
(1 row)

This looks pretty successful but why did we just call this type of trigger a footgun? Let us modify the function and see what happens:

CREATE OR REPLACE FUNCTION on_login_proc()
RETURNS event_trigger AS
$$
BEGIN
    INSERT INTO user_lo (w) VALUES (SESSION_USER);
    RAISE NOTICE 'You are welcome!';
END;
$$ LANGUAGE plpgsql;

The function is essentially buggy. The consequences are nothing short of a total failure:

linux$ psql login_trigger
psql: error: connection to server
  on socket "/tmp/.s.PGSQL.5432"
  failed: FATAL:  relation "user_lo" does not exist
LINE 1: INSERT INTO user_lo (w) VALUES (SESSION_USER)
QUERY:  INSERT INTO user_lo (w) VALUES (SESSION_USER)
CONTEXT:  PL/pgSQL function on_login_proc() line 3 at SQL statement

The entire authentication process will fail. That is important because small mistakes can lead to large-scale outages. It is therefore highly recommended to think twice before deploying event triggers to handle login attempts. What is important to understand is that PostgreSQL does exactly what it has been designed to do – it just does so in a sensitive area that can cause a lot of trouble.

The problem is that when you have installed a broken event trigger and you want to get rid of it, you have a hard time ahead. First of all, you have to shut down the database and then you have to start it in single-user mode (https://www.postgresql.org/docs/17/app-postgres.html). In single-user mode, you can then drop the event trigger because, in single-user mode, event triggers are actually disabled – you can therefore log in without those functions being fired.

Inspecting wait events in PostgreSQL

PostgreSQL provides a shiny new system view called pg_wait_events. For many years, wait events have been an integral feature of PostgreSQL, and allowed us to monitor and inspect all kinds of performance problems. However, in real life, DBAs often had to switch between the database and the documentation to figure out which type of wait events actually do exist.

pg_wait_events puts an end to this type of problem and provides an easy way to understand which events are there and what they mean. Here is an example:

test=# \x
Expanded display is on.
test=# SELECT *
FROM pg_wait_events
WHERE name = 'DataFileFlush';
-[ RECORD1 ]-----------------------------------------
type        | IO
name        | DataFileFlush
description | Waiting for a relation data file to
              reach durable storage

What we can see here is that DataFileFlush means that we are waiting for the operating system to write the data to your physical storage device.

The beauty is that this new view provides a comprehensive overview that might surprise many people:

test=# SELECT  type, count(*)
FROM    pg_wait_events
GROUP BY ROLLUP (1)
ORDER BY 1;
   type    | count
-----------+-------
 Activity  |    16
 BufferPin |     1
 Client    |     9
 Extension |     1
 IO        |    77
 IPC       |    57
 LWLock    |    81
 Lock      |    12
 Timeout   |    10
           |   264
(10 rows)

Yes, this is true: PostgreSQL knows a grand total of 264 different types of events, which is a huge number.

Digging into checkpoints and background writing

The background writer and the checkpoint process have been around for many years. In the latest release of PostgreSQL, the system statistics related to those processes have been changed. First of all, a couple of fields have been removed from the pg_stat_bgwriter system view:

test=# \d pg_stat_bgwriter
     View "pg_catalog.pg_stat_bgwriter"
      Column      |           Type   ...
------------------+-------------------------
 buffers_clean    | bigint
 maxwritten_clean | bigint
 buffers_alloc    | bigint
 stats_reset      | timestamp with time zone

The view is way more compact now because a great deal of this information has been moved to a new system view that is defined as follows:

test=# \d pg_stat_checkpointer
                     View "pg_catalog.pg_stat_checkpointer"
       Column        |           Type ...
---------------------+--------------------------
 num_timed           | bigint
 num_requested       | bigint
 restartpoints_timed | bigint
 restartpoints_req   | bigint
 restartpoints_done  | bigint
 write_time          | double precision
 sync_time           | double precision
 buffers_written     | bigint
 stats_reset         | timestamp with time zone

The pg_stat_checkpointer view contains most of the information previously found in pg_stat_bgwriter. Therefore, it is necessary to adjust your monitoring queries to reflect those changes.

Improving pg_stat_statements

The pg_stat_statements module is an extension for PostgreSQL that is shipped as part of the contrib package. To me, it has always been the gold standard for performance analysis; therefore, I am happy to see even more changes made to improve the data provided by this view. There are various interesting ones worth mentioning:

  • CALL statements now support parameters as placeholders
  • Allows placeholders for savepoint and 2PC-related commands
  • Tracks DEALLOCATE statements
  • Adds support for local block I/O statistics
  • Adds more details to JIT statistics
  • Adds an optional argument to pg_stat_statements_reset()

Overall, these improvements will make pg_stat_statements more compact and thus make it easier to find relevant information.

Adding permissions for maintenance tasks

In the past, maintenance was often done by the superuser. The goal is to avoid this. Therefore, more and more permissions and roles have been added over the years to reduce the need for the superuser.

PostgreSQL 17 has added the MAINTAIN permission to a couple of commands, which allows us to execute various important tasks such as the following:

  • VACUUM and ANALYZE
  • CLUSTER
  • REINDEX
  • REFRESH MATERIALIZED VIEW
  • LOCK TABLE

The feature works as follows:

test=# CREATE USER joe;
CREATE ROLE
test=# GRANT MAINTAIN ON …

The tab completion will reveal the full power of this new feature. The number of options you have are quite numerous:

  • ALL FUNCTIONS IN SCHEMA DATABASE
  • TABLE
  • ALL PROCEDURES IN SCHEMA
  • DOMAIN
  • LANGUAGE
  • ROUTINE
  • TABLESPACE
  • ALL ROUTINES IN SCHEMA
  • FOREIGN DATA WRAPPER
  • LARGE OBJECT
  • SCHEMA
  • TYPE
  • ALL SEQUENCES IN SCHEMA
  • FOREIGN SERVER
  • PARAMETER
  • SEQUENCE
  • ALL TABLES IN SCHEMA
  • FUNCTION
  • PROCEDURE

After dealing with DBA-related functionalities, we can now turn our attention to some more developer-oriented functionalities.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image