Working with PostgreSQL transactions
PostgreSQL provides you with highly advanced transaction machinery that offers countless features to developers and administrators alike. In this section, we will look at the basic concept of transactions. The first important thing to know is that, in PostgreSQL, everything is a transaction. If you send a simple query to the server, it is already a transaction. Here is an example:
test=# SELECT now(), now(); now | now -------------------------------+------------------------------- 2024-05-24 12:59:33.594603+02 | 2024-05-24 12:59:33.594603+02 (1 row)
In this case, the SELECT
statement will be a separate transaction. If the same command is executed again, different timestamps will be returned.
Tip
Keep in mind that the now()
function will return the transaction time. The SELECT
statement will, therefore, always return two identical timestamps. If you want the real time, consider using clock_timestamp()
instead of now()
.
If more than one statement has to be a part of the same transaction, the BEGIN
statement must be used, as follows:
test=# \h BEGIN Command: BEGIN Description: start a transaction block Syntax: BEGIN [ WORK | TRANSACTION ] [ transaction_mode [, ...] ] where transaction_mode is one of: ISOLATION LEVEL { SERIALIZABLE | REPEATABLE READ | READ COMMITTED | READ UNCOMMITTED } READ WRITE | READ ONLY [ NOT ] DEFERRABLE URL: https://www.postgresql.org/docs/17/sql-begin.html
The BEGIN
statement will ensure that more than one command is packed into a transaction. Here is how it works:
test=# BEGIN; BEGIN test=*# SELECT now(); now ------------------------------- 2024-05-24 13:00:39.864604+02 (1 row) test=*# SELECT now(); now ------------------------------- 2024-05-24 13:00:39.864604+02 (1 row) test=*# COMMIT; COMMIT
The important point here is that both timestamps will be identical. As we mentioned earlier, we are talking about transaction time.
To end the transaction, COMMIT
can be used:
test=# \h COMMIT Command: COMMIT Description: commit the current transaction Syntax: COMMIT [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ] URL: https://www.postgresql.org/docs/17/sql-commit.html
There are a few syntax elements here. You can just use COMMIT
, COMMIT WORK
, or COMMIT TRANSACTION
. All three commands have the same meaning. If this is not enough, there’s more – the END
command is identical to COMMIT
and can be used interchangeably:
test=# \h END Command: END Description: commit the current transaction Syntax: END [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ] URL: https://www.postgresql.org/docs/17/sql-end.html
As you can see, the END
clause is the same as the COMMIT
clause from a feature point of view.
ROLLBACK
is the counterpart of COMMIT
. Instead of successfully ending a transaction, it will simply stop the transaction without ever making things visible to other transactions, as shown in the following code:
test=# \h ROLLBACK Command: ROLLBACK Description: abort the current transaction Syntax: ROLLBACK [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ] URL: https://www.postgresql.org/docs/17/sql-rollback.html
Some applications use ABORT
instead of ROLLBACK
(those two commands are interchangeable in PostgreSQL). The meaning is the same. What is really useful in PostgreSQL is the idea of transaction chains. COMMIT AND CHAIN
will help you to achieve exactly that:
test=# SHOW transaction_read_only; transaction_read_only ----------------------- Off (1 row) test=# BEGIN TRANSACTION READ ONLY ; BEGIN test=*# SELECT 1; ?column? ---------- 1 (1 row) test=*# COMMIT AND CHAIN; COMMIT test=*# SHOW transaction_read_only; transaction_read_only ----------------------- On (1 row) test=*# SELECT 1; ?column? ---------- 1 (1 row) test=*# COMMIT AND NO CHAIN; COMMIT test=# SHOW transaction_read_only; transaction_read_only ----------------------- Off (1 row) test=# COMMIT; WARNING: there is no transaction in progress COMMIT
Let’s go through this example step by step:
- Display the content of the
transaction_read_only
setting. It isOff
because, by default, we are in read/write mode. - Start a read-only transaction using
BEGIN
. This will automatically adjust thetransaction_read_only
variable. - Commit the transaction using
AND CHAIN
, and then PostgreSQL will automatically start a new transaction featuring the same properties as the previous transaction.
In our example, we will also be in read-only mode, just like the transaction before. There is no need to explicitly open a new transaction and set whatever values again, which can dramatically reduce the number of round trips between the application and the server. In the case of high latency systems, saving on commands can really make a difference. If a transaction is committed normally (= NO CHAIN
), the read-only attribute of the transaction will be gone.
Handling errors inside a transaction
In this section, we will dig deeper into error handling and learn how to handle problems inside a database transaction. It is not always the case that transactions are correct from beginning to end. Things might just go wrong for whatever reason. However, in PostgreSQL, only error-free transactions can be committed. The following listing shows a failing transaction, which errors out due to a division by
zero
error:
test=# BEGIN; BEGIN test=*# SELECT 1; ?column? ---------- 1 (1 row) test=*# SELECT 1 / 0; ERROR: division by zero test=!# SELECT 1; ERROR: current transaction is aborted, commands ignored until end of transaction block test=!# SELECT 1; ERROR: current transaction is aborted, commands ignored until end of transaction block test=!# COMMIT; ROLLBACK
Note that division by zero
did not work out.
Note
In any proper database, an instruction similar to this will instantly error out and make the statement fail.
It is important to point out that PostgreSQL will error out. After an error has occurred, no more instructions will be accepted, even if those instructions are semantically and syntactically correct. It is still possible to issue COMMIT
. However, PostgreSQL will roll back the transaction because it is the only correct thing to be done at that point.
Making use of SAVEPOINT
In professional applications, it can be pretty hard to write reasonably long transactions without ever encountering a single error. To solve this problem, users can utilize something called SAVEPOINT
, as follows:
test=# \h SAVEPOINT Command: SAVEPOINT Description: define a new savepoint within the current transaction Syntax: SAVEPOINT savepoint_name URL: https://www.postgresql.org/docs/17/sql-savepoint.html
As the name indicates, a savepoint is a safe place inside a transaction that the application can return to if things go terribly wrong. Here is an example:
test=# BEGIN; BEGIN test=*# SELECT 1; ?column? ---------- 1 (1 row) test=*# SAVEPOINT a; SAVEPOINT test=*# SELECT 2 / 0; ERROR: division by zero test=!# SELECT 2; ERROR: current transaction is aborted, commands ignored until end of transaction block test=!# ROLLBACK TO SAVEPOINT a; ROLLBACK test=*# SELECT 3; ?column? ---------- 3 (1 row) test=*# COMMIT; COMMIT
After the first SELECT
clause, I decided to create a savepoint to make sure that the application can always return to this point inside the transaction. As you can see, the savepoint has a name, which is referred to later.
After returning to the savepoint called a
, the transaction can proceed normally. The code has jumped back to before the error, so everything is fine.
The number of savepoints inside a transaction is practically unlimited. We have seen customers with over 250,000 savepoints in a single operation. PostgreSQL can easily handle this.
If you want to remove a savepoint from inside a transaction, there’s the RELEASE
SAVEPOINT
command:
test=# \h RELEASE Command: RELEASE SAVEPOINT Description: destroy a previously defined savepoint Syntax: RELEASE [ SAVEPOINT ] savepoint_name URL: https://www.postgresql.org/docs/17/sql-release-savepoint.html
Many people ask what will happen if you try to reach a savepoint after a transaction has ended. The answer is that the life of a savepoint ends as soon as the transaction ends. In other words, there is no way to return to a certain point in time after the transactions have been completed.
Transactional DDLs
PostgreSQL has a very nice feature that is unfortunately not present in many commercial database systems. In PostgreSQL, it is possible to run DDLs (commands that change the data structure) inside a transaction block. In a typical commercial system, a DDL will implicitly commit the current transaction by default. This does not occur in PostgreSQL.
Apart from some minor exceptions (DROP DATABASE
, CREATE TABLESPACE
, DROP TABLESPACE
, and so on), all DDLs in PostgreSQL are transactional, which is a huge advantage and a real benefit to end users.
Here is an example:
test=# BEGIN; BEGIN test=*# CREATE TABLE t_test (id int); CREATE TABLE test=*# ALTER TABLE t_test ALTER COLUMN id TYPE int8; ALTER TABLE test=*# \d t_test Table "public.t_test" Column | Type | Collation | Nullable | Default --------+--------+-----------+----------+--------- id | bigint | | | test=*# ROLLBACK; ROLLBACK test=# \d t_test Did not find any relation named "t_test".
In this example, a table has been created and modified, and the entire transaction has been aborted. As you can see, there is no implicit COMMIT
command or any other strange behavior. PostgreSQL simply acts as expected.
Transactional DDLs are especially important if you want to deploy software. Just imagine running a content management system (CMS). If a new version is released, you’ll want to upgrade. Running the old version would still be okay; running the new version would also be okay, but you really don’t want a mixture of old and new. Therefore, deploying an upgrade in a single transaction is highly beneficial, as it upgrades an atomic operation.
Note
To facilitate good software practices, we can include several separately coded modules from our source control system into a single deployment transaction.
After dealing with transaction and error handling in general, it is important to focus our attention more on locking and concurrency.