Understanding transaction isolation levels
Up until now, you have seen how to handle locking, as well as some basic concurrency. In this section, you will learn about transaction isolation. To me, this is one of the most neglected topics in modern software development. Only a small fraction of software developers is actually aware of this issue, which in turn leads to mind-boggling bugs.
Here is an example of what can happen:
Transaction 1 |
Transaction 2 |
|
|
|
|
The user will see |
|
|
|
|
|
|
|
The user will see |
|
|
Table 2.8 – Transactional visibility
Most users would actually expect the first transaction to always return 300
, regardless of the second transaction. However, this isn’t true. By default, PostgreSQL runs in the READ COMMITTED
transaction isolation mode. This means that every statement inside a transaction will get a new snapshot of the data, which will be constant throughout the query.
Note
A SQL statement will operate on the same snapshot and will ignore changes by concurrent transactions while it is running.
If you want to avoid this, you can use TRANSACTION ISOLATION LEVEL REPEATABLE READ
. In this transaction isolation level, a transaction will use the same snapshot throughout the entire transaction. Here’s what will happen:
Transaction 1 |
Transaction 2 |
|
|
|
|
|
|
|
|
|
|
|
|
|
The user will see |
|
Table 2.9 – Managing REPEATABLE READ transactions
As we’ve outlined, the first transaction will freeze its snapshot of the data and provide us with constant results throughout the entire transaction. This feature is especially important if you want to run reports. The first and last pages of a report should always be consistent and operate on the same data. Therefore, the repeatable read is key to consistent reports.
Note that isolation-related errors won’t always pop up instantly. Sometimes, trouble is noticed years after an application has been moved to production.
Note
Repeatable read is not more expensive than read committed. There is no need to worry about performance penalties. For normal online transaction processing (OLTP), read committed has various advantages because changes can be seen much earlier and the odds of unexpected errors are usually lower.
Considering serializable snapshot isolation transactions
On top of read committed and repeatable read, PostgreSQL offers Serializable Snapshot Isolation (SSI) transactions. So, overall, PostgreSQL supports three isolation levels (read committed, repeatable read, and serializable). Note that Read Uncommitted
(which still happens to be the default in some commercial databases) is not supported; if you try to start a read uncommitted transaction, PostgreSQL will silently map to read committed. Let’s get back to the serializable isolation level.
Note
If you want to know more about this isolation level, consider checking out https://wiki.postgresql.org/wiki/Serializable.
The idea behind serializable isolation is simple; if a transaction is known to work correctly when there is only a single user, it will also work in the case of concurrency when this isolation level is chosen. However, users have to be prepared; transactions may fail (by design) and error out. In addition to this, a performance penalty has to be paid.
Note
Consider using serializable isolation only when you have a decent understanding of what is going on inside the database engine.