You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Product type Paperback

Published in Dec 2024

Publisher Packt

ISBN-13 9781836205975

Length 474 pages

Edition 6th Edition

Languages

SQL

Tools

PostgreSQL

Concepts

Databases

Author (1):

Hans-Jürgen Schönig

View More author details

Table of Contents (16) Chapters

Preface

1. Chapter 1: What is New in PostgreSQL 17

2. Chapter 2: Understanding Transactions and Locking FREE CHAPTER

3. Chapter 3: Making Use of Indexes

4. Chapter 4: Handling Advanced SQL

5. Chapter 5: Log Files and System Statistics

6. Chapter 6: Optimizing Queries for Good Performance

7. Chapter 7: Writing Stored Procedures

8. Chapter 8: Managing PostgreSQL Security

9. Chapter 9: Handling Backup and Recovery

10. Chapter 10: Making Sense of Backups and Replication

11. Chapter 11: Deciding on Useful Extensions

12. Chapter 12: Troubleshooting PostgreSQL

13. Chapter 13: Migrating to PostgreSQL

14. Index

Why subscribe?

15. Other Books You May Enjoy

Using SQL and developer features

In this section, we will discuss some of the most desired developer and SQL features that have made it into PostgreSQL 17.

Teaching COPY error handling

Let us start with my personal favorite: COPY is finally able to handle errors in a reasonably good way. Many people were frustrated by the error-handling behavior. Standard PostgreSQL will stop COPY when it hits an error. It is good to see that this vital functionality has made it into the official release of PostgreSQL.

The main question arising is: how can we make use of this feature? Here is a simple command to create a sample table:

test=# CREATE TABLE t_data (
    id    int,
    data  text
);
CREATE TABLE

The goal is to import the following dataset into the table and make sure those errors are handled properly:

1  hans
2  paul
abc  joe
4  jane
def  james
5  laura

What we see here is that the data is definitely wrong. The following listing proves this beyond doubt:

test=# COPY t_data FROM '/tmp/file.txt';
ERROR:  invalid input syntax for type integer: "abc"
CONTEXT:  COPY t_data, line 3, column id: "abc"

Fortunately, we can handle this kind of problem in PostgreSQL 17 and simply ignore the error:

test=# COPY t_data FROM '/tmp/file.txt'
WITH (ON_ERROR 'ignore');
NOTICE:  2 rows were skipped due to data type incompatibility
COPY 4
test=# SELECT * FROM t_data ;
 id | data
----+-------
  1 | hans
  2 | paul
  4 | jane
  5 | laura
(4 rows)

PostgreSQL will import the data and simply skip over invalid data. A NOTICE label will indicate how many rows have been skipped. As of version 17, two types of ON_ERROR settings are supported: stop and ignore. In the future, it is likely that more options will be available.

Splitting and merging partitions

In the previous years, there has not been a single version of PostgreSQL that has not provided relevant improvements to partitioning as a whole. The same holds true for the new release. This time, the development team has been working on splitting and merging partitions, which has been a frequent requirement over the years.

The following listing shows how a simple table including a partition can be created:

CREATE TABLE t_timeseries (
    id serial,
    d date,
    payload text
) PARTITION BY RANGE (d);
CREATE TABLE t_timeseries_2024
PARTITION OF t_timeseries
FOR VALUES FROM ('2024-01-01')
TO ('2025-01-01');

Here is a typical example one would encounter in real life. We have some kind of time series and we are using range partitions to split the data into smaller chunks for various reasons (faster cleanup, scalability, and so on). However, tables are often too large and we have to break them up into smaller chunks. This is when PostgreSQL can do some of the heavy lifting for us:

ALTER TABLE t_timeseries
   SPLIT PARTITION t_timeseries_2024
   INTO (
     PARTITION t_timeseries_2024_h1
    FOR VALUES FROM ('2024-01-01') TO ('2024-07-01'),
  PARTITION t_timeseries_2024_h2
    FOR VALUES FROM ('2024-07-01') TO ('2025-01-01')
);

What we do here is take our partition and split it into two new chunks that contain roughly half of the data. Note that this is a single command that takes care of this operation.

While splitting partitions into various pieces might be by far the most common new operation, it is also possible to reverse this decision and unify various partitions into a single entity. The way to do that is by using the ALTER TABLE … MERGE PARTITIONS … command, which is equally as easy to use as the SPLIT command that we have observed and tested before:

ALTER TABLE t_timeseries
MERGE PARTITIONS (
    t_timeseries_2024_h1, 
    t_timeseries_2024_h2
)
INTO t_timeseries_2024;

All we have to do here is to tell PostgreSQL which partitions are supposed to form the new entity and let the database engine do its magic.

Tuning numbers into binary and octal values

One of the lesser-known features that made it into PostgreSQL is the ability to convert numbers to a binary and, respectively, octal representation. Two overloaded functions have been added – to_bin and to_oct:

test=# \df *to_bin*
   List of functions
   Schema   |  Name  | Result data type | Argument data types | Type
------------+--------+------------------+---------------------+------
 pg_catalog | to_bin | text             | bigint              | func
 pg_catalog | to_bin | text             | integer             | func
(2 rows)
test=# \df *to_oct*
   List of functions
   Schema   |  Name  | Result data type | Argument data types | Type
------------+--------+------------------+---------------------+------
 pg_catalog | to_oct | text             | bigint              | func
 pg_catalog | to_oct | text             | integer             | func
(2 rows)

Both functions can be called with 32- or 64-bit integer values. The following listing shows an example of those functions in action:

test=# SELECT to_bin(4711), to_oct(4711);
    to_bin     | to_oct
---------------+--------
 1001001100111 | 11147
(1 row)

Improving MERGE even more

MERGE has been around for various releases. In SQL, the MERGE command is used to merge data from two tables into one table. This command is useful when you need to update or insert rows based on a common column between the two tables.

The new release of PostgreSQL has also introduced another feature, namely, WHEN NOT MATCHED BY SOURCE THEN. This additional syntax allows us to define the behavior even better and adds some flexibility.

Here is how it works:

CREATE TABLE t_demo (
    a int PRIMARY KEY, 
    b int
);
INSERT INTO t_demo
VALUES (1, 4711), 
       (2, 5822), 
       (3, 6933);
CREATE TABLE t_source (
    a int PRIMARY KEY, 
    b int
);
INSERT INTO t_source
VALUES (2, 6822), 
       (3, 6933), 
       (4, 1252);
MERGE INTO t_demo AS t1
USING t_source AS t2 
ON t1.a = t2.a
WHEN MATCHED THEN
    UPDATE SET b = t1.b * 100
WHEN NOT MATCHED THEN
    INSERT (a, b) VALUES (t2.a, t2.b)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE
RETURNING t1.*, t2.*;

This MERGE statement will return the following data:

 a |   b    | a |  b
---+--------+---+------
 1 |   4711 |   |
 2 | 582200 | 2 | 6822
 3 | 693300 | 3 | 6933
 4 |   1252 | 4 | 1252
(4 rows)

RETURNING * can be really useful to debug the statement as a whole. The same is true in my example: a = 1 is available in the original table but not in the source table and the row is therefore deleted. In the case of a = 2 and a = 3, we got a full match and, therefore, the UPDATE statement will execute. a = 4 is only present in the t_source table and is therefore inserted into the t_demo table.

The following table shows what we can expect to find after the MERGE operation:

 a |   b
---+--------
 2 | 582200
 3 | 693300
 4 |   1252
(3 rows)

As you can see, all three cases defined in the MERGE statement have been executed successfully. The question is: which row was touched by which rule? We can modify the RETURNING clause a bit:

RETURNING merge_action(), t1.*, t2.*

In this case, PostgreSQL will provide us with even more information, as we can see in the following listing:

 merge_action | a |   b    | a |  b
--------------+---+--------+---+------
 DELETE       | 1 |   4711 |   |
 UPDATE       | 2 | 582200 | 2 | 6822
 UPDATE       | 3 | 693300 | 3 | 6933
 INSERT       | 4 |   1252 | 4 | 1252
(4 rows)

Additional JSON functionality

The number of JSON-related functionalities has skyrocketed over the years. The same is true for version 17, which provides many more features that will make JSON easier to use and more powerful overall. The first thing that caught my attention was the fact that more standard compliant functions have been added – namely, JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE().

What is also noteworthy is JSON_TABLE, which allows us to turn a JSON document into a tabular format in one go. This is pretty similar to what XMLTABLE does.

The syntax might look as follows:

SELECT jt.*
FROM customers,
     JSON_TABLE (
         js, '$.favorites[*]' COLUMNS (
             id FOR ORDINALITY,
             country text PATH '$.country',
             branch text PATH '$.industry[*].branch' WITH WRAPPER,
             ceo text PATH '$.company[*].ceo' WITH WRAPPER
         )
     ) AS jt;

What this does is to address various elements in the JSON document and return it in a format we can actually and safely read.

Creating BRIN indexes in parallel

Technically, this is not a developer feature, but given the fact that performance topics are often hard to categorize, I decided to include this here. BRIN indexes are often used in data warehouses to quickly filter data without carrying the overhead of full-blown B-tree indexes. Creating B-trees has long been possible using more than one CPU. However, in PostgreSQL 17, it is now possible to create BRIN indexes in parallel, which can greatly speed up the process

The rest of the chapter is locked

You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Table of Contents (16) Chapters

Using SQL and developer features

Teaching COPY error handling

Splitting and merging partitions

Tuning numbers into binary and octal values

Improving MERGE even more

Additional JSON functionality

Creating BRIN indexes in parallel

Authors (1)

Personalised recommendations for you

You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Table of Contents (16) Chapters

Using SQL and developer features

Teaching COPY error handling

Splitting and merging partitions

Tuning numbers into binary and octal values

Improving MERGE even more

Additional JSON functionality

Creating BRIN indexes in parallel

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you