Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Mastering PostgreSQL 9.6
Mastering PostgreSQL 9.6

Mastering PostgreSQL 9.6: A comprehensive guide for PostgreSQL 9.6 developers and administrators

By Hans-Jürgen Schönig
₹800 per month
Book May 2017 416 pages 1st Edition
eBook
₹3,634.99 ₹800.00
Print
₹4,542.99
Subscription
₹800 Monthly
eBook
₹3,634.99 ₹800.00
Print
₹4,542.99
Subscription
₹800 Monthly

What do you get with a Packt Subscription?

Free for first 7 days. ₹800 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Mastering PostgreSQL 9.6

PostgreSQL Overview

PostgreSQL is one of the world's most advanced open source database systems and it has many features widely used by developers and system administrators alike. In this book, many of those cool features will be covered and discussed in great detail.

In this chapter, you will be introduced to PostgreSQL and the cool new features available in PostgreSQL 9.6 and beyond. All relevant new functionality will be covered in detail. Given the sheer number of changes made to the code and given the size of the PostgreSQL project, this list of features is of course not complete, so I tried to focus on the most important aspects relevant to most people.

The features outlined in this chapter will be split into the following categories:

  • Database administration
  • SQL and developer-related
  • Backup, recovery, and replication
  • Performance-related topics

What is new in PostgreSQL 9.6?

PostgreSQL 9.6 was released in late 2016 and is the last version that will still be following the old numbering scheme PostgreSQL has been using for more than a decade now. From PostgreSQL 10.0 onward, a new version numbering system will be in place. From 10.0 on, major releases will happen way more frequently.

Understanding new database administration functions

PostgreSQL 9.6 has many new features that can help the administrator to reduce work and make systems more robust.

One of those features is the idle_in_transaction_session_timeout function.

Killing idle sessions

In PostgreSQL, a session or a transaction can basically live almost forever. In some cases, this has been a problem because transactions were kept open for too long. Usually, this was due to a bug. The trouble is this: insanely long transactions can cause cleanup problems and table bloat can occur. The uncontrolled growth of a table (table bloat) naturally leads to performance problems and unhappy end users.

Starting with PostgreSQL 9.6, it is possible to limit the duration a database connection is allowed to spend inside a transaction without performing real work. Here is how it works:

test=# SET idle_in_transaction_session_timeout TO 2500; 
SET
test=# BEGIN;
BEGIN
test=# SELECT 1;
?column?
----------
1
(1 row)

test=# SELECT 1;
FATAL: terminating connection due to idle-in-transaction timeout

Administrators and developers can set a timeout, which is 2.5 seconds in my example. As soon as a transaction is idle for too long, the connection will be terminated automatically by the server. Nasty side effects of long idle transactions can be prevented easily by adjusting this parameter.

Finding more detailed information in pg_stat_activity

The pg_stat_activity function is a system view that has been around for many years. It basically contains a list of active connections. In older versions of PostgreSQL, administrators could see that a query is waiting for somebody else—however, it was not possible to figure out why and for whom. This has changed in 9.6. Two columns have been added:

test=# \d pg_stat_activity  
View "pg_catalog.pg_stat_activity"
Column | Type | Modifiers
------------------+--------------------------+-----------
...
wait_event_type | text |
wait_event | text |
...

In addition to this extension, a new procedure has been added, which shows who caused whom to wait:

test=# SELECT * FROM pg_blocking_pids(4711);
pg_blocking_pids
------------------
{3435}
(1 row)

When the function is called, it will return a list of blocking PIDs.

Tracking vaccum progress

For many years, people have asked for a progress tracker for vacuum. Finally, PostgreSQL 9.6 makes this wish come true by introducing a new system view. Here is how it works:

postgres=# SELECT * FROM pg_stat_progress_vacuum ;

­[ RECORD 1 ]­­­­­­+­­­­­­­­­­­­­­

pid | 29546
datid | 67535
datname | test
relid | 16402
phase | scanning heap
heap_blks_total | 6827
heap_blks_scanned | 77
heap_blks_vacuumed | 0
index_vacuum_count | 0
max_dead_tuples | 154
num_dead_tuples | 0

PostgreSQL will provide detailed information about ongoing vacuum processes so that people can track the progress of this vital operation.

Improving vacuum speed

PostgreSQL 9.6 not only provides you with deeper insights into what vacuum does at the moment, it will also speed up the process in general. From PostgreSQL 9.6 onward, PostgreSQL will keep track of all frozen pages and avoid vacuuming those pages.

Tables that are mostly read-only will massively benefit from this change, as vacuum load is drastically reduced.

Digging into new SQL and developer-related functions

One of the most promising new features of PostgreSQL is the ability to perform phrase searching. Up to 9.5 it was only possible to search for words—phrase searching was very hard to do. 9.6 nicely removes this limitation. Here is an example of how it works:

test=# SELECT phraseto_tsquery('Under pressure') @@                   to_tsvector('Something was under some sort of pressure');

?column?

----------

f

(1 row)

test=# SELECT phraseto_tsquery('Under pressure') @@ to_tsvector('Under pressure by David Bowie hit number 1 again');

?column?

----------

t

(1 row)

The first query returns false because the words we are looking for do not occur in the desired order. In the second example, true is returned because there really is a proper match.

However, there is more: in 9.6 it is possible to check whether words show up in a certain order. In the following example, we want a word to be between united and nations:

test=# SELECT tsquery('united <2> nations') @@                     to_tsvector('are we really united, happy nations?');

?column?

----------

t

(1 row)

test=# SELECT tsquery('united <2> nations') @@ to_tsvector('are we really at united nations?');

?column?

----------

f

(1 row)

The second example returns false as there is no word between united and nations.

Using new backup and replication functionality

PostgreSQL 9.6 has also seen improvements in the area of backup and recovery.

Streamlining wal_level and monitoring

The wal_level setting has always been a bit hard to understand for many people. Many were struggling with the difference between the archive and hot_standby settings. To remove this confusion altogether, both settings have been replaced with the easier-to-understand replica setting, which does the same as hot_standby.

In addition to that, the monitoring of replicated setups has been simplified. Prior to 9.6, there was only the pg_stat_replication view, which could be queried on the master to supervise the flow of data to the slave. Now it is also possible to monitor the flow of data on the slaves, by consulting the pg_stat_wal_receiver function. It is basically the slave-side mirror of the pg_stat_replication function and helps to determine the state of replication.

Using multiple synchronous standby servers

PostgreSQL has been able to perform synchronous replication for quite a while already. In PostgreSQL, it is possible to have more than just one synchronous server from 9.6 onward. Earlier, only one server had to acknowledge a commit. Now it is possible to have an entire group of servers that has to confirm a commit. This is especially important if you want to improve reliability in case of multi-node error.

The syntax to use this new feature is simple:

synchronous_standby_names = '3 (server1, server2, server3, server4) 

However, there is more to synchronous replication in PostgreSQL 9.6. Previously, PostgreSQL ensured (synchronous_commit = on) that the transaction log has reached the slave. However, this did not mean that data was actually visible. Consider an example: somebody adds a user to the master, instantly connects to the slave, and checks for the user. While the transaction log was guaranteed to be on the slave, it was not necessarily guaranteed that the data inside the log was already visible to the end user (due to replication conflicts and so on). By setting synchronous_commit = 'remote_apply', it is now possible to query the slave directly after a commit on the master, without having to worry that data might not be visible yet. The remote_apply value is slower than the on value but it allows to write more advanced applications.

Understanding performance-related features

Just like every release of PostgreSQL, there are numerous performance improvements, which can help to speedup applications. In this section, I want to focus on the most important and most powerful ones. Of course, there are many more small improvements than listed here.

Improving relation extensions

For many years PostgreSQL has extended a table (or an index) block by block. In the case of a single writer process, this was usually fine. However, in cases of high-concurrency writing, writing a block at a time was a source of contention and suboptimal performance. From 9.6 onward, PostgreSQL started to extend tables by multiple blocks at a time. The number of blocks added at a time is 20 times the number of waiting processes.

Checkpoint sorting and kernel interaction

When PostgreSQL writes changes to disk during a checkpoint, it now does so in a more orderly way to ensure that writes are more sequential than earlier. This is done by sorting blocks before sending them too. Random writes will be dramatically reduced this way, which in turn leads to higher throughput on most hardware.

Sorted checkpoints are not the only scalability thing to make it into 9.6. There are also new kernel write-back configuration options: what does this mean? In case of large caches, it could take quite a long time to write all changes out. This used to be especially nasty on systems with hundreds of gigabytes of memory because fairly intense I/O storms could happen. Of course, the operating system, level behavior of Linux could be changed using the /proc/sys/vm/dirty_background_ratio command. However, only a handful of consultants and system administrators actually knew how to do that and why. The checkpoint_flush_after, bgwriter_flush_after, and backend_flush_after functions can be used now to control the flush behavior. In general, the rule is to flush earlier. Still, as the feature is new, people are still gathering experience on how to use those settings in the most efficient way possible.

Using more advanced foreign data wrappers

Foreign data wrappers have been around for many years. Starting with PostgreSQL 9.6, the optimizer can use foreign tables way more efficiently. This includes join push down (joins can now already be performed remotely) and order push down (sorting can now happen remotely). Distributing data inside a cluster is now way more efficient due to faster remote operations.

Introducing parallel queries

Traditionally, a query had to run on a single CPU. While this was just fine in the OLTP world, it started to be a problem for analytical applications, which were bound to the speed provided by a single core. With PostgreSQL 9.6, parallel queries were introduced. Of course, implementing parallel queries was hard and so a lot of infrastructure has already been implemented over the years. All this infrastructure is now available to provide the end user with parallel sequential scans. The idea is to make many CPUs work on complicated WHERE conditions during a sequential scan. Version 9.6 also allowed for parallel aggregates and parallel joins. Of course, there is a lot of work left, but we are already looking at a major leap forward.

To control parallelism, there are two essential settings:

test=# SHOW max_worker_processes; 
max_worker_processes
----------------------
8
(1 row)

test=# SHOW max_parallel_workers_per_gather ;
max_parallel_workers_per_gather
---------------------------------
2
(1 row)

The first one limits the overall number of worker processes available. The second one controls the number of workers allowed per gather node.

A gather node is a new thing you will see in an execution plan. It is in charge of unifying results coming from parallel subprocesses.

In addition to those fundamental settings, there are a couple of new optimizer parameters to adjust the cost of parallel queries.

Adding snapshot too old

Those of you using Oracle would be aware of the following error message: snapshot too old. In Oracle, this message indicates that a transaction has been too long, so it has to be aborted. In PostgreSQL, transactions can run almost infinitely. However, long transactions can still be a problem, so the snapshot too old error has been added as a feature to 9.6, which allows transactions to be aborted after a certain amount of time.

The idea behind that is to prevent table bloat and to make sure that end users are aware of the fact that they might be about to do something stupid.

Summary

In PostgreSQL 9.6 and 10.0, a lot of functionality has been added, which allows people to run even more professional applications even more faster and more efficiently. As far as PostgreSQL 10.0 is concerned, the exact new features are not fully defined yet; some things are already known and are outlined in this chapter.

Left arrow icon Right arrow icon

Key benefits

  • Your one-stop guide to mastering the advanced concepts in PostgreSQL with ease
  • Master query optimization, replication, and high availability with PostgreSQL
  • Extend the functionalities of PostgreSQL to suit your organizational needs with minimum effort

Description

PostgreSQL is an open source database used for handling large datasets (Big Data) and as a JSON document database. It also has applications in the software and web domains. This book will enable you to build better PostgreSQL applications and administer databases more efficiently. We begin by explaining the advanced database design concepts in PostgreSQL 9.6, along with indexing and query optimization. You will also see how to work with event triggers and perform concurrent transactions and table partitioning, along with exploring SQL and server tuning. We will walk you through implementing advanced administrative tasks such as server maintenance and monitoring, replication, recovery and high availability, and much more. You will understand the common and not-so-common troubleshooting problems and how you can overcome them. By the end of this book, you will have an expert-level command of the advanced database functionalities and will be able to implement advanced administrative tasks with PostgreSQL.

What you will learn

[*]Get to grips with the advanced features of PostgreSQL 9.6 and handle advanced SQL [*]Make use of the indexing features in PostgreSQL and fine-tune the performance of your queries [*]Work with the stored procedures and manage backup and recovery [*]Master the replication and failover techniques [*]Troubleshoot your PostgreSQL instance for solutions to the common and not-so-common problems [*]Learn how to migrate your database from MySQL and Oracle to PostgreSQL without any hassle

Product Details

Country selected

Publication date : May 30, 2017
Length 416 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781783555352
Category :

What do you get with a Packt Subscription?

Free for first 7 days. ₹800 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details


Publication date : May 30, 2017
Length 416 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781783555352
Category :

Table of Contents

14 Chapters
Preface Chevron down icon Chevron up icon
1. PostgreSQL Overview Chevron down icon Chevron up icon
2. Understanding Transactions and Locking Chevron down icon Chevron up icon
3. Making Use of Indexes Chevron down icon Chevron up icon
4. Handling Advanced SQL Chevron down icon Chevron up icon
5. Log Files and System Statistics Chevron down icon Chevron up icon
6. Optimizing Queries for Good Performance Chevron down icon Chevron up icon
7. Writing Stored Procedures Chevron down icon Chevron up icon
8. Managing PostgreSQL Security Chevron down icon Chevron up icon
9. Handling Backup and Recovery Chevron down icon Chevron up icon
10. Making Sense of Backups and Replication Chevron down icon Chevron up icon
11. Deciding on Useful Extensions Chevron down icon Chevron up icon
12. Troubleshooting PostgreSQL Chevron down icon Chevron up icon
13. Migrating to PostgreSQL Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Top Reviews
No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.