Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Mex$179.99 | ALL EBOOKS & VIDEOS
Save more on purchases! Buy 2 and save 10%, Buy 3 and save 15%, Buy 5 and save 20%
Mastering PostgreSQL 9.6
Mastering PostgreSQL 9.6

Mastering PostgreSQL 9.6: A comprehensive guide for PostgreSQL 9.6 developers and administrators

By Hans-Jürgen Schönig
Mex$1,000.99 Mex$179.99
Book May 2017 416 pages 1st Edition
eBook
Mex$1,000.99 Mex$179.99
Print
Mex$1,251.99
Subscription
Free Trial
eBook
Mex$1,000.99 Mex$179.99
Print
Mex$1,251.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now
Table of content icon View table of contents Preview book icon Preview Book

Mastering PostgreSQL 9.6

PostgreSQL Overview

PostgreSQL is one of the world's most advanced open source database systems and it has many features widely used by developers and system administrators alike. In this book, many of those cool features will be covered and discussed in great detail.

In this chapter, you will be introduced to PostgreSQL and the cool new features available in PostgreSQL 9.6 and beyond. All relevant new functionality will be covered in detail. Given the sheer number of changes made to the code and given the size of the PostgreSQL project, this list of features is of course not complete, so I tried to focus on the most important aspects relevant to most people.

The features outlined in this chapter will be split into the following categories:

  • Database administration
  • SQL and developer-related
  • Backup, recovery, and replication
  • Performance-related topics

What is new in PostgreSQL 9.6?

PostgreSQL 9.6 was released in late 2016 and is the last version that will still be following the old numbering scheme PostgreSQL has been using for more than a decade now. From PostgreSQL 10.0 onward, a new version numbering system will be in place. From 10.0 on, major releases will happen way more frequently.

Understanding new database administration functions

PostgreSQL 9.6 has many new features that can help the administrator to reduce work and make systems more robust.

One of those features is the idle_in_transaction_session_timeout function.

Killing idle sessions

In PostgreSQL, a session or a transaction can basically live almost forever. In some cases, this has been a problem because transactions were kept open for too long. Usually, this was due to a bug. The trouble is this: insanely long transactions can cause cleanup problems and table bloat can occur. The uncontrolled growth of a table (table bloat) naturally leads to performance problems and unhappy end users.

Starting with PostgreSQL 9.6, it is possible to limit the duration a database connection is allowed to spend inside a transaction without performing real work. Here is how it works:

test=# SET idle_in_transaction_session_timeout TO 2500; 
SET
test=# BEGIN;
BEGIN
test=# SELECT 1;
?column?
----------
1
(1 row)

test=# SELECT 1;
FATAL: terminating connection due to idle-in-transaction timeout

Administrators and developers can set a timeout, which is 2.5 seconds in my example. As soon as a transaction is idle for too long, the connection will be terminated automatically by the server. Nasty side effects of long idle transactions can be prevented easily by adjusting this parameter.

Finding more detailed information in pg_stat_activity

The pg_stat_activity function is a system view that has been around for many years. It basically contains a list of active connections. In older versions of PostgreSQL, administrators could see that a query is waiting for somebody else—however, it was not possible to figure out why and for whom. This has changed in 9.6. Two columns have been added:

test=# \d pg_stat_activity  
View "pg_catalog.pg_stat_activity"
Column | Type | Modifiers
------------------+--------------------------+-----------
...
wait_event_type | text |
wait_event | text |
...

In addition to this extension, a new procedure has been added, which shows who caused whom to wait:

test=# SELECT * FROM pg_blocking_pids(4711);
pg_blocking_pids
------------------
{3435}
(1 row)

When the function is called, it will return a list of blocking PIDs.

Tracking vaccum progress

For many years, people have asked for a progress tracker for vacuum. Finally, PostgreSQL 9.6 makes this wish come true by introducing a new system view. Here is how it works:

postgres=# SELECT * FROM pg_stat_progress_vacuum ;

­[ RECORD 1 ]­­­­­­+­­­­­­­­­­­­­­

pid | 29546
datid | 67535
datname | test
relid | 16402
phase | scanning heap
heap_blks_total | 6827
heap_blks_scanned | 77
heap_blks_vacuumed | 0
index_vacuum_count | 0
max_dead_tuples | 154
num_dead_tuples | 0

PostgreSQL will provide detailed information about ongoing vacuum processes so that people can track the progress of this vital operation.

Improving vacuum speed

PostgreSQL 9.6 not only provides you with deeper insights into what vacuum does at the moment, it will also speed up the process in general. From PostgreSQL 9.6 onward, PostgreSQL will keep track of all frozen pages and avoid vacuuming those pages.

Tables that are mostly read-only will massively benefit from this change, as vacuum load is drastically reduced.

Digging into new SQL and developer-related functions

One of the most promising new features of PostgreSQL is the ability to perform phrase searching. Up to 9.5 it was only possible to search for words—phrase searching was very hard to do. 9.6 nicely removes this limitation. Here is an example of how it works:

test=# SELECT phraseto_tsquery('Under pressure') @@                   to_tsvector('Something was under some sort of pressure');

?column?

----------

f

(1 row)

test=# SELECT phraseto_tsquery('Under pressure') @@ to_tsvector('Under pressure by David Bowie hit number 1 again');

?column?

----------

t

(1 row)

The first query returns false because the words we are looking for do not occur in the desired order. In the second example, true is returned because there really is a proper match.

However, there is more: in 9.6 it is possible to check whether words show up in a certain order. In the following example, we want a word to be between united and nations:

test=# SELECT tsquery('united <2> nations') @@                     to_tsvector('are we really united, happy nations?');

?column?

----------

t

(1 row)

test=# SELECT tsquery('united <2> nations') @@ to_tsvector('are we really at united nations?');

?column?

----------

f

(1 row)

The second example returns false as there is no word between united and nations.

Using new backup and replication functionality

PostgreSQL 9.6 has also seen improvements in the area of backup and recovery.

Streamlining wal_level and monitoring

The wal_level setting has always been a bit hard to understand for many people. Many were struggling with the difference between the archive and hot_standby settings. To remove this confusion altogether, both settings have been replaced with the easier-to-understand replica setting, which does the same as hot_standby.

In addition to that, the monitoring of replicated setups has been simplified. Prior to 9.6, there was only the pg_stat_replication view, which could be queried on the master to supervise the flow of data to the slave. Now it is also possible to monitor the flow of data on the slaves, by consulting the pg_stat_wal_receiver function. It is basically the slave-side mirror of the pg_stat_replication function and helps to determine the state of replication.

Using multiple synchronous standby servers

PostgreSQL has been able to perform synchronous replication for quite a while already. In PostgreSQL, it is possible to have more than just one synchronous server from 9.6 onward. Earlier, only one server had to acknowledge a commit. Now it is possible to have an entire group of servers that has to confirm a commit. This is especially important if you want to improve reliability in case of multi-node error.

The syntax to use this new feature is simple:

synchronous_standby_names = '3 (server1, server2, server3, server4) 

However, there is more to synchronous replication in PostgreSQL 9.6. Previously, PostgreSQL ensured (synchronous_commit = on) that the transaction log has reached the slave. However, this did not mean that data was actually visible. Consider an example: somebody adds a user to the master, instantly connects to the slave, and checks for the user. While the transaction log was guaranteed to be on the slave, it was not necessarily guaranteed that the data inside the log was already visible to the end user (due to replication conflicts and so on). By setting synchronous_commit = 'remote_apply', it is now possible to query the slave directly after a commit on the master, without having to worry that data might not be visible yet. The remote_apply value is slower than the on value but it allows to write more advanced applications.

Understanding performance-related features

Just like every release of PostgreSQL, there are numerous performance improvements, which can help to speedup applications. In this section, I want to focus on the most important and most powerful ones. Of course, there are many more small improvements than listed here.

Improving relation extensions

For many years PostgreSQL has extended a table (or an index) block by block. In the case of a single writer process, this was usually fine. However, in cases of high-concurrency writing, writing a block at a time was a source of contention and suboptimal performance. From 9.6 onward, PostgreSQL started to extend tables by multiple blocks at a time. The number of blocks added at a time is 20 times the number of waiting processes.

Checkpoint sorting and kernel interaction

When PostgreSQL writes changes to disk during a checkpoint, it now does so in a more orderly way to ensure that writes are more sequential than earlier. This is done by sorting blocks before sending them too. Random writes will be dramatically reduced this way, which in turn leads to higher throughput on most hardware.

Sorted checkpoints are not the only scalability thing to make it into 9.6. There are also new kernel write-back configuration options: what does this mean? In case of large caches, it could take quite a long time to write all changes out. This used to be especially nasty on systems with hundreds of gigabytes of memory because fairly intense I/O storms could happen. Of course, the operating system, level behavior of Linux could be changed using the /proc/sys/vm/dirty_background_ratio command. However, only a handful of consultants and system administrators actually knew how to do that and why. The checkpoint_flush_after, bgwriter_flush_after, and backend_flush_after functions can be used now to control the flush behavior. In general, the rule is to flush earlier. Still, as the feature is new, people are still gathering experience on how to use those settings in the most efficient way possible.

Using more advanced foreign data wrappers

Foreign data wrappers have been around for many years. Starting with PostgreSQL 9.6, the optimizer can use foreign tables way more efficiently. This includes join push down (joins can now already be performed remotely) and order push down (sorting can now happen remotely). Distributing data inside a cluster is now way more efficient due to faster remote operations.

Introducing parallel queries

Traditionally, a query had to run on a single CPU. While this was just fine in the OLTP world, it started to be a problem for analytical applications, which were bound to the speed provided by a single core. With PostgreSQL 9.6, parallel queries were introduced. Of course, implementing parallel queries was hard and so a lot of infrastructure has already been implemented over the years. All this infrastructure is now available to provide the end user with parallel sequential scans. The idea is to make many CPUs work on complicated WHERE conditions during a sequential scan. Version 9.6 also allowed for parallel aggregates and parallel joins. Of course, there is a lot of work left, but we are already looking at a major leap forward.

To control parallelism, there are two essential settings:

test=# SHOW max_worker_processes; 
max_worker_processes
----------------------
8
(1 row)

test=# SHOW max_parallel_workers_per_gather ;
max_parallel_workers_per_gather
---------------------------------
2
(1 row)

The first one limits the overall number of worker processes available. The second one controls the number of workers allowed per gather node.

A gather node is a new thing you will see in an execution plan. It is in charge of unifying results coming from parallel subprocesses.

In addition to those fundamental settings, there are a couple of new optimizer parameters to adjust the cost of parallel queries.

Adding snapshot too old

Those of you using Oracle would be aware of the following error message: snapshot too old. In Oracle, this message indicates that a transaction has been too long, so it has to be aborted. In PostgreSQL, transactions can run almost infinitely. However, long transactions can still be a problem, so the snapshot too old error has been added as a feature to 9.6, which allows transactions to be aborted after a certain amount of time.

The idea behind that is to prevent table bloat and to make sure that end users are aware of the fact that they might be about to do something stupid.

Summary

In PostgreSQL 9.6 and 10.0, a lot of functionality has been added, which allows people to run even more professional applications even more faster and more efficiently. As far as PostgreSQL 10.0 is concerned, the exact new features are not fully defined yet; some things are already known and are outlined in this chapter.

Left arrow icon Right arrow icon

Key benefits

  • Your one-stop guide to mastering the advanced concepts in PostgreSQL with ease
  • Master query optimization, replication, and high availability with PostgreSQL
  • Extend the functionalities of PostgreSQL to suit your organizational needs with minimum effort

Description

PostgreSQL is an open source database used for handling large datasets (Big Data) and as a JSON document database. It also has applications in the software and web domains. This book will enable you to build better PostgreSQL applications and administer databases more efficiently. We begin by explaining the advanced database design concepts in PostgreSQL 9.6, along with indexing and query optimization. You will also see how to work with event triggers and perform concurrent transactions and table partitioning, along with exploring SQL and server tuning. We will walk you through implementing advanced administrative tasks such as server maintenance and monitoring, replication, recovery and high availability, and much more. You will understand the common and not-so-common troubleshooting problems and how you can overcome them. By the end of this book, you will have an expert-level command of the advanced database functionalities and will be able to implement advanced administrative tasks with PostgreSQL.

What you will learn

[*]Get to grips with the advanced features of PostgreSQL 9.6 and handle advanced SQL [*]Make use of the indexing features in PostgreSQL and fine-tune the performance of your queries [*]Work with the stored procedures and manage backup and recovery [*]Master the replication and failover techniques [*]Troubleshoot your PostgreSQL instance for solutions to the common and not-so-common problems [*]Learn how to migrate your database from MySQL and Oracle to PostgreSQL without any hassle

Product Details

Country selected

Publication date : May 30, 2017
Length 416 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781783555352
Category :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : May 30, 2017
Length 416 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781783555352
Category :

Table of Contents

14 Chapters
Preface Chevron down icon Chevron up icon
1. PostgreSQL Overview Chevron down icon Chevron up icon
2. Understanding Transactions and Locking Chevron down icon Chevron up icon
3. Making Use of Indexes Chevron down icon Chevron up icon
4. Handling Advanced SQL Chevron down icon Chevron up icon
5. Log Files and System Statistics Chevron down icon Chevron up icon
6. Optimizing Queries for Good Performance Chevron down icon Chevron up icon
7. Writing Stored Procedures Chevron down icon Chevron up icon
8. Managing PostgreSQL Security Chevron down icon Chevron up icon
9. Handling Backup and Recovery Chevron down icon Chevron up icon
10. Making Sense of Backups and Replication Chevron down icon Chevron up icon
11. Deciding on Useful Extensions Chevron down icon Chevron up icon
12. Troubleshooting PostgreSQL Chevron down icon Chevron up icon
13. Migrating to PostgreSQL Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Top Reviews
No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.