You're reading from PostgreSQL High Performance Cookbook Mastering query optimization, database monitoring, and performance-tuning for PostgreSQL

Product type Paperback

Published in Mar 2017

Publisher Packt

ISBN-13 9781785284335

Length 360 pages

Edition 1st Edition

Languages

SQL

Tools

PostgreSQL

Concepts

High Performance Programming

Authors (2):

Chitij Chauhan

Dinesh Kumar

View More author details

Table of Contents (13) Chapters

Preface

1. Database Benchmarking FREE CHAPTER

2. Server Configuration and Control

3. Device Optimization

4. Monitoring Server Performance

5. Connection Pooling and Database Partitioning

6. High Availability and Replication

7. Working with Third-Party Replication Management Utilities

8. Database Monitoring and Performance

9. Vacuum Internals

10. Data Migration from Other Databases to PostgreSQL and Upgrading the PostgreSQL Cluster

11. Query Optimization

12. Database Indexing

Configuring pgbench

In this recipe, we will be discussing how to configure the pgbench to perform various test cases.

Getting ready

By default, PostgreSQL provides a tool, pgbench, which performs a default test suite based on TPC-B, which simulates the live database load on the servers. Using this tool, we can estimate the tps (transactions per second) capacity of the server, by conducting a dedicated read or read/write test cases. Before performing the pgbench test cases on the server, we need to fine-tune the PostgreSQL parameters and make them ready to fully utilize the server resources. Also, it's good practice to run pgbench from a remote machine, where the network latency is trivial among the nodes.

How to do it...

As aforementioned, pgbench simulates a TPC-B-like workload on the servers, by executing three update statements, followed by SELECT and INSERT statements into different pre-defined pgbench tables, and if we want to use those pre-defined tables, then we would need to initiate pgbench using the -i or --initialize options. Otherwise, we can write a customized SQL script.

To get effective results from pgbench, we need to fine-tune the PostgreSQL server with the following parameters:

Parameter	Description
`shared_buffers`	This is the amount of memory for database operations
`huge_pages`	This improves the OS-level memory management
`work_mem`	This is the amount of memory for each backend for data operations such as sort, join, and so on
`autovacuum_work_mem`	This is the amount of memory for postgres internal process autovacuum
`max_worker_processes`	This is the maximum number of worker processes for the database
`max_parallel_workers_per_gather`	This is the number of worker processes to consider for a gather node type
`max_connections`	This is the number of database connections
`backend_flush_after`	Do fsync, after this many bytes have been flushed to disk by each user process
`checkpoint_completion_target`	This is used to set I/O usage ratio during the checkpoint
`archive_mode`	This is used to determine whether the database should run in the archive mode
`log_lock_waits`	This is used to log useful information about concurrent database locks
`log_temp_files`	This is used to log information about the database's temporary files
`random_page_cost`	This is used to set random page cost value for the index scans

Note

You can also find other parameters at the following URL, which are also important before conducting any benchmarking on the database server: https://www.postgresql.org/docs/9.6/static/runtime-config.html.

Another good practice to get good performance is to keep the transaction logs (pg_xlog) in another mount point, and also have unique tablespaces for tables and indexes. While performing the pgbench testing with predefined tables, we can specify these unique tablespaces using the --index-tablespace and --tablespace options.

How it works...

As we discussed earlier, pgbench is a TPC-B benchmarking tool for PostgreSQL, which simulates the live transactions load on the database server by collecting the required metrics such as tps, latency, and so on. Using pgbench, we can also increase the database size by choosing the test scale factor while using predefined tables. If you wanted to test multiple concurrent connections to the database and wanted to use the pooling mechanism, then it's good practice to configure the pgbouner/pgpool on the local database node to reuse the connections.