Configuring pgbench
In this recipe, we will be discussing how to configure the pgbench to perform various test cases.
Getting ready
By default, PostgreSQL provides a tool, pgbench, which performs a default test suite based on TPC-B, which simulates the live database load on the servers. Using this tool, we can estimate the tps (transactions per second) capacity of the server, by conducting a dedicated read or read/write test cases. Before performing the pgbench test cases on the server, we need to fine-tune the PostgreSQL parameters and make them ready to fully utilize the server resources. Also, it's good practice to run pgbench from a remote machine, where the network latency is trivial among the nodes.
How to do it...
As aforementioned, pgbench simulates a TPC-B-like workload on the servers, by executing three update statements, followed by SELECT
and INSERT
statements into different pre-defined pgbench tables, and if we want to use those pre-defined tables, then we would need to initiate pgbench using the -i
or --initialize
options. Otherwise, we can write a customized SQL script.
To get effective results from pgbench, we need to fine-tune the PostgreSQL server with the following parameters:
Parameter |
Description |
|
This is the amount of memory for database operations |
|
This improves the OS-level memory management |
|
This is the amount of memory for each backend for data operations such as sort, join, and so on |
|
This is the amount of memory for postgres internal process autovacuum |
|
This is the maximum number of worker processes for the database |
|
This is the number of worker processes to consider for a gather node type |
|
This is the number of database connections |
|
Do fsync, after this many bytes have been flushed to disk by each user process |
|
This is used to set I/O usage ratio during the checkpoint |
|
This is used to determine whether the database should run in the archive mode |
|
This is used to log useful information about concurrent database locks |
|
This is used to log information about the database's temporary files |
|
This is used to set random page cost value for the index scans |
Note
You can also find other parameters at the following URL, which are also important before conducting any benchmarking on the database server: https://www.postgresql.org/docs/9.6/static/runtime-config.html.
Another good practice to get good performance is to keep the transaction logs (pg_xlog
) in another mount point, and also have unique tablespaces for tables and indexes. While performing the pgbench testing with predefined tables, we can specify these unique tablespaces using the --index-tablespace
and --tablespace
options.
How it works...
As we discussed earlier, pgbench is a TPC-B benchmarking tool for PostgreSQL, which simulates the live transactions load on the database server by collecting the required metrics such as tps
, latency
, and so on. Using pgbench, we can also increase the database size by choosing the test scale factor while using predefined tables. If you wanted to test multiple concurrent connections to the database and wanted to use the pooling mechanism, then it's good practice to configure the pgbouner/pgpool on the local database node to reuse the connections.
Note
For more features and options with the pgbench tool, visit https://www.postgresql.org/docs/9.6/static/pgbench.html.