Streaming replication (Advanced)
The last topic we are going to cover is a newer method for indefinite binary recovery over a network connection known as streaming replication. A backup restored this way would always be a working copy of the database, and does not require third-party tools like rsync
to utilize. Being always online, this is a perfect candidate for disaster recovery scenarios instead of merely pure backup availability.
Getting ready
Please refer to the Getting a basic export (Simple) recipe to bootstrap our database. Before we start the server to create the sample
database, we need to change a few settings in postgresql.conf
using the following code:
wal_level = hot_standby max_wal_senders = 5
Then we must allow a user to connect from the backup
server to the replication
database. For this example, assume the backup
server has the IP address 192.168.5.5
. We activate it by adding the following line to pg_hba.conf
:
host replication replication 192.168.5.5/32 trust
Then start the main database and use pgbench
to initialize our sample data. Afterwards, we should actually create the replication
user and give it the ability to use system replication as shown in the following command:
$> psql -c "create user replication with replication;"
How to do it...
The pg_basebackup
backup step, which we will perform on the backup
server, also acts as a restore. The data on the backup server will be almost ready to run. The following are the steps for streaming applications:
Assuming that the main database resides at
192.168.5.1
, execute the following command from thebackup
database server:$> pg_basebackup -D /db/pgdata -h 192.168.5.1 -U replication
Next, create a
recovery.conf
file on thebackup
server with the following content:standby_mode = on trigger_file = '/tmp/promote_db' primary_conninfo = 'host=192.168.5.1 port=5432 user=replication'
Next, it's common practice to modify
postgresql.conf
to allow online read-only queries. Let's do that next using the following code snippet:hot_standby = on
Then start the backup copy of the database:
$> pg_ctl -D /db/pgdata start
How it works...
What we have just created here is the default type of backup replication known as asynchronous streaming replication. It took a lot of extra preparation on the main database server to get this working.
We started by changing the wal_level
database to hot_standby
so the backup
server can run in read-only mode. Also it is important that we set max_wal_senders
to a non-zero value so the backup
server can connect to the main database and request transaction files directly from the source. Because of this, we no longer need rsync
at all, or even PostgreSQL archive_mode
. Transferring files between both servers is only necessary as a safe fallback method in case the backup server cannot connect to the main database for some reason.
The replication
database is actually a pseudo-database that does not actually exist. The line we placed in pg_hba.conf
tells PostgreSQL that the replication
user can connect to this pseudo-database from the backup
server. Oddly enough, using all
to indicate all databases actually does not work. To use replication, not only does a user with the replication
permission need to exist, but they must be able to explicitly connect to the replication
database. We admit that this might be somewhat confusing.
With all of this preparation out of the way, we revisit the pg_basebackup
tool once more. Because the replication
user can connect directly to the main database, it can actually create a copy of the database as well. In fact, that is the primary purpose of the pg_basebackup
tool. Instead of copying transaction log files, it requests primary database files and saves them as a mirror of the source database.
The next step is to create a recovery.conf
file to control how the backup copy acts. In our case, we want the usual standby_mode
, a basic trigger_file
to allow local promotion, and the real magic of streaming replication, that is, primary_conninfo
. This is where we designate the connection string to the main database server. If everything works properly, the backup
server will report the following in its logs after we start it:
LOG: streaming replication successfully connected to primary
There's more...
Asynchronous replication is not the only option available for a backup
server. We can also operate synchronously or use the backup copy of the database to produce backup files for long term storage.
Synchronous backup
Asynchronous backup is good for most cases. However, for mission-critical data that absolutely must exist on at least two servers before being saved at all, there is another option. Synchronous replication modifies the main database server in such a way that no transactions will commit at all unless at least one backup server also receives the data. To use this, we need to add a setting to the postgresql.conf
file on our main server as shown in the following code:
synchronous_standby_names = 'backup_server'
We also need to append an application_name
name to the end of the primary_conninfo
setting to name our backup server:
primary_conninfo = '... application_name=backup_server'
Once we restart both the main and backup databases, transactions will presumably reflect increased durability.
Synchronous backup caveates
Though synchronous backup is more durable since all transactions must pass through at least two systems, PostgreSQL is very serious about honoring that data guarantee. If the backup
server disconnects from the main server, either through server maintenance or network disruption, the main database will actually stop processing transactions entirely until the backup
server returns to normal operation.
The only way to prevent this behavior is to temporarily disable synchronous replication in postgresql.conf
:
synchronous_standby_names = ''
And then telling the server to reload:
$> pg_ctl -D /db/pgdata reload
This should be done before maintenance on the backup
server, or if a network outage is detected between the two servers.
We also need to consider the effect of network latency. Because transactions must be acknowledged by two servers, locks are held longer on the main server. For non-critical data in a synchronous replication setup, clients should disable synchronous mode. This can be done with a basic PostgreSQL client command. The following sample update would not use synchronous replication:
SET synchronous_commit TO false; UPDATE my_table SET modified_date = NOW() WHERE row_val LIKE '%change%';
The synchronous_commit
setting can be used anywhere normal SQL is allowed. This is only a temporary change, and only applies to the current client connection. To reactivate synchronous commit in the same connection, it needs to be explicitly re-enabled using the following code:
SET synchronous_commit TO true;
Finally, we should note that synchronous replication only guarantees transactions that have been received by the backup
server, not that they have been processed. Synchronous mode is really just a shallow extension of asynchronous, and as a consequence, retains many of the same limitations. One of those limits is that the backup server can only note that it has the transaction, not the progress of applying it.
In practice, backup servers usually apply transactions much faster than the main server because they do not generally carry the same transaction overhead. We just do not want to be unpleasantly surprised by an unexpected race condition if the two servers get out of sync.
Streaming backups
Probably the best use for a hot-standby or streaming backup is the ability to use it as the primary backup source. In all likelihood, the primary database is much busier than the backup or disaster recovery server. There are just a few things we need to remember:
The database is only available for reading.
As a consequence,
pg_basebackup
cannot be used. This restriction is removed in PostgreSQL 9.2 and above.Using
pg_start_backup
orpg_stop_backup
are also not allowed. Again, PostgreSQL 9.2 and above make this possible.
This means tar
and rsync
style backups are good for older installs, with or without filesystem snapshots. pg_dumpall
and pg_dump
are also perfectly valid, since they only obtain read locks on the database while they operate.
Any of the previously mentioned backup methods that do not involve pg_basebackup
can be used with very little modification. The easiest way is to simply stop the backup copy, since it is likely non-critical. Since the backup
server is probably under less duress, parallel compression can utilize more available CPUs, and disk bandwidth saturation carries less operational risks.
The ultimate backup method is to have an entirely separate backup server to act in a disaster recovery role. Not only can it fill in for the main database in case of an outage, but low priority work and maintenance procedures such as regular backups can be offloaded without risking the primary system. Streaming replication makes this both easy and convenient.