Restoring a binary backup (Simple)
We have now discussed several different methods for obtaining a binary backup. What do we actually do with these binary backup files? We'll cover a simple restore since there is no supplied utility that performs this task.
Getting ready
Please refer to the Getting a basic export (Simple) recipe to bootstrap our sample
database. Then, we want a very simple backup to restore. We should stop the database to avoid any extra work using the following commands:
$> pg_ctl -D /db/pgdata stop -m fast $> tar -C /db/pgdata --exclude=pg_xlog -cz \ -f /db/backup/backup.tar.gz . $> tar -C /db/pgdata/pg_xlog --exclude=archive_status \ --exclude=archive_status -czf /db/backup/xlog.tar.gz .
Once these files are safely stored, erase our sample database using the following command:
$> rm -Rf /db/pgdata
How to do it...
Restoring a binary backup is generally easy if we run a few commands as the postgres
user, as shown in the following list:
First, create a directory and extract the backup file to that location:
$> mkdir -m 700 /db/pgdata $> tar -C /db/pgdata -xzf /db/backup/backup.tar.gz
Then, create a directory for the archive logs to restore. We also need the
pg_xlog
directory which the PostgreSQL expects for the database to start properly. Once these directories are created, uncompress the transaction logs we backed up:$> mkdir -m 700 /db/pgdata/{archived,pg_xlog} $> tar -C /db/pgdata/archived -xzf /db/backup/xlog.tar.gz
Finally, create a file named
recovery.conf
in the/db/pgdata
directory to contain the following:restore_command = 'cp /db/pgdata/archived/%f "%p"'
Then start the database as usual. It will recover for a period of time and become available:
$> pg_ctl -D /db/pgdata start
How it works...
We cheat a little to create a simple restore backup that will require using a recovery.conf
file. All recovery is controlled through this file, so it is important to know how it works.
Afterwards, we simply create the basic directory for the database as it existed before, and extract the backup file there. When a database is in recovery, it copies old transaction logs and processes them, ensuring the database properly reflects all the necessary data.
The real trick here is the recovery.conf
file, which plays a central role in almost all binary recovery. We used one of the simplest settings to copy transaction logs to their required destination directory, and before this is all over, we will learn much more. For now, know that in the restore_command
line, %f
is the name of a file PostgreSQL needs, and %p
is where the file should reside.
This means we could use any equivalent Unix command instead of our simple copy. For instance, we could use rsync
to copy the file from a remote storage location instead.
There's more...
Restoring a simple backup is effectively just extraction and recovery. We don't even need a bootstrap like with a SQL restore. However, there is still a little more to know.
No extract for rsync backups
If we use rsync
to copy files from another server instead, we have an exact copy of the database as if it were never stopped. In this case, there is no backup file to extract, and no other files to restore. We don't really need a recovery.conf
file for the database to start either.
Tip
Do not exclude the pg_xlog
directory in your rsync
command if you plan to do this, as PostgreSQL requires those files to start the database. It is also good practice to only resync data when the server is stopped so that the data files are consistent.
Remote data copies
The copy command can be anything we want. It is good practice to have a vault server that acts as a backup target. Let's set up a recovery.conf
that copies data from a remote server instead.
restore_command = 'rsync -a postgres@backup_server::archive/%f "%p"'
This assumes that we have rsync
installed, and have an equivalent rsyncd.conf
on the backup
server. We could also use scp
instead for a secure SSH-based copy, or we could execute our own pre-written script.
Note
Be careful with scp
. It copies files over without an intermediate temporary name, so the recovering database may try to read a file while it is being transferred. A safer option is to use the (--rsh=ssh
) rsync
switch to use ssh
as a transport mechanism only. This way, your data is still protected by ssh
, without potentially damaging the backup
server.
Again, this is just one possible alternative available. A good environment makes use of several techniques as appropriate.