Getting a basic export (Simple)
We will start with pg_dumpall
, the most basic PostgreSQL backup tool. This single command-line utility can export the entire database instance at once. We want to start with this particular command, because it preserves important information such as users, roles, and passwords. Later, we will only use it to obtain this important metadata.
Getting ready
Before we begin backing up our database, we should have a database! Since we have installed both PostgreSQL and the Contrib
tools, we should have everything we need to get started with. To make things easier, we will export a single environment variable to run all commands as the postgres
user. This user owns the database instance in default installs. Exporting this variable lets you act as the postgres
user for all future examples. Later, we can use the createdb
utility to create an empty database for our backup experimentation. The pgbench
utility will be our source of data, as shown in the following code snippet, since backing up an empty database is hard to verify upon restore:
$> export PGUSER=postgres $> createdb sample $> pgbench -i -s 50 sample
Now we have a database named sample
with several tables full of generated data. Since the default row count for the tool is 100,000, a scale of 50
provides a table with five million rows that we can use to verify the backup processing time required. We can also verify the restored database by checking for the existence of the generated tables and their content. If this scale is too large, feel free to use a smaller scale to follow along.
Tip
The sample
database will be the basis for all subsequent data export examples. Unless otherwise specified, it always starts with a fresh database. Again, you can use the suggested scale size here, or choose your own.
How to do it...
Creating a backup this way requires a single command, as follows:
Make sure you have opened a Command Prompt console as a local user on your Linux system, and type the following command:
$> pg_dumpall -f backup.sql
How it works...
The pg_dumpall
utility produces what should be a full copy of all database objects including users
, schemas
, and data
, as a single very large SQL file. Our example directed the SQL output to a file named backup.sql
, but any name is valid, so long as we can remember it later.
There's more...
Though the venerable --help
command-line switch always lists the full capabilities available to us, the more important of these deserve more discussion.
Restoring the export
Before we get much further, we should quickly explain how to restore the SQL file you just produced. Our other recipes are more complex and require separate sections, but restoring a pg_dumpall
export is very easy. The psql
command is used for running SQL files. Since this is just a SQL file, you can run it directly against the database. Try the following:
$> psql -f backup.sql postgres
The -f
switch tells PostgreSQL that we want to execute our backup file against the database postgres
, which is a placeholder. The psql
command expects a database name, so we have provided a simple default. Your backup will still restore properly, for example, creating and filling the sample database. This is because the backup also contains database creation commands and more commands to change database targets so that all data goes where it should. Like we said, this is the easiest backup method PostgreSQL has.
Exporting global objects
Though the SQL export itself is perfectly valid for restore, many administrators prefer to use the pg_dumpall
export to obtain the globally stored objects such as users
, roles
, and passwords
only, and use other tools for things such as tables and other data. To get this global data alone, the -g
switch tells pg_dumpall
that is all we wanted. Type the following command to get only global objects:
$> pg_dumpall -g -f globals.sql
We will be using the previous command frequently for just getting global objects.
Compressed backups
Unfortunately pg_dumpall
cannot directly compress its output; it is a very basic tool. If we have an extremely large database, other UNIX commands will also be necessary. For example, the following command will compress the dump using a parallel algorithm while it is being produced, to greatly reduce backup time and size:
$> pg_dumpall | gzip > backup.sql.gz
Naming backups
Note that in all of our examples thus far, we have named the backup rather poorly. It is a better practice to use the -f
switch to provide a name that follows a specific naming scheme. Backup files should always include at least one contextual clue, the date on which the backup was taken, and possibly the time. The following is a better example:
$> pg_dumpall -f production_2013-02-15.sql