Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Instant PostgreSQL Backup and Restore How-to

You're reading from   Instant PostgreSQL Backup and Restore How-to A step-by-step guide to backing up and restoring your database using safe, efficient, and proven recipes

Arrow left icon
Product type Paperback
Published in Mar 2013
Publisher Packt
ISBN-13 9781782169109
Length 54 pages
Edition 1st Edition
Languages
Arrow right icon
Toc

Taking snapshots (Advanced)


Filesystem snapshots can provide a better-integrity guarantee for PostgreSQL backups, as files on a snapshot do not change during the backup. In Linux systems, the Linux Volume Manager (LVM) provides this capability. On BSD or specially altered Linux systems, ZFS is also available. On expensive SAN systems, this is often a built-in function. Due to its availability, we will cover LVM in this recipe.

Getting ready

Let's assume we have a 500 GB raw block device provided as /dev/sdb. This can be from an internal RAID or SAN management's LUN storage, or even just an internal disk. The following commands will make the /db path available to initialize as a new database:

$> pvcreate /dev/sdb
$> vgcreate db_group /dev/sdb
$> lvcreate -L 450G -n db_volume db_group
$> mkfs -t ext4 /dev/db_group/db_volume
$> mount /dev/db_group/db_volume /db

Please refer to the Getting a basic export (Simple) recipe on preparing a sample database.

How to do it...

With LVM in place, there are several new commands available to manage the volume itself. Let's back up the entire contents of the /db directory within a database snapshot using the following steps:

  1. Start by creating a new snapshot volume:

    $> lvcreate -s -L 40G -n db_snap db_group/db_volume
    
  2. Next, mount the directory so tar can copy the database files:

    $> mkdir /mnt/snap
    $> mount /dev/db_group/db_snap /mnt/snap
    
  3. Now, perform a tar backup:

    $> tar -C /mnt/snap -c -z -f /backup/db_backup.tar.gz .
    
  4. Finally, unmount and destroy the snapshot volume:

    $> umount /mnt/snap
    $> lvremove -f db_group/db_snap
    

How it works...

The process itself is basically a wrapper for whatever backup method we want to choose. During the setup, we create a physical volume that represents the actual device. On top of that goes the volume group, which allows us to group several devices or volumes together. Lastly comes the logical volume itself. We only allocate 450 GB of the total size, so we can use up to 50 GB for snapshots.

When a snapshot volume is created (lvcreate -s), it is tied to the named volume, and uses space in the same volume group. In this case, we used 40 GB of the available 50 GB. While the snapshot volume exists, any changed blocks on the source volume are written to this space instead. This means our example will allow up to 40 GB of data to change before the filesystem notices that the space is exhausted and it automatically deallocates the volume.

When we mount the snapshot volume, we actually see an unchanged copy of the /db directory, mounted at /mnt/snap instead. Our running applications see /db normally, so they can keep operating without interruption. This is why we direct the backup tar command to the snapshot mount and back up everything inside. This data is frozen in time, as if the database had simply stopped operating.

When the backup is finished, we simply unmount and remove the volume. This prevents possible problems from LVM destroying a full snapshot, and lets us reuse the space in subsequent backup operations.

This kind of backup doesn't need the pg_start_backup or pg_stop_backup commands, because no new transaction logs can be written in the snapshot, and nothing can be archived. The files we backed up can be extracted elsewhere and started as a separate database as if it had crashed while operating.

There's more...

It's easy to see that snapshots are very powerful, but there are several elements to keep in mind while using them.

Proper snapshot sizing

Why did we choose 40 GB for our snapshot volume? In our case, this was just an illustrative amount. In a real scenario, we would want to carefully calculate suitable snapshot sizes. This might simply be multiplying the average data turnover by the usual back up duration, and doubling that amount. We might also back up during a less active time of day (such as midnight) with a semi-large snapshot.

The point to remember is that the snapshot supplies all necessary space for any changed blocks on any file in the source volume, or any new files generated, such as transaction logs. The easiest way to see how much of this space is used, is to call some other LVM commands using the following command:

$> lvdisplay db_group/db_snap

The section entitled Allocated to snapshot shows how much of the volume is being used. Creating and monitoring a temporary snapshot is a good way to gauge the rate of data turnover before scheduling a real snapshot backup.

In our case, we reserved 10 percent of the total volume size, which is a good starting point. It is always possible to return excess space to the database volume with lvextend and resize2fs. To return 10 GB of space to the db_volume volume, we would adjust it while it is unmounted using the following commands:

$> umount /db
$> lvextend -L +10G db_group/db_volume
$> resize2fs /dev/db_group/db_volume

It is worth mentioning that ZFS has features that negate a lot of the work we performed previously. Unfortunately, properly setting up a good ZFS filesystem is a non-trivial exercise. We highly recommend obtaining a book that outlines how to best use this filesystem if you plan on utilizing this approach.

Snapshot backup targets

It may be tempting to back up the database to the source volume, as it's likely very large. This inclination must be ignored, however! Remember that any block written in the source volume comes from the snapshot. For very large databases, this could require hundreds of GB of space for the backup alone, ignoring any changes of the database itself.

Best practice suggests directing a backup to another device entirely. Larger systems commonly have a separate backup mount such as a NAS or SAN device where backups can reside in a vault area before being committed to tape for long-term storage. Not only is this safer for a snapshot, but much faster as well, since the backup is not stealing read or write throughput from the source volume.

The assumption through all of these examples is that /backup is a source volume. This is a good idea for all backup types, but is especially necessary for snapshot backups to avoid IO contention and snapshot destruction.

Each tablespace is likely to be held on a separate volume, and each would need a separate snapshot command. Even if this is done virtually simultaneously, there is a slight risk of a race condition between the different times for which each of the snapshots were frozen, leaving the data files backed up in an undetermined state. Any tables knocked out of sync due to this should be saved by replaying archive logs during recovery, but be wary of the limitations in this approach.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image