Backing up your environment (Medium)
There are different types of backup. When you think of backing up your environment, it helps to think of how you want to restore it. Do you want to restore it quickly? Do you want to restore it from the bare bones? Do you want to invest more or less time doing the actual restoration process? Do you want to be granular about what you restore?
Getting ready
Debian has different software for all those types of backups. You should select the one that you feel comfortable with, and not try to find a feature-by-feature replica of what you had in your old environments, or try to use whatever the rest of the people are using on the Internet.
For a web server, you usually have two options: bare metal backup and restore or just web server filesystem backup and restore. The former is usually more comprehensive and complex to set up and maintain, and the latter is easier to get running but covers less recovery scenarios.
In any case, you should also add special considerations for backing up your databases and caches, as they usually store information in memory that needs to be flushed to the disk. In some cases like in Postgres (http://stackoverflow.com/questions/1216660/see-and-clear-postgres-caches-buffers), it is not easy to flush cache manually; you need to stop the database altogether.
In general, it's best to back up a stopped system. That's why it's important to have a scalability strategy for your application so it can keep running while you back up masters.
For the bare metal recovery solution, we'll use Bacula. We will also share some tips on how to use rsync for a web backup scenario. The reason why rsync might make sense here is that you don't have a large datacenter with lots of different servers and operating systems; most likely, you have Debian running a web application, with any number of similar hardware running slaves.
Bacula is distributed, so you can have a director, a storage daemon (server where backups are stored) and several file daemons (clients to be backed up); for the purpose of this guide, we'll consider you have the director and storage together. Also, the director can use different backends for metadata storage. It could make sense to use the same database as your web application to potentially re-use existing DBA skills. We'll use MySQL.
Besides /var/www
and /var/lib/mysql
or /var/lib/postgres
, you'd usually want to back up several critical folders such as /etc
, which contains configuration for your setup. /var
may also be a good idea especially if you're using other caches or software with variable data. The rest of your system, particularly /usr
and /lib
, are usually not modified and come prepackaged on Debian packages; /tmp
is volatile (clears out with each restart) and /dev
is autogenerated.
Of course, if you're going with a bare metal strategy, then you recover your setup from the ground up.. More on that in the next recipe, Restoring your environment.
How to do it…
In this section, you will install and configure Bacula Director, Storage Daemon and Console, with a MySQL backend:
In your backup server, install all Bacula components using
apt-get: sudo apt-get install baculabacula-director-mysqlbacula-sd-mysqlbacula-console
.In the sudo editor, enter
/etc/bacula/bacula-director.conf
.Browse to the
Client {}
group, the first one will be the server itself, and the second one is commented out. You can uncomment it and change the directives:Name: It should match the name in
/etc/bacula/bacula-fd.conf
of the clientAddress: It is the IP address or FQDN for the client
Password: It should match the one of the client, or you can make your own
Also, in
/etc/bacula/bacula-sd.conf
, make sure an IP address or FQDN is used, and that the name under Device | Archive Device and the password matches the one indirector.conf
.Restart Bacula by using the
service bacula-director restart
andsudo service bacula-sd restart
commands.In your client server (your web application server), install the Bacula file daemon components using the command,
sudo apt-get install bacula-fd
.Open the
/etc/bacula/bacula-fd.conf
configuration file, set Bacula to listen on your internal backup address and set the hostname and password of the allowed director:sudo editor /etc/bacula/bacula-fd.conf
.Browse to the
Director {}
group, change Name to the name of the director (found underDirector/Name
in/etc/bacula/bacula-director.conf
on the server) and take note of the password (it needs to match the one on the server).Browse to the
FileDaemon {}
group, change FDAddress to a non-loopback IP address where the director can reach you.Issue a
sudo service bacula-fd restart
command.To test it on the director, run:
sudobconsole status
Type in:
3
(for client)Type in:
2
(usually your client will be #2)Bacula will show a picture similar to the following one. You should see no error messages and no jobs running.
Bacula uses FileSet, which are lists of files and folders to backups; Schedule, which define when to run backups; and Jobs, JobDefs, and JobSets (groups of Jobs). We are going to create a simple FileSet, a simple JobDefs and a simple Job. You can copy and paste from the existing content of
/etc/bacula/bacula-director.conf
. The content of the JobDefs should look as shown in the following screenshot:The Job type, on the other hand, should reference the JobDefs type, as illustrated here:
And finally the FileSet type, which is referenced from the JobDefs type, should look like this:
This creates a weekly backup for web01 that brings (incrementally) the contents of
/var/www
to the server. After setting it up, you need to issue asudo service bacula-director restart
command. Now in bconsole, navigate to Status | Director where you should see:You should also label the volume using bconsole first and then the label. Pick a name for your volume (since you are using file storage, this is not incredibly important except for reference reasons) and choose the File pool.
With Bacula, you can issue manual backups when necessary using bconsole. Just use run and select the job you created, then hit Yes. The output should look as in the following screenshot:
Your job should end soon, and you can check it by navigating to Status | Director in bconsole. An OK status (no errors) is pictured as shown in the following screenshot:
For rsync, you will also need a storage server. You can initiate the back up from either side, and the good news is that the restore works the same just by inverting some parts of the rsync command line. Let's suppose you are initiating the back up from the client (web server):
rsync –avz /var/www user@backup:/var/backups/webapp rsync –avz /var/lib/mysqluser@backup:/var/backups/mysql
The -avz
options are the most popular set of options passed to rsync. z
enables compressions and a
enables the archive mode that will preserve useful things such as symlinks. v
is verbose and will show filenames and the sent/received tally as well as the bandwidth used.
There's more…
As mentioned before, you should be careful about data not written to disk. Here are some tips:
Stop your database using the service
mysql stop
orservice postgresql stop
, or flush MySQL tables (http://dev.mysql.com/doc/refman/5.5/en/backup-methods.html) withFLUSH TABLES tbl_list WITH READ LOCK
(remember to useUNLOCK TABLES
after the back up) if your engine and application model supports itIf your application does not handle database unavailability, you might have to stop your web server as well using the command,
service apache2 stop
Alternatively, use Bacula's application-specific scripts for MySQL (http://dev.mysql.com/doc/refman/5.5/en/backup-methods.html), which uses full dumps (this may take a lot of time depending on your database size and uses a lot of disk I/O, which you'll definitely consume either way since you're backing up a disk)
You should also check on bacula-director.conf
where you want your files restored. Bacula will put a dummy path (something like /nonexistent/path/
…), but you should put something like /var/backups/restore
or something meaningful to you. We chose /bacula-restores
.
We suggest that Debian users back up their installed package lists and their responses to debconf, the Debian configuration interface. You can use the following to prepare a file that can be later backed up by Bacula or manually:
debconf-get-selections > debconf.txt dpkg –get-selections > packages.txt