Preparing your setup
Before we dig into the actual installation process and see how things can be put to work, it makes sense to talk a little bit about the PostgreSQL version numbers. Understanding the PostgreSQL versioning policy will give you valuable insights, especially with respect to your upgrade policy, downtime management, and so on.
Understanding the PostgreSQL version numbers
As you might have already seen, a PostgreSQL version number consists of three digits. The logic of the version number is as follows:
- Minor releases: 9.4.2, 9.4.1, 9.4.1
- Major releases: 9.4.0, 9.3.0, 9.2.0
- N.0.0 releases (super major): 9.0.0, 8.0.0, 7.0.0
The distinction between the preceding three types of releases is pretty important. Why is that? Well, if you happen to upgrade to a new minor release (say, from 9.4.1 to 9.4.3), all you have to do is stop the database and start the new binaries. There is no need to touch the data. In short, the amount of downtime needed is basically close to zero.
Note that a minor release only contains bug fixes, improvements in the documentation, and so on. It will never add new features, change the functionality, or remove existing stuff.
Tip
You can safely update to a more recent minor release to improve reliability. The risk involved is negligible.
In case of a major version change, you definitely have to plan things a little better because updates are a bit more complicated (pg_dump / pg_restore
or pg_upgrade
).
Choosing the right version
When I am training people, they ask me on a regular basis which version of PostgreSQL they should use. The answer to this question is simple; if you have the ability to decide freely, it is absolutely safe to use the latest stable release of PostgreSQL, even if it is a "zero" release (which is equal to 9.4.0, 9.3.0, and so on).
Installing binary packages
After this little introduction to PostgreSQL versioning, we can move forward and see how binary packages can be installed. Nowadays, most people use binary packages that are shipped with their preferred Linux distribution. These packages are tested, easy to use, and readily available.
In this chapter, we will show you how to install PostgreSQL on Debian or Ubuntu and on Red-Hat-based systems.
Installing PostgreSQL on Debian or Ubuntu
Let's focus on installing PostgreSQL on Debian or Ubuntu first. The key point here is that it is recommended to add the PostgreSQL repositories to Ubuntu. The reason is that many Linux distributions, including Ubuntu, ship very old and outdated versions of PostgreSQL in their standard setup. If you don't want to miss a couple of years of PostgreSQL development, adding the current repositories will be highly beneficial to you. The process of adding the repositories is as follows:
- Create a file called
/etc/apt/sources.list.d/pgdg.list
, and add a line for the PostgreSQL repository (the following steps can be done as a root user or by usingsudo
). Alternatively,/etc/apt/sources.list
is a place to put the line:deb http://apt.postgresql.org/pub/repos/apt/ YOUR_DEBIAN_VERSION_HERE-pgdg main
- So, in case of Wheezy, the following line will be useful:
deb http://apt.postgresql.org/pub/repos/apt/ wheezy-pgdg main
- Once we add the repository, we can import the signing key:
$# wget --quiet -O - \ https://www.postgresql.org/media/keys/ACCC4CF8.asc | \ apt-key add - OK
- Once the key has been added, we can update our package information and install PostgreSQL:
apt-get update
- In our case, we will install PostgreSQL 9.4. Of course, you can also decide to use 9.3 or any other recent version you desire:
apt-get install "postgresql-9.4"
- All relevant packages will be downloaded automatically, and the system will instantly fire up PostgreSQL.
- Once all these steps have been performed, you are ready for action. You can try to connect to the database:
root@chantal:~# su - postgres $ psql postgres psql (9.4.1) Type "help" for help. postgres=#
Installing PostgreSQL on Red-Hat-based systems
The installation process on Red Hat-based distributions works in a pretty similar way. Many distributions use RPM packages. The following URL shows the distributions for which we are currently ready to use RPMs: http://yum.postgresql.org/repopackages.php.
The first thing to do is to install an RPM package containing all the repository information. Once this is done, we can easily fetch PostgreSQL RPMs from the repository and fire things up in almost no time.
In our example, we chose Fedora 20 as our distribution. To enable the repository, we can run the following command (as root):
yum install http://yum.postgresql.org/9.4/fedora/fedora-20-x86_64/pgdg-fedora94-9.4-1.noarch.rpm
Once the repository has been added, we can install PostgreSQL by using the following commands:
yum install postgresql94-server postgresql94-contrib /usr/pgsql-9.4/bin/postgresql94-setup initdb systemctl enable postgresql-9.4.service systemctl start postgresql-9.4.service
The first command (yum install
) will fetch the packages from the repository and install them on your server. Once this is done, we can prepare a database instance and initialize it.
Finally, we enable the service and start it up. Our database server is now ready for action.
Compiling PostgreSQL from source
So far, you've seen how to install binary packages. However, in some cases, you might want to compile PostgreSQL from source all by yourself. There are several reasons for this:
- SLAs: You might have to provide an old version, which is not available as package anymore, to fulfill some SLA agreements.
- No packages available: On your favorite flavor of Linux, there is most likely a package containing PostgreSQL available always. However, what about AIX, Solaris, HPUX, and others?
- Custom patches: Some people write custom patches to enhance PostgreSQL.
- Split directories: You might want to split the binary and library directories and make sure that PostgreSQL does not integrate tightly into the existing OS.
- Configure options: Custom configure options, or some other options, to enable
dtrace
.
How it works
Before we get started, we have to download the tarball from http://ftp.postgresql.org/pub/source/. There, you will find one directory per version of PostgreSQL. In our case, we have downloaded PostgreSQL 9.4.1, and we will use it throughout this chapter.
The first thing we have to do is to extract the tar archive:
tar xvfz postgresql-9.4.1.tar.gz
This will create a directory containing the PostgreSQL source code. Once we have entered this directory, we can call configure
, which will then check your system to see if all libraries you need are present. It generates vital parts of the build infrastructure.
Here is how it works:
./configure --prefix=/usr/local/pg941
In our example, we used the most simplistic of all configurations. We want to install the binaries to a directory called /usr/local/pg941
. Note that this is not where the data will end up; it is where the executables will reside. If you don't define –prefix
, the default installation path will be /usr/local/pgsql
.
Of course, there is a lot more. Try running the following command:
./configure --help
If you run the preceding command, you will see that there are some more features that can be turned on (for example, --with-perl
or --with-python
) in case you are planning to write stored procedures in Perl or Python.
In some cases, you might find that our operating system lacks libraries needed to compile PostgreSQL properly. Some of the most common candidates are libreadline-dev
and zlib-dev
(of course there are some more). These two libraries are needed to enable the command-line history as well as to give support for compression. We highly recommend providing both libraries to PostgreSQL.
Tip
Keep in mind that the two previously defined libraries have slightly different names on different Linux distributions because every Linux distribution uses slightly different naming conventions.
If you are compiling on a more exotic Unix operating system such as Solaris, AIX, and so on, we recommend you to check out the documentation regarding the platform specifications.
We can move forward and actually compile PostgreSQL, using the following commands:
make make install
You just have to call make
and make install
(as root) and wait for a few seconds. In this case, we simply use one CPU core to build PostgreSQL. If you want to scale out the build process to many CPU cores, you can use –j
, shown as follows:
make -j 8
The -j 8
command will tell make
to do up to 8
things in parallel, if possible. Adding parallelism to the build process will definitely speed up the process. It is not uncommon to build PostgreSQL in 30 seconds or less if there are enough CPU cores on board.
Installing the contrib packages
It is highly recommended to install the PostgreSQL contrib
packages as well. Contrib is a set of additional modules that can be used for different purposes such as creating database links from PostgreSQL, to PostgreSQL, or for adding an additional indexing functionality.
If you are installing PostgreSQL from binary packages, you can simply install one more package (for example, postgresql-9.3-contrib
). If you happen to install from source, you have to perform the following steps:
cd contrib make make install
Of course, you can also use the -j
flag again to scale out to more than just one CPU. The make install
command will need root permissions again (for example, via sudo
).
Finalizing your installation
Once the binaries have been installed, we can move forward and finalize our installation. The following steps are to be carried out in order to finalize our installation:
- Creating and configuring a user to run PostgreSQL
- Creating a database instance
- Deploying the
init
scripts
If you have installed PostgreSQL from binary packages, the system will automatically create a user for PostgreSQL. If you happen to compile it yourself, you have to create the operating system user yourself too.
Depending on the operating system you are using, this works in a slightly different way. On Ubuntu, for instance, you can call adduser
on Red Hat and useradd
on CentOS. I really recommend looking up the procedure to create a user in your operating system manual.
In general, it's best practice to create a user named postgres
; however, a nonroot user will also do. I just recommend sticking to the standard to make life easier on the administration front.
Once the user has been created, it is, in general, a good idea to prepare your infrastructure for PostgreSQL. This implies adjusting your $PATH
environment variable. On most Linux systems, this can be done in your .bash_profile
or .bashrc
file. Having your favorite PostgreSQL tools in your path will make life simple and a lot easier.
Finally, we can add the init
scripts to the system. In postgresql-9.4.1/contrib/start-scripts
, you will find init
scripts for Linux, Mac OS X, and FreeBSD. These scripts are a good framework to make your init
process work as expected.
Creating a database instance
Once we compile PostgreSQL and prepare ourselves to launch PostgreSQL, we can create a so-called PostgreSQL database instance. What is a database instance? Well, whenever you start PostgreSQL, you are actually firing up a database instance. So, the instance is really a central thing; it is that which contains all the databases, users, tablespaces, and so on.
In PostgreSQL, a database instance always resides in a database
directory. In our example, we want to create the instance under /data
:
mkdir /data chown postgres.postgres /data su - postgres initdb -D /data -E unicode
First, we created the directory and assigned it to the postgres
user. Then, we created the database instance. The important part here is that we explicitly stated (-E unicode
) that we want UTF-8 to be the default character set in our system. If we don't explicitly tell the system what to use, it will check out the locale settings and use the Unix locale as the default for the instance. This might not be the desired configuration for your setup, so it is better to explicitly define the character set.
Also, instead of using -D
here, we can set $PGDATA
to tell PostgreSQL where the desired place for the database instance is going to be. There's also an initdb --help
command that will reveal a handful of additional configuration options.
At this point, we won't go into all the configuration options as it is out of the scope of this book. However, we will point out some really useful flags, described as follows:
-A
: This defines the default authentication method of local connections. Many people use trust, md5, or peer for this option.-E
and--locale
: This defines your desired character set and locale settings.-k
: This setting will require PostgreSQL to create data page checksums. It is highly recommended to use this setting for mission critical data. The overhead of the page checksums is virtually zero, so you will get a lot more protection for your data at virtually no cost.
Once we create our database instance, we can start our database server.
Firing up PostgreSQL
Firing up PostgreSQL is easy. If we used binary packages, we can use the /etc/init.d/postgresql start
or service postgresql start
command (as root or by using sudo
).
Note that on some Linux distros, it might be necessary to add a version number to the service (for example, /etc/init.d/postgresql-9.4 start
). On non-Linux systems, you have to check out your corresponding init
routines.
In case you have not installed the start
scripts, you can fire up PostgreSQL manually. Assuming that our database instance resides in /data
, it works like this:
pg_ctl -D /data -l /dev/null start
In the preceding command, pg_ctl
is the tool to control PostgreSQL, -D
tells the system where to find the database instance, -l /dev/null
tells our database server to send the log information to /dev/null
, and start
will simply make the instance fire up.
Note that we use -l
here for simplicity reasons. In later chapters, you will learn how to set up proper logging using the PostgreSQL onboard infrastructure.
Installing PostgreSQL is as simple as that.