You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Product type Paperback

Published in Dec 2024

Publisher Packt

ISBN-13 9781836205975

Length 474 pages

Edition 6th Edition

Languages

SQL

Tools

PostgreSQL

Concepts

Databases

Author (1):

Hans-Jürgen Schönig

View More author details

Table of Contents (16) Chapters

Preface

1. Chapter 1: What is New in PostgreSQL 17 FREE CHAPTER

2. Chapter 2: Understanding Transactions and Locking

3. Chapter 3: Making Use of Indexes

4. Chapter 4: Handling Advanced SQL

5. Chapter 5: Log Files and System Statistics

6. Chapter 6: Optimizing Queries for Good Performance

7. Chapter 7: Writing Stored Procedures

8. Chapter 8: Managing PostgreSQL Security

9. Chapter 9: Handling Backup and Recovery

10. Chapter 10: Making Sense of Backups and Replication

11. Chapter 11: Deciding on Useful Extensions

12. Chapter 12: Troubleshooting PostgreSQL

13. Chapter 13: Migrating to PostgreSQL

14. Index

Why subscribe?

15. Other Books You May Enjoy

Performing simple dumps

Backups and data exports are important because, without backups, you are effectively putting your database at risk in the case of crashes and storage-related issues. If you are running through a PostgreSQL setup, there are basically two main methods of performing backups:

Logical dumps (extracting a SQL script that represents your data)
Transaction log shipping

The idea behind transaction log shipping is to archive binary changes made to the database. Most people claim that transaction log shipping is the only real way to create backups. However, in my opinion, this is not necessarily true.

Many people rely on pg_dump to simply extract a textual representation of the data. Interestingly, pg_dump is also the oldest method of creating a backup and has been around since the very early days of the PostgreSQL project (transaction log shipping was added much later). Every PostgreSQL administrator becomes familiar with pg_dump sooner or later, so it is important to know how it really works and what it does.

Running pg_dump

In the first part of this section, you will learn some basic things about pg_dump. The first thing we want to do is create a simple textual dump, as shown in the following code block:

[hs@linuxpc ~]$ pg_dump test > /tmp/dump.sql

This is the most simplistic backup you can imagine. Basically, pg_dump logs in to the local database instance, connects to a database called test, and starts to extract all of the data, which will then be sent to stdout and redirected to the file. The beauty here is that the standard output gives you all of the flexibility of a Unix system. You can easily compress the data using a pipe or do whatever you want to do with it.

In some cases, you might want to run pg_dump as a different user. All PostgreSQL client programs support a consistent set of command-line parameters to pass the user information. If you just want to set the user, use the -U flag, as follows:

[hs@linuxpc ~]$ pg_dump -U whatever_powerful_user test > /tmp/dump.sql

The following set of parameters can be found in all PostgreSQL client programs:

... Connection options:
  -d, --dbname=DBNAME database to dump
  -h, --host=HOSTNAME database
       server host or socket directory
  -p, --port=PORT database server port number
  -U, --username=NAME connect as specified database user
  -w, --no-password never prompt for password
  -W, --password force password prompt (should
                           happen automatically)
  --role=ROLENAME do SET ROLE before dump
...

You can just pass the information you want to pg_dump, and if you have the right level of permission, PostgreSQL will fetch the data. The important thing here is to see how the program really works. Basically, pg_dump connects to the database and opens a large, repeatable read transaction that simply reads all of the data committed before the start of the transaction. Remember, a repeatable read ensures that PostgreSQL creates a consistent snapshot of the data, which does not change throughout the transactions. In other words, a dump is always consistent—no foreign keys will be violated. The output is a snapshot of data as it was when the dump started. Consistency is a key factor here. It also implies that changes made to the data when the dump is running won’t make it to the backup anymore.

Important note

A dump simply reads everything—therefore, there are no separate permissions to be able to dump something. As long as you can read it, you can back it up.

Also, note that the backup is in a textual format by default. This means that you can safely extract data from, say, Solaris, and move it to another CPU architecture. In the case of binary copies, this is clearly not possible, as the on-disk format depends on your CPU architecture.

Passing passwords and using the service file

If you take a close look at the connection parameters shown in the previous section, you will notice that there is no way to pass a password to pg_dump. You can enforce a password prompt, but you cannot pass the parameter to pg_dump using a command-line option.

The reason for this is simply because the password might show up in the process table and be visible to other people. The question now is this: If pg_hba.conf, which is on the server, enforces a password, how can the client program provide it?

There are various means of doing this. Here are three:

Using environment variables
Using .pgpass to pass connection information
Using service files

In this section, we will learn about all three methods.

Using environment variables

One way to pass all kinds of parameters is to use environment variables. If information is not explicitly passed to pg_dump, it will look for the missing information in predefined environment variables. A list of all potential settings can be found at https://www.postgresql.org/docs/17/static/libpq-envars.html.

The following overview shows some of the environment variables that are commonly needed for backups:

PGHOST: This tells the system which host to connect to
PGPORT: This defines the TCP port to be used
PGUSER: This tells a client program about the desired user
PGPASSWORD: This contains the password to be used
PGDATABASE: This is the name of the database to connect to

The advantage of these environments is that the password won’t show up in the process table. However, there’s more.

Consider the following example:

psql  -U ... -h ... -p ... -d ...

Given that you are a system administrator, would you really want to type a long piece of code such as this a couple of times every day? If you are working with the very same host again and again, just set those environment variables and connect with plain SQL. The following listing shows how to connect to the database by using environment variables to control the desired behavior:

[hs@linuxpc ~]$ export PGHOST=localhost
[hs@linuxpc ~]$ export PGUSER=hs
[hs@linuxpc ~]$ export PGPASSWORD=abc
[hs@linuxpc ~]$ export PGPORT=5432
[hs@linuxpc ~]$ export PGDATABASE=test
[hs@linuxpc ~]$ psql
psql (17.0)
Type "help" for help.

As you can see, there are no command-line parameters anymore. Just type psql and you are in.

Important note

All applications based on the standard PostgreSQL C language client library (libpq) will understand these environment variables, so you can use them not only for psql and pg_dump but also many other applications.

Using .pgpass to pass connection information

A very common way to store login information is via the use of the .pgpass files. The idea is simple: Put a file called .pgpass into your home directory and put your login information there. The format is simple. The following listing contains the basic format:

hostname:port:database:username:password

An example would be the following:

192.168.0.45:5432:mydb:xy:abc

PostgreSQL offers some nice additional functionality, in which most fields can contain *.

Here is an example:

*:*:*:xy:abc

The * character implies that on every host, on every port, for every database, the user called xy will use abc as the password. To make PostgreSQL use the .pgpass file, make sure that the right file permissions are in place. Without the following lines, things won’t work properly:

chmod 0600 ~/.pgpass

chmod sets the file-level permissions. These are necessary to protect the files. Furthermore, .pgpass can also be used on a Windows system. In that case, the file can be found in the %APPDATA%\postgresql\pgpass.conf path.

Using service files

However, .pgpass is not the only file you can use. You can also make use of service files. Here’s how they work: if you want to connect to the very same servers over and over again, you can create a .pg_service.conf file. It will hold all of the connection information you need.

Here is an example of a .pg_service.conf file:

Mac:~  hs$ cat .pg_service.conf
# a sample service
[hansservice]
host=localhost
port=5432
dbname=test
user=hs
password=abc
[paulservice]
host=192.168.0.45
port=5432
dbname=xyz
user=paul
password=cde

To connect to one of the services, just set the environment and connect the following:

iMac:~ hs$ export PGSERVICE=hansservice

A connection can now be established without passing parameters to psql:

iMac:~ hs$ psql
psql (17.0)
Type "help" for help.
test=#

As you can see, the login works without additional command-line parameters. Alternatively, you can use the following command:

psql service=hansservice

Now that we have learned how to pass passwords and connection information, let’s move on to learning how to extract subsets of data.

Extracting subsets of data

So far, we have seen how to dump an entire database. However, this may not be what we want to do. In many cases, we just want to extract a subset of tables or schemas. Fortunately, pg_dump can help us to do that while also providing several switches:

-a: This only dumps the data and does not dump the data structure
-s: This dumps the data structure but skips the data
-n: This only dumps a certain schema
-N: This dumps everything but excludes certain schemas
-t: This only dumps certain tables
-T: This dumps everything but certain tables (this makes sense if you want to exclude logging tables and so on)

Partial dumps can be very useful to speed things up considerably. Now that we have learned how to perform simple dumps, let’s learn how to handle various file formats.

The rest of the chapter is locked

You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Table of Contents (16) Chapters

Performing simple dumps

Running pg_dump

Passing passwords and using the service file

Using environment variables

Using .pgpass to pass connection information

Using service files

Extracting subsets of data

Authors (1)

Personalised recommendations for you

You're reading from Mastering PostgreSQL 17 Elevate your database skills with advanced deployment, optimization, and security strategies

Table of Contents (16) Chapters

Performing simple dumps

Running pg_dump

Passing passwords and using the service file

Using environment variables

Using .pgpass to pass connection information

Using service files

Extracting subsets of data

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you