Managing your cluster
A PostgreSQL cluster is a collection of several databases that all run under the very same PostgreSQL service or instance.
Managing a cluster means being able to start, stop, take control, and get information about the status of a PostgreSQL instance.
From an operating system point of view, PostgreSQL is a service that can be started, stopped, and, of course, monitored. As you saw in the previous chapter, usually when you install PostgreSQL, you also get a set of operating system-specific tools and scripts to integrate PostgreSQL with your operating system service management. Usually, you will find system service files or other operating system-specific tools, like pg_ctl
cluster, which is shipped with Debian GNU/Linux and its derivatives.
PostgreSQL ships with a specific tool called pg_ctl,
which helps in managing the cluster and the related running processes. This section introduces you to the basic usage of pg_ctl
and to the processes that you can encounter in a running cluster. It does not matter which service management system your operating system is using, pg_ctl
will always be available to the PostgreSQL administrator in order to take control of a database instance.
pg_ctl
The pg_ctl
command-line utility allows you to perform different actions on a cluster, mainly initialize, start, restart, stop, and so on. pg_ctl
accepts the command to execute as the first argument, followed by other specific arguments—the main commands are as follows:
start
,stop
, andrestart
execute the corresponding actions on the cluster.status
reports the current status (running or not) of the cluster.initdb
(orinit
for short) executes the initialization of the cluster, possibly removing any previously existing data.reload
causes the PostgreSQL server to reload the configuration, which is useful when you want to apply configuration changes.promote
is used when the cluster is running as a replica server (namely astandby
node) and, from now on, must be detached from the original primary becoming independent (replication will be explained in later chapters).
Generally speaking, pg_ctl
interacts mainly with the postmaster (the first process launched within a cluster), which in turn “redirects” commands to other existing processes. For instance, when pg_ctl
starts a server instance, it makes the postmaster process run, which in turn completes all the startup activities, including launching other utility processes (as briefly explained in the previous chapter). On the other hand, when pg_ctl
stops a cluster, it issues a halt command to the postmaster, which in turn requires other active processes to exit, waiting for them to finish.
The postmaster process is just the very first PostgreSQL-related process launched within the instance; on some systems, there is a process named “postmaster,” while on other operating systems, there are only processes named “postgres.” The first process ever launched, despite its name, is referred to as the postmaster. The name postmaster
is just that, a name used to identify a process among the others (in particular, the first process launched within the cluster).
pg_ctl
needs to know where the PGDATA
is located, and this can be specified by either setting an environment variable named PGDATA
or by specifying it on the command line by means of the –D
flag.
Interacting with a cluster status (for example, to stop it) is an action that not every user must be able to perform; usually, only an operating system administrator must be able to interact with services including PostgreSQL.
PostgreSQL, in order to mitigate the side effects of privilege escalation, does not allow a cluster to be run by privileged users, such as root
. Therefore, PostgreSQL is run by a “normal” user, usually named postgres
on all operating systems. This unprivileged user owns the PGDATA
directory and runs the postmaster
process, and, therefore, also all the processes launched by the postmaster itself. pg_ctl
must be run by the same unprivileged operating system user that is going to run the cluster.
If you are using the Docker image, PostgreSQL is already running as the main service. This means that issuing a stop
or a restart
command will force you to exit from the container due to its shutdown.
Moreover, in the Docker container, the PostgreSQL service will be already running without any need for manual intervention.
The status
command just queries the cluster to get information, so it is pretty safe as a starting point to understand what is happening:
$ pg_ctl status
pg_ctl: server is running (PID: 1)
/usr/lib/postgresql/16/bin/postgres
The command reports back that the server is running, with a Process Identifier (PID) equal to one (this number will be different on your machine). Moreover, the command reports the executable file used to launch the server, in the above example, /usr/lib/postgresql/16/bin/postgres
.
If the server is not running for any reason, the pg_ctl
command will report an appropriate message to indicate that is unable to find an instance of PostgreSQL started:
$ pg_ctl status
pg_ctl: no server running
In order to report the status of the cluster, pg_ctl
needs to know where the database is storing its own data—that is, where the PGDATA
is on disk. There are two ways to make pg_ctl
aware of where the PGDATA
is:
- Setting an environment variable named
PGDATA
, containing the path of the data directory - Using the
–D
command-line flag to specify the path to the data directory
Almost every PostgreSQL cluster-related command searches for the value of PGDATA
as an environmental variable or as a -D
command-line option.
In the previous examples, no PGDATA
has been specified, and this is because it has been assumed the value of the PGDATA
was specified by an environment variable.
It is quite easy to verify this—for example, in the Docker container:
$ echo $PGDATA
/postgres/16/data
$ pg_ctl status
pg_ctl: server is running (PID: 1)
/usr/lib/postgresql/16/bin/postgres
In the case that your setup does not include an PGDATA
environment variable, you can always set it manually before launching pg_ctl
or any other cluster-related command:
$ export PGDATA=/postgres/16/data
$ pg_ctl status
pg_ctl: server is running (PID: 1)
The command-line argument, specified with -D
, always has precedence against any PGDATA
environment variable, so if you don’t set or misconfigure the PGDATA
variable but, instead, pass the right value on the command line, everything works fine:
$ export PGDATA=/postgres/data # wrong PGDATA!
$ pg_ctl status -D /postgres/16/data
pg_ctl: server is running (PID: 1)
/usr/lib/postgresql/16/bin/postgres "-D" "/postgres/16/data"
The same concepts of PGDATA
and the -D
optional argument are true for pretty much any “low-level” commands that act against a cluster and make clear that, with the same set of executables, you can run multiple instances of PostgreSQL on the same machine, as long as you keep the PGDATA
directory of each one separate.
Do not use the same PGDATA
directory for multiple versions of PostgreSQL. While it could be tempting, on your own test machine, to have a single PGDATA
directory that can be used in turn by a PostgreSQL 16 and a PostgreSQL 15 instance, this will not work as expected and you risk losing all your data. Luckily, PostgreSQL is smart enough to see that PGDATA
has been created and used by a different version and refuses to operate, but please be careful not to share the same PGDATA
directory with different instances.
pg_ctl
can be used to start and stop a cluster by means of appropriate commands. For example, you can start an instance with the start
command (assuming a PGDATA
environment variable has been set):
$ pg_ctl start
waiting for server to start....
[27765] LOG: starting PostgreSQL 16.0 on x
86_64-pc-linux-gnu, compiled by gcc (GCC) 12.1.0, 64-bit
[27765] LOG: listening on IPv6 address "::1", port 5432
[27765] LOG: listening on IPv4 address "127.0.0.1", port 5432 [27765] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
[27768] LOG: database system was shut down at 2023-07-19 07:20:24 EST
[27765] LOG: database system is ready to accept connections
done
server started
The start
, stop
, and restart
commands do not work on the Docker images from this book’s repository because such containers are running PostgreSQL as the main process; therefore, stopping (or restarting) will cause the container to exit. Similarly, there is no need to start the service because it is automatically started once the container starts.
The pg_ctl
command launches the postmaster
process, which prints out a few log lines before redirecting the logs to the appropriate log file. The server started
message at the end confirms that the server has started. During the startup, the PID of the postmaster is reported within square brackets; in the above example, the postmaster is the operating system process number 27765
.
Now, if you run pg_ctl
again to check the server, you will see that it has been started:
$ pg_ctl status
pg_ctl: server is running (PID: 27765)
/usr/pgsql-16/bin/postgres
As you can see, the server is now running and pg_ctl
shows the PID of the running postmaster (27765
), as well as the executable command line (in this case, /usr/pgsql-16/bin/postgres
).
Remember: The postmaster process is the first process ever started in the cluster. Both the backend processes and the postmaster are run starting from the postgres
executable, and the postmaster is just the root of all PostgreSQL processes, with the main aim of keeping all the other processes under control.
Now that the cluster is running, let’s stop it. As you can imagine, stop
is the command used to instruct pg_ctl
about which action to perform:
$ pg_ctl stop
waiting for server to shut down....
[27765] LOG: received fast shutdown request
[27765] LOG: aborting any active transactions
[27765] LOG: background worker "logical replication launcher" (PID 27771) exited with exit code 1
[27766] LOG: shutting down
[27766] LOG: checkpoint starting: shutdown immediate
[27766] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.035 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=237 kB; lsn=0/1529DC8, redo lsn=0/1529DC8
[27765] LOG: database system is shut down
done
server stopped
During a shutdown, the system prints a few messages to inform the administrator about what is happening, and as soon as the server stops, the message server stopped
confirms that the cluster is no longer running.
Shutting down a cluster can be much more problematic than starting it, and for that reason, it is possible to pass extra arguments to the stop
command in order to let pg_ctl
act accordingly. There are three ways of stopping a cluster:
- The
smart
mode means that the PostgreSQL cluster will gently wait for all the connected clients to disconnect and only then will it shut the cluster down. - The
fast
mode will immediately disconnect every client and will shut down the server without having to wait. - The
immediate
mode will abort every PostgreSQL process, including client connections, and shut down the cluster in a dirty way, meaning that the server will need some specific activity on the restart to clean up such dirty data (more on this in the next chapters).
In any case, once a stop
command is issued, the server will not accept any new incoming connections from clients, and depending on the stop mode you have selected, existing connections will be terminated. The default stop mode, if none is specified, is fast
, which forces an immediate disconnection of the clients but ensures data integrity.
If you want to change the stop mode, you can use the -m
flag, specifying the mode name, as follows:
$ pg_ctl stop -m smart
waiting for server to shut down........................ done
server stopped
In the preceding example, the pg_ctl
command will wait, printing a dot every second until all the clients disconnect from the server. In the meantime, if you try to connect to the same cluster from another client, you will receive an error, because the server has entered the stopping procedure:
$ psql
psql: error: could not connect to server: FATAL: the database system is shutting down
It is possible to specify just the first letter of the stop mode instead of the whole word; so, for instance, s
for smart
, i
for immediate
, and f
for fast
.
PostgreSQL processes
You have already learned how the postmaster
is the root of all PostgreSQL processes, but as explained in Chapter 1, Introduction to PostgreSQL, PostgreSQL will launch multiple different processes at startup. These processes are in charge of keeping the cluster operational and in good health. This section provides a glance at the main processes you can find in a running cluster, allowing you to recognize each of them and their respective purposes.
If you inspect a running cluster from the operating system point of view, you will see a bunch of processes tied to PostgreSQL:
$ pstree -p postgres
postgres(1)─┬─postgres(34)
├─postgres(35)
├─postgres(37)
├─postgres(38)
└─postgres(39)
$ ps -C postgres -af
postgres 1 0 0 11:08 ? 00:00:00 postgres
postgres 34 1 0 11:08 ? 00:00:00 postgres: checkpointer
postgres 35 1 0 11:08 ? 00:00:00 postgres: background writer
postgres 37 1 0 11:08 ? 00:00:00 postgres: walwriter
postgres 38 1 0 11:08 ? 00:00:00 postgres: autovacuum launcher
postgres 39 1 0 11:08 ? 00:00:00 postgres: logical replication launcher
The PID numbers reported in these examples refer to the Docker container, where the first PostgreSQL process has a PID equal to 1. On other machines, you will get different PID numbers.
As you can see, the process with PID 1
is one that spawns several other child processes and hence is the first and main PostgreSQL process launched, and as such, is usually called postmaster.
The other processes are as follows:
checkpointer
is the process responsible for executing the checkpoints, which are points in time where the database ensures that all the data is actually stored persistently on the disk.background writer
is responsible for helping to push the data out of the memory to permanent storage.walwriter
is responsible for writing out the Write-Ahead Logs (WALs), the logs that are needed to ensure data reliability even in the case of a database crash.logical replication launcher
is the process responsible for handling logical replication.
Depending on the exact configuration of the cluster, there could be other processes active:
- Background workers: These are processes that can be customized by the user to perform background tasks.
- WAL receiver and/or WAL sender: These are processes involved in receiving data from or sending data to another cluster in replication scenarios.
Many of the concepts and aims of the preceding process list will become clearer as you progress through the book’s chapters, but for now, it is sufficient that you know that PostgreSQL has a few other processes that are always active without any regard to incoming client connections.
When a client connects to your cluster, a new process is spawned: this process, named the backend process, is responsible for serving the client requests (meaning executing the queries and returning the results). You can see and count connections by inspecting the process list:
$ ps -C postgres -af
UID PID PPID C STIME TTY TIME CMD
postgres 1 0 0 11:08 ? 00:00:00 postgres
postgres 34 1 0 11:08 ? 00:00:00 postgres: checkpointer
postgres 35 1 0 11:08 ? 00:00:00 postgres: background writer
postgres 37 1 0 11:08 ? 00:00:00 postgres: walwriter
postgres 38 1 0 11:08 ? 00:00:00 postgres: autovacuum launcher
postgres 39 1 0 11:08 ? 00:00:00 postgres: logical replication launcher
postgres 40 1 0 04:35 ? 00:00:00 postgres: postgres postgres [local] idle
If you compare the preceding list with the previous one, you will see that there is another process with PID 40
: this process is a backend process. In particular, this process represents a client connection to the database named postgres
.
PostgreSQL uses a process approach to concurrency instead of a multi-thread approach. There are different reasons for this: most notably, the isolation and portability that a multi-process approach offers. Moreover, on modern hardware and software, forking a process is no longer so much of an expensive operation.
Therefore, once PostgreSQL is running, there is a tree of processes that roots at postmaster
. The aim of the latter is to spawn new processes when there is the need to handle new database connections, as well as to monitor all maintenance processes to ensure that the cluster is running fine.