So far, we have seen how to create or drop a database and how to list all the databases or schemas or tables. We discussed earlier that there is a base subdirectory inside the data directory of a PostgreSQL cluster that contains the databases and their objects. In this recipe, we are going to discuss some of the interesting details, such as the file system layout of a database and a table in PostgreSQL.
Getting ready
To identify a table or a database, we should have a running PostgreSQL cluster that we could connect using psql. Additionally, it requires access to the data directory on the database server (through the Postgres OS user) to walk through the locations and see the files in reality.
How to do it...
In the following steps, we will see how we could identify the directory specific to the database and how the table appears on the filesystem by locating it:
- We will create a database for test purposes:
$ psql -c "CREATE DATABASE testing"
CREATE DATABASE
- Get the oid value of the database from pg_database as seen in the following command:
$ psql -c "select oid, datname from pg_database"
oid | datname
-------+-----------
13878 | postgres
16384 | testing
1 | template1
13877 | template0
(4 rows)
- Locate the database directory using the oid value from the previous step:
$ ls -ld $PGDATA/base/16384
drwx------. 2 postgres postgres 8192 Sep 1 07:14 /var/lib/pgsql/13/data/base/16384
- Create a table in the newly created database:
$ psql -d testing -c "CREATE TABLE employee (id int)"
CREATE TABLE
- Get the oid value of the table created in the previous step from pg_class as seen in the following command:
$ psql -d testing -c "select oid, relname from pg_class where relname = 'employee'"
oid | relname
-------+----------
16385 | employee
(1 row)
- Get the location of the table on the file system using the pg_relation_filepath function:
$ psql -d testing -c "select pg_relation_filepath('employee')"
pg_relation_filepath
----------------------
base/16384/16385
(1 row)
How it works...
In order to understand the location of a database in the file system, we shall create a database as seen in Step 1. Now, we could use the catalog pg_database to get the oid of the database we created, as seen in Step 2.
As you can see in the log of Step 2, there is a unique identifier for each database. While there are three default databases that are automatically created, the fourth database we have created (testing) has got a unique oid (object identifier) as well. In the previous recipes, we discussed that the base directory contains all the databases of the cluster. So, a subdirectory with the same name as the oid of the database is created inside this base directory. This directory contains all the objects created inside the database.
What this means is that when you create a database in PostgreSQL, a subdirectory with the same name as the oid of the newly created database is created inside the base directory, as seen in Step 3.
Now, let's understand the file system layout of a table. We can see that in action by creating an example table in the testing database as seen in Step 4.
In order to identify the file system layout of the table, there is a function called pg_relation_filepath(). This function can be used to know the location of the table and the file that corresponds to this table. But before using this function, let's see if there is a unique identifier for this table using pg_class as seen in Step 5.
So, from the log, we can see that 16385 is the oid of the employee table we have just created. Now, let's run the function to know where this was table created, as seen in Step 6.
As we can see in the log of Step 6, a file with the name 16385 (the same as the oid of the employee table) is created inside the 16384 directory. The directory 16384 corresponds to the oid of the database we created. So, this indicates that a table in PostgreSQL is created as a file inside the database directory inside the base directory. This base directory is, of course, one of the subdirectories of the data directory.