DBAs may be asked to set up a test server and populate it with test data. Often, that server will be old hardware, possibly with smaller disk sizes. So, the subject of data sampling raises its head.
The purpose of sampling is to reduce the size of the dataset and improve the speed of later analysis. Some statisticians are so used to the idea of sampling that they may not even question whether its use is valid or it can cause further complications.
The SQL standard way to perform sampling is by adding the TABLESAMPLE clause to the SELECT statement. This is only available on PostgreSQL 9.5 and later, so we also describe an alternate version of this recipe that doesn't use TABLESAMPLE.