The first part of our process is to decide whether or not we actually need a SAN at all. If the database will remain relatively small, capable of residing easily on local hard drives for several years, we don't need a SAN just yet.
While it might seem arbitrary, setting 3 TB as a cutoff for local storage comes with a few justifications. First, consider the local drives. Even if they were capable of saturating a 6 Gbps disk controller, 3 TB would require over an hour to transfer to another local storage device. If that wasn't a bottleneck, there is still the network. With a 10 Gbps NIC and assuming no overhead, that's 40 minutes of transfer at full speed.
That directly affects speed of backups, synchronization, emergency data restores, and any number of other critical operations. Some RAID cards also require special configuration when handling over 4 TB of storage, out of which 3 TB is uncomfortably close if we ever need an extension. SAN devices can perform local storage snapshots for nearly instant data copies intended for other servers. If the other server also uses the same SAN, there's no transfer overhead.
And lastly, while RAID devices can be extended when online, there is a limit imposed by how many local disks are available to our server, either directly in the chassis, or from direct attached storage extensions. If there's ever any risk we can reach that maximum, SAN devices do not have any of these inherent limitations, which we can use to our advantage.
If a SAN is ever available for testing, we're still not done. Depending on the speed of configuration of the SAN or the storage allocation itself, performance may not be sufficient, so we should test the claims made by the SAN manufacturer before committing all of our storage to it.
A very easy way to do this is with a basic pgbench test. The pgbench command is provided by the PostgreSQL software, and it can test various aspects of a server. For our uses, we want to focus on the disk storage. We start by creating a new pgbench database with createdb, so the pgbench command has somewhere to store its test data. The -i option to pgbench tells it to initialize new test data, and the -s option describes the scale of test data we want.
A scale of 4000 creates a database roughly 60 GB in size. Feel free to adjust this scale to be larger than the amount of available RAM, which guarantees that the server cannot cache all of the test data and taint our performance results by inflating the numbers.
After initializing a new test database, there is a Linux command that can instruct the server to drop all available cached data. This means none of our test data is in memory before we start the benchmark. Again, we don't want to inflate our results, otherwise the SAN looks more capable than it really is.
The test itself comes from pgbench again, which is instructed to only read the test data with the -S option. Furthermore, we tell the benchmark to launch 24 clients with the -c parameter, and to run the test for ten full minutes with the -T option. While we used 24 clients here, consider any amount up to three times the number of available processor cores. The final -j flag merely launches two concurrent benchmark threads, preventing the test itself from reducing overall performance due to CPU throttling.
This process should reveal how capable the SAN is, and if our production database will be safe and have good performance while relying on remote storage.