Checkpoint overhead
If your database buffer cache is large, it's also possible a large quantity of data could be dirty when a checkpoint starts, causing a checkpoint I/O spike. When using a later PostgreSQL version that supports spread checkpoints, you can tune checkpoint frequency to reduce the average size of these spikes, and you'll also have the pg_stat_bgwriter
view to help you with that optimization.
But ultimately you should realize that every bit of data that's inside the database's shared buffer cache needs to be accounted for at checkpoint time, while the data in the OS cache does not. The flip side to this is that keeping data in shared_buffers
can reduce total write volume, when you modify the same data block more than once per checkpoint—which is common for index blocks in particular. Sizing to optimize has to consider both sides of this trade-off. You should aim for the most frequently used blocks ending up in database shared memory, with the...