Seek/scan cost and statistics parameters
Now, we will look at a few other parameters that can impact the query plan and performance.
The default_statistics_target
parameter tells PostgreSQL how much data should be sampled to populate tables that store metadata. The default value is 100
. PostgreSQL will consider (300*
value) pages. This means, with the default value of 100
, PostgreSQL will read 30,000 pages (or the entire table, if the table is not that big) to do random sampling of rows. From these samples, it will populate the pg_statistic
catalog. The type of data collected include the number of distinct non-null values, most common values, most common frequencies for the values, and so on. These values will be used by the planner to decide the execution plan. Details of columns in pg_statistic
are provided at http://www.postgresql.org/docs/current/static/catalog-pg-statistic.html.
If EXPLAIN ANALYZE
for a query shows a significant variation between the actual and estimates, we should consider...