PK chunking to improve performance
PK chunking is designed as a mechanism to allow entire Salesforce table data to be extracted—for example, as part of a backup routine. PK chunking effectively adds record IDs as a WHERE
clause parameter to query data from a Salesforce entity in batches.
In general, if an object in Salesforce has more than 10 million rows, you should use PK chunking when exporting its data. If you are finding that querying for data times out regularly, use PK chunking.
Given that PK chunking effectively separates one big query into separate queries by adding a WHERE
clause and using a range of ordered IDs, the batch size can be set. This is defaulted to 100,000 (as in, 100,000 records will be returned by default for each batch) but can be as high as 250,000. Therefore, for a 10 million-row entity, a batch size of 250,000 would result in 40 data batches being returned.
In Chapter 7, Data Migration, we walked through a practical example of how PK chunking...