Bulkification – querying within loops
Another common mistake seen in Apex is querying for data within loops. This is different than repeated querying for data, as discussed previously, as instead, it focuses on performing a query with a (potentially) unique outcome for each iteration of a loop. This is particularly true within triggers.
When working with a trigger, you should always prepare your code to handle a batch of 200
records at once as a minimum. Although the limit for the number of items passed into a single trigger context is 200
for standard and custom objects, processes invoked via the Bulk
API or in a batch Apex context may call a trigger multiple times in a transaction. This is true regardless of whether or not you believe the tool will only pass records to the trigger individually; all that is required is for an enterprising administrator to create a flow that manipulates multiple records that fire your trigger or a large volume of data to be loaded, and you will have issues.
Consider the following code block, wherein we are looping through each contact we have been provided in a Contact
trigger and retrieving the related Account
record, including some information:
trigger ContactTrigger on Contact (before insert, after insert) { switch on Trigger.operationType { when BEFORE_INSERT { for(Contact con : Trigger.new) { Account acc = [SELECT UpsellOpportunity__c FROM Account WHERE Id = :con.AccountId]; con.Contact_for_Upsell__c = acc.UpsellOpportunity__c != 'No'; } } when AFTER_INSERT { //after insert code } } }
This simple trigger will set the Contact_for_Upsell__c
field to true
if the account is marked as having any upsell opportunity.
There are a couple of fairly obvious problems with the way we are querying here. Firstly, this is not bulkified—if we have 200
records passed into the trigger (over 100
records, in fact), we will break the governor limit for Salesforce Object Query Language (SOQL) queries and receive an exception that we cannot handle. Secondly, this setup is also inefficient as it may retrieve the same account record from the database twice.
A better way to manage this would be to gather all of the account IDs in a set and then query once. Not only will this avoid the governor limit, but it will also avoid us querying for duplicate results. An updated version of the code to do this is shown here:
trigger ContactTrigger on Contact (before insert, after insert) { switch on Trigger.operationType { when BEFORE_INSERT { Set<Id> accountIds = new Set<Id>(); for(Contact con : Trigger.new) { accountIds.add(con.AccountId); } Map<Id, Account> accountMap = new Map<Id, Account>([SELECT UpsellOpportunity__c FROM Account WHERE Id in :accountIds]); for(Contact con : Trigger.new) { con.Contact_for_Upsell__c = accountMap.get(con. AccountId).UpsellOpportunity__c != 'No'; } } when AFTER_INSERT { //after insert code } } }
In this code, we declare a Set<Id>
called accountIds
to hold the account ID for each contact without duplicates. We then query our Account
records into a Map<Id, Account>
so that when looping through each contact for a second time, we can set the value correctly.
Some of you may now be wondering whether we have merely moved our performance issue from having too many queries to having multiple loops through all the data. In Chapter 19, Performance Profiling, when we talk about performance profiling and profiling our code, we will cover the use of big-O notation in detail when discussing scaling. However, to touch on the subject of scaling here, it is worth doing some rudimentary analysis. Looping through these records (maximum 200
) will be extremely quick on the central processing unit (CPU) and is an inexpensive operation. It is also an operation that scales linearly as the number of records within the trigger grows. In our original trigger, for each new record, we had the following:
- One loop iteration
- One query
This is scaled linearly at a rate of 1x for both resources—that is, doubling the items doubled the resources being utilized, until a point of failure with a governor limit (in this instance, queries). In our new trigger structure, we have the following for each record:
- Two loop iterations (one for each
for
loop) - Zero additional queries
Our new resource usage scales linearly for loop iterations but is constant for queries, which are a more limited resource. As we will see later, this is the type of optimization we want within our code. It is, therefore, imperative that whenever we are looping through records and wish to query related data, we do so in a bulkified manner that, wherever possible, performs a single query for the entire loop.
Bulkification – DML within loops
Similar to the issue of querying in loops is that of performing DML within loops. The limit for DML statements is higher than that of SOQL queries at the time of writing and so is unlikely to present itself as early; however, it follows the same root cause and also the solution.
Take the following code example, in which we are now in the after insert
context for our trigger:
trigger ContactTrigger on Contact (before insert) { switch on Trigger.operationType { when BEFORE_INSERT { //previous trigger code } when AFTER_INSERT { for(Contact con : Trigger.new) { if(con.Contact_for_Upsell__c) { Task t = new Task(); t.Subject = 'Discuss opportunities with new contact'; t.OwnerId = con.OwnerId; t.WhoId = con.Id; insert t; } } } } }
Here, we are creating a Task
record for the owner of any new contact that is marked for upsell to reach out to them and discuss potential opportunities. In our worst-case bulk scenario here, we have 200 contacts that all have the Contact_for_Upsell__c
checkbox checked. This will lead to each iteration firing a DML statement that will cause a governor limit exception on record 151. Again, using our rudimentary analysis, we can see that for each additional record on the trigger, we have an additional DML statement that scales linearly until we breach our limit.
Instead, whenever making DML statements (particularly in triggers), we should ensure that we are using the bulk format and passing lists of records to be manipulated into the statement. For example, the trigger code should be written as follows:
trigger ContactTrigger on Contact (before insert) { switch on Trigger.operationType { when BEFORE_INSERT { //previous trigger code } when AFTER_INSERT { List<Task> tasks = new List<Task>(); for(Contact con : Trigger.new) { if(con.Contact_for_Upsell__c) { Task t = new Task(); t.Subject = 'Discuss opportunities with new contact'; t.OwnerId = con.OwnerId; t.WhoId = con.Id; tasks.add(t); } } insert tasks; } } }
This new code has a constant usage of DML statements, one for the entire operation, and can happily scale up to 200 records.