☁️ Apex Batch Processing in Salesforce — A Deep Dive

As Salesforce developers, we eventually run into situations where the volume of data becomes too large to process in a single transaction. This is when governor limits start appearing and force us to rethink our design.

One of the most powerful tools Salesforce gives us to deal with large datasets is Batch Apex.

In this post, I want to walk through some important concepts around Batch Apex that every developer should understand:

  • When to use Trigger vs Batch
  • How many records Batch Apex can process
  • What exactly a chunk (scope) is
  • Why you may hit the 50K query row limit
  • How to design Batch jobs correctly

🤔 When Should You Use a Trigger vs Batch?

One of the most common design questions is deciding whether a requirement should be implemented using a Trigger or a Batch class.

When a Trigger Makes Sense

Triggers are ideal when the logic must run immediately as part of the transaction.

Typical examples include:

  • Setting derived field values
  • Performing validations
  • Updating related records
  • Creating follow-up tasks automatically

Example:

trigger OpportunityTrigger on Opportunity (before insert, before update) {    for (Opportunity opp : Trigger.new) {
if (opp.Amount > 100000) {
opp.Priority__c = 'High';
}
}}

In these situations:

  • The user expects immediate processing
  • The number of records is usually small (≤ 200)
  • Everything runs in a single synchronous transaction

When Batch Apex Is the Right Choice

Batch Apex is designed for large-scale asynchronous processing.

Some examples I’ve seen in projects include:

  • Nightly data cleanup jobs
  • Recalculating values for large datasets
  • Data migrations or historical data updates
  • Reprocessing failed integration records
  • Updating millions of records after introducing new business logic

Example execution:

Database.executeBatch(new OpportunityUpdateBatch(), 200);

A simple rule I usually follow is:

If the process is not user-triggered and may involve thousands or millions of records, it is a good candidate for Batch Apex.


📦 How Many Records Can Batch Apex Process?

One of the biggest advantages of Batch Apex is that it can handle very large datasets.

A Batch class implements the Database.Batchable interface.

Example structure:

global class OpportunityUpdateBatch implements Database.Batchable<SObject> 
{
global Database.QueryLocator start(Database.BatchableContext bc) { return Database.getQueryLocator(
'SELECT Id, StageName FROM Opportunity WHERE StageName = \'Prospecting\''
);
}
global void execute(Database.BatchableContext bc, List<Opportunity> scope)
{ // Process records in this chunk } global void finish(Database.BatchableContext bc) { // Optional post-processing logic
}
}

Important limits to keep in mind:

LimitValue
Maximum records via QueryLocator50 million
Maximum records via Iterable50,000
Maximum concurrent batch jobs5
Maximum jobs in Flex Queue100

Because the records are processed in chunks, Salesforce can safely process millions of records asynchronously.


🧩 What Is a Chunk (Scope)?

When a batch job starts, Salesforce splits the dataset returned by start() into smaller groups called chunks (or scopes).

Example:

Total records = 1,000,000
Chunk size = 200

Salesforce will call execute():

1,000,000 / 200 = 5,000 times

Each execute() call processes only one chunk of records.

Example execution:

Database.executeBatch(new OpportunityUpdateBatch(), 200);

Chunk size limits:

ParameterValue
Default chunk size200
Minimum chunk size1
Maximum chunk size2000

One important thing to remember:

Each chunk runs in its own transaction, which means governor limits reset for every chunk.

This is what allows Batch Apex to handle large volumes of data efficiently.


🚨 The 50K Query Row Problem

One of the most common issues developers run into with Batch Apex is the 50,000 SOQL row limit per transaction.

This typically happens because of incorrect batch design.

Let’s look at an example.


❌ A Common Anti-Pattern

Suppose the batch starts by querying Opportunities:

global Database.QueryLocator start(Database.BatchableContext bc) {    return Database.getQueryLocator(
'SELECT Id FROM Opportunity WHERE StageName = \'Closed Won\''
);}

Inside execute() we query related contacts from Accounts:

global void execute(Database.BatchableContext bc, List<Opportunity> scope) 
{
Set<Id> accountIds = new Set<Id>();
for(Opportunity opp : scope){
accountIds.add(opp.AccountId);
}
List<Contact> contacts = [
SELECT Id, Email, AccountId
FROM Contact
WHERE AccountId IN :accountIds
];
}

At first glance, this looks fine.

But let’s do the math.

Chunk size = 200 opportunities
Each account has ~300 contacts

Result:

200 × 300 = 60,000 rows

Salesforce limit:

50,000 rows per transaction

This results in:

System.LimitException: Too many query rows: 50001

🔍 Root Cause

The batch is iterating over a parent object and then querying related child records inside execute() without accounting for the multiplication factor.

This can easily push the query result beyond the 50K row limit.


✅ Correct Design Approaches

Fix #1 — Batch on the Object You Are Actually Processing (Best Practice)

If your logic operates on Contacts, batch Contacts, not Opportunities.

global class ContactUpdateBatch implements Database.Batchable<SObject>{    
global Database.QueryLocator start(Database.BatchableContext bc)
{ return Database.getQueryLocator(
'SELECT Id, Email FROM Contact'
);
}
global void execute(Database.BatchableContext bc, List<Contact> scope){
List<Contact> contactsToUpdate = new List<Contact>(); for(Contact con : scope){
con.Description = 'Processed by batch';
contactsToUpdate.add(con);
}
update contactsToUpdate;
}
}

This approach avoids parent-child multiplication entirely.


Fix #2 — Reduce Chunk Size

If batching on the parent object is unavoidable, reduce the chunk size.

Example:

Average contacts per account = 250

Safe chunk size:

50000 / 250 ≈ 200

You might run the batch like this:

Database.executeBatch(new OpportunityContactBatch(), 50);

However, this approach works only if the child record count is predictable.


Fix #3 — Use Aggregation for Rollups

For rollup calculations, it’s often more efficient to use an aggregate query.

Example:

SELECT AccountId, COUNT(Id)
FROM Contact
GROUP BY AccountId

This avoids retrieving thousands of child records unnecessarily.


📊 Key Takeaways

ConceptExplanation
TriggerReal-time processing
Batch ApexAsynchronous large data processing
Chunk / ScopeRecords processed per execute()
Default Chunk Size200
Maximum Chunk Size2000
QueryLocator Limit50 million records
SOQL Row Limit50,000 rows per transaction

One lesson I’ve learned while designing batch jobs is this:

If your batch is hitting governor limits, the problem is usually not the limit — it’s the design.

Comments

Leave a comment