Configuring Batch Size
Batch size refers to the number of data records processed together during a task, influencing the speed of task completion significantly. Using a larger batch size can enhance efficiency by minimizing data transfer and processing overhead. However, if the batch size is excessively large, it may strain system resources and impede performance. The optimal batch size is contingent upon the characteristics of the specific data being processed, including volume, complexity, and variability. For instance, a Copy activity is used to copy data between a Source and a sink, where the sink represents the destination.
Note
This section primarily focuses on the Configure the batch size concept of the DP-203: Data Engineering on Microsoft Azure exam.
When the sink is represented by a relational database, one option that can be specified is Write batch size
, as shown in Figure 5.26, taken from an ADF pipeline:
Figure 5.26 – Specifying...