Splitting Data
ADF provides multiple ways to split data in a pipeline to enhance workflow flexibility, performance, scalability, and resource optimization. By utilizing various data splitting techniques, you can design robust data processing pipelines capable of handling diverse data processing requirements to achieve efficient data orchestration. This capability allows you to partition data into smaller subsets for parallel processing or to route data to different branches of the pipeline based on specific criteria. Within the data splitting, the important ones are Conditional Split and cloning (new branch).
Note
This section primarily focuses on the Split data concept of the DP-203: Data Engineering on Microsoft Azure exam.
While Conditional Split is used to split data based on certain conditions, the New branch option is used to just copy the entire dataset for a new execution flow. You have already seen an example of a Conditional Split in Figure 4.25. You will now create...