Recognizing Partitioning Needs in ADLS Gen2
As mentioned in the Benefits of Partitioning section, you can partition data according to your requirements such as performance, scalability, security, operational overhead, and so on. However, there is another reason why you might end up partitioning your data: the various I/O bandwidth limits that are imposed at subscription levels by Azure. These limits apply to both Blob Storage and ADLS Gen2.
Note
This section primarily focuses on the Identify when partitioning is needed in Azure Data Lake Storage Gen2 concept of the DP-203: Data Engineering on Microsoft Azure exam.
The rate at which you ingest data into an Azure Storage system is called the ingress rate, and the rate at which you move the data out of the Azure Storage system is called the egress rate.
Table 2.2 shows a snapshot of some of the limits enforced by Azure Blob Storage:
Resource |
Limit ... |