Bulk data processing
Although it is not strictly analytics, bulk data processing is crucial for managing large datasets effectively. This could involve truncating outdated data to optimize storage costs or supporting schema evolution in NoSQL databases. As application access patterns change or new features are added, datasets may need updates with new attributes, modifications to existing attributes, or removal of deprecated ones. Additionally, bulk processing can be essential for adding new attributes to create new secondary indexes and support schema evolution.
To explore when bulk processing is needed for DynamoDB tables, consider these example use cases.
Use case 1 – creating a new Global Secondary Index (GSI)
Consider a scenario where your application, two years post-launch, would benefit from a new GSI to support a new feature. Suppose that the sort key of this GSI needs to be a composite attribute, generated by concatenating two existing attributes such as status...