Using change feed for change tracking
When processing high volumes of data in Azure Cosmos DB, you may want to react to all the documents that are inserted or updated. While it is possible to query each container at a certain time interval, such a solution has many downsides:
- It is difficult to determine the correct interval.
- You can query the data even if there are no changes to the dataset.
- You do not know what record was added or altered, so you often need to query the whole collection.
To address these issues, a feature called change feed was introduced to Azure Cosmos DB. It allows for the implementation of various business cases, including the following:
- Calling an HTTP endpoint passing information about an added or updated document
- Processing data in a streaming fashion
- Migrating data with no or limited downtime
Change feed can be easily integrated with multiple real-time processing tools such as Apache Spark or Apache Storm, giving...