Change Data Capture
Generally, operational systems do not maintain historical data for extended periods of time. Therefore, it is essential that an exact replica of the transactional system data be maintained in the data lake along with its history. This has a few advantages, including providing you with a historical audit log of all your transactional data. Additionally, this huge wealth of data can help you to unlock novel business use cases and data patterns that could take your business to the next level.
Maintaining an exact replica of a transactional system in the data lake means capturing all of the changes to every transaction that takes place in the source system and replicating it in the data lake. This process is generally called CDC. CDC requires you to not only capture all the new transactions and append them to the data lake but also capture any deletes or updates to the transactions that happen in the source system. This is not an ordinary feat to achieve on data...