Data contract publishing patterns
Data generators need to be able to publish their data easily and reliably to the interface they are providing to their data consumers, which will typically be a table in a data warehouse or lakehouse, such as Snowflake or Google BigQuery, or a topic in an event streaming platform such as Apache Kafka or Google Pub/Sub.
In this section, we’ll look at the different patterns they can use to publish their data to these systems, and the pros and cons of each.
Perhaps the key consideration you need to make is whether you need a transactional guarantee between the source system and the interface you’re providing to the data consumer. It’s what ensures consistency between the data in our service and the data used by our data consumers.
Consider the scenario where you have a user of the system taking some action that results in a new record being written to the services database – for example, placing an order. Writing to...