Schema evolution
Schema evolution can be described as a technique that's used to adapt to ongoing structural changes to data. As systems mature and add more functionality, schema evolution is inevitable. Therefore, adapting to schema evolution is an extremely important requirement of modern-day pipelines.
It is customary to start developing pipelines so that they have base schemas for tables at the start of the project. Frequently, by the time things move into production, there is a very high likelihood that the schema for some incoming file or table has changed. But why is this such a big problem?
Important
A data engineer should never make the mistake of assuming that the schema of incoming data will never change. Instead, prepare the pipelines so that they auto-adjust to this evolution.
Let's discuss an example scenario to illustrate this point. Let's assume your pipelines have been deployed in production and that, for a while, you have been ingesting...