Digging into the problem of moving data between two systems
We have talked, in Chapter 1, What Is Analytics Engineering?, about the changing process of extracting, transforming, and loading (ETL) data, but understanding these steps is only part of ingesting data. Whenever you add new data to your data platform, whether that is sales data, currency exchange data, web analytics data, or video footage, you will have to make certain choices around the frequency, quality, reliability, and retention of that data and many other choices. If you do not think ahead at the beginning, reality will catch up with you when your data provider makes a change, a pipeline accidentally runs twice, or the requirements from the business change. But why do we need to move data from one system to the other in the first place?
You might consider it a given that you have to manipulate data a bit to make it fit your purpose. Maybe you’re used to pivoting tables in an Excel sheet, adjusting formulas...