Understanding the uses of ETL
In the most literal terms, ETL refers to a procedure with three conceptual phases that begin with reading data from a source system and end with a derivative of the original data being stored into a target system. In between these deceptively simple steps sits the most important facet of ETL, the transformation from the source system's semantic and physical schema to the domain model expected by the target system. In this step, we are essentially integrating source and target systems that may represent data differently.
Much of the academic literature on ETL points to the expansion of data warehousing concepts in the 1970s as its origin. It was a time when businesses rapidly adopted databases and found themselves with multiple data repositories, often using incompatible formats. Sounds familiar? Fast forward to today, and not much has changed aside from the date. The ability to integrate data from siloed or incompatible systems continues to be...