Data integration challenges
The traditional approach taken with traditional analytical systems has typically leveraged data integration tools to build pipelines that extract source system data, transform it, cleanse it, and finally load it into a data mart or data warehouse.
This data integration approach, also known as "schema on write," can lead to long development lead times as the target data model has to be defined before the data movement pipeline can be completed. Meanwhile, the physical act of copying data both multiplies storage costs, courtesy of data duplication, and introduces the challenge of data latency to the data movement pipeline. Furthermore, data movement and duplication increase the data management burden when meeting security and compliance requirements as multiple versions of the same data now exist.
This "data movement" pattern is also intrinsic to modern big data architectures. While some data integration velocity challenges have...