Hosting common data pipeline templates
After exploring the data mesh and finding the right data for their project, the next step for data product teams is to access that data directly or move that data to their data product landing zone. Small or medium-sized data kept in databases or data lakes can sometimes be directly accessed into a Python workbook by using a connection string and reading the data into a data frame. But for large datasets and data coming from on-premise legacy systems or enterprise resource planning (ERP) and customer relationship management (CRM) systems hosted outside the data mesh, you need pipelines.
In Azure, these pipelines are typically built using Azure Data Factory. While sources for these pipelines are common across data products, the type of storage where this data is deposited is also quite standard. It’s either a data lake or an SQL database that is typically used to store this data. If each data product team starts building pipelines to...