Managing the foundation for data lakes
Data engineers design, build and manage the data pipelines, but the foundation of the data lake and data warehouse is the specific landing zone for the data platform. Typically, landing zones in cloud are operated by cloud engineers who take care of the compute, storage, and network resources.
Looking at management of data platforms, we can distinguish various roles:
- Data architect or engineer: the architect and data engineer are often combined in one role. The role is responsible for design, development, and deployment of the data pipelines. The engineer must have extensive knowledge of ETL or ELT principles and technologies, making sure that data from sources get collected and transformed into usable datasets in data warehouses or other data products where the data can be further analyzed. Data also needs to be validated, which is a required skill of the engineer too. In essence, the engineer makes sure that data that is ingested into warehouses...