Data as code – An intelligent pipeline
All the operationalizing aspects referred to in the previous sections would have to be explicitly coded by DevOps, MLOps, and DataOps personas. A managed platform such as Databricks has abstracted the complexity of all these features as part of its DLT offering. The culmination of all these features out of the box gives rise to intelligent pipelines. There is a shift from a procedural to a declarative definition of a pipeline where, as an end user, you specify the "what" aspects of the data transformations, delegating the "how" aspects to the underlying platform. This is especially useful for simplifying the ETL development and go to production process when pipelines need to be democratized across multiple use cases for large, fast-moving data volumes such as IoT sensor data.
These are the key differentiators:
- The ability to understand the dependencies of the transformations to generate the underlying DAG...