Data transformation best practices
As seen in previous chapters, analytics engineering embraces software engineering best practices to model, transform, test, deploy, and document data in a reusable way.
When it comes to writing transformation pipelines, SQL is the industry standard. Still, you might also want to use other languages, such as Python or Scala, depending on the tools you use for transformation.
The barrier to entry to writing SQL code is quite low. Thanks to its declarative nature, SQL is easy to read. Most data specialists know how to write SQL, making it easier for organizations to hire talent who can work with SQL pipelines, an important factor in democratizing transformation capabilities.
In this section, we will tackle SQL best practices for your transformation pipelines. We will also mention language specific to dbt and Databricks. In dbt, the SQL files in which developers write SELECT
statements are called models. In Databricks, code is organized within...