Hands-on – creating data transformations with AWS Glue DataBrew
In Chapter 7, Transforming Data to Optimize for Analytics, we used AWS Glue Studio to create a data transformation job that took in multiple sources to create a new table. In this chapter, we discussed how AWS Glue DataBrew is a popular service for data analysts, so we’ll now make use of Glue DataBrew to transform a dataset.
Differences between AWS Glue Studio and AWS Glue DataBrew
Both AWS Glue Studio and AWS Glue DataBrew provide a visual interface for designing transformations, and in many use cases, either tool could be used to achieve the same outcome. However, Glue Studio generates Spark code that can be further refined in a code editor and can be run in any compatible environment. Glue DataBrew does not generate code that can be further refined, although Glue DataBrew recipes can also be run from a Glue Studio job. Glue Studio has fewer built-in transforms, and the transforms it does...