Summary
In this chapter, you learned about some of the best practices for dealing with real-world problems and how to run highly selective queries on big fact tables. After that, you learned how to run highly selective queries by experimenting with the Glue partition indexing technique, which allows you to query humongous fact tables and make data retrieval smooth. Next, you learned how to deal with join performance issues between a large fact table and a small dimension table. Here, you learned how to use the broadcast mechanism to optimize the join operation.
After that, you learned how to deal with dimension tables when something goes wrong and you don’t have a way to partition the workloads into smaller workloads. Here, you applied a Glue bounded execution with Glue bookmarks to restrict the number of files that can be processed with incremental workloads. For the edge case scenario, where you read a large-dimension table, you learned how to configure Glue jobs to use...