You're reading from AWS for Solutions Architects Design your cloud infrastructure by implementing DevOps, containers, and Amazon Web Services

Product type Paperback

Published in Feb 2021

Publisher Packt

ISBN-13 9781789539233

Length 454 pages

Edition 1st Edition

Tools

AWS

Concepts

Cloud Computing

Table of Contents (20) Chapters

Preface

1. Section 1: Exploring AWS

2. Chapter 1: Understanding AWS Cloud Principles and Key Characteristics FREE CHAPTER

3. Chapter 2: Leveraging the Cloud for Digital Transformation

4. Section 2: AWS Service Offerings and Use Cases

5. Chapter 3: Storage in AWS – Choosing the Right Tool for the Job

6. Chapter 4: Harnessing the Power of Cloud Computing

7. Chapter 5: Selecting the Right Database Service

8. Chapter 6: Amazon Athena – Combining the Simplicity of Files with the Power of SQL

9. Chapter 7: AWS Glue – Extracting, Transforming, and Loading Data the Simple Way

10. Chapter 8: Best Practices for Application Security, Identity, and Compliance

11. Section 3: Applying Architectural Patterns and Reference Architectures

12. Chapter 9: Serverless and Container Patterns

13. Chapter 10: Microservice and Event-Driven Architectures

14. Chapter 11: Domain-Driven Design

15. Chapter 12: Data Lake Patterns – Integrating Your Data across the Enterprise

16. Chapter 13: Availability, Reliability, and Scalability Patterns

17. Section 4: Hands-On Labs

18. Chapter 14: Hands-On Lab and Use Case

19. Other Books You May Enjoy

Leave a review - let other readers know what you think

Summary

In this chapter, we introduced one of the most important services in the AWS stack – AWS Glue. We also learned about the high-level components that comprise AWS Glue such as the AWS Glue console, the AWS Glue Data Catalog, AWS Glue crawlers, and AWS Glue code generators. We then learned how everything is connected and how it can be used. Finally, we spent some time learning about recommended best practices when architecting and implementing AWS Glue.

In this chapter, we reviewed how we can choose the right worker type when launching an AWS Glue job. We learned how to optimize our file size during file splitting. We saw what can cause Yarn to run out of memory and what can be done to avoid this problem. We learned how the Apache Spark UI can be leveraged for troubleshooting. We were presented with definitions of data partitioning and predicate pushdown, and why they're important, along with other best practices and techniques.

In the next chapter, we will learn...