Test your knowledge
Before moving on to the next chapter, test your knowledge with the following questions:
- Assume you have multiple batch and streaming ETL workloads that use different transient EMR clusters for distributed processing. Your organization is looking for a persistent centralized data catalog that can help the data governance team get a unified view. Between AWS Glue Data Catalog and Hive Metastore, which one is better suited?
- Assume you have an on-premises Hadoop cluster that uses Apache Ranger for fine-grained access control. You are planning to migrate your on-premises Hadoop cluster to Amazon EMR in AWS to take benefit of cloud security, reliability, and scaling capabilities. For your Ranger server, you have configured custom TLS certificates that you plan to integrate into EMR. How should you integrate the TLS certificates into EMR?
- Assume you are part of a bigger enterprise that has multiple departments and each department has its own AWS account...