Chapter 8: Understanding Data Governance in Amazon EMR
In previous chapters, you learned about EMR cluster security with IAM policies and data encryption and how you can configure security groups to control network traffic from or to your cluster.
As well as EMR cluster-level security, you can also enable data-level security where you can build a centralized data catalog on your datasets and then define fine-grained permissions to control which user can access which database, table, or column of your data catalog. Security of data is as important as maintaining security on your infrastructure. When you put security controls on your data, you also need to think about whether the data available for consumption is available in a useful format with proper data quality checks in place.
That brings us to the focus of this chapter, where we will dive deep into the following topics, which will help you implement data governance and granular permission management on your data catalog...