Sensitive data discovery with Amazon Macie
In the previous section, we saw how AWS Lake Formation helps with access control mechanisms, which is a vital piece of data governance. When certain datasets contain confidential data or sensitive data, you can use Lake Formation to selectively grant access to only certain columns by tagging them accordingly and granting access via those tags.
The big assumption we made was that data stewards of the data lake are already aware of all the confidential data in the data lake, along with its S3 bucket and filename. In a large implementation of a data lake with lots of contributing source systems, finding sensitive data and classifying it accordingly is like finding a needle in a haystack.
So many use cases require that data assets be classified and tagged accordingly so that accurate permissions can be granted to only the personas who should have access to the data. Doing this also ensures that such sensitive data is tracked as it migrates...