Fine-grained access control using AWS Lake Formation
One of the biggest challenges with setting up and operating data lakes on a large scale is to make sure all the data is secure. This challenge arises due to data being all over the place in a data lake, across multiple S3 buckets, and accessible via many cataloged tables. Setting up a unified permission model around who gets access to what portion of the data is not a trivial task. Imagine a very large data lake with thousands of databases and thousands of tables with 10,000 users continuously trying to access the data; to complicate things further, new users are getting onboarded every day and new datasets are constantly getting added to the data lake. Unless there is a robust mechanism to control fine-grained data access across all the datasets, the data lake would become a governance nightmare.
AWS Lake Formation
In a few of the previous chapters, we touched upon AWS Lake Formation as a service that helps in multiple aspects...