Data lake in AWS with Lake Formation
Lake formation is a fully managed data lake service provided by AWS that enables data engineers and analysts to build a secure data lake. Lake formation provides an orchestration layer combining AWS services such as S3, RDS, EMR, and Glue to ingest and clean data with centralized fine gain data security management.
With Lake formation, you can set up your data lake on Amazon S3 and start ingesting readily queryable data. As you add your data sources, Lake Formation will crawl those sources and move the data into your Amazon S3 data lake. Lake Formation uses machine learning to automatically lay out the data in Amazon S3 partitions, change it into formats for faster analytics, like Apache Parquet and ORC, and deduplicate and find matching records to increase data quality.
You can set up all permissions for your data lake, which will be implemented across all services accessing this data, such as Amazon Redshift, Amazon Athena, and Amazon EMR. This reduces...