You're reading from The Self-Taught Cloud Computing Engineer

Product type Book

Published in Sep 2023

Publisher Packt

ISBN-13 9781805123705

Pages 472 pages

Edition 1st Edition

Languages

Concepts

Cloud Computing

Author (1):

Dr. Logan Song

Table of Contents (24) Chapters

Preface

1. Part 1: Learning about the Amazon Cloud

2. Chapter 1: Amazon EC2 and Compute Services

3. Chapter 2: Amazon Cloud Storage Services

4. Chapter 3: Amazon Networking Services

5. Chapter 4: Amazon Database Services

6. Chapter 5: Amazon Data Analytics Services

7. Chapter 6: Amazon Machine Learning Services

8. Chapter 7: Amazon Cloud Security Services

9. Part 2:Comprehending GCP Cloud Services

10. Chapter 8: Google Cloud Foundation Services

11. Chapter 9: Google Cloud’s Database and Big Data Services

12. Chapter 10: Google Cloud AI Services

13. Chapter 11: Google Cloud Security Services

14. Part 3:Mastering Azure Cloud Services

15. Chapter 12: Microsoft Azure Cloud Foundation Services

16. Chapter 13: Azure Cloud Database and Big Data Services

17. Chapter 14: Azure Cloud AI Services

18. Chapter 15: Azure Cloud Security Services

19. Part 4:Developing a Successful Cloud Career

20. Chapter 16: Achieving Cloud Certifications

21. Chapter 17: Building a Successful Cloud Computing Career

22. Index

Why subscribe?

23. Other Books You May Enjoy

Amazon EMR

Amazon EMR is a platform for leveraging many big data tools for data processing. We will start by looking at the concepts of MapReduce and Hadoop.

MapReduce and Hadoop

MapReduce and Hadoop are two related concepts in the field of distributed computing and big data processing.

The idea of MapReduce is “divide and conquer”: decompose a big dataset into smaller ones to be processed in parallel on distributed computers. It was originally developed by Google for its search engine to handle the massive amounts of data generated by web crawling. The MapReduce programming model involves two functions: a map function that divides and processes in parallel the datasets and a map function that aggregates the map outputs.

Hadoop is an open source software framework that implements the MapReduce model. Hadoop consists of two core components: Hadoop Distributed File System (HDFS) and MapReduce. HDFS is a distributed filesystem that can rapidly transfer data between...