Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
The Self-Taught Cloud Computing Engineer

You're reading from   The Self-Taught Cloud Computing Engineer A comprehensive professional study guide to AWS, Azure, and GCP

Arrow left icon
Product type Paperback
Published in Sep 2023
Publisher Packt
ISBN-13 9781805123705
Length 472 pages
Edition 1st Edition
Tools
Arrow right icon
Author (1):
Arrow left icon
Dr. Logan Song Dr. Logan Song
Author Profile Icon Dr. Logan Song
Dr. Logan Song
Arrow right icon
View More author details
Toc

Table of Contents (24) Chapters Close

Preface 1. Part 1: Learning about the Amazon Cloud
2. Chapter 1: Amazon EC2 and Compute Services FREE CHAPTER 3. Chapter 2: Amazon Cloud Storage Services 4. Chapter 3: Amazon Networking Services 5. Chapter 4: Amazon Database Services 6. Chapter 5: Amazon Data Analytics Services 7. Chapter 6: Amazon Machine Learning Services 8. Chapter 7: Amazon Cloud Security Services 9. Part 2:Comprehending GCP Cloud Services
10. Chapter 8: Google Cloud Foundation Services 11. Chapter 9: Google Cloud’s Database and Big Data Services 12. Chapter 10: Google Cloud AI Services 13. Chapter 11: Google Cloud Security Services 14. Part 3:Mastering Azure Cloud Services
15. Chapter 12: Microsoft Azure Cloud Foundation Services 16. Chapter 13: Azure Cloud Database and Big Data Services 17. Chapter 14: Azure Cloud AI Services 18. Chapter 15: Azure Cloud Security Services 19. Part 4:Developing a Successful Cloud Career
20. Chapter 16: Achieving Cloud Certifications 21. Chapter 17: Building a Successful Cloud Computing Career 22. Index 23. Other Books You May Enjoy

Amazon EMR

Amazon EMR is a platform for leveraging many big data tools for data processing. We will start by looking at the concepts of MapReduce and Hadoop.

MapReduce and Hadoop

MapReduce and Hadoop are two related concepts in the field of distributed computing and big data processing.

The idea of MapReduce is “divide and conquer”: decompose a big dataset into smaller ones to be processed in parallel on distributed computers. It was originally developed by Google for its search engine to handle the massive amounts of data generated by web crawling. The MapReduce programming model involves two functions: a map function that divides and processes in parallel the datasets and a map function that aggregates the map outputs.

Hadoop is an open source software framework that implements the MapReduce model. Hadoop consists of two core components: Hadoop Distributed File System (HDFS) and MapReduce. HDFS is a distributed filesystem that can rapidly transfer data between...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime