Chapter 1. What It's All About
This book is about Hadoop, an open source framework for large-scale data processing. Before we get into the details of the technology and its use in later chapters, it is important to spend a little time exploring the trends that led to Hadoop's creation and its enormous success.
Hadoop was not created in a vacuum; instead, it exists due to the explosion in the amount of data being created and consumed and a shift that sees this data deluge arrive at small startups and not just huge multinationals. At the same time, other trends have changed how software and systems are deployed, using cloud resources alongside or even in preference to more traditional infrastructures.
This chapter will explore some of these trends and explain in detail the specific problems Hadoop seeks to solve and the drivers that shaped its design.
In the rest of this chapter we shall:
Learn about the big data revolution
Understand what Hadoop is and how it can extract value from data
Look into cloud computing and understand what Amazon Web Services provides
See how powerful the combination of big data processing and cloud computing can be
Get an overview of the topics covered in the rest of this book
So let's get on with it!