Hadoop on Cloud
Hadoop is a distributed system and it is capable of running over thousands of distributed nodes. Hadoop mega clusters with thousands of nodes are already in production. In this book, we developed solutions on a single-node cluster. Such a setup is good for learning but not sufficient for a production environment. Setting up even a modest three- or five-node Hadoop cluster may not be very feasible at home due to the cost of hardware involved. Arranging the budgets for a five-node Hadoop cluster in a company will require you to go through a budgetary approval process and then order the hardware, which can be a time-consuming process.
Hadoop on Cloud offers a good alternative to having a multinode Hadoop setup in your own data center or company premises. You can use Hadoop on cloud in two ways:
- Deploying Hadoop on cloud servers
- Using Hadoop as a Service
Deploying Hadoop on cloud servers
All cloud service providers let you provision standard Linux- or Windows-based servers by...