Installing Hadoop
There are several ways to install Hadoop. The most common ones are:
- Installing Hadoop from the source files from https://hadoop.apache.org
- Installing using open source distributions from commercial vendors such as Cloudera and Hortonworks
In this exercise, we will install the Cloudera Distribution of Apache Hadoop (CDH), an integrated platform consisting of several Hadoop and Apache-related products. Cloudera is a popular commercial Hadoop vendor that provides managed services for enterprise-scale Hadoop deployments in addition to its own release of Hadoop. In our case, we'll be installing the HDP Sandbox in a VM environment.
Installing Oracle VirtualBox
A VM environment is essentially a copy of an existing operating system that may have preinstalled software. The VM can be delivered in a single file, which allows users to replicate an entire machine by just launching a file instead of reinstalling the OS and configuring it to mimic another system. The VM operates in a self...