A single-node Hadoop in Cloud
Hopefully by now you should have obtained an understanding of what outcomes you can achieve by running MapReduce jobs in Hadoop or by using other Hadoop components. In this chapter, we will put theory into practice.
We will begin by creating a Linux-based virtual machine with a pre-installed Hortonworks distribution of Hadoop through Microsoft Azure. The reason why we opt for a pre-installed, ready-to-use Hadoop is because this book is not about Hadoop per se, and we also want you to start implementing MapReduce jobs in the R language as soon as possible.
Once you have your Hadoop virtual machine configured and prepared for Big Data crunching we will present you with a simple word count example initially carried out in Java. This example will serve as a comparison for a similar job run in R.
Finally, we will perform a word count task in the R language. Before that, however, we will guide you through some additional configuration operations and we will explain...