Hadoop and big data
In this section, we'll consider why Hadoop is actually a very good choice for storing and accessing big data.
Imagine you want to process data, a lot of data. In our previous example, we considered the scenario where machine generated web logging files are being produced and we want to leverage information within those files to perform some analytics and produce some (hopefully) compelling data visualizations.
Using R worked here, but if we extend the scenario with the idea that we will continue to receive web log files over time and the size of those files will increase, R might not be a feasible answer.
Entering Hadoop
Hadoop (as the product documentation says) is not your average database. In fact, Hadoop can store all kinds of data from many servers and websites and corporate vaults--as much as you might need or want to gather. In addition, Hadoop spreads your work across hundreds or thousands of processors and storage drives working in parallel all at the same...