As explained in earlier chapters, Storm is meant for real-time data processing. However, in most cases, you will need to store the processed data in a data store so that you can use the stored data for further batch analysis and execute the batch analysis query on the data stored. This section explains how you can store the data processed by Storm in HBase.
Before going to the implementation, I want to give a little overview of what HBase is. HBase is a NoSQL, multidimensional, sparse, horizontally scalable database that is modeled after Google BigTable. HBase is built on top of Hadoop, which means it relies on Hadoop and integrates with the MapReduce framework very well. Hadoop provides the following benefits to HBase:
- A distributed data store that runs on top of the commodity hardware
- Fault tolerance
We will assume that you have HBase...