HBase with R
The Apache HBase database allows users to store and process non-relational data on top of HDFS. Inspired by Google's BigTable, HBase is an open source, distributed, consistent, and scalable database that facilitates real-time read and write access to massive amounts of data. It is in fact a columnar or key-column-value data store that lacks any default schema and can be defined by users at any time.
The following tutorial will present a sequence of essential activities that will allow you to import our previously used Land Registry Price Paid Data into the HBase store on the Microsoft Azure HDInsight cluster and then retrieve specific slices of data using RStudio Server.
Azure HDInsight with HBase and RStudio Server
The process of launching the fully operational Microsoft Azure HDInsight cluster with HBase database is very similar to the one described in Chapter 4, Hadoop and MapReduce Framework for R, where we guided you through the creation of a multi-node HDInsight Hadoop cluster...