Let's look at the internal mechanics of how reads and writes are executed within a RegionServer instance.
Reads and writes
The HBase write path
HDFS is an append-only file system, so how could a database that supports random record updates be built on top of it?
HBase is what's called a log-structured merge tree, or an LSM, database. In an LSM database, data is stored within a multilevel storage hierarchy, with movement of data between levels happening in batches. Cassandra is another example of an LSM database.
When a write for a key is issued from the HBase client, the client looks up Zookeeper to get the location of the RegionServer that hosts the META region. It then queries the META region to find out a table...