System trade-offs
In the introduction to this book, we discussed some of the principles in the design of distributed systems and talked about inherent system trade-offs that we need to choose between while setting out to build a distributed system.
How does HBase make those trade-offs? What aspects of its architecture are affected by these design choices, and what effect does it have on the set of use cases that it might be a fit for?
At this point, we already know HBase range partitions the key space, dividing it into key ranges assigned to different regions. The purpose of the META table is to record the range assignments. This is different from Cassandra, which uses consistent hashing and has no central state store that captures the data placement state.
We already know that HBase is an LSM database, converting random writes into a stream of append operations. This allows it to achieve higher write throughputs than conventional databases, and also makes the layering on top of HDFS possible...