Compaction
In HBase, the MemStore in Regions creates many HFiles for a Column Family. This large number of files will require more time to read and hence, can impact the read performance. To improve the performance, HBase performs compaction to merge files in order to reduce their number and to keep the data manageable. The compaction process identifies the StoreFiles to merge by running an algorithm which is called compaction policy. There are two types of compactions: minor compactions and major compactions.
The Compaction policy
Compaction policy is the algorithm which can be used to select the StoreFiles for merging. Two policies are possible and the available ones are ExploringCompactionPolicy
and RatioBasedCompactionPolicy
. To set the policy algorithm, we have to set the value of the property hbase.hstore.defaultengine.compactionpolicy.class
of hbase-site.xml
. RatioBasedCompactionPolicy was available as the default policy prior to HBase 0.96 and is still available. ExploringCompactionPolicy...