HBase Compaction

HBase Compaction

As we know, the sorted data is written to Hfile from Memstore. So many small Hfiles will be generated during the write operation. This helps to increase disk spaces. So HBase compaction is a process where Hbase will pick these files and merge.

There are two types of HBase compaction.

  1. Minor compaction
  2. Major compaction

Minor Compaction:

In Minor compaction, the smaller Hfiles merged into fewer bigger Hfiles.

hbase comp1

Major Compaction:

In Major compaction, all the  HFiles of a region will be merged into a single bigger HFile. During Major compaction lot of disk write, IO operations take place. So Major compactions can be scheduled when there is less load on the server like the weekend.  This process also sometimes called a Write Amplification Process.

hbase comp2

Compaction Tuning:

HBase allows users to control the Compaction process. Compaction can be enabled or disabled using the below config.

 hbase.regionserver.compaction.enabled = true 

To specify the time between major compaction. The time is expressed in milliseconds.

 hbase.hregion.majorcompaction = 604800000 

Set to ‘0’ to disable time-based automatic major compaction. The default value of major compaction is 7 days.

Leave a Reply