HBase Introduction
As Apache Hadoop is a popular tool to store and process data, But Hadoop can not handle random writes and reads. Also, Hadoop is not capable of editing or change a hdfs file. HBase is built to overcome the limitation of Hadoop and this is designed based on Google’s BigTable.
HBase is a distributed database that is scalable and allows real-time processing and it is based on the Hadoop platform. HBase uses HDFS as storage as it is using the Hadoop platform. It’s capacity based on the capacity of the Hadoop cluster. Also, HBase is highly Fault-tolerant as it uses Hadoop’s fault-tolerant system. Let’s discuss the limitation of Hadoop to understand more about HBase.
As we know, Hadoop is a great tool to process huge volumes of data but this is not a database. Below are a few limitations of Hadoop.
Hadoop | HBase |
Hadoop does not know where exactly a particular data is stored. | HBase data is structured and stores the index of data in memory |
Hadoop doesn’t allow random access of data and parses the entire file to extract one information | HBase allows random access to data. |
High latency. Not suitable for real-time processing. | Low latency. Suitable for real-time processing |
Not Acid Compliant | Acid Compliant |