HBase CRUD and Basic Commands

HBase CRUD Operations

General Commands

HBase provides shell commands to directly interact with the Database and below are a few most used shell commands.

statusThis command will display the cluster information and health of the cluster.

hbase(main):>status	
hbase(main):>status "detailed" 

version: This will provide information about the version of HBase.

 hbase(main):> version 

whoami : This will list the current user.

 hbase(main):> whoami 

table_help : This will give the reference shell command for HBase.

 hbase(main):009:> table_help 

Create

Let’s create an HBase table and insert data into the table. Now that we know, while creating a table user needs to create required Column Families.

Here we have created two-column families for table ‘employee’. First Column Family is ‘Personal Info’ and Second Column Family is ‘Professional Info’.

create 'employee', 'Personal info', 'Professional Info'
0 row(s) in 1.4750 seconds

=> Hbase::Table - employee

Upon successful creation of the table, the shell will return 0 rows.

Create a table with Namespace:
A namespace is nothing but a logical grouping of tables.’company_empinfo’ is the namespace id in the below command.

 create 'company_empinfo:employee', 'Personal info', 'Professional Info'

Create a table with version:

By default, versioning is not enabled in HBase. So users need to specify while creating. Given below is the syntax for creating an HBase table with versioning enabled.

 
create 'tableName',{NAME=>"CF1",VERSIONS=>5},{NAME=."CF2",VERSIONS=>5}
create 'bankdetails',{NAME=>"address",VERSIONS=>5}

Put:
Put command is used to insert records into HBase.

 
put 'employee', 1, 'Personal info:empId', 10
put 'employee', 1, 'Personal info:Name', 'Alex'
put 'employee', 1, 'Professional Info:Dept, 'IT' 

Here in the above example all the rows having Row Key as 1 is considered to be one row in HBase.To add multiple rows

 
put 'employee', 2, 'Personal info:empId', 20
put 'employee', 2, 'Personal info:Name', 'Bob'
put 'employee', 2, 'Professional Info:Dept', 'Sales' 

As discussed earlier, the user can add any number of columns as part of the row.

Read

‘get’ and ‘scan’ command is used to read data from HBase. Lets first discuss ‘get’ operation.

get: ‘get’ operation returns a single row from the HBase table. Given below is the syntax for the ‘get’ method.

get 'table Name', 'Row Key'
hbase(main):022:get 'employee', 1 
COLUMN                                           CELL
 Personal info:Name                              timestamp=1504600767520, value=Alex
 Personal info:empId                             timestamp=1504600767491, value=10
 Professional Info:Dept                          timestamp=1504600767540, value=IT
3 row(s) in 0.0250 seconds

To retrieve a specific column of row:

Follow the command to read a specific column of a row.

 
get 'table Name', 'Row Key',{COLUMN => 'column family:column’}
get 'table Name', 'Row Key' {COLUMN => ['c1', 'c2', 'c3']
 get 'employee', 1 ,{COLUMN => 'Personal info:empId'}
COLUMN                                           CELL
 Personal info:Name                              timestamp=1504600767520, value=Alex
 Personal info:empId                             timestamp=1504600767491, value=10
 Professional Info:Dept                          timestamp=1504600767540, value=IT
3 row(s) in 0.0250 seconds

Note: Notice that there is a timestamp attached to each cell. These timestamps will update for the cell whenever the cell value is updated. All the old values will be there but timestamp having the latest value will be displayed as output.

Get all version of a column

Below given command is used to find different versions. Here ‘VERSIONS => 3’ defines number of version to be retrieved.

 
get 'Table Name', 'Row Key', {COLUMN => 'Column Family', VERSIONS => 3} 

scan:
‘scan’ command is used to retrieve multiple rows.
Select all:
The below command is an example of a basic search on the entire table.

 scan 'Table Name' 
 hbase(main):074:> scan 'employee'
ROW                                              COLUMN+CELL
 1                                               column=Personal info:Name, timestamp=1504600767520, value=Alex
 1                                               column=Personal info:empId, timestamp=1504606480934, value=15
 1                                               column=Professional Info:Dept, timestamp=1504600767540, value=IT
 2                                               column=Personal info:Name, timestamp=1504600767588, value=Bob
 2                                               column=Personal info:empId, timestamp=1504600767568, value=20
 2                                               column=Professional Info:Dept, timestamp=1504600768266, value=Sales
2 row(s) in 0.0500 seconds 

Note: All the Rows are arranged by Row Keys along with columns in each row.

Column Selection:

The below command is used to Scan any particular column.

 
hbase(main):001:>scan 'employee' ,{COLUMNS => 'Personal info:Name' }
ROW                                              COLUMN+CELL
 1                                               column=Personal info:Name, timestamp=1504600767520, value=Alex
 2                                               column=Personal info:Name, timestamp=1504600767588, value=Bob
2 row(s) in 0.3660 seconds 

Limit Query:

The below command is used to Scan any particular column.

 hbase(main):002:>scan 'employee' ,{COLUMNS => 'Personal info:Name',LIMIT =>1 }
ROW                                              COLUMN+CELL
 1                                               column=Personal info:Name, timestamp=1504600767520, value=Alex
1 row(s) in 0.0270 seconds 

Update

To update any record HBase uses ‘put’ command. To update any column value, users need to put new values and HBase will automatically update the new record with the latest timestamp.

 put 'employee', 1, 'Personal info:empId', 30 

The old value will not be deleted from the HBase table. Only the updated record with the latest timestamp will be shown as query output.

To check the old value of any row use below command.

 get 'Table Name', 'Row Key', {COLUMN => 'Column Family', VERSIONS => 3} 

Delete

delete‘ command is used to delete individual cells of a record.

The below command is the syntax of delete command in the HBase Shell.

 delete 'Table Name' ,'Row Key','Column Family:Column' 
 delete 'employee',1, 'Personal info:Name' 

Drop Table:
To drop any table in HBase, first, it is required to disable the table. The query will return an error if the user is trying to delete the table without disabling the table. Disable removes the indexes from memory.

The below command is used to disable and drop the table.

 disable 'employee'

Once the table is disabled, the user can drop using below syntax.

 drop 'employee' 

You can verify the table in using ‘exist’ command and enable table which is already disabled, just use ‘enable’ command.

Leave a Reply