HBase CRUD and Basic Commands
HBase CRUD Operations
General Commands
HBase provides shell commands to directly interact with the Database and below are a few most used shell commands.
status: This command will display the cluster information and health of the cluster.
hbase(main):>status hbase(main):>status "detailed"
version: This will provide information about the version of HBase.
hbase(main):> version
whoami : This will list the current user.
hbase(main):> whoami
table_help : This will give the reference shell command for HBase.
hbase(main):009:> table_help
Create
Let’s create an HBase table and insert data into the table. Now that we know, while creating a table user needs to create required Column Families.
Here we have created two-column families for table ‘employee’. First Column Family is ‘Personal Info’ and Second Column Family is ‘Professional Info’.
create 'employee', 'Personal info', 'Professional Info' 0 row(s) in 1.4750 seconds => Hbase::Table - employee
Upon successful creation of the table, the shell will return 0 rows.
Create a table with Namespace:
A namespace is nothing but a logical grouping of tables.’company_empinfo’ is the namespace id in the below command.
create 'company_empinfo:employee', 'Personal info', 'Professional Info'
Create a table with version:
By default, versioning is not enabled in HBase. So users need to specify while creating. Given below is the syntax for creating an HBase table with versioning enabled.
create 'tableName',{NAME=>"CF1",VERSIONS=>5},{NAME=."CF2",VERSIONS=>5} create 'bankdetails',{NAME=>"address",VERSIONS=>5}
Put:
Put command is used to insert records into HBase.
put 'employee', 1, 'Personal info:empId', 10 put 'employee', 1, 'Personal info:Name', 'Alex' put 'employee', 1, 'Professional Info:Dept, 'IT'
Here in the above example all the rows having Row Key as 1 is considered to be one row in HBase.To add multiple rows
put 'employee', 2, 'Personal info:empId', 20 put 'employee', 2, 'Personal info:Name', 'Bob' put 'employee', 2, 'Professional Info:Dept', 'Sales'
As discussed earlier, the user can add any number of columns as part of the row.
Read
‘get’ and ‘scan’ command is used to read data from HBase. Lets first discuss ‘get’ operation.
get: ‘get’ operation returns a single row from the HBase table. Given below is the syntax for the ‘get’ method.
get 'table Name', 'Row Key'
hbase(main):022:get 'employee', 1
COLUMN CELL
Personal info:Name timestamp=1504600767520, value=Alex
Personal info:empId timestamp=1504600767491, value=10
Professional Info:Dept timestamp=1504600767540, value=IT
3 row(s) in 0.0250 seconds
To retrieve a specific column of row:
Follow the command to read a specific column of a row.
get 'table Name', 'Row Key',{COLUMN => 'column family:column’} get 'table Name', 'Row Key' {COLUMN => ['c1', 'c2', 'c3']
get 'employee', 1 ,{COLUMN => 'Personal info:empId'}
COLUMN CELL
Personal info:Name timestamp=1504600767520, value=Alex
Personal info:empId timestamp=1504600767491, value=10
Professional Info:Dept timestamp=1504600767540, value=IT
3 row(s) in 0.0250 seconds
Note: Notice that there is a timestamp attached to each cell. These timestamps will update for the cell whenever the cell value is updated. All the old values will be there but timestamp having the latest value will be displayed as output.
Get all version of a column
Below given command is used to find different versions. Here ‘VERSIONS => 3’ defines number of version to be retrieved.
get 'Table Name', 'Row Key', {COLUMN => 'Column Family', VERSIONS => 3}
scan:
‘scan’ command is used to retrieve multiple rows.
Select all:
The below command is an example of a basic search on the entire table.
scan 'Table Name'
hbase(main):074:> scan 'employee'
ROW COLUMN+CELL
1 column=Personal info:Name, timestamp=1504600767520, value=Alex
1 column=Personal info:empId, timestamp=1504606480934, value=15
1 column=Professional Info:Dept, timestamp=1504600767540, value=IT
2 column=Personal info:Name, timestamp=1504600767588, value=Bob
2 column=Personal info:empId, timestamp=1504600767568, value=20
2 column=Professional Info:Dept, timestamp=1504600768266, value=Sales
2 row(s) in 0.0500 seconds
Note: All the Rows are arranged by Row Keys along with columns in each row.
Column Selection:
The below command is used to Scan any particular column.
hbase(main):001:>scan 'employee' ,{COLUMNS => 'Personal info:Name' }
ROW COLUMN+CELL
1 column=Personal info:Name, timestamp=1504600767520, value=Alex
2 column=Personal info:Name, timestamp=1504600767588, value=Bob
2 row(s) in 0.3660 seconds
Limit Query:
The below command is used to Scan any particular column.
hbase(main):002:>scan 'employee' ,{COLUMNS => 'Personal info:Name',LIMIT =>1 }
ROW COLUMN+CELL
1 column=Personal info:Name, timestamp=1504600767520, value=Alex
1 row(s) in 0.0270 seconds
Update
To update any record HBase uses ‘put’ command. To update any column value, users need to put new values and HBase will automatically update the new record with the latest timestamp.
put 'employee', 1, 'Personal info:empId', 30
The old value will not be deleted from the HBase table. Only the updated record with the latest timestamp will be shown as query output.
To check the old value of any row use below command.
get 'Table Name', 'Row Key', {COLUMN => 'Column Family', VERSIONS => 3}
Delete
‘delete‘ command is used to delete individual cells of a record.
The below command is the syntax of delete command in the HBase Shell.
delete 'Table Name' ,'Row Key','Column Family:Column'
delete 'employee',1, 'Personal info:Name'
Drop Table:
To drop any table in HBase, first, it is required to disable the table. The query will return an error if the user is trying to delete the table without disabling the table. Disable removes the indexes from memory.
The below command is used to disable and drop the table.
disable 'employee'
Once the table is disabled, the user can drop using below syntax.
drop 'employee'
You can verify the table in using ‘exist’ command and enable table which is already disabled, just use ‘enable’ command.