Impala Hbase Interaction

Summary of Hbase and Impala connections and interactions.

 

 

(1) Very simple example for interacting between Hbase and Impala.

 

 

  1. Configure Hbase as normal. (edit hbase-site.xml and start the daemons. Use “jps” to see daemons)

 

 

  1. Create a table in Hbase using the following command in Hbase shell (command: “hbase shell”):

 

——————————————————————————

create ‘a’,’ints’

enable ‘a’

——————————————————————————

 

there are two columns, one is the key_ID, another one is called “ints”.

 

 

  1. In Hive shell (command: “hive shell”), we type in the following command:

 

——————————————————————————

CREATE EXTERNAL TABLE a (

id int,

int_col int)

STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’

WITH SERDEPROPERTIES (

“hbase.columns.mapping” =

“:key,ints:int_col”

)

TBLPROPERTIES(“hbase.table.name” = “a”);

——————————————————————————

 

 

This query create an external table in Hive from Hbase, and map the id to int, the “ints” column to int in Hive.

This approach regards the ID as int. We can also regard it as string.

Details please see:

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/Impala/Installing-and-Using-Impala/ciiu_impala_hbase.html

 

 

  1. In the impala shell (command: “impala-shell”), we first invalidate the metadata to let it find the new created table.

 

——————————————————————————

INVALIDATE METADATA a;

——————————————————————————

 

Now we can use SQL queries in the impala shell.

 

Warning:

It is an external table so that when dropped by Impala or Hive, the original HBase table is not touched at all.

 

 

 

 

 

 

 

 

 

 

(2) More complex example for interacting between Hbase and Impala with multiple types.

 

For data types references in Impala:

http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_datatypes.html?scroll=float_unique_1

 

  1. Create a table in Hbase using the following command in Hbase shell (command: “hbase shell”):

 

——————————————————————————

create ‘hbasealltypessmall’, ‘bools’, ‘ints’, ‘floats’, ‘strings’

enable ‘hbasealltypessmall’

quit

——————————————————————————

 

 

  1. In Hive shell (command: “hive shell”), we type in the following command:

 

——————————————————————————

CREATE EXTERNAL TABLE hbasestringids (

id string,

bool_col boolean,

tinyint_col tinyint,

smallint_col smallint,

int_col int,

bigint_col bigint,

float_col float,

double_col double,

date_string_col string,

string_col string,

timestamp_col timestamp)

STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’

WITH SERDEPROPERTIES (

“hbase.columns.mapping” =

“:key,bools:bool_col,ints:tinyint_col,ints:smallint_col,ints:int_col,ints:\

bigint_col,floats:float_col,floats:double_col,strings:date_string_col,\

strings:string_col,strings:timestamp_col”

)

TBLPROPERTIES(“hbase.table.name” = “hbasealltypessmall”);

——————————————————————————

 

This query regards the id as string, and map HBase 5 columns to Hive 11 columns.

For example, it maps tinyint_col, smallint_col, int_col and bigint_col to ints.

 

 

  1. In the impala shell (command: “impala-shell”), we first invalidate the metadata to let it find the new created table.

 

——————————————————————————

INVALIDATE METADATA a;

——————————————————————————

 

Now we can use SQL queries in the impala shell.

 

 

 

References and other materials:

 

http://mapredit.blogspot.com/2013/05/query-hbase-tables-with-impala.html

http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_impala_hbase.html

http://yanbohappy.sinaapp.com/?tag=hbase

http://doc.mapr.com/display/MapR/Working+with+Impala

Leave a comment