hive - What is the syntax to create an external table partitioned on an hbase column? -
i have table in hbase i'd represent external table in hive
so far i've been using:
create external table events(key string, day int, source string, ip string) stored 'org.apache.hadoop.hive.hbase.hbasestoragehandler' serdeproperties ("hbase.columns.mapping"=":key,c:date#b,c:source,c:ipaddress") tblproperties ("hbase.table.name" = "eventtable"); however queries aren't balanced across mappers, i'm trying partition on ip address:
create external table events(key string, source string) partitioned (ip string) stored 'org.apache.hadoop.hive.hbase.hbasestoragehandler' serdeproperties ("hbase.columns.mapping"=":key,c:date#b,c:source,c:ipaddress") tblproperties ("hbase.table.name" = "eventtable"); but receive error improper column mappings:
failed: error in metadata: java.lang.runtimeexception: metaexception(message:org.apache.hadoop.hive.serde2.serdeexception org.apache.hadoop.hive.hbase.hbaseserde: columns has 2 elements while hbase.columns.mapping has 3 elements (counting key if implicit)) failed: execution error, return code 1 org.apache.hadoop.hive.ql.exec.ddltask i've been looking around can't find documentation indicates how map between hbase column , hive partitioning column
i think can't partition external table easily, when underlying storage hbase.
hive partition strategy build on way data specific partition stored in separate folder ("or other storage"). because of partitioning hbase (if exists) require usage of more tables or usage of hbase versions.
i think post give better understanding of partitioning http://blog.zhengdong.me/2012/02/22/hive-external-table-with-partitions
and on place https://cwiki.apache.org/hive/hbaseintegration.html can find partitioning in hbase left future.
if want have partitions recommend loading data hbase/hive hdfs/hive table, depends on use cases.
regards, dino
Comments
Post a Comment