Hive1.2.1 和 Hbase1.0.1.1整合java
1、Hive和Hbase安装apache
2、原理bash
摘录于《Hbase企业应用开发实战》maven
Hive在0.6.0版本已经引入了Hive和Hbase的整合实现-hive-hbase-handler-0.6.0.jar.该实现是基于Hive Storage Hanlers。实现的动机是经过模块和扩展的方式使得Hive可以访问和管理其余系统的数据。不光可以访问Hbase。也能够整合其余的如HyperTable、MongoDB、Cassandra、Google Spreadsheets等等oop
/** * HiveStorageHandler defines a pluggable interface for adding * new storage handlers to Hive. A storage handler consists of * a bundle of the following: * *<ul> *<li>input format *<li>output format *<li>serde *<li>metadata hooks for keeping an external catalog in sync * with Hive's metastore *<li>rules for setting up the configuration properties on * map/reduce jobs which access tables stored by this handler *</ul> * * Storage handler classes are plugged in using the STORED BY 'classname' * clause in CREATE TABLE. */ public interface HiveStorageHandler extends Configurable { /** * @return Class providing an implementation of {@link InputFormat} */ public Class<? extends InputFormat> getInputFormatClass(); /** * @return Class providing an implementation of {@link OutputFormat} */ public Class<? extends OutputFormat> getOutputFormatClass(); /** * @return Class providing an implementation of {@link SerDe} */ public Class<? extends SerDe> getSerDeClass(); }
Hive中的表有内部表和外部表之分。this
名称 | 功能描述 |
内部表 | 经过hive元数据存储管理,并hive负责数据存储表 |
外部表 | 经过外部目录管理,hive不负责数据存储 |
Hive Storage Handlers引入了本地表和非本地表spa
名称 | 功能描述 |
本地表 | 不经过HiveStorageHandlers管理表 |
非本地表 | 必须经过HiveStorageHandlers管理表 |
3、code
hive-hbase-handler-1.2.1.jar 默认支持的hbase版本是orm
<hbase.hadoop2.version>0.98.9-hadoop2</hbase.hadoop2.version>ip
须要从新编译
[root@hftclclw0001 opt]# wget http://www.carfab.com/apachesoftware/hive/hive-1.2.1/apache-hive-1.2.1-src.tar.gz [root@hftclclw0001 opt]# tar -zxvf apache-hive-1.2.1-src.tar.gz [root@hftclclw0001 opt]# cd apache-hive-1.2.1-src [root@hftclclw0001 apache-hive-1.2.1-src]# mvn clean package -DskipTests -Phadoop-2 -Dhbase.hadoop2.version=1.0.1.1 [root@hftclclw0001 apache-hive-1.2.1-src]# cd hbase-handler/target [root@hftclclw0001 target]# ll total 236 drwx------ 2 root root 4096 Jun 8 03:05 antrun drwx------ 4 root root 4096 Jun 8 03:05 classes drwx------ 3 root root 4096 Jun 8 03:05 generated-sources drwx------ 3 root root 4096 Jun 8 03:05 generated-test-sources -rw------- 1 root root 115950 Jun 8 03:05 hive-hbase-handler-1.2.1.jar -rw------- 1 root root 83567 Jun 8 03:05 hive-hbase-handler-1.2.1-tests.jar drwx------ 2 root root 4096 Jun 8 03:05 maven-archiver drwx------ 3 root root 4096 Jun 8 03:05 maven-shared-archive-resources drwx------ 4 root root 4096 Jun 8 03:05 test-classes drwx------ 3 root root 4096 Jun 8 03:05 tmp drwx------ 2 root root 4096 Jun 8 03:05 warehouse
复制 hive-hbase-handler-1.2.1.jar 替换hive_home 下lib里面的jar
启动hive