hive整合phoenix

版本:java

hbase-0.98.21-hadoop2-bin.tar.gznode

phoenix-4.8.0-HBase-0.98-bin.tar.gzapache

apache-hive-1.2.1-bin.tar.gzvim

--------------------------------------------------缓存

首先须要phoenix整合hbaseapp

hive整合hbase,此处参照以前的笔记ide

将phoenix{core,queryserver,4.8.0-HBase-0.98,hive}拷贝到$hive/lib/oop

根据官网要求修改配置文件post

> vim conf/hive-env.sh性能

> vim conf/hive-site.xml

启动:

> hive -hiveconf phoenix.zookeeper.quorum=hadoop01:2181

建立内部表

create table phoenix_table (

s1 string,

i1 int,

f1 float,

d1 double

)

STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler'

TBLPROPERTIES (

"phoenix.table.name" = "phoenix_table",

"phoenix.zookeeper.quorum" = "hadoop01",

"phoenix.zookeeper.znode.parent" = "/hbase",

"phoenix.zookeeper.client.port" = "2181",

"phoenix.rowkeys" = "s1, i1",

"phoenix.column.mapping" = "s1:s1, i1:i1, f1:f1, d1:d1",

"phoenix.table.options" = "SALT_BUCKETS=10, DATA_BLOCK_ENCODING='DIFF'"

);

建立成功。查询phoenix和hbase中都有相应的表生成:phoenix

hbase:

属性

  1. phoenix.table.name
    • phoenix指定表名
    • 默认值:hive同样的表
  1. phoenix.zookeeper.quorum
    • 指定ZK地址
    • 默认值:localhost
  1. phoenix.zookeeper.znode.parent
    • 指定HBase在ZK的目录
    • 默认值:/ hbase
  1. phoenix.zookeeper.client.port
    • 指定ZK端口
    • 默认值:2181
  1. phoenix.rowkeys
    • 指定phoenix的rowkey,即hbase的rowkey
    • 要求
  1. phoenix.column.mapping
    • hive与phoenix之间的列映射。

插入数据

使用hive测试表pokes导入数据

> insert into table phoenix_table select bar,foo,12.3 as fl,22.2 as dl from pokes;

成功、查询

在phoenix中查询

还能够使用phoenix导入数据,看官网的解释

注意:phoenix4.8认为加tbale关键字为语法错误,其余版本没试,不知道官网怎么没说明

建立外部表

For external tables Hive works with an existing Phoenix table and manages only Hive metadata. Deleting an external table from Hive only deletes Hive metadata and keeps Phoenix table

首先在phoenix建立表

phoenix> create table PHOENIX_TABLE_EXT(aa varchar not null primary key,bb varchar);

再在hive中建立外部表:

create external table phoenix_table_ext_1 ( aa string, bb string ) STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ( "phoenix.table.name" = "phoenix_table_ext ", "phoenix.zookeeper.quorum" = "hadoop01", "phoenix.zookeeper.znode.parent" = "/hbase", "phoenix.zookeeper.client.port" = "2181", "phoenix.rowkeys" = "aa", "phoenix.column.mapping" = "aa:aa, bb:bb" );

建立成功,插入成功

这些选项能够设置在hive CLI

性能调优

参数 默认值 描述
phoenix.upsert.batch.size 1000 批量大小插入。
[phoenix-table-name].disable.wal false 它暂时设置表属性DISABLE_WAL = true。可用于提升性能
[phoenix-table-name].auto.flush false 当WAL是disabled 的flush又为真,则按文件刷进库

查询数据

能够使用HiveQL在phoenix表查询数据。一个简单表查询当hive.fetch.task.conversion=more and hive.exec.parallel=true.就能够像在Phoenix CLI同样快。

参数 默认值 描述
hbase.scan.cache 100 为一个单位请求读取行大小。
hbase.scan.cacheblock false 是否缓存块。
split.by.stats false If true, mappers will use table statistics. One mapper per guide post.
[hive-table-name].reducer.count 1 reducer的数量. In tez mode is affected only single-table query. See Limitations
[phoenix-table-name].query.hint   Hint for phoenix query (like NO_INDEX)

遇到的问题:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hbase.client.Scan.isReversed()Z

最开始我用的hbase-0.96.2-hadoop2版本,不能整合,这个是须要hbase-client-0.98.21-hadoop2.jar包,更换这个jar包就解决了,可是仍是会报下面的错

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:ERROR 103 (08004): Unable to establish connection.

因而更换了hbase的版本为0.98.21的 ok了

---------

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.StringIndexOutOfBoundsException: String index out of range: -1

由于字段对应不同

create table phoenix_table_3 (a string,b int) STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ("phoenix.table.name" = "phoenix_table_3","phoenix.zookeeper.quorum" = "hadoop01","phoenix.zookeeper.znode.parent" = "/hbase","phoenix.zookeeper.client.port" = "2181","phoenix.rowkeys" = "a1","phoenix.column.mapping" = "a:a1, b:b1","phoenix.table.options" = "SALT_BUCKETS=10, DATA_BLOCK_ENCODING='DIFF'");

hive表字段与phoenix字段同样就能够了

----------

建立成功,插入也能成功,就是hive查询的时候报错找不到a1列,由于phoenix是aa列

Failed with exception java.io.IOException:java.lang.RuntimeException: org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined column. columnName=A1

create external table phoenix_table_ext (a1 string,b1 string)STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler' TBLPROPERTIES ("phoenix.table.name" = "phoenix_table_ext","phoenix.zookeeper.quorum" = "hadoop01","phoenix.zookeeper.znode.parent" = "/hbase","phoenix.zookeeper.client.port" = "2181","phoenix.rowkeys" = "aa","phoenix.column.mapping" = "a1:aa, b1:bb");

解决办法:同上hive表字段与phoenix字段同样就能够了

相关文章
相关标签/搜索