python操做hive

前言

HiveServer2为客户端在远程执行hive查询提供了接口,经过Thrift RPC来实现,还提供了多用户并发和认证功能。目前使用python的用户能够经过pyhs2这个模块来链接HiveServer2,实现查询和取回结果的操做。python客户端采用pyhs2模块python

安装python工具模块

  1. 安转pip https://pip.pypa.io/en/stable/installing/
  2. 安装依赖模块
    • yum install cyrus-sasl-plain
    • yum install cyrus-sasl-devel
    • yum install python-devel
  3. pip install pyhs2

python客户端代码

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# hive util with hive server2
"""
[@author](https://my.oschina.net/arthor):wyf
[@create](https://my.oschina.net/u/192469):2016-06-29 16:55
"""
__author__ = 'wyf'
__version__ = '0.1'

import pyhs2
import sys

default_encoding = 'utf-8'
if sys.getdefaultencoding() != default_encoding:
    reload(sys)
    sys.setdefaultencoding(default_encoding)

class HiveClient:
    def __init__(self, db_host, user, password, database, port=10000, authMechanism="PLAIN"):
        """
        create connection to hive server2
        """
        self.conn = pyhs2.connect(host=db_host,
                                  port=port,
                                  authMechanism=authMechanism,
                                  user=user,
                                  password=password,
                                  database=database,
                                  )

    def query(self, sql):

        """
        query
        """
        with self.conn.cursor() as cursor:
            cursor.execute(sql)
            return cursor.fetch()

    def close(self):
        """
        close connection
        """
        self.conn.close()


def main():
    """
    main process
    """
    try:
        hive_client = HiveClient(db_host='192.168.1.13', port=10000, user='hive', password='hive',

                             database='default', authMechanism='PLAIN')

        sql = 'select * from record limit 10'#实例sql语句
        result = hive_client.query(sql)
        hive_client.close()
    except pyhs2.error, tx:
        print '%s' % (tx.message)
        sys.exit(1)
    writeXlwt('test.xls',result)

if __name__ == '__main__':  
    main()

python操做hive结果

相关文章
相关标签/搜索