上一篇文章介绍了hive2.3.4的搭建,然而这个版本已经不能稳定的支持mapreduce程序。本篇博主将分享hive1.2.2工具搭建全过程。先说明:本节就直接在上一节的hadoop环境中搭建了!java
1、下载apache-hive-1.2.2-bin.tar.gznode
2、上传hive包到namenode服务器mysql
3、解压hive包web
tar -zxvf apache-hive-1.2.2-bin.tar.gz -C /home/hadoop/apps/
4、修改/etc/profile中hive的配置文件sql
#export HIVE_HOME=/home/hadoop/apps/apache-hive-2.3.4-bin export HIVE_HOME=/home/hadoop/apps/apache-hive-1.2.2-bin export PATH=${HIVE_HOME}/bin:$PATH #保存后执行source /etc/profile生效
5、修改hive配置文件数据库
cd /home/hadoop/apps/apache-hive-1.2.2-bin/conf/ cp hive-env.sh.template hive-env.sh #新增如下三行内容并保存 vi hive-env.sh export HADOOP_HOME=/home/hadoop/apps/hadoop-2.9.1 export HIVE_CONF_DIR=/home/hadoop/apps/apache-hive-1.2.2-bin/conf export HIVE_AUX_JARS_PATH=/home/hadoop/apps/apache-hive-1.2.2-bin/lib
6、修改log4j日志配置apache
cp hive-log4j.properties.template hive-log4j.properties 将EventCounter修改为org.apache.hadoop.log.metrics.EventCounter #log4j.appender.EventCounter=org.apache.hadoop.hive.shims.HiveEventCounter log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
7、配置hive元数据库mysqlcentos
vi hive-site.xml #将如下信息写入到hive-site.xml文件中 <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.29.131:3306/hivedb?createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property> </configuration>
8、将jdbc驱动类拷贝到hive的lib目录下服务器
cp ~/mysql-connector-java-5.1.28.jar $HIVE_HOME/lib/
9、删除以前hive2.3.4在hdfs中留下的历史文件(此步骤必定要作)app
hdfs dfs -rm -r /tmp/hive hdfs dfs -rm -r /user/hive
10、初始化hive
[hadoop@centos-aaron-h1 bin]$ schematool -initSchema -dbType mysql Metastore connection URL: jdbc:mysql://192.168.29.131:3306/hivedb?createDatabaseIfNotExist=true Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: root Starting metastore schema initialization to 1.2.0 Initialization script hive-schema-1.2.0.mysql.sql Error: Duplicate key name 'PCS_STATS_IDX' (state=42000,code=1061) org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !! *** schemaTool failed *** [hadoop@centos-aaron-h1 bin]$ schematool -initSchema -dbType mysql Metastore connection URL: jdbc:mysql://192.168.29.131:3306/hivedb?createDatabaseIfNotExist=true Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: root Starting metastore schema initialization to 1.2.0 Initialization script hive-schema-1.2.0.mysql.sql Initialization script completed schemaTool completed
11、启动hive而且完成建库、建表,数据上传
#此句须要在建库建表作好才执行 hdfs dfs -put bbb_hive.txt /user/hive/warehouse/wcc_log.db/t_web_log01
[hadoop@centos-aaron-h1 bin]$ hive Logging initialized using configuration in file:/home/hadoop/apps/apache-hive-1.2.2-bin/conf/hive-log4j.properties hive> show databases; OK default Time taken: 0.679 seconds, Fetched: 1 row(s) hive> create database wcc_log; OK Time taken: 0.104 seconds hive> use wcc_log; OK Time taken: 0.03 seconds hive> create table t_web_log01(id int,name string) > row format delimited > fields terminated by ','; OK Time taken: 0.159 seconds hive> select * from t_web_log01; OK 1 张三 2 李四 3 王二 4 麻子 5 隔壁老王 Time taken: 0.274 seconds, Fetched: 5 row(s) hive> select count(*) from t_web_log01; Query ID = hadoop_20190121080409_dfb157d9-0a79-4784-9ea4-111d0ad4cc92 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1548024929599_0003, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1548024929599_0003/ Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job -kill job_1548024929599_0003 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2019-01-21 08:04:25,271 Stage-1 map = 0%, reduce = 0% Ended Job = job_1548024929599_0003 with errors Error during job, obtaining debugging information... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec hive> select count(id) from t_web_log01; Query ID = hadoop_20190121080455_b3eb8d25-2d10-46c6-b4f3-bfcdab904b92 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1548024929599_0004, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1548024929599_0004/ Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job -kill job_1548024929599_0004 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2019-01-21 08:05:09,771 Stage-1 map = 0%, reduce = 0% Ended Job = job_1548024929599_0004 with errors Error during job, obtaining debugging information... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec
执行报错FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
11、解决上面的报错
查询yarn日志:http://centos-aaron-h1:8088/proxy/application_1548024929599_0004/
分析缘由:hive远程去调用yarn后,或出现一些环境变量丢失的状况;
解决方案:修改mapred-site.xml 新增下面内容,而且分发到全部hadoop集群,并重启集群
<property> <name>mapreduce.application.classpath</name> <value>/home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/*, /home/hadoop/apps/hadoop-2.9.1/share/hadoop/mapreduce/lib/*</value> </property>
12、再次运行hive查询【select count(id) from t_web_log01;】
[hadoop@centos-aaron-h1 bin]$ hive Logging initialized using configuration in file:/home/hadoop/apps/apache-hive-1.2.2-bin/conf/hive-log4j.properties hive> use wcc_log > ; OK Time taken: 0.487 seconds hive> show tables; OK t_web_log01 Time taken: 0.219 seconds, Fetched: 1 row(s) hive> select count(id) from t_web_log01; Query ID = hadoop_20190121082042_c5392e1c-8db8-4329-bcdf-b0c332fcfe4f Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1548029911300_0001, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1548029911300_0001/ Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job -kill job_1548029911300_0001 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2019-01-21 08:21:05,410 Stage-1 map = 0%, reduce = 0% 2019-01-21 08:21:14,072 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.38 sec 2019-01-21 08:21:21,290 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.32 sec MapReduce Total cumulative CPU time: 3 seconds 320 msec Ended Job = job_1548029911300_0001 MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 3.32 sec HDFS Read: 6642 HDFS Write: 2 SUCCESS Total MapReduce CPU Time Spent: 3 seconds 320 msec OK 5 Time taken: 40.218 seconds, Fetched: 1 row(s) hive> [hadoop@centos-aaron-h1 bin]$
从上面能够看到执行成功,结果:5条记录
最后寄语,以上是博主本次文章的所有内容,若是你们以为博主的文章还不错,请点赞;若是您对博主其它服务器大数据技术或者博主本人感兴趣,请关注博主博客,而且欢迎随时跟博主沟通交流。