Hive能作什么?html
为何要使用Hive?java
Hive与传统数据库对比mysql
Hive | RDBMS | |
查询语言 | HQL | SQL |
数据存储 | HDFS | Raw Device or Local FS |
执行 | MapReduce | Excutor |
执行延迟 | 高 | 低 |
处理数据规模 | 大 | 小 |
数据类型 | 所有数据(历史和在线---分析) | 在线数据 |
冗余程度 | 高冗余 | 低冗余(经过范式) |
... | ... | ... |
... | ... | ...sql |
Hive的架构数据库
Hive相关概念apache
protectedList<Operator<?extendsSerializable>>childOperators; protectedList<Operator<?extendsSerializable>>parentOperators; protectedbooleandone;// 初始化值为false
Hive的三种模式浏览器
1.本地 derby 这种方式是最简单的存储方式,只须要在 hive-site.xml 作以下配置即可 <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:;databaseName=metastore_db;create=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.apache.derby.jdbc.EmbeddedDriver</value> </property> <property> <name>hive.metastore.local</name> <value>true</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> </configuration> 注:使用 derby 存储方式时,运行 hive 会在当前目录生成一个 derby 文件和一个 metastore_db 目录。这种存储方式的弊端是在同一个目录下同时只能有一个 hive 客户端能使用数据库,不然会提示以下错误 [html] view plaincopyprint? hive> show tables; FAILED: Error in metadata: javax.jdo.JDOFatalDataStoreException: Failed to start database 'metast ore_db', see the next exception for details. NestedThrowables: java.sql.SQLException: Failed to start database 'metastore_db', see the next exception for details. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask hive> show tables; FAILED: Error in metadata: javax.jdo.JDOFatalDataStoreException: Failed to start database 'metastore_db', see the next exception for details. NestedThrowables: java.sql.SQLException: Failed to start database 'metastore_db', see the next exception for details. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask 2.本地 mysql 这种存储方式须要在本地运行一个 mysql 服务器,并做以下配置(须要将 mysql 的驱动 jar 包拷贝到$HIVE_HOME/lib 目录下)。 # /opt/hive-1.2.1/conf/hive-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive_remote/warehouse</value> </property> <property> <name>hive.metastore.local</name> <value>true</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost/hive_remote?createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>password</value> </property> </configuration> 附: 安装 mysql Yum install mysql-server -y 启动服务 service mysqld start mysql 修改 mysql 权限: GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123' WITH GRANT OPTION; flush privileges; delete from user where Host != '%'; 删除多余会对权限形成影响的数据刷新权限 [ERROR] Terminal initialization failed; falling back to unsupported java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected at jline.TerminalFactory.create(TerminalFactory.java:101) 错误的缘由: Hadoop jline 版本和 hive 的 jline 不一致 3.远端 mysql 3.1.remote 一体 这种存储方式须要在远端服务器运行一个 mysql 服务器,而且须要在 Hive 服务器启动 meta 服务。 这里用 mysql 的测试服务器,ip 位 192.168.1.214,新建 hive_remote 数据库,字符集位 latine1 <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.57.6:3306/hive?createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>password</value> </property> <property> <name>hive.metastore.local</name> <value>false</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://192.168.1.188:9083</value> </property> </configuration> 注:这里把 hive 的服务端和客户端都放在同一台服务器上了。服务端和客户端能够拆开, 3.2.Remote 分开 将 hive-site.xml 配置文件拆为以下两部分 -服务端配置文件 启动:hive --service metastore <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.57.6:3306/hive?createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property> </configuration> -客户端配置文件 启动:hive <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <property> <name>hive.metastore.local</name> <value>false</value> <property> <name>hive.metastore.uris</name> <value>thrift://slave2:9083</value> </property> </configuration>