sqoop实战（五）

时间 2019-11-07

标签 sqoop 实战繁體版

原文原文链接

1 Importing Data Directly into Hive 关系型数据库-----hivemysql

sqoop import \sql

--connect jdbc:mysql://192.168.130.221/sqoop \数据库

--username root \服务器

--password root \oop

--table tbl_place \spa

--hive-importcode

表要有主键啊！！！orm

sqoop和hive的导入默认的导入路径是不同的！可是不能存在
three

成功~
ci

**For example, if you

want to change the Hive type of column id to STRING and column price to DECIMAL,

you can specify the following Sqoop parameters:

sqoop import \

...

--hive-import \

--map-column-hive id=STRING,price=DECIMAL

本身来一个：

sqoop import \

--connect jdbc:mysql://192.168.130.221/sqoop \

--username root \

--password root \

--table tbl_place \

--hive-import \

--map-column-hive place_code=STRING

成功！

**导入HIVE的过程：

1 先将数据导入临时表。

2 导入成功后

3 creating a table

4 loading the data from a temporary location.

**临时位置能够用 --target-dir or --warehouse-dir 两个参数指定。可是不要使用/user/hive/warehouse，容易在第三阶段出现问题。

**默认状况，已存在的表中若是有数据，则追加。

若是想删除原来的数据，再迁移的话，使用参数：--hive-overwrite

本身来一个：

sqoop import \

--connect jdbc:mysql://192.168.130.221/sqoop \

--username root \

--password root \

--table tbl_place \

--hive-overwrite \

成功！

3 Using Partitioned Hive Tables---使用分区的hive表

sqoop import \

--connect jdbc:mysql://192.168.130.221/sqoop \

--username root \

--password root \

--table tbl_place \

--hive-import \

--hive-partition-key place_code\

--hive-partition-value "2016-02-26"

脑下面的错误。我在想是否是从由于个人hive不是一个集群致使的？----有待解决！

**Sqoop要求分区列为字符串类型。

**hive的分区支持是实现虚拟列，不属于数据自己。

**--hive-partition-value这个参数得值不能是列名称。

4 Replacing Special Delimiters During Hive Import----特殊字符

**当数据中包含HIVE的分隔符时，想要清除这些分隔符。

sqoop import \

--connect jdbc:mysql://192.168.130.221/sqoop \

--username root \

--password root \

--table cities \

--hive-import \

--hive-drop-import-delims

**也能够替换这些分隔符：

sqoop import \

--connect jdbc:mysql://192.168.130.221/sqoop \

--username root \

--password root \

--table cities \

--hive-import \

--hive-delims-replacement "SPECIAL"

**hive 分隔符有： \n, \t, and \01

5 Using the Correct NULL String in Hive

sqoop import \

--connect jdbc:mysql://192.168.130.221/sqoop \

--username root \

--password root \

--table cities \

--hive-import \

--null-string '\\N' \

--null-non-string '\\N'

**hive处理空字符默认用：/N

**sqoop处理空字符默认用：null

**不会抛异常。坑爹！！！能够存在空值，可是没法查询操做。有个卵用~

6 ***Importing Data into HBase

sqoop import \

--connect jdbc:mysql://192.168.130.221/sqoop \

--username root \

--password root \

--table cities \

--hbase-table cities \

--column-family world

**To insert data into HBase there are three mandatory parameters:

1 表名

2 列族名称

3 行键

** --hbase-row-key parameter.

**HBase不会自动建表。须要指定 --create-hbasetable.

**运行导入命令之前，列族和表必须存在

7 Importing All Rows into HBase

**sqoop import \

-Dsqoop.hbase.add.row.key=true \

--connect jdbc:mysql://192.168.130.221/sqoop \

--username root \

--password root \

--table cities \

--hbase-table cities \

--column-family world

**HBase中不容许存在空值。

**HBase序列化的时候会跳过空值

**sqoop.hbase.add.row.key指示Sqoop插入行的键列两次，一次

做为行标识符，而后再次在数据自己。即便全部其余列包含空的行键的列至少不会是空的，这将容许插入该行在HBase。

8 Improving Performance When Importing into HBase

**在导入前，HBase建立表时，多区域。more regions

hbase> create 'cities', 'world', {NUMREGIONS => 20, SPLITALGO => 'HexString Split'}

**默认状况下，每个新的HBase表只有一个区域，能够仅经过一个服务区域服务器。这意味着，每个新的表将只提供一个物理节点。

1. sqoop实战（一）
2. Sqoop增量导入实战
3. DIV+CSS实战（五）
4. 6五、ansible实战
5. C# Redis实战(五)
6. 【若泽大数据实战第二十天】Sqoop-1.4.6-cdh5.7.0.tar.gz安装 + Sqoop help
7. 【Sqoop】Sqoop 工具之 Sqoop-import
8. Sqoop操做实践
9. nodeJs五子棋实战
10. Netty实战五之ByteBuf
更多相关文章...
• XML DOM 实例 - XML DOM 教程
• XML 实例 - XML 教程
• RxJava操作符（五）Error Handling
• Git五分钟教程