hive join基础

left join 是left outer join的简写,left join默认是outer属性的。string

Inner Joinit

Inner Join 逻辑运算符返回知足第一个(顶端)输入与第二个(底端)输入联接的每一行。这个和用select查询多表是同样的效果,因此不多用到;io

outer join则会返回每一个知足第一个(顶端)输入与第二个(底端)输入的联接的行。它还返回任何在第二个输入中没有匹配行的第一个输入中的行。test

关键就是后面那句,返回的多一些。因此一般意义上的left join就是left outer joinselect


CREATE TABLE t1(查询

name string,join

age   int字符

)data

PARTITIONED BY( hour string)运算符

ROW FORMAT DELIMITED

FIELDS TERMINATED BY '\t'

COLLECTION ITEMS TERMINATED BY ':'

LINES TERMINATED BY '\n'

STORED AS textFILE;


CREATE TABLE t2(

name string,

sex   int

)

PARTITIONED BY( hour string)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY '\t'

COLLECTION ITEMS TERMINATED BY ':'

LINES TERMINATED BY '\n'

STORED AS textFILE; 


因为第二列是int,可是咱们插入的一个字符型,因此会显示NULL

b       c

hive:


b       NULL    2011111111


LOAD DATA LOCAL INPATH '/opt/smc/xuanli/data/t1.txt' OVERWRITE INTO TABLE test.t1 partition (hour='2011111112');

LOAD DATA LOCAL INPATH '/opt/smc/xuanli/data/t2.txt' OVERWRITE INTO TABLE test.t2 partition (hour='2011111111');


select t1.*,t2.* from t1 left join t2  on (t1.hour=t2.hour and t1.name=t2.name);

select t1.*,t2.* from t1 right join t2  on (t1.hour=t2.hour and t1.name=t2.name);

select t1.*,t2.* from t1 join t2  on (t1.hour=t2.hour and t1.name=t2.name);

select t1.*,t2.* from t1 INNER  join t2  on (t1.hour=t2.hour and t1.name=t2.name);

select t1.*,t2.* from t1 full join t2  on (t1.hour=t2.hour and t1.name=t2.name);

SELECT t1.name  FROM t1 LEFT SEMI JOIN t2  on (t1.hour=t2.hour and t1.name=t2.name);


在t1存在。t2不存在

select t1.*,t2.* from t1 left join t2  on (t1.hour=t2.hour and t1.name=t2.name) where t1.name is not null and t2.name is null;

select t1.*,t3.* from t1 left join (select name,hour from test.t2) t3  on (t1.hour=t3.hour and t1.name=t3.name) where t1.name is not null and t3.name is null;

相关文章
相关标签/搜索