Hive 实现 IN 和 NOT IN 子句

来源:互联网 发布:2017院士 知乎 编辑:程序博客网 时间:2024/06/10 15:09

目前hive不支持 in或not in 中包含查询子句的语法,所以只能通过left join实现。

假设有一个用户浏览商品表skim,和一个用户购买商品表buy。如下

skim表

userId itemId time 001 342 2015-05-08 002 382 2015-05-09 003 458 2015-05-09 004 468 2015-05-09

buy表

userId itemId time 001 342 2015-05-07 002 382 2015-05-08 003 458 2015-05-09 005 325 2015-05-09

select * from skim left outer join buy
on skim.userId = buy .userId and skim.itemId = buy .itemId
结果如下:

skim.userId skim.itemId skim.time buy.userId buy.itemId buy.time 001 342 2015-05-08 001 342 2015-05-08 002 382 2015-05-09 002 382 2015-05-09 003 458 2015-05-09 003 458 2015-05-09 004 468 2015-05-09

IN

如果要查询在skim表中并且也在buy表中的信息,需要用in查询,hive sql如下:

select skim.userId , skim.itemId from skim left outer join buy
on skim.userId = buy .userId and skim.itemId = buy .itemId where buy .userId is not null;

结果如下:

userId itemId 001 342 002 382 003 458

NOT IN

如果要查询在skim表中并且不也在buy表中的信息,需要用not in查询,hive sql如下:

select skim.userId, skim.itemId from skim left outer join buy
on skim.userId=buy .userId and skim.itemId=buy .itemId where buy .userId is null;

结果如下:

userId itemId 004 468
0 0
原创粉丝点击