python调取hive 和sqoop之二
来源:互联网 发布:三菱编程实例100 编辑:程序博客网 时间:2024/05/20 00:12
#!/usr/bin/pythonimport time,osUSER="wilson.zhou"PWD="YCt452uz"URL="jdbc:postgresql://10.1.1.230:5432/xmo_dw"day = time.strftime('%Y-%m-%d',time.localtime(time.time() -24*60*60))[2::]sql = '''"insert overwrite directory '/shortdata/media_planner/{0}/'SELECT req.adx,req.city,req.d,req.url,req.adformat,req.size,count(req.bid),count(show.bid),count(click.bid),sum(win.price)FROM( SELECT concat_ws('','20',a.ds) d,b.adx,b.bid,b.city, parse_url(b.url,'HOST') url, CASE b.video WHEN 'false' THEN 'Banner' ELSE 'Rich Media' END adformat, concat_ws('x',c.w,c.h) SIZE FROM bs_rtbreq a LATERAL VIEW json_tuple(a.str, 'adx','bid','city','url','slots','video') b AS adx, bid,city,url,slots,video LATERAL VIEW json_tuple(substring(b.slots,2,length(b.slots)-2),'h','w') c AS h ,w WHERE a.ds='{0}' AND parse_url(b.url,'HOST') rlike '{1}' UNION ALL SELECT concat_ws('','20',ds) d, get_json_object(str,'$.adx') AS adx, get_json_object(str,'$.bid') AS bid, get_json_object(str,'$.city') AS city, parse_url(get_json_object(str,'$.url'),'HOST') AS url, CASE get_json_object(str,'$.video') WHEN 'false' THEN 'Banner' ELSE 'Rich Media' END adformat, concat_ws('x',get_json_object(str,'$.slots[0].w'),get_json_object(str,'$.slots[0].h')) SIZE FROM bs_rtbreq_tanx WHERE parse_url(get_json_object(str,'$.url'),'HOST') rlike '{1}' AND ds='{0}' ) reqLEFT JOIN ( SELECT a.ds,b.bid,b.price FROM bs_rtbwinner a LATERAL VIEW json_tuple(a.str, 'bid','price') b AS bid,price WHERE a.ds='{0}' ) win ON req.bid = win.bidLEFT JOIN ( SELECT a.ds,b.bid FROM bs_showup a LATERAL VIEW json_tuple(a.str,'bid') b AS bid WHERE a.ds='{0}' ) SHOW ON win.bid = SHOW.bidLEFT JOIN ( SELECT a.ds,b.bid FROM bs_click a LATERAL VIEW json_tuple(a.str, 'bid') b AS bid WHERE a.ds='{0}' ) click ON SHOW.bid = click.bidGROUP BY req.adx,req.city,req.d,req.url,req.adformat,req.SIZE"'''.format(day,'^[a-zA-Z0-9\-\.\_]+\.[a-zA-Z]{2,3}(/\S*)?\.?$','','','','','','')cmd = "hive -e "+sqlprint(cmd)os.system(cmd)print("hive execute end")print("sqoop execute start...")try: sline = "sqoop export --connect {0} --username {1} --password {2} --export-dir /shortdata/media_planner/{3} --table media_planner --fields-terminated-by '\001' --input-null-non-string '\\\N' --input-null-string '\\\N' --columns adx,city,ds,url,adformat,size,pv,impression,click,expense".format(URL,USER,PWD,day) print("sqoop cmd="+sline) os.system(sline)except: print("error line")print("all end")
还有一篇,一次性记下来:
0 0
- python调取hive 和sqoop之二
- 用python调用hive和sqoop
- 大数据基础(二)hadoop, mave, hbase, hive, sqoop在ubuntu 14.04.04下的安装和sqoop与hdfs,hive,mysql导入导出
- 简易串联,mysql,hive和sqoop
- python脚本 用sqoop把mysql数据导入hive
- Hadoop之Sqoop导出Hive表到Mysql
- Sqoop笔记二之命令使用
- Hadoop入门(三):Sqoop和Hive的使用
- hadoop日志【1】--hive服务和sqoop服务运行
- Hive部署(包括集成Hbase和Sqoop)
- 用sqoop实现mysql和hive数据互导
- sqoop:mysql和Hbase/Hive/Hdfs之间相互导入数据
- RDBMS和HDFS, HIVE, HBASE的迁移工具Sqoop
- Hive部署(包括集成Hbase和Sqoop)
- sqoop工具在hive和mysql之间互相导数据
- 安装部署(八) Hive+Sqoop安装部署和使用
- Sqoop进行MySQL和Hive间的导入导出操作
- 使用Sqoop从MySQL导入数据到Hive和HBase
- hibernate
- iic协议是什么?iic总线特点_iic总线工作原理
- Linux自动备份MySql数据库
- RequireJS以及AMD规范入门实践
- mysql中char与varchar的区别 哪种字段的查找效率高
- python调取hive 和sqoop之二
- 绝对的小抄:Photoshop的快捷键
- 一次MVVM+ReactiveCocoa实践
- iOS AVPlayer pop 释放内存 崩溃
- hibernate一对多双向注解
- <C语言超大数求和>实现思路及源码
- java 考试复习整理——JAVA类和类成员的修饰符
- 语法基本循环控制
- <10>Android HAL 硬件抽象层模块的加载过程