hive加载数据

来源:互联网 发布:中金公司 知乎 编辑:程序博客网 时间:2024/06/08 05:40

创建表时加载

create table newtable as select col1,col2 from oldtable;

创建表时指定数据位置

create table tablename() location ''root@ubuntu:~# cat student1.txt xiaoming    10xiaohong    12xiaoli  13xiaowang    11root@ubuntu:~# hadoop fs -mkdir /user/hive/warehouse/dataroot@ubuntu:~#  hadoop fs -put student1.txt /user/hive/warehouse/data/hive> create table student2(name string,age int) row format delimited fields terminated by '\t' lines terminated by '\n' stored as textfile location '/user/hive/warehouse/data/';OKTime taken: 0.272 secondshive> select * from student2;OKxiaoming    10xiaohong    12xiaoli  13xiaowang    11Time taken: 0.45 seconds, Fetched: 4 row(s)

本地数据加载

load data local inpath 'localpath' [pverwrite] into table tablename;root@ubuntu:~# cat student2.txt lucy    21lily    22hive> load data local inpath '/root/student2.txt' into table student2;Copying data from file:/root/student2.txtCopying file: file:/root/student2.txtLoading data to table default.student2[Warning] could not update stats.OKTime taken: 2.43 secondshive> select * from student2;OKxiaoming    10xiaohong    12xiaoli  13xiaowang    11lucy    21lily    22Time taken: 0.066 seconds, Fetched: 6 row(s)

加载hdfs数据

load data inpath 'hdfspath' [overwrite] into table tablename;(移动数据)

分区表数据加载
本地数据加载

load data local inpath 'localpath' [pverwrite] into table tablename partition(pn='');

加载hdfs数据

load data inpath 'hdfspath' [overwrite] into table tablename partition(pn='');

由查询语句加载数据

insert [overwrite|into] table tablename partition(pn='') select col1,col2 from table where ....

分区表数据加载

root@ubuntu:~# cat student1.txt xiaoming    10xiaohong    12xiaoli  13xiaowang    11hive> load data local inpath '/root/student1.txt' into table partitionTable partition(dt='20160818');Copying data from file:/root/student1.txtCopying file: file:/root/student1.txtLoading data to table default.partitiontable partition (dt=20160818)Partition default.partitiontable{dt=20160818} stats: [numFiles=1, numRows=0, totalSize=46, rawDataSize=0][Warning] could not update stats.OKTime taken: 1.831 secondshive> select * from partitionTable;OKxiaoming    10  20160818xiaohong    12  20160818xiaoli  13  20160818xiaowang    11  20160818Time taken: 0.354 seconds, Fetched: 4 row(s)hive> show partitions partitionTable;OKdt=20160818Time taken: 4.091 seconds, Fetched: 1 row(s)

这里写图片描述

外部分区表加载别忘了

alter table tablename add partition(dt='xxxx');

hive数据加载注意事项
分隔符,单个字符,写多个的话只取第一个
数据类型对应问题
load数据, 字段类型不能相互转化是,查询返回null
select查询插入,字段类型不能互相转化时,插入数据为null

root@ubuntu:~# cat test.txt lucy#hellojack#rose13#3545 hive> create table testType(name int,age string) row format delimited fields terminated by '#' lines terminated by '\n' stored as textfile;OKTime taken: 2.088 secondshive> load data local inpath '/root/test.txt' into table testType;hive> select * from testType;OKNULL    helloNULL    rose13  3545 Time taken: 0.035 seconds, Fetched: 3 row(s)root@ubuntu:~# hadoop fs -cat /user/hive/warehouse/testtype/test.txtlucy#hellojack#rose13#3545 

select查询插入数据,字段值顺序要与表中字段顺序一致,名称可不一致,hive在加载数据时不做检查,查询是检查

外部分区表需要添加分区才能看到数据

0 0