GreenPlum 集群 gpfdist 实战
来源:互联网 发布:数据库分为哪几种类型 编辑:程序博客网 时间:2024/06/11 16:19
并行文件服务gpfdist组件模块,能够实现最大并行度、加载带宽,默认greenplum集群已经有了已经安装了gpfdist,但是如果在单独的服务器上,还是需要再次安装的单独的组件,需要下载一个loaders的组件安装包进行安装。
1,下载
下载地址:https://network.pivotal.io/products/pivotal-gpdb#/releases/4540/file_groups/561,选择和greenplumdatabase相同款的loaders,loaders里面包括有gpfdisk组件,下载显示如下:
C:\pic\greenplum\005.png
2,安装
基础组件
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
wget http://pyyaml.org/download/libyaml/yaml-0.1.7.tar.gz
tar -xvf yaml-0.1.7.tar.gz
cd yaml-0.1.7
./configure
make
make install
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
(1)解压缩
unzip greenplum-loaders-4.3.8.2-build-1-RHEL5-x86_64.zip
(2)创建软件目录
mkdir /data/greenplum
chown -R gpadmin:gpadmin /data/greenplum
(3)开始安装
sh greenplum-loaders-4.3.8.2-build-1-RHEL5-x86_64.bin -y
(4)查看组件,可以看到gpfdist和gpload
[gpadmin@db_m2_slave1 ~]$ ll /data/greenplum/bin
total 756
drwxr-xr-x 4 gpadmin gpadmin 4096 May 10 2016 ext
-rwxr-xr-x 1 gpadmin gpadmin 663372 May 10 2016 gpfdist
-rwxr-xr-x 1 gpadmin gpadmin 311 May 10 2016 gpload
-rwxr-xr-x 1 gpadmin gpadmin 100338 May 10 2016 gpload.py
[gpadmin@db_m2_slave1 ~]$
3,使用
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
启动命令:
nohup /data/greenplum/bin/gpfdist -d /home/gpadmin/ -p 8090 > /home/gpadmin/gpfdist.log &
启动过程:
[gpadmin@db_m2_slave1 ~]$ nohup /data/greenplum/bin/gpfdist -d /home/gpadmin/ -p 8090 > /home/gpadmin/gpfdist.log &
[1] 27003
[gpadmin@db_m2_slave1 ~]$
[gpadmin@db_m2_slave1 ~]$ more /home/gpadmin/gpfdist.log
2017-05-12 14:10:31 27003 INFO Before opening listening sockets - following listening sockets are available:
2017-05-12 14:10:31 27003 INFO IPV6 socket: [::]:8090
2017-05-12 14:10:31 27003 INFO IPV4 socket: 0.0.0.0:8090
2017-05-12 14:10:31 27003 INFO Trying to open listening socket:
2017-05-12 14:10:31 27003 INFO IPV6 socket: [::]:8090
2017-05-12 14:10:31 27003 INFO Opening listening socket succeeded
2017-05-12 14:10:31 27003 INFO Trying to open listening socket:
2017-05-12 14:10:31 27003 INFO IPV4 socket: 0.0.0.0:8090
Serving HTTP on port 8090, directory /home/gpadmin
[gpadmin@db_m2_slave1 ~]$
4,通过gpfdist服务建立的外部表
建立测试数据,准备2个txt数据,文件名字t01.txt/t02.txt
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[gpadmin@db_m2_slave1 gpdextdata]$ pwd
/home/gpadmin/gpdextdata
[gpadmin@db_m2_slave1 gpdextdata]$ more t01.txt
1|aaa
2|zhangsan
[gpadmin@db_m2_slave1 gpdextdata]$ more t02.txt
3|wanger
4|mazi
[gpadmin@db_m2_slave1 gpdextdata]$
在greenplum db上建立外部表,指向gpfdist服务的t01.txt、t02.txt数据,建立外部表的sql语句如下,在psql命令窗口上执行:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
create external table public.t01_ext_1 (
id integer,
name varchar(128)
)
location (
/*'gpfdist://101.254.13.72:8090/gpextdata/test001.txt',
'gpfdist:// 101.254.3.72:8090/gpextdata/test002.txt'*/
/*'gpfdist:// 101.254.13.72:8090/gpextdata/*.txt'*/
'gpfdist://101.254.13.72:8090/gpextdata/t01.txt',
'gpfdist:// 101.254.13.72:8090/gpextdata/t02.txt'
)
Format 'TEXT' (delimiter as E'|' null as '' escape 'OFF')
--Encoding 'GB18030' Log errors into public.test001_err segment reject limit 10 rows
;
执行过程:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
(1)创建外部表成功:
yueworld_db=# create external table public.t01_ext_1 (
yueworld_db(# id integer,
yueworld_db(# name varchar(128)
yueworld_db(# )
yueworld_db-# location (
yueworld_db(# 'gpfdist://101.254.13.72:8090/gpextdata/t01.txt',
yueworld_db(# 'gpfdist:// 101.254.13.72:8090/gpextdata/t02.txt'
yueworld_db(# )
yueworld_db-# Format 'TEXT' (delimiter as E'|' null as '' escape 'OFF')
yueworld_db-# ;
CREATE EXTERNAL TABLE
yueworld_db=#
yueworld_db=# select * from public.t01_ext_1;;
id | name
----+------
1 | aaa
2 | zhangsan
3 | wanger
4 | mazi
(4 rows)
yueworld_db=#
- GreenPlum 集群 gpfdist 实战
- Greenplum gpfdist使用
- Greenplum使用gpload通过gpfdist实现文件的高速加载
- GreenPlum 使用gpload通过gpfdist文件实现数据高速加载
- 转:greenplum使用gpfdist与外部表高效导入数据
- 关于greenPlum中通过gpfdist导入数据不成功的问题与原因
- 扩展greenplum集群
- greenplum集群的安装
- GreenPlum 集群 迷雾重重
- Greenplum优化实战
- 从GREENPLUM集群中去除某个SEGMENT
- greenplum数据库集群的安装实例
- 玩转Greenplum集群主备机替换
- GreenPlum 集群部署详细过程 V2.0
- Greenplum Hadoop入门到实战演练教程
- Greenplum对新增节点扩展Segments实战
- GreenPlum 外部表external table 实战
- GreenPlum 可读写外部表 实战
- 调整数组顺序使奇数位于偶数前面
- kafka+storm+hbase整合试验(Wordcount)
- 交换两个数的三种方法
- 7.SpringBoot整合RabbitMQ实现微服务间的异步消息沟通
- openssl aes 加密解密示例程序
- GreenPlum 集群 gpfdist 实战
- 链表中倒数第K个节点
- codeforces 789A
- python 对text进行读写小例子非常简单
- C# 遍历Hashtable
- 二叉树的垂直遍历
- D3.js实现折线图的方法详解
- [leetcode: Python]566. Reshape the Matrix
- sql 笔记1