Hadoop CDH4.4集群搭建
来源:互联网 发布:如何拷贝淘宝主图视频 编辑:程序博客网 时间:2024/06/11 21:13
结束了一年多的paas开发,转战大数据,备份一些安装文档。
集群示例
hadoop-001 10.168.204.55 NameNode,secondaryNameNode,ResourceManager
hadoop-002 10.168.204.56 DataNode,NodeManager
hadoop-003 10.168.204.57 DataNode,NodeManager
hadoop-004 10.168.204.58 DataNode,NodeManager
hadoop版本:CDH4.4.0
centos版本:6.3
一、准备
1. jdk 1.7
http://download.oracle.com/otn-pub/java/jdk/7u45-b18/jdk-7u45-linux-x64.rpm
sudo rpm -ivh jdk-7u45-linux-x64.rpm alternatives --install /usr/bin/java java /usr/java/jdk1.7.0_45/bin/java 300 alternatives --install /usr/bin/javac javac /usr/java/jdk1.7.0_45/bin/javac 300 alternatives --config java
2. 修改hostname
vim /etc/sysconfig/network #修改每个服务器的hostname,重启生效
配置/etc/hosts
192.168.204.55 hadoop-001 192.168.204.56 hadoop-002 192.168.204.57 hadoop-003 192.168.204.58 hadoop-004
3. 防火墙关闭
service iptables statusservice iptables stop chkconfig iptables stop
4. selinux disabled
#修改为disablevim /etc/selinux/config
5. 创建hadoop用户,配置为sudoer
adduser hadooppasswd hadoop sudo vim /etc/sudoers
6. ssh without passwd
#切换至hadoop用户ssh-keygen -t rsacat id_rsa.pub >> authorized_keyschmod 600 authorized_keys
将authorized_keys scp 至其它slaves服务器上。
二、安装
1. 下载CDH4.4 tar
mkdir cdh4.4.0 wget http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.4.0.tar.gz tar -xvzf hadoop-2.0.0-cdh4.4.0.tar.gz
2. 设置环境变量
修改/etc/profile或 ~/.bashrc,这里改的是bashrc,都一样。
export JAVA_HOME=/usr/java/jdk1.7.0_45export HADOOP_HOME=/home/hadoop/cdh4.4.0/hadoop-2.0.0-cdh4.4.0export HADOOP_COMMOM_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_YARN_HOME=$HADOOP_HOMEexport HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoopexport YARN_CONF_DIR=$HADOOP_HOME/etc/hadoopexport HADOOP_LIB=$HADOOP_HOME/libexport JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/nativeexport PATH=$PATH:/etc/haproxy/sbin/:$JAVA_HOME/bin:$JAVA_HOME/jre/binexport CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$HADOOP_LIB/native/libhadoop.so
libhadoop.so其实是后面安装impala时要用到。
3. 配置文件设置
core-site.xml
<configuration> <property> <name>fs.default.name</name> <value>hdfs://hadoop-001:8020</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/hadoop/tmp</value> </property> <property> <name>fs.trash.interval</name> <value>10080</value> </property> <property> <name>fs.trash.checkpoint.interval</name> <value>10080</value> </property><!-- <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.SnappyCodec </value> </property> <property> <name>io.compression.codec.lzo.class</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property>--> <!-- OOZIE --> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>hadoop-001</value> </property> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value>hadoop</value> </property></configuration>
hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>2</value> </property><!-- <property> <name>hadoop.tmp.dir</name> <value>/hadoop/tmp</value> </property>--> <property> <name>dfs.namenode.name.dir</name> <value>file:/hadoop/name</value> <final>ture</final> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/hadoop/data</value> <final>ture</final> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.namenode.http-address</name> <value>hadoop-001:50070</value> </property> <property> <name>dfs.secondary.http.address</name> <value>hadoop-001:50090</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <!--for impala <property> <name>dfs.client.read.shortcircuit</name> <value>true</value> </property> <property> <name>dfs.domain.socket.path</name> <value>/var/run/hadoop-hdfs/dn._PORT</value> </property> <property> <name>dfs.client.file-block-storage-locations.timeout</name> <value>3000</value> </property> <property> <name>dfs.datanode.hdfs-blocks-metadata.enabled</name> <value>true</value> </property>--></configuration>
yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hadoop-001:18025</value> </property> <property> <name>yarn.resourcemanager.address </name> <value>hadoop-001:18040</value> </property> <property> <name>yarn.resourcemanager.scheduler.address </name> <value>hadoop-001:18030</value> </property> <property> <name>yarn.resourcemanager.admin.address </name> <value>hadoop-001:18141</value> </property> <property> <name>yarn.resourcemanager.webapp.address </name> <value>hadoop-001:8088</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.application.classpath</name> <value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*</value> </property> </configuration>
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop-001:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop-001:19888</value> </property> <property> <name>mapreduce.job.tracker</name> <value>hadoop-001:8021</value> <final>ture</final> </property> <property> <name>mapred.system.dir</name> <value>file:/hadoop/mapred/system</value> <final>ture</final> </property> <property> <name>mapred.local.dir</name> <value>file:/hadoop/mapred/local</value> <final>ture</final> </property> <property> <name>mapred.child.env</name> <value>LD_LIBRARY_PATH=/usr/local/lib</value> </property> <!--<property> <name>mapreduce.map.output.compress</name> <value>true</value> </property> <property> <name>mapreduce.map.output.compress.codec</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property>--> </configuration>
4. 准备hdfs的文件路径
/hadoop/tmp /hadoop/mapred/system /hadoop/mapred/local /hadoop/name /hadoop/data sudo chown hadoop:hadoop -R /hadoop
5. 将 CDH4.4 scp至slaves节点
scp -r cdh4.4.0/ hadoop-002:~/. scp -r cdh4.4.0/ hadoop-003:~/. scp -r cdh4.4.0/ hadoop-004:~/.
三、启动
1. 格式化文件系统
#在hadoop-001 master节点上
cd cdh4.4.0/hadoop-2.0.0-cdh4.4.0/bin hadoop namenode -format
2. 启动
cd cdh4.4.0/hadoop-2.0.0-cdh4.4.0/sbin ./start-all.sh
jps一下,看有没有相应的进程。
四、遇到的问题
微博:http://weibo.com/kingjames3
0 0
- Hadoop CDH4.4集群搭建
- hadoop CDH4.4上Impala集群安装
- hadoop CDH4.4上Impala集群安装
- hadoop2/CDH4集群搭建
- 搭建ganglia集群并且监视hadoop CDH4.6
- 搭建hadoop cdh4.5版笔记
- cdh4 hadoop本地库搭建及安装
- hadoop-hbase-hive-zookeeper的cdh4.6.0和spark-0.9.0-incubating-bin-cdh4集群
- hadoop集群搭建(hadoop)
- HADOOP: 搭建hadoop集群
- 十分钟搭建自己的hadoop2/CDH4集群
- 十分钟搭建自己的hadoop2/CDH4集群
- 十分钟搭建自己的hadoop2/CDH4集群
- cdh4.2.0源码搭建hadoop+hbase+zookeeper开发环境
- 搭建hadoop-dist-2.0.0-cdh4.2.0开发测试环境
- 4台服务器搭建Hadoop集群
- 4台Hadoop集群完全分布式搭建
- hadoop 2.6.4 伪分布集群搭建
- [android反编译小结]apktool/ AXMLPrinter2.jar/ dex2jar.bat/ jd-gui/ Jodeclipse/ JadClipse
- Java编程构建简单画图板3——构建自定义队列实现重绘功能
- leetcod Binary Tree Level Order Traversal II
- IPv6 MTU Path Discovery
- linux ant 解决 错误: 找不到或无法加载主类 org.apache.tools.ant.launch.Launcher
- Hadoop CDH4.4集群搭建
- php5.5新特性
- NGUI官网示例6 – Draggable Window讲解
- mysql远程访问失败解决方案
- oc学习之旅:数组浅拷贝深拷贝
- FFMPEG相关知识
- Java程序查询系统参数
- Python 字符的处理的两个技巧(增加,分离)
- Mysql数据库引擎性能测试