hadoop(1.1.2) eclipse plugin编译

来源:互联网 发布:数据库迁移 编辑:程序博客网 时间:2024/06/03 02:04
-------------------------------------------------------------------
下载安装eclipse
-------------------------------------------------------------------
$sudo apt-get install eclipse
#会自动安装eclipse-jdt,其jar文件 ubuntu 12.10 在'/usr/share/eclipse/dropins/jdt/plugins'里,ubuntu 12.04在'/usr/lib/eclipse/dropins/jdt/plugins'里
#这里使用ubuntu 12.04

$sudo vim /etc/profile
ADD:
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk


-------------------------------------------------------------------
下载安装hadoop
-------------------------------------------------------------------
$sudo tar -xzf hadoop-1.1.2.tar.gz -C /usr/local
$sudo chown -R hadoop:hadoop /usr/local/hadoop-1.1.2
$sudo vim /etc/profile
增加:
export HADOOP_HOME=/usr/local/hadoop-1.1.2
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export CLASSPATH=$HADOOP_HOME/hadoop-core-1.1.2.jar:$HADOOP_HOME:$HADOOP_HOME/lib:$CLASSPATH


-------------------------------------------------------------------
编译hadoop eclipse插件
-------------------------------------------------------------------
(1)
$mkdir -p $HADOOP_HOME/build/contrib/eclipse-plugin/lib
$cp $HADOOP_HOME/hadoop-core-1.1.2.jar                          $HADOOP_HOME/build/contrib/eclipse-plugin/lib
$cp $HADOOP_HOME/lib/commons-cli-1.2.jar                       $HADOOP_HOME/build/contrib/eclipse-plugin/lib
$cp $HADOOP_HOME/lib/commons-httpclient-3.0.1.jar        $HADOOP_HOME/build/contrib/eclipse-plugin/lib
$cp $HADOOP_HOME/lib/commons-configuration-1.6.jar    $HADOOP_HOME/build/contrib/eclipse-plugin/lib
$cp $HADOOP_HOME/lib/commons-lang-2.4.jar                    $HADOOP_HOME/build/contrib/eclipse-plugin/lib
$cp $HADOOP_HOME/lib/jackson-mapper-asl-1.8.8.jar        $HADOOP_HOME/build/contrib/eclipse-plugin/lib
$cp $HADOOP_HOME/lib/jackson-core-asl-1.8.8.jar              $HADOOP_HOME/build/contrib/eclipse-plugin/lib

$cp $HADOOP_HOME/hadoop-core-1.1.2.jar $HADOOP_HOME/lib/commons-cli-1.2.jar  $HADOOP_HOME/lib/commons-httpclient-3.0.1.jar $HADOOP_HOME/lib/commons-configuration-1.6.jar $HADOOP_HOME/lib/commons-lang-2.4.jar  $HADOOP_HOME/lib/jackson-mapper-asl-1.8.8.jar $HADOOP_HOME/lib/jackson-core-asl-1.8.8.jar    $HADOOP_HOME/build/contrib/eclipse-plugin/lib


(2)
修改build.xml和MANIFEST.MF:
$cd $HADOOP_HOME/src/contrib/eclipse-plugin
$vim build.xml
注释以下两行:
    <copy file="${hadoop.root}/build/hadoop-core-${version}.jar" tofile="${build.dir}/lib/hadoop-core.jar" verbose="true"/>
    <copy file="${hadoop.root}/build/ivy/lib/Hadoop/common/commons-cli-${commons-cli.version}.jar"  todir="${build.dir}/lib" verbose="true"/>

$vim META-INF/MANIFEST.MF
修改:
Bundle-ClassPath: classes/,
 lib/hadoop-core.jar
为:
Bundle-ClassPath: classes/,
 lib/hadoop-core-1.1.2.jar,
 lib/commons-cli-1.2.jar,
 lib/commons-httpclient-3.0.1.jar,
 lib/commons-configuration-1.6.jar,
 lib/commons-lang-2.4.jar,
 lib/jackson-mapper-asl-1.8.8.jar,
 lib/jackson-core-asl-1.8.8.jar
(3)
#把相应eclipse-jdt到jar文件复制到/usr/lib/eclipse/plugins
$sudo cp -r /usr/lib/eclipse/dropins/jdt/plugins/*  /usr/lib/eclipse/plugins
$ant -Declipse.home=/usr/lib/eclipse -Dbuild.dir=/usr/local/hadoop-1.1.2/build/contrib/eclipse-plugin -Dversion=1.1.2 jar

#成功后,在/usr/local/hadoop-1.1.2/build/contrib/eclipse-plugin目录下有相应的jar。

-------------------------------------------------------------------
配置eclipse
-------------------------------------------------------------------
$sudo cp /usr/local/hadoop-1.1.2/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-1.1.2.jar /usr/lib/eclipse/plugins
$eclipse
#可以在左侧的Project Explore看到"DFS Locations".
#如果没有,请重新安装一次eclipse


-------------------------------------------------------------------
在eclipse运行hadoop程序
-------------------------------------------------------------------

(1)Window -> Preferences -> Hadoop Map/Reduce 输入hadoop安装目录

(2)Window -> Show View -> other -> MapReduce Tools -> MapReduce Locations 点击OK

(3)点击右侧大象 New Hadoop Location ,配置如下:

Location name : 可任意
Map/Reduce Master
--Host: localhost (JobTracker进程 所在主机)
--Port: 8021 (JobTracker进程 接受请求的RPC端口)
DFS Master
--Host: localhost (NameNode进程 所在主机)
--Port: 8020 (NameNode进程 接受请求到RPC端口)
User Name: hadoop (Hadoop client用户名)
SOCKS proxy:这里不使用(SOCKS proxy can be configured if you cannot access the Map/Reduce location directly because your machine is not directly connected to the location. )


(4)启动hadoop集群

a.伪分布式集群部署:
----hadoop-env.sh:
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk
----core-site.xml:
<configuration>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/hadoop-tmp</value>
  </property>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:8020/</value>
  </property>
</configuration>
----mapred-site.xml:
<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:8021</value>
  </property>
</configuration>
b.配置ssh:
~$ssh-keygen -t rsa -P 'hadoop' -f ~/.ssh/id_rsa_hadoop
#生成一个rsa密钥对,私钥在id_rsa_hadoop,公钥在id_rsa_hadoop.pub,密码为"hadoop"
~$cat ~/.ssh/id_rsa_hadoop.pub >> ~/.ssh/authorized_keys
#如果是集群,则需要把authorized_keys复制到所有节点
~$ssh-agent bash
#启动ssh代理
~$ssh-add ~/.ssh/id_rsa_hadoop
#输入rsa密码
c.启动集群:
~$hadoop namenode -format
~$start-all.sh
~$jps
显示如下,则成功:
24749 Jps
2526 
24608 TaskTracker
24272 SecondaryNameNode
23769 NameNode
24015 DataNode
24359 JobTracker

d.运行基准测试程序wordcount
~$hadoop fs -mkdir /user/hadoop/wordcount/input
~$hadoop fs -copyFromLocal /home/hadoop/examples.desktop /user/hadoop/wordcount/input/
~$hadoop jar $HADOOP_HOME/hadoop-examples-1.1.2.jar wordcount /user/hadoop/wordcount/input /user/hadoop/wordcount/output

~$hadoop fs -cat /user/hadoop/wordcount/output/part*

#可以看到正常的输出

(5)测试eclipse和hadoop的连接
在eclipse DFS Locations按F5 ,可以看到:

(6)在eclipse 上运行wordcount程序:
wordcount的源代码:$HADOOP_HOME/src/examples/org/apache/hadoop/examples/WordCount.java

在eclipse新建一个MAP/REDUCE project:


在MP-Wordcount工程下的src目录下新建一个java类:

复制wordcount的源代码进去,把"package org.apache.hadoop.examples;"替换成"package test;"

在Project Explore窗口里,右击"WordCount.java",选择"Run AS"->"Run on hadoop"
这个时候报错如下:
Usage: wordcount <in> <out>
在Project Explore窗口里,右击"WordCount.java",选择"Run AS"->"Run Configurations",配置java application



点"Apply",再点"Run"
报错:
Output directory /user/hadoop/wordcount/output already exists

~$hadoop fs -rmr /user/hadoop/wordcount/output
#再次进行"Run As"->"Run Configurations"->"Run"
~$hadoop fs -cat /user/hadoop/wordcount/output/part*
#可以看到正确的输出

原创粉丝点击