安装以前准备4台机器:bluejoe0,bluejoe4,bluejoe5,bluejoe9html
bluejoe0做为master,bluejoe4,5,9做为slavejava
bluejoe0做为namenodenode
bluejoe9为secondary namenodeweb
bluejoe4,5,9做为datanodeapache
安装hadoop
首先在bluejoe0机器上下载Hadoop:浏览器
wget http://mirrors.cnnic.cn/apache/hadoop/common/stable2/hadoop-2.5.2.tar.gzapp
保存至/usr/local/,tar之;框架
ln之,/usr/local/hadoop;webapp
配置hdfs
配置core-site.xml:jvm
[html] view plain copy
print?

- <configuration>
- <property>
- <name>fs.default.name</name>
- <value>hdfs://bluejoe0:9000</value>
- </property>
- <property>
- <name>io.file.buffer.size</name>
- <value>4096</value>
- </property>
-
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/data/hdfs/tmp</value>
- </property>
- </configuration>
配置hdfs-site.xml:
[html] view plain copy
print?

- <?xml version="1.0" encoding="UTF-8"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-
- <configuration>
- <property>
- <name>dfs.name.dir</name>
- <value>file:/data/hdfs/name</value>
- </property>
- <property>
- <name>dfs.data.dir</name>
- <value>file:/data/hdfs/data</value>
- </property>
- <property>
- <name>dfs.permissions</name>
- <value>false</value>
- </property>
- <property>
- <name>dfs.replication</name>
- <value>2</value>
- </property>
- <property>
- <name>dfs.webhdfs.enabled</name>
- <value>true</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address</name>
- <value>bluejoe0:9000</value>
- </property>
- <property>
- <name>dfs.namenode.secondary.http-address</name>
- <value>bluejoe9:50090</value>
- </property>
- </configuration>
注意,dfs.namenode.rpc-address要和fs.default.name一致。
设置/usr/local/hadoop/etc/hadoop/hadoop-env.sh:
export JAVA_HOME=/usr/lib/jvm/Java-1.6.0-openjdk-1.6.0.0.x86_64
scp,将hadoop目录复制到其它机器;
配置slaves:
[html] view plain copy
print?

- bluejoe4
- bluejoe5
- bluejoe9
namenode格式化:
hdfs namenode -format
启动hdfs:
./sbin/start-dfs.sh
能够看到输出信息:
[html] view plain copy
print?

- Starting namenodes on [bluejoe0]
- bluejoe0: starting namenode, logging to /usr/local/hadoop-2.5.2/logs/hadoop-root-namenode-bluejoe0.out
- bluejoe9: starting datanode, logging to /usr/local/hadoop-2.5.2/logs/hadoop-root-datanode-bluejoe9.out
- bluejoe4: starting datanode, logging to /usr/local/hadoop-2.5.2/logs/hadoop-root-datanode-bluejoe4.out
- bluejoe5: starting datanode, logging to /usr/local/hadoop-2.5.2/logs/hadoop-root-datanode-bluejoe5.out
- Starting secondary namenodes [bluejoe9]
接下来,能够查看Web界面(http://bluejoe0:50070/),其datanodes截图以下:

目前为止,hdfs安装完毕!
配置mapreduce
修改 yarn-site.xml:
[html] view plain copy
print?

- <?xml version="1.0"?>
- <configuration>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
-
- <property>
- <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
- <value>org.apache.hadoop.mapred.ShuffleHandler</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.address</name>
- <value>bluejoe0:8032</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.scheduler.address</name>
- <value>bluejoe0:8030</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.resource-tracker.address</name>
- <value>bluejoe0:8031</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.admin.address</name>
- <value>bluejoe0:8033</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.webapp.address</name>
- <value>bluejoe0:8088</value>
- </property>
- </configuration>
scp如上配置文件至其它节点;
启动mapreduce框架:
/usr/local/hadoop-2.5.2/sbin/start-yarn.sh
启动浏览器,访问http://bluejoe0:8088:

执行测试程序:
hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar pi 100 1000
Job Finished in 12.885 seconds
Estimated value of Pi is 3.14120000000000000000
设置/usr/local/hadoop/etc/hadoop/hadoop-env.sh:
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64