今天咱们来实际搭建一下Hadoop 2.2.0版,实战环境为目前主流服务器操做系统RedHat6.2系统,本次环境搭建时,各种介质均来自互联网,在搭建环境以前,请提早准备好各种介质。java
功能node |
Hostnamelinux |
IP地址express |
Namenodeapache |
Master浏览器 |
192.168.200.2bash |
Datanode服务器 |
Slave1网络 |
192.168.200.3app |
Datanode |
Slave2 |
192.168.200.4 |
Datanode |
Slave3 |
192.168.200.5 |
Datanode |
Slave4 |
192.168.200.6 |
软件 |
版本 |
操做系统 |
RedHat 6.2-64 |
Hadoop |
Hadoop 2.2.0 |
Jdk |
Jdk 1.7-linux |
规划好服务器用途后,对服务器进行系统安装,并配置网络。
此处省略
(1)对操做系统安装完成后,关闭全部节点的防火墙服务和selinux服务。
service iptablesstop
chkconfigiptables off
cat/etc/selinux/config
# This filecontrols the state of SELinux on the system.
# SELINUX= cantake one of these three values:
# enforcing - SELinux security policy isenforced.
# permissive - SELinux prints warningsinstead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE=can take one of these two values:
# targeted - Targeted processes areprotected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
(2)复制hadoop软件,jdk软件到服务器中。
[root@masterhome]# ls
jdk-7u67-linux-x64.rpm hadoop-2.2.0.tar.gz
(3)修改各个服务器主机名和网络
cat/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=slave2
cat/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
HWADDR=f2:85:cd:9a:30:0d
NM_CONTROLLED=yes
ONBOOT=yes
IPADDR=192.168.200.4
BOOTPROTO=none
NETMASK=255.255.255.0
TYPE=Ethernet
GATEWAY=192.168.200.254
IPV6INIT=no
USERCTL=no
(4)在各服务器上配置/etc/hosts文件
127.0.0.1 localhost localhost.localdomain localhost4localhost4.localdomain4
::1 localhost localhost.localdomainlocalhost6 localhost6.localdomain6
192.168.200.2 master
192.168.200.3 slave1
192.168.200.4 slave2
192.168.200.5 slave3
192.168.200.6 slave4
通常咱们不会常常使用root用户运行hadoop,因此要建立一个日常运行和管理Hadoop的用户;
master和slave节点机都要建立相同的用户和用户组,即在全部集群服务器上都要建hdtest用户和用户组;
使用如下命令建立用户
useradd hdtest
password hdtest
把hadoop-2.2.0.tar.gz拷贝到hdtest用户下,并修改所属组。
这次使用的jdk1.7,从官网上下载jdk1.7-linux ,复制到每台服务器上进行安装。
使用root用户安装
rpm -ivh jdk-7u67-linux-x64.rpm
本次使用的是hdtest用户安装hadoop,故须要对hdtest用户进行配置。
须要在master和slave全部节点上配置环境变量
[root@master ~]# find / -name java
………………
/usr/java/jdk1.7.0_67/bin/java
……………………
[root@master home]# su - hdtest
[hdtest@master ~]$ cat .bash_profile
# .bash_profile
…………
PATH=$PATH:$HOME/bin
export PATH
export JAVA_HOME=/usr/java/jdk1.7.0_67
export PATH=$JAVA_HOME/bin:$PATH
exportCLASSPATH=$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:./
export HADOOP_HOME=/home/hdtest/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin/
exportJAVA_LIBRARY_PATH=/home/hdtest/hadoop-2.2.0/lib/native
在全部节点使用hdtest用户生成秘钥
[hdtest@master.ssh]$ ssh-keygen -t rsa
[hdtest@slave1 .ssh]$ ssh-keygen -t rsa
[hdtest@slave2 .ssh]$ ssh-keygen -t rsa
[hdtest@slave3 .ssh]$ ssh-keygen -t rsa
[hdtest@slave4 .ssh]$ ssh-keygen -t rsa
[hdtest@slave2 .ssh]$ ll
total 16
-rw------- 1 hdtest hdtest 1675 Sep 4 14:53 id_rsa
-rw-r--r-- 1 hdtest hdtest 395 Sep 4 14:53 id_rsa.pub
-rw-r--r-- 1 hdtest hdtest 783 Sep 4 14:58 known_hosts
各节点生成公钥复制到其中一台机器,并进行合并。
[hdtest@slave1 .ssh]$ scp id_rsa.pub192.168.200.2:/home/hdtest/.ssh/slave1.pub
[hdtest@slave2 .ssh]$ scp id_rsa.pub192.168.200.2:/home/hdtest/.ssh/slave2.pub
[hdtest@slave3 .ssh]$ scp id_rsa.pub192.168.200.2:/home/hdtest/.ssh/slave3.pub
[hdtest@slave4 .ssh]$ scp id_rsa.pub192.168.200.2:/home/hdtest/.ssh/slave4.pub
[hdtest@master .ssh]$ cat *.pub >>authorized_keys
把master上生成的authorized文件分别拷贝到每台机器上。
scp authorized_keys slave1:/home/hdtest/.ssh/
scp authorized_keys slave2:/home/hdtest/.ssh/
scp authorized_keys slave3:/home/hdtest/.ssh/
scp authorized_keys slave4:/home/hdtest/.ssh/
在全部节点修改文件权限
[hdtest@master ~]$ chmod 700 .ssh/
[hdtest@master .ssh]$ chmod 600authorized_keys
以上步骤完成后,进行测试
[hdtest@master .ssh]$ ssh slave1
Last login: Thu Sep 4 15:58:39 2014 from master
[hdtest@slave1 ~]$ ssh slave3
Last login: Thu Sep 4 15:58:42 2014 from master
[hdtest@slave3 ~]$
使用ssh登录各服务器,不用输入密码,证实配置完成。
在安装hadoop以前,须要新建几个目录
[hdtest@master ~]$ pwd
/home/hdtest
mkdir dfs/name -p
mkdir dfs/data -p
mkdir mapred/local -p
mkdir mapred/system
每台机器服务器都要配置,且都是同样的,配置完一台其余的只须要拷贝,每台
机上的core-site.xml和mapred-site.xml都是配master服务器的hostname,由于都是配
置hadoop的入口
[hdtest@master hadoop]$ pwd
/home/hdtest/hadoop-2.2.0/etc/hadoop
[hdtest@master hadoop]$ cat core-site.xml
<?xml version="1.0"encoding="UTF-8"?>
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the"License");
youmay not use this file except in compliance with the License.
Youmay obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS"BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Seethe License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific propertyoverrides in this file. -->
<configuration>
<property>
<name>io,native.lib.available</name>
<value>true</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
<final>true</final>
</property>
</configuration>
[hdtest@master hadoop]$ cat hdfs-site.xml
<?xml version="1.0"encoding="UTF-8"?>
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the"License");
youmay not use this file except in compliance with the License.
Youmay obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS"BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Seethe License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific propertyoverrides in this file. -->
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hdtest/dfs/name</value>
<description>Determines where on the local filesystemthe DFS name node should store the name table.If this is a comma-delimited listof directories,then name table is replicated in all of the directories,forredundancy.</description>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hdtest/dfs/data</value>
<description>Determines where on the local filesystemthe DFS name node should store the name table.If this is a comma-delimited listof directories,then name table is replicated in all of the directories,forredundancy.</description>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>副本数量</description>
</property>
<property>
<name>dfs.permission</name>
<value>false</value>
</property>
</configuration>
[hdtest@master hadoop]$ cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the"License");
youmay not use this file except in compliance with the License.
Youmay obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS"BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Seethe License for the specific language governing permissions and
limitationsunder the License. See accompanying LICENSE file.
-->
<!-- Put site-specific propertyoverrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://master:9001</value>
<final>true</final>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1536</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1024M</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>3072</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx2560M</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>100</value>
</property>
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>50</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>file:/home/hdtest/mapred/system</value>
<final>true</final>
</property>
<property>
<name>mapred.local.dir</name>
<value>file:/home/hdtest/mapred/local</value>
</property>
</configuration>
[hdtest@master hadoop]$ cat yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the"License");
youmay not use this file except in compliance with the License.
Youmay obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS"BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Seethe License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>
<spanstyle="font-family:Arial,Helvetica,sans-serif">master</span>
<spanstyle="font-family:Arial,Helvetica,sans-serif">:8080</span>
</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8081</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8082</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<!-- Site specificYARN configuration properties -->
</configuration>
修改hadoop-env.sh,vi yarn-env.sh,mapred-env.sh文件
修改如下路径
export JAVA_HOME=/usr/java/jdk1.7.0_67
只须要配置namemode节点机,这里的HDM01即作namenode也兼datanode,通常状况
namenode要求独立机器,namemode不兼datanode
[hdtest@master hadoop]$ pwd
/home/hdtest/hadoop-2.2.0/etc/hadoop
[hdtest@master hadoop]$ cat masters
192.168.200.2
[hdtest@master hadoop]$ cat slaves
192.168.200.3
192.168.200.4
192.168.200.5
192.168.200.6
以上配置完成后,须要把hadoop目录分发到各slave节点上。
[hdtest@master ~]$ scp -r hadoop-2.2.0slave1:/home/hdtest/
[hdtest@master ~]$ scp -r hadoop-2.2.0slave2:/home/hdtest/
[hdtest@master ~]$ scp -r hadoop-2.2.0slave3:/home/hdtest/
[hdtest@master ~]$ scp -r hadoop-2.2.0slave4:/home/hdtest/
[hdtest@master bin]$ pwd
在master节点运行如下命令格式化
/home/hdtest/hadoop-2.2.0/bin
[hdtest@master bin]$ ./hadoop namenode –format
出现如下字符表示格式化成功。
从新format时,系统提示以下:
Re-format filesystem in /home/hadoop/tmp/dfs/name ? (Y or N) 必须输入大写Y,输入小写y不会报输入错误,但format出错。
使用如下命令启动hadoop服务,只需在namenode节点启动。
[hdtest@master sbin]$ pwd
/home/hdtest/hadoop-2.2.0/sbin
[hdtest@master sbin]$ ./start-all.sh (stop-all.sh中止服务)
出现如下字符表示启动成功,并能够使用jps命令进行验证。
[hdtest@master ~]$ netstat -ntpl
[hdtest@master sbin]$ hadoop dfsadmin-report
DEPRECATED: Use of this script to executehdfs command is deprecated.
Instead use the hdfs command for it.
Java HotSpot(TM) 64-Bit Server VM warning:You have loaded library /home/hdtest/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0which might have disabled stack guard. The VM will try to fix the stack guardnow.
It's highly recommended that you fix thelibrary with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/09/05 10:48:23 WARNutil.NativeCodeLoader: Unable to load native-hadoop library for yourplatform... using builtin-java classes where applicable
Configured Capacity: 167811284992 (156.29GB)
Present Capacity: 137947226112 (128.47 GB)
DFS Remaining: 137947127808 (128.47 GB)
DFS Used: 98304 (96 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 4 (4 total, 0 dead)
Live datanodes:
Name: 192.168.200.5:50010 (slave3)
Hostname: slave3
Decommission Status : Normal
Configured Capacity: 41952821248 (39.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7465213952 (6.95 GB)
DFS Remaining: 34487582720 (32.12 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.21%
Last contact: Fri Sep 05 10:48:23 CST 2014
Name: 192.168.200.3:50010 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 41952821248 (39.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7465467904 (6.95 GB)
DFS Remaining: 34487328768 (32.12 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.21%
Last contact: Fri Sep 05 10:48:24 CST 2014
Name: 192.168.200.6:50010 (slave4)
Hostname: slave4
Decommission Status : Normal
Configured Capacity: 41952821248 (39.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7467925504 (6.96 GB)
DFS Remaining: 34484871168 (32.12 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.20%
Last contact: Fri Sep 05 10:48:24 CST 2014
Name: 192.168.200.4:50010 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 41952821248 (39.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7465451520 (6.95 GB)
DFS Remaining: 34487345152 (32.12 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.21%
Last contact: Fri Sep 05 10:48:22 CST 2014
经过浏览器访问如下地址http://192.168.200.2:50070/
访问如下地址http://192.168.200.2:8088/cluster