1.1、服务器规划java
Hadoop2 HA Cluster | |||||
Host | IP | NameNode | JournalNode | FailoverController | DataNode |
nn1 | 192.168.50.221 | Y | Y | Y | N |
nn2 | 192.168.50.222 | Y | Y | Y | N |
dn1 | 192.168.50.225 | N | Y |
N | Y |
dn2 | 192.168.50.226 | N | N | N | Y |
dn3 | 192.168.50.227 | N | N | N | Y |
ZooKeeper机器部署,请看ZooKeeper哪一篇,作HADOOP的HA,须要提早部署ZKnode
1.2、软件版本web
Linux: CentOS2.6.32-431.el6.x86_64shell
Hadoop:2.6.0(建议升级为2.7.1+)express
ZooKeeper:3.4.6apache
JDK/JRE: 1.7.0_75 (64bit)bootstrap
第一步临时修改vim
#hostname nn1安全
第二步修改永久修改,防止下次重启后被重置服务器
修改/etc/sysconfig/network中的hostname
NETWORKING=yes
HOSTNAME= NameNode1
第三步作DNS映射,能够直接访问主机名则访问的部本机IP地址
修改/etc/hosts文件
在最后增长一行,如
192.168.50.221 nn1
第四步重启电脑
重启后,ping nn1
若是能ping成功则配置完成
其余机器依次配置便可
命令:service iptables stop
同时关闭防火墙自启动服务:chkconfig iptables off
查看是否关闭了自启动:chkconfig --list | grep iptables ,若是所有是off则表示都关闭了
查看状态:service iptables status
# service iptables status
Firewall is stopped.
3、设置ssh
修改/etc/ssh/sshd_config root账户修改
#vim /etc/ssh/sshd_config
取消以下注释
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
修改后,重启ssh: service sshd restart
切换到普通用户下进行设置
生成SSH通信密钥,有这个证书后,hadoop启动时就能够免密码登陆了
a、先在每台机器上面执行ssh-keygen -t rsa -P "",密码都为空,主要是每次hadoop启动时须要认证,若是有密码每次都须要输入,实在麻烦,因此密码留空,回车便可,执行后,将产生两个文件,位于~/.ssh文件夹中
b、而后相互执行 ssh-copy-id userName@machineName,此功能能够快速把本身的公钥发给对方,并自动追加
[root@nn1 ~]# ssh nn1
进入后,exit退出SSH便可
这里注意,有时候仍是须要你输入密码,这里有多是authorized_keys的权限问题,咱们进一步设置权限,chmod 600 authorized_keys便可
若是对于其余计算机则须要把你的公钥发送给对方,你就能够无密码登陆了,认证了
参见:《Zookeeper部署文档.doc》
3.1、初始化系统配置
3.1.1、建立应用账户和组(可选,建议新建一个专门的用户)
为了系统安全建议每一个对外应用都建立单独的账户和组,具体建立方法请网上搜索。
#新建组
[root@nn2 ~]# groupadd bdata
#添加用户和分组
[root@nn2 ~]# useradd -g bdata bdata
#设置密码
[root@nn2 ~]# passwd bdata
Changing password for user bdata.
New password:
BAD PASSWORD: it does not contain enough DIFFERENT characters
BAD PASSWORD: is a palindrome
Retype new password:
passwd: all authentication tokens updated successfully.
3.1.2、修改hosts映射
#vim /etc/hosts
192.168.50.221 nn1
192.168.50.222 nn2
192.168.50.223 HMaster1
192.168.50.224 HMaster2
192.168.50.225 dn1
192.168.50.226 dn2
192.168.50.227 dn3
192.168.50.225 jn1
192.168.50.226 jn2
192.168.50.227 jn3
192.168.50.228 Zookeeper1
192.168.50.229 Zookeeper2
192.168.50.230 Zookeeper3
3.1.2、安装JDK/JRE
HADOOP2是使用JAVA语言开发的软件,因此必需要安装JRE或JDK才能够运行,为了测试方便建议安装JDK(生产环境安装JRE便可),JDK安装步骤(略)
3.2、安装HADOOP2
3.2.1、单机安装HADOOP2
下载地址:
http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
官方提供的发布包是基于32位系统编译的,Hadoop依赖的so库也是32位版本,所以在64位操做系统部署会提示一些警告信息,建议下载源码包自行编译64位版本HADOOP2。
配置环境变动
export HADOOP_HOME=/hom/bdata/software/hadoop-2.6.0
export PATH=.:$PATH:$HADOOP_HOME/bin
配置$HADOOP_HOME/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<!-- 配置HDFS集群名称及HDFS存储的路径 -->
<name>fs.defaultFS</name>
<value>hdfs://dfscluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/bdata/datadir/hadoopinfo/journal</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<!-- zookeeper集群地址列表,以逗号分隔 -->
<value>Zookeeper1:2181,Zookeeper2:2181,Zookeeper3:2181</value>
</property>
</configuration>
配置$HADOOP_HOME/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<!-- HDFS冗余数量 -->
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<!-- HDFS集群名称,必须与core-site.xml中的名称一致 -->
<name>dfs.nameservices</name>
<value>dfscluster</value>
</property>
<property>
<!--
dfs.ha.namenodes.[nameservices id]
hdfs namenode集群列表
-->
<name>dfs.ha.namenodes.dfscluster</name>
<value>nn1,nn2</value>
</property>
<property>
<!--
namenode rpc端口
dfs.namenode.rpc-address.[nameservices id].[host id]
-->
<name>dfs.namenode.rpc-address.dfscluster.nn1</name>
<!-- hostname/IP -->
<value>nn1:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.dfscluster.nn2</name>
<value>nn2:9000</value>
</property>
<property>
<!--
namenode http端口(Web后台访问端口)
dfs.namenode.http-address.[nameservices id].[host id]
-->
<name>dfs.namenode.http-address.dfscluster.nn1</name>
<value>nn1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.dfscluster.nn2</name>
<value>nn2:50070</value>
</property>
<property><!-- 毫秒,最大时间,超事后客户端会断开和服务器的链接,默认1000ms,网络很差的状况下,能够设置大一些 --> <name>ipc.client.connection.maxidletime </name> <value>1000</value> </property>
<!-- 客户端链接重试次数(在网络状况很差的状况下,有必要设置大一些,默认是10次,每次1000ms) --> <property>
<name>ipc.client.connect.max.retries</name>
<value>30</value>
</property>
<property>
<!--
QJM实现Hadoop HA
配置JournalNode集群列表以分号分隔
qjournal://[hostname/IP:port];[hostname/IP:port]/[nameservices id]
-->
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://jn1:8485;jn2:8485;jn3:8485/dfscluster</value>
</property>
<property>
<!-- 客户端检查到namenode失效自动切换 -->
<name>dfs.client.failover.proxy.provider.dfscluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<!--
namenode切换时须要把fence standby的节点(禁止写入EditLog)
有两种方式实现:
1、使用Hadoop使用Java实现的方式(推荐)
2、编写shell脚本
-->
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<!-- 配置本地私钥地址(配置SSH无密码登陆时生成的私钥) -->
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/bdata/.ssh/id_rsa</value>
</property>
<property>
<!-- 配置sshfence超时时间单位毫秒 -->
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>10000</value>
</property>
<property>
<!-- 是否启用Failover自动切换(true:启用false:禁用) -->
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<!-- 指定namespace存入目录 -->
<name>dfs.namenode.name.dir</name>
<value>/home/bdata/datadir/hadoop/name</value>
</property>
<property>
<!-- 指定datanode数据存放目录 -->
<name>dfs.datanode.data.dir</name>
<value>/home/bdata/datadir/hadoop/data</value>
</property>
</configuration>
单节点配置,下面有HA的配置方式,若是不使用资源管理器,能够不配置
#vim yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>nn1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>nn1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>nn1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>nn1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>nn1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>nn1:8001</value>
</property>
YARN HA配置,若是不使用资源管理器,能够不配置
#vim yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
<description>启用Yarn的HA配置方式</description>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-ha</value>
<description>配置HA的集群名称,在ZK中显示也为该名称</description>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
<description>设置HA多个RM的名称,便于下面引用配置</description>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>nn1</value>
<description>配置rm1对应的主机</description>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>nn2</value>
<description>配置RM2对应的主机名称</description>
</property>
<!-- 启用RM重启的功能-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
<description>启用RM重启的功能,默认为false</description>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
<description>用于状态存储的类,采用ZK存储状态类</description>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>dn1:2181,dn2:2181,dn3:2181</value>
<description>因为数据须要存储在ZK中,经过ZK来选举active,因此这里须要配置ZK的地址</description>
</property>
<!-- 配置通信的地址和端口,有多少个RM就配置多少组property -->
<!-- RM1-->
<property>
<name>yarn.resourcemanager.scheduler.address.rm1</name>
<value>nn1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
<value>nn1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>nn1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm1</name>
<value>nn1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>nn1:8001</value>
<description>提供给web页面访问的地址,能够查看任务情况等信息</description>
</property>
<!-- RM2 -->
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>nn2:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
<value>nn2:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>nn2:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm2</name>
<value>nn2:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>nn2:8001</value>
</property></configuration>
添加数据从节点#vim slavesdn1dn2dn3
修改hadoop-env.sh,JAVA_HOME地址export JAVA_HOME=/usr/java/jdk1.7.0_45
3.2.2、集群安装Hadoop
把配置好的hadoop复制到其它节点,这里bdata表明用户
scp -r hadoop bdata@NameNode1:/home/bdata/stoftware/
scp -r hadoop bdata@DataNode1:/home/bdata/stoftware/
scp -r hadoop bdata@DataNode2:/home/bdata/stoftware/
scp -r hadoop bdata@DataNode3:/home/bdata/stoftware/
3.2.3、启动Hadoop集群
一、启动zookeeper集群(必须提早所有启动)
#zkServer.sh start
二、格式化zookeeper(与zookeeper创建联系,只有首次启动须要执行这一步)
# hdfs zkfc -formatZK
3、启动全部JournalNode(首次只能单台启动)
#hadoop-daemon.sh start journalnode #启动全部journalnode
4、格式化主NameNode节点
#hdfs namenode -format //在主namenode上执行格式化命令(保证zookeeper已经启动)
5、启动首个NameNode节点
#hadoop-daemon.sh start namenode //启动主namenode(保证journalnode已经启动,而且是格式化的节点上)
6、同步主NameNode数据到备NameNode上
#hdfs namenode -bootstrapStandby //备用namenode同步主namenode数据,在备节点上启动
7、关闭HDFS集群
#stop-dfs.sh
8、启动HDFS集群
#start-dfs.sh
启动后查看相应日志检查启动是否正常,也可经过JPS查看对应的进程是否启动
单独启动某个进程
hadoop-daemon.sh start journalnode //启动journalnode
hadoop-daemon.sh start namenode //启动namenode
hadoop-daemon.sh start datanode //启动datanode
hadoop-daemon.sh start zkfc //启动DFSZKFailoverController
1、关闭HDFS集群
#stop-dfs.sh
2、启动HDFS集群
#start-dfs.sh
1、启动Yarn HA
主启动全部节点,可是第二个RM节点须要手动启动
#./start-yarn.sh
手动启动standy的RM节点,该节点必须手动启动
#./yarn-daemon.sh start resourcemanager
单独启动某一个节点的方式
RM节点
#./yarn-daemon.sh start resourcemanager
NM节点
#./yarn-daemon.sh start nodemanager
2、中止Yarn HA
主启动全部节点,可是第二个RM节点须要手动启动
#./stop-yarn.sh
手动启动standy的RM节点,该节点必须手动启动
#./yarn-daemon.sh stop resourcemanager
单独启动某一个节点的方式
RM节点
#./yarn-daemon.sh stop resourcemanager
NM节点
#./yarn-daemon.sh stop nodemanager
三、查看任务
yarn application -list
四、中止任务
yarn application -kill <Application ID>
五、查看状态
yarn application -status <Application ID>
3.3、Hadoop集群异常处理
3.3.1、启动指定节点
集群可能由于各类异常致使部分节点失效,须要重启部分节点,相应命令以下:
hadoop-daemon.sh start journalnode //启动journalnode
hadoop-daemon.sh start namenode //启动namenode
hadoop-daemon.sh start datanode //启动datanode
hadoop-daemon.sh start zkfc //启动DFSZKFailoverController
3.3.2、从新格式化/清空HDFS数据及元数据(慎用)
hdfs zkfc -formatZK
hadoop-daemon.sh start journalnode #启动全部journalnode
hdfs namenode -format //在主namenode上执行格式化命令(保证zookeeper已经启动)
hadoop-daemon.sh start namenode //启动主namenode(保证journalnode已经启动)
hdfs namenode -bootstrapStandby //备用namenode同步主namenode数据
hadoop-daemon.sh start namenode //启动备用namenode
hadoop-daemon.sh start datanode //启动datanode