Hadoop HA(namenode & resoureManager)配置

Hadoop HA(namenode & resoureManager)配置

 

第一步、系统初始化

  1. 配置hosts
  2. 关闭firewalld
  3. 关闭selinux
  4. 修改limits
  5. 添加普通用户hadoop及设置密码
  6. 主机间免密(用普通户用设置)

第二步、安装zookeeper

 

由于hadoop HA依赖于zookeeper,所以先需要搭建zookeeper集群。

安装Zookeeper

1、下载zookeeper-3.4.10

windows 下载方式

https://archive.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz

linux 下载方式

wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz

上传zookeeper 压缩包到服务器

解压 zookeeper-3.4.10.tar.gz 文件

tar -zxf soft/zookeeper-3.4.10.tar.gz

 

配置zookeeper 环境变量 所有安装zookeeper的服务器都要配置

vim ~/.bashrc

 

2、配置zoo.cfg配置文件

复制配置文件模板 conf 目录下

cp zoo_sample.cfg zoo.cfg

 

修改 zoo.cg 配置文件

vim zoo.cfg

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/zywa/zookeeper-3.4.10/data

dataLogDir=/zywa/zookeeper-3.4.10/logs

clientPort=2181

maxClientCnxns=600

server.1=hadoop01:2888:3888

server.2=hadoop02:2888:3888

server.3=hadoop03:2888:3888

创建data文件夹 和 logs 文件夹

mkdir data logs

 

创建 zookeeper 集群 每个节点的唯一标识

echo "1" >> myid

 

在 data 目录下创建myid文件 并写入 唯一标书(数字)如上图

3、将zookeeper同步到其他服务器

scp -r zookeeper-3.4.10/ [email protected]:~

 

修改其他几台服务器的myid文件中的值

echo "2" > data/myid

 

echo "3" > data/myid

 

4、启动 zookeeper 集群

每台服务器都要执行

zkServer.sh start

 

查看zookeeper集群状态

zkServer.sh status

 

一个leader

 

两个follower

 

 

第三步、搭建hadoop集群

安装Hadoop

下载hadoop-2.9.0

windows 下载方式

http://www.apache.org/dist/hadoop/core/hadoop-2.9.0/hadoop-2.9.0.tar.gz

linux   下载方式

wget http://www.apache.org/dist/hadoop/core/hadoop-2.9.0/hadoop-2.9.0.tar.gz

上传hadoop压缩包到服务器

解压 hadoop-2.9.0.tar.gz 文件

tar -zxf soft/hadoop-2.9.0.tar.gz

 

配置hadoop环境变量 所有安装hadoop的服务器都要配置

vim ~/.bashrc

 

创建hadoop 的data、name、secondary、tmp四个目录

mkdir -p /zywa/hadoop-2.9.0/data/hadoop-repo

cd  /zywa/hadoop-2.9.0/data/hadoop-repo

mkdir data name secondary tmp

cd /zywa/hadoop-2.9.0/data/hadoop-repo/data

mkdir localdir1 localdir2

 

配置hadoop 配置文件在 etc/hadoop 目录下

修改hadoop-env.sh配置文件

vim hadoop-env.sh

 

 

只修改这两处

 

修改 yarn-env.sh 配置文件

vim yarn-env.sh

 

只修改这一处

修改 hdfs-site.xml 配置文件

 

==========================================================>

<configuration>

        <!-- 完全分布式集群名称 -->

        <property>

                <name>dfs.nameservices</name>

                <value>mycluster</value>

        </property>

        <!-- 集群中NameNode节点都有哪些 -->

        <property>

                <name>dfs.ha.namenodes.mycluster</name>

                <value>hadoop01,hadoop02</value>

        </property>

        <!-- hadoop01的RPC通信地址 -->

        <property>

                <name>dfs.namenode.rpc-address.mycluster.hadoop01</name>

                <value>hadoop01:9000</value>

        </property>

        <!-- hadoop02的RPC通信地址 -->

        <property>

                <name>dfs.namenode.rpc-address.mycluster.hadoop02</name>

                <value>hadoop02:9000</value>

        </property>

        <!-- hadoop01的http通信地址 -->

        <property>

                <name>dfs.namenode.http-address.mycluster.hadoop01</name>

                <value>hadoop01:50070</value>

        </property>

 

        <!-- hadoop02的http通信地址 -->

        <property>

                <name>dfs.namenode.http-address.mycluster.hadoop02</name>

                <value>hadoop02:50070</value>

        </property>

 

        <!-- 指定NameNode元数据在JournalNode上的存放位置 -->

        <property>

                <name>dfs.namenode.shared.edits.dir</name>      <value>qjournal://hadoop01:8485;hadoop02:8485;hadoop03:8485/mycluster</value>

        </property>

        <!-- 配置隔离机制,即同一时刻只能有一台服务器对外响应 -->

        <property>

                <name>dfs.ha.fencing.methods</name>

                <value>sshfence</value>

        </property>

        <!-- 使用隔离机制时需要ssh无秘钥登录-->

        <property>

                <name>dfs.ha.fencing.ssh.private-key-files</name>

                <value>/home/hadoop/.ssh/id_rsa</value>

        </property>

        <property>

                <name>dfs.ha.fencing.ssh.connect-timeout</name>

                <value>30000</value>

        </property>

        <!-- 声明journalnode服务器存储目录-->

        <property>

                <name>dfs.journalnode.edits.dir</name>

                <value>/zywa/hadoop-2.9.0/data/jn</value>

        </property>

<!-- 访问代理类:client,mycluster,active配置失败自动切换实现方式-->

        <property>

            <name>dfs.client.failover.proxy.provider.mycluster</name>       <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

        </property>

        <property>

        <name>dfs.ha.automatic-failover.enabled</name>

        <value>true</value>

        </property>

        <property>

                <name>dfs.namenode.name.dir</name>

                <value>/zywa/hadoop-2.9.0/data/hadoop-repo/name</value>

        </property>

        <property>

                <name>dfs.datanode.data.dir</name>

                <value>/zywa/hadoop-2.9.0/data/hadoop-repo/data</value>

        </property>

        <property>

                <name>dfs.replication</name>

                <value>3</value>

        </property>

<property>

                <name>dfs.webhdfs.enabled</name>

                <value>true</value>

        </property>

        <property>

                <name>dfs.permissions</name>

                <value>false</value>

        </property>

        <property>

<name>dfs.permissions.enabled</name>

<value>false</value>

        </property>

</configuration>

<================================================================

修改 core-site.xml 配置文件

vim core-site.xml

============================================================>

<configuration>

<!-- fs改为集群名称 -->

<property>

                <name>fs.defaultFS</name>

                <value>hdfs://mycluster</value>

        </property>

        <property>

                <name>hadoop.tmp.dir</name>

                <value>/zywa/hadoop-2.9.0/data/hadoop-repo/tmp</value>

        </property>

        <property>

                <name>io.file.buffer.size</name>

                <value>131702</value>

                </property>

        <property>

                <name>hadoop.proxyuser.hadoop.groups</name>

                <value>*</value>

                <description>Allow the superuser oozie to impersonate any members of the group group1 and group2</description>

         </property>

        <property>

                <name>hadoop.proxyuser.hadoop.hosts</name>

                <value>*</value>

                <description>The superuser can connect only from host1 and host2 to impersonate a user</description>

        </property>

        <!-- Hue -->

        <property>

                <name>hadoop.proxyuser.hue.hosts</name>

                <value>*</value>

        </property>

        <property>

                <name>hadoop.proxyuser.hue.groups</name>

                <value>*</value>

        </property>

        <property>

                <name>hadoop.proxyuser.hive.groups</name>

                <value>*</value>

        </property>

        <property>

                <name>hadoop.proxyuser.hive.hosts</name>

                <value>*</value>

        </property>

<!-- 借助zk实现高可用 -->

        <property>

<name>ha.zookeeper.quorum</name>

<value>hadoop01:2181,hadoop02:2181,hadoop03:2181</value>

        </property>

</configuration>

<==============================================================

复制 mapred-site.xml 配置文件

cp mapred-site.xml.template mapred-site.xml

 

复制然后修改

修改 mapred-site.xml 配置文件

 

vim mapred-site.xml

===============================================================>

<configuration>

<property>

                <name>mapreduce.framework.name</name>

                <value>yarn</value>

        </property>

        <property>

                <name>mapreduce.job.queuename</name>

                <value>default</value>

        </property>

        <property>

                <name>mapreduce.jobhistory.address</name>

                <value>hadoop01:10020</value>

        </property>

        <property>

                <name>mapreduce.jobhistory.done-dir</name>

                <value>/mr-history/done</value>

        </property>

        <property>

                <name>mapreduce.jobhistory.intermediate-done-dir</name>

                <value>/mr-history/tmp</value>

        </property>

        <property>

                <name>mapreduce.jobhistory.recovery.enable</name>

                <value>true</value>

        </property>

        <property>

                <name>mapreduce.jobhistory.recovery.store.leveldb.path</name>

                <value>/hadoop/mapreduce/jhs</value>

        </property>

        <property>

                <name>mapreduce.jobhistory.webapp.address</name>

                <value>hadoop01:19888</value>

        </property>

        <property>

                <name>mapreduce.jobhistory.done-dir</name>

                <value>/mr-history/done</value>

        </property>

        <property>

                <name>mapreduce.map.memory.mb</name>

                <value>2048</value>

                <description>对maps更大的资源限制的</description>

        </property>

        <property>

                <name>mapreduce.map.java.opts</name>

                <value>-Xmx2048m</value>

        </property>

        <property>

                <name>mapreduce.task.io.sort.mb</name>

                <value>200</value>

        </property>

        <property>

                <name>mapreduce.task.io.sort.factor</name>

                <value>20</value>

        </property>

 

        <property>

                <name>mapreduce.map.cpu.vcores</name>

                <value>2</value>

        </property>

        <property>

                <name>mapreduce.job.jvm.numtasks</name>

                <value>13</value>

        </property>

        <property>

                <name>mapreduce.input.fileinputformat.split.maxsize</name>

                <value>10485760</value>

        </property>

 

</configuration>

<================================================================

修改 yarn-site.xml 配置文件

vim yarn-site.xml

=========================================================>

<configuration>

<!-- Site specific YARN configuration properties -->

<property>

                <name>yarn.nodemanager.aux-services</name>

                <value>mapreduce_shuffle</value>

        </property>

        <property>

                <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>              <value>org.apache.hadoop.mapred.ShuffleHandler</value>

        </property>

        <property>

                <name>yarn.resourcemanager.address</name>

                <value>hadoop01:8032</value>

        </property>

        <property>

           <name>yarn.resourcemanager.scheduler.address</name>

                <value>hadoop01:8030</value>

        </property>

        <property>

     <name>yarn.resourcemanager.resource-tracker.address</name>

        <value>hadoop01:8031</value>

    </property>

        <property>

        <name>yarn.resoucemanager.admin.address</name>

        <value>hadoop01:8033</value>

    </property>

        <property>

        <name>yarn.resourcemanager.webapp.address</name>

        <value>hadoop01:8088</value>

    </property>

<!--可使用的虚拟CPU个数,默认是8,推荐将该值设值为与物理CPU核数数目相同-->

    <property>

        <name>yarn.nodemanager.resource.cpu-vcores</name>

        <value>12</value>

    </property>

        <property>

        <name>yarn.nodemanager.resource.memory-mb</name>

        <value>20480</value>

    </property>

         <!-- MapReduce作业时,每个task最少可申请内存-->

    <property>

            <name>yarn.scheduler.minimum-allocation-mb</name>

            <value>1024</value>

    </property>

        <!-- MapReduce作业时,每个task最多可申请内存 -->

        <property>

              <value>yarn.scheduler.maximum-allocation-mb</value>

                <value>10240</value>

        </property>

 

        <property>

      <name>yarn.log-aggregation-enable</name>

      <value>true</value>

    </property>

        <property>

      <name>yarn.log.server.url</name>

      <value>http://hadoop01:19888/jobhistory/logs</value>

    </property>

        <property>

      <name>yarn.log.server.web-service.url</name>

     <value>http://hadoop01:8188/ws/v1/applicationhistory</value>

    </property>

        <property>

                <name>yarn.nodemanager.remote-app-log-dir</name>

                <value>/user/container/logs</value>

        </property>

        <property>

                <name>yarn.resourcemanager.hostname</name>

                <value>hadoop01</value>

        </property>

         <!--是否启动一个线程检查每个任务正使用的虚拟内存量,如果任务超出分配值,则直接将其杀掉,默认是true-->

    <property>

        <name>yarn.nodemanager.vmem-check-enabled</name>

        <value>false</value>

    </property>

</configuration>

<=========================================================

 

修改 mapred-env.sh 配置文件

vim mapred-env.sh

 

修改capacity-scheduler.xml 配置文件(可以不配,使用默认的,只需修改1项)

vim capacity-scheduler.xml

=========================================================>

  <property>

    <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>

    <value>0.5</value>

    <description>

      Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications.

    </description>

  </property>

<==========================================================

配置slaves 文件 (此文件配置那几个节点为DataNode和NodeManager)

vim slaves

 

将hadoop同步到其他节点

(提醒下,你发送前可以把hadoop中的share/doc这个目录下的东西删掉,因为是些帮助文档,太大了,影响传输速度所以。。。rm -rf share/doc)

scp -r hadoop-2.9.0/  [email protected]:/zywa/

scp -r hadoop-2.9.0/  [email protected]:/zywa/

启动过程

1)首先zookeeper已经启动好了吧(三台都要启动)

开启命令 zkServer.sh start

2)启动三台journalnode(这个是用来同步两台namenode的数据的)

hadoop-deamon.sh start journalnode

3)操作namenode(只要格式化一台,另一台同步,两台都格式化,你就做错了!!)

格式化hadoop01:hdfs namenode -format

启动刚格式化好的namenode:hadoop-deamon.sh start namenode

在第二台机器上同步namenode的数据:/hdfs namenode -bootstrapStandby

启动第二台的namenode:hadoop-deamon.sh start namenode

4)查看web(这里应该两台都是stanby)

http://http://172.10.10.137:50070

http://http://172.10.10.138:50070

5)然后手动切换namenode状态

手动切换namenode状态(也可以在第一台切换第二台为active,毕竟一个集群)

$ hdfs haadmin -transitionToActive nn1 ##切换成active

$ hdfs haadmin -transitionToStandby nn1 ##切换成standby

注: 如果不让你切换的时候,hdfs haadmin -transitionToActive hadoop01 -forceactive

也可以直接通过命令行查看namenode状态,hdfs haadmin -getServiceState hadoop01

6)这时候就应该配置自动故障转移了!(其实完整的配置我在上面已经给过了)

首先你要把你的集群完整的关闭,一定要全关了!!

自动故障转移的配置其实要在zookeeper上生成一个节点 hadoop-ha,这个是自动生成的,通过下面的命令生成:

hdfs zkfc -formatZK

然后你登录zookeeper的客户端,就是zkCli.sh里面通过 “ls /” 可以看到多了个节点

这时候讲道理集群应该是没问题了!

你可以直接通过start-dfs.sh去启动hdfs,默认会启动zkfc的,其实就是一个自动故障转移的进程,会在你的namenode存在的两台机器上有这么一个节点。

等到完全启动了之后,就可以kill掉active的namenode,你就会发现stanby的机器变成active,然后再去启动那台被你kill掉的namenode(启动起来是stanby的状态),然后你再去kill掉active,stanby的机器又会变成active,到此你的HA自动故障转移已经完成了。

 

 

启动historyserver

mr-jobhistory-daemon.sh start historyserver

 

启动成功

 

配置resourceManager HA:

 

修改yarn-site.xml配置文件:

vim yarn-site.xml

=====================================================>

<configuration>

  <property>

        <name>yarn.nodemanager.aux-services</name>

        <value>mapreduce_shuffle</value>

  </property>

   <property>

    <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>

      <value>org.apache.hadoop.mapred.ShuffleHandler</value>

   </property>

        <!--可使用的虚拟CPU个数,默认是8,推荐将该值设值为与物理CPU核数数目相同-->

   <property>

        <name>yarn.nodemanager.resource.cpu-vcores</name>

        <value>6</value>

    </property>

        <property>

        <name>yarn.nodemanager.resource.memory-mb</name>

        <value>20480</value>

    </property>

         <!-- MapReduce作业时,每个task最少可申请内存-->

    <property>

            <name>yarn.scheduler.minimum-allocation-mb</name>

            <value>1024</value>

    </property>

        <!-- MapReduce作业时,每个task最多可申请内存 -->

    <property>

            <value>yarn.scheduler.maximum-allocation-mb</value>

            <value>10240</value>

    </property>

    <property>

<name>yarn.log-aggregation-enable</name>

<value>true</value>

    </property>

    <property>

            <name>yarn.nodemanager.remote-app-log-dir</name>

            <value>/user/container/logs</value>

    </property>

     <!--是否启动一个线程检查每个任务正使用的虚拟内存量,如果任务超出分配值,则直接将其杀掉,默认是true-->

    <property>

<name>yarn.nodemanager.vmem-check-enabled</name>

<value>false</value>

    </property>

        <!--启用resourcemanager ha-->

<!--是否开启RM ha,默认是开启的-->

<property>

<name>yarn.resourcemanager.ha.enabled</name>

<value>true</value>

</property>

<!--声明两台resourcemanager的地址-->

<property>

<name>yarn.resourcemanager.cluster-id</name>

<value>rmcluster</value>

</property>

<property>

<name>yarn.resourcemanager.ha.rm-ids</name>

<value>rm1,rm2</value>

</property>

<property>

<name>yarn.resourcemanager.hostname.rm1</name>

<value>hadoop01</value>

</property>

<property>

<name>yarn.resourcemanager.hostname.rm2</name>

<value>hadoop02</value>

</property>

<!--指定zookeeper集群的地址-->

<property>

<name>yarn.resourcemanager.zk-address</name>

<value>hadoop01:2181,hadoop02:2181,hadoop03:2181</value>

</property>

<!--启用自动恢复,当任务进行一半,rm坏掉,就要启动自动恢复,默认是false-->

<property>

<name>yarn.resourcemanager.recovery.enabled</name>

<value>true</value>

</property>

<!--指定resourcemanager的状态信息存储在zookeeper集群,默认是存放在FileSystem里面。-->

<property>

<name>yarn.resourcemanager.store.class</name>

<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>

</property>

</configuration>

<=============================================================

将yarn-site.xml文件分发到其他主机上

$ scp yarn-site.xml [email protected]:`pwd`

$ scp yarn-site.xml [email protected]:`pwd`

 

#在hadoop01上启动start-yarn.sh

#在hadoop02上启动yarn-daemon.sh start resourcemanager

 

 

 

 

观察web 8088端口

当hadoop01的ResourceManager是Active状态的时候,访问hadoop02的ResourceManager会自动跳转到hadoop01的web页面

测试resoureManager HA的可用性:

yarn rmadmin -getServiceState rm1 ##查看rm1的状态

yarn rmadmin -getServiceState rm2 ##查看rm2的状态

然后你可以提交一个job到yarn上面,当job执行一半(比如map执行了100%),然后kill -9 掉active的rm

这时候如果job还能够正常执行完,结果也是正确的,证明你rm自动切换成功了,并且不影响你的job运行!!!

环境配置好后,验证主从namenode master节点切换:

1、此时hadoop02为active节点:

 

 Hadoop01为standby节点:

 

2、接下来kill active节点上的namenode:

 

刷新hadoop02的web UI,发现不可用。

 

刷新hadoop01 webUI发生了变化:从最早的standby 变成了active

 

至此:主从切换正常。

注:此搭建模式未在生产检测!

 

  1. Apache hadoop HA 切换namenode报错:java.lang.RuntimeException: Unable to fence NameNode at node01

(错误日志:hadoop-hadoop-zkfc-hadoop01.log)

问题描述:完全分布式搭建完成以后无法自动切换namenode
namenode 启动以后,一个active,一个standby,将active namenode 用kill -9杀死,另一个依然是standby状态

原因:
PATH=$PATH:/sbin:/usr/sbin fuser -v -k -n tcp 9000 via ssh: bash: fuser: 未找到命令Unable to fence service by any configured methodjava.lang.RuntimeException: Unable to fence NameNode at master189/192.168.29.189:9000

提示未找到fuster程序,导致无法进行fence,所以可以通过如下命令来安装,Psmisc软件包中包含了fuster程序:

yum install psmisc

 yum i阿nstall psmisc

安装完毕,问题解决

 

 

 

                                                                                                                                                                              运维实施组

                                                                                                                                                                       2019年9月10日星期二