hadoop balance均衡datanode存储不起做用问题分析

时间 2019-11-30

标签 hadoop balance 均衡 datanode 存储不起问题分析栏目 Hadoop 繁體版

原文原文链接

　　前段时间由于hadoop集群各datanode空间使用率很不均衡，须要从新balance（主要是有后加入集群的2台机器磁盘空间比较大引发的），在执行以下语句：node

bin/start-balancer.sh -threshold 10

　　后，日志输出以下：mysql

Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved
Mar 10, 2014 11:03:40 AM          0                 0 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:41 AM          1                 0 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:42 AM          2               443 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:43 AM          3               443 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:44 AM          4            891.85 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:45 AM          5            891.85 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:46 AM          6            891.85 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:47 AM          7            891.85 KB           614.49 GB              20 GB
Mar 10, 2014 11:03:48 AM          8            891.85 KB           614.49 GB              20 GB
No block has been moved for 5 iterations. Exiting...
Balancing took 10.023 seconds

很明显，balancer已经计算出要移动的数据量，可是就是没有移动，这是为何呢？sql

查看hadoop-mysql-balancer-master.log并无发现Error或者Warning，那只能去看源码了。ide

原来hadoop balancer在进行转移block的时候是会判断的，具体要求见下面的代码：oop

 /* Decide if it is OK to move the given block from source to target
   * A block is a good candidate if
   * 1. the block is not in the process of being moved/has not been moved;
   * 2. the block does not have a replica on the target;
   * 3. doing the move does not reduce the number of racks that the block has
   */

private boolean isGoodBlockCandidate(Source source, 
      BalancerDatanode target, BalancerBlock block) {
    // check if the block is moved or not
    if (movedBlocks.contains(block)) {
        return false;
    }
    if (block.isLocatedOnDatanode(target)) {
      return false;
    }

    boolean goodBlock = false;
    if (cluster.isOnSameRack(source.getDatanode(), target.getDatanode())) {
      // good if source and target are on the same rack
      goodBlock = true;
    } else {
      boolean notOnSameRack = true;
      synchronized (block) {
        for (BalancerDatanode loc : block.locations) {
          if (cluster.isOnSameRack(loc.datanode, target.datanode)) {
            notOnSameRack = false;
            break;
          }
        }
      }
      if (notOnSameRack) {
        // good if target is target is not on the same rack as any replica
        goodBlock = true;
      } else {
        // good if source is on the same rack as on of the replicas
        for (BalancerDatanode loc : block.locations) {
          if (loc != source && 
              cluster.isOnSameRack(loc.datanode, source.datanode)) {
            goodBlock = true;
            break;
          }
        }
      }
    }
    return goodBlock;
  }

对照上面的3个要求，逐一排查未移动block的缘由：spa

（1）须要移动的block在本次balance的过程当中没有被移动过------这条知足；日志

（2）须要移动的block在目标机器上不存在------这条待验证；code

（3）须要移动的block，在移动后不改变每一个机架上block的数量（注意，这是的数量不是总数量，是去重之后的block数量，例如，block的备份数是2，实际上是算一个惟一的block）------因为集群在配置的时候没有添加机架感知脚本，因此默认状况下，都在1个机架上，这条知足。xml

那如今就去集群上验证第二条，果不其然，发现不少block在后面加入的2台机器上都已经存在，这还移动个屁啊，那边都已经存在了，因此balancer移动进程就退出了。blog

解决方法：

1.使用以下命令

bin/hadoop fs -setRep -R / 2

将集群中的block备份数同一设置成你在hdfs-site.xml中

<property>
<name>dfs.replication</name>
<value>2</value>
</property>

配置的备份数，而后重启hadoop集群，等hadoop完成校验blcok之后再balance便可解决问题。