前段时间由于hadoop集群各datanode空间使用率很不均衡,须要从新balance(主要是有后加入集群的2台机器磁盘空间比较大引发的),在执行以下语句:node
bin/start-balancer.sh -threshold 10
后,日志输出以下:mysql
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved Mar 10, 2014 11:03:40 AM 0 0 KB 614.5 GB 20 GB Mar 10, 2014 11:03:41 AM 1 0 KB 614.5 GB 20 GB Mar 10, 2014 11:03:42 AM 2 443 KB 614.5 GB 20 GB Mar 10, 2014 11:03:43 AM 3 443 KB 614.5 GB 20 GB Mar 10, 2014 11:03:44 AM 4 891.85 KB 614.5 GB 20 GB Mar 10, 2014 11:03:45 AM 5 891.85 KB 614.5 GB 20 GB Mar 10, 2014 11:03:46 AM 6 891.85 KB 614.5 GB 20 GB Mar 10, 2014 11:03:47 AM 7 891.85 KB 614.49 GB 20 GB Mar 10, 2014 11:03:48 AM 8 891.85 KB 614.49 GB 20 GB No block has been moved for 5 iterations. Exiting... Balancing took 10.023 seconds
很明显,balancer已经计算出要移动的数据量,可是就是没有移动,这是为何呢?sql
查看hadoop-mysql-balancer-master.log并无发现Error或者Warning,那只能去看源码了。ide
原来hadoop balancer在进行转移block的时候是会判断的,具体要求见下面的代码:oop
/* Decide if it is OK to move the given block from source to target * A block is a good candidate if * 1. the block is not in the process of being moved/has not been moved; * 2. the block does not have a replica on the target; * 3. doing the move does not reduce the number of racks that the block has */ private boolean isGoodBlockCandidate(Source source, BalancerDatanode target, BalancerBlock block) { // check if the block is moved or not if (movedBlocks.contains(block)) { return false; } if (block.isLocatedOnDatanode(target)) { return false; } boolean goodBlock = false; if (cluster.isOnSameRack(source.getDatanode(), target.getDatanode())) { // good if source and target are on the same rack goodBlock = true; } else { boolean notOnSameRack = true; synchronized (block) { for (BalancerDatanode loc : block.locations) { if (cluster.isOnSameRack(loc.datanode, target.datanode)) { notOnSameRack = false; break; } } } if (notOnSameRack) { // good if target is target is not on the same rack as any replica goodBlock = true; } else { // good if source is on the same rack as on of the replicas for (BalancerDatanode loc : block.locations) { if (loc != source && cluster.isOnSameRack(loc.datanode, source.datanode)) { goodBlock = true; break; } } } } return goodBlock; }
对照上面的3个要求,逐一排查未移动block的缘由:spa
(1)须要移动的block在本次balance的过程当中没有被移动过------这条知足;日志
(2)须要移动的block在目标机器上不存在------这条待验证;code
(3)须要移动的block,在移动后不改变每一个机架上block的数量(注意,这是的数量不是总数量,是去重之后的block数量,例如,block的备份数是2,实际上是算一个惟一的block)------因为集群在配置的时候没有添加机架感知脚本,因此默认状况下,都在1个机架上,这条知足。xml
那如今就去集群上验证第二条,果不其然,发现不少block在后面加入的2台机器上都已经存在,这还移动个屁啊,那边都已经存在了,因此balancer移动进程就退出了。blog
解决方法:
1.使用以下命令
bin/hadoop fs -setRep -R / 2
将集群中的block备份数同一设置成你在hdfs-site.xml中
<property> <name>dfs.replication</name> <value>2</value> </property>
配置的备份数,而后重启hadoop集群,等hadoop完成校验blcok之后再balance便可解决问题。