存储,学习,共享。。。。javascript
公司在线上使用了CDH5 HA模式,有两个Namenode节点,结果其中的Standby节点由于一些关于edits文件的报错异常中止了,而且在启动的过程当中一直报告找不到各类文件。html
刚开始怀疑问题可能只发生在Standby自己,所以尝试了bootstrapStandby来从新初始化Standby节点,但问题依旧。然后来由于我 尝试重启ZKFC(Zookeeper Failover)服务器,致使了Active节点进行自动切换,在切换失败后打算切换回去时,也没法启动服务了,报错跟Standby节点如出一辙,于 是整个Hadoop集群就挂了。 java
问题严重,在搜遍了整个Google都找不到任何有用的信息以后,只能求助于老大。最后,老大想到一个思路,就是将fsimage(元数据)文件与edits(编辑日志)文件都反编译成文本,查看里面具体有什么内容,为何加载edits文件时会报错。node
结果,这个思路给咱们带来了曙光,并最终修复了整个集群。python
环境介绍:linux
idc2-server1: namenode, journalnode, zkfcsql
idc2-server2: namenode, journalnode, zkfcapache
idc2-server3: journalnode, resourcemanagerbootstrap
具体过程:vim
首先,是Standby Namenode上出现如下错误,而后自动异常关闭了进程:
2014-11-11 02:12:54,057 FATAL org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unknown error encountered while tailing edits. Shutting down standby NN. java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) ...
其中提到了"Unknown error encountered while tailing edits. Shutting down standby NN."
因而,咱们尝试启动Standby Namenode服务,结果报告如下错误:
2014-11-12 04:26:28,860 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding stream 'http://idc2-server10.heylinux.com:8480/getJournal?jid=idc2&segmentTxId=240823073&storageInfo=-55%3A1838233660%3A0%3ACID-d77ea84b-1b24-4bc2-ad27-7d2384d222d6' to transaction ID 2407412562014-11-12 04:26:28,874 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation CloseOp [length=0, inodeId=0, path=/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015, replication=3, mtime=1415671845582, atime=1415670522749, blockSize=134217728, blocks=[], permissions=oozie:hdfs:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=240823292] java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) ...2014-11-12 04:26:32,641 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
说找不到"/user/dong/data/dpp/classification/gender/vw-output-train /2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary /attempt_1415171013961_37060_m_000015_0/part-00015"这个文件。
而事实上,这个文件是临时文件,不重要而且已经被删除了。
但在上面,却报告"ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation CloseOp",能够看出是在加载edits文件,执行OP_CLOSE操做时提示找不到文件。
刚开始咱们怀疑可能只是Standby上的fsimage文件或edits文件有问题,因而咱们在Standby上执行了 bootstrapStandby,改过程会自动从Active Namenode上获取最新的fsimage文件,并从Journalnode日志服务器上下载并执行新的edits文件。
sudo -u hdfs hadoop namenode -bootstrapStandby
可是,在初始化以后,加载edits时仍然遇到上面相同的报错。
而接下来,因为我尝试将ZKFC(Zookeeper Failover)服务重启,致使了Active Namenode自动切换到Standby,但因为Standby没法take over,因此Active Namenode切换回去的时候,也没法正常重启了,错误跟Standby启动时同样。
这样一来,整个Hadoop集群就挂了,在搜遍了整个Google都找不到任何有用的信息以后,我打电话给了老大,老大也经过上面的错误Google不到任何一条有用的信息。因而老大尝试在edits文件中grep上面的路径,找到了一些相关的edits文件:
# cd /data1/dfs/nn/# cp -rpa current current.backup.orig# cd /data2/dfs/nn/# cp -rpa current current.backup.orig# cd /data1/dfs/nn/current# grep attempt_1415171013961_37060_m_000015_0 *Binary file edits_0000000000240687057-0000000000240698453 matches Binary file edits_0000000000240823073-0000000000240838096 matches Binary file edits_inprogress_0000000000244853266 matches
因而,咱们思考可否从这些edits文件或fsimage文件中找到一些线索。
而下面的两篇文章中,提到了Hadoop自带的针对fsimage和edits文件的反编译工具:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html
其中,关于edits文件的一些描述给咱们带来了极大的但愿:
In case there is some problem with hadoop cluster and the edits file is corrupted it is possible to save at least part of the edits file that is correct. This can be done by converting the binary edits to XML, edit it manually and then convert it back to binary.
经过以上描述,咱们了解到edits文件可能会corrupted,而反编译以后手动修改,在编译回二进制格式进行替换,能够做为一种解决办法。因而咱们将上面找到的两个关联edits文件,将其复制出来并进行了反编译:
mkdir /tmp2/ cd /data1/dfs/nn cp edits_0000000000240687057-0000000000240698453 /tmp2/ cp edits_0000000000240823073-0000000000240838096 /tmp2/ hdfs oev -i edits_0000000000240687057-0000000000240698453 -o edits_0000000000240687057-0000000000240698453.xml hdfs oev -i edits_0000000000240823073-0000000000240838096 -o edits_0000000000240823073-0000000000240838096.xml
反编译以后,生成了两个XML文件,咱们在XML文件中搜索"/user/dong/data/dpp/classification/gender /vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun /_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part- 00015",找到了OP_CLOSE与OP_DELETE相关记录:
<RECORD> <OPCODE>OP_DELETE</OPCODE> <DATA> <TXID>240818498</TXID> <LENGTH>0</LENGTH> <PATH>/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015</PATH> <TIMESTAMP>1415671972595</TIMESTAMP> <RPC_CLIENTID>4a38861d-3bee-40e6-abb6-d2b58f313781</RPC_CLIENTID> <RPC_CALLID>676</RPC_CALLID> </DATA> </RECORD>
<RECORD> <OPCODE>OP_CLOSE</OPCODE> <DATA> <TXID>240823292</TXID> <LENGTH>0</LENGTH> <INODEID>0</INODEID> <PATH>/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015</PATH> <REPLICATION>3</REPLICATION> <MTIME>1415671845582</MTIME> <ATIME>1415670522749</ATIME> <BLOCKSIZE>134217728</BLOCKSIZE> <CLIENT_NAME></CLIENT_NAME> <CLIENT_MACHINE></CLIENT_MACHINE> <PERMISSION_STATUS> <USERNAME>oozie</USERNAME> <GROUPNAME>hdfs</GROUPNAME> <MODE>420</MODE> </PERMISSION_STATUS> </DATA> </RECORD>
能够看到,对于"/user/dong/data/dpp/classification/gender/vw-output-train /2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary /attempt_1415171013961_37060_m_000015_0/part-00015",OP_DELETE发生在了 OP_CLOSE以前,所以执行OP_CLOSE时会提示"File does not exist"。
因而,咱们尝试将OP_CLOSE这部分代码,替换成其它的内容,好比无关痛痒的修改一个现有文件的权限,并保留TXID 240823292以确保edits文件的完整性。
<RECORD> <OPCODE>OP_SET_PERMISSIONS</OPCODE> <DATA> <TXID>240823292</TXID> <SRC>/user/oozie-heylinux/.staging/job_1415171013961_37194</SRC> <MODE>504</MODE> </DATA> </RECORD>
修改完成以后,再将XML文件反编译回binary格式。
cd /tmp2/ cp edits_0000000000240823073-0000000000240838096.xml edits_0000000000240823073-0000000000240838096.xml.orig vim edits_0000000000240823073-0000000000240838096.xml hdfs oev -i edits_0000000000240823073-0000000000240838096.xml -o edits_0000000000240823073-0000000000240838096 -p binary
而后将binary文件同步到journalnode日志服务器中:
cd /var/hadoop/data/dfs/jn/idc2prod/ cp -rpa current current.backup.1cd /tmp2/ cp edits_0000000000240823073-0000000000240838096 /data1/dfs/nn/current/ cp edits_0000000000240823073-0000000000240838096 /data2/dfs/nn/current/ cp edits_0000000000240823073-0000000000240838096 /var/hadoop/data/dfs/jn/idc2prod/current/ scp edits_0000000000240823073-0000000000240838096 root@idc2-server2:/var/hadoop/data/dfs/jn/idc2prod/current/ scp edits_0000000000240823073-0000000000240838096 root@idc2-server3:/var/hadoop/data/dfs/jn/idc2prod/current/
而后启动namenode服务,能够发现,之间的错误已经不存在了,取而代之的已经变成了其它文件。
2014-11-12 08:57:13,053 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding stream 'http://idc2-server1.heylinux.com:8480/getJournal?jid=idc2prod&segmentTxId=240823073&storageInfo=-55%3A1838233660%3A0%3ACID-d77ea84b-1b24-4bc2-ad27-7d2384d222d6' to transaction ID 2402992102014-11-12 08:57:13,063 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation CloseOp [length=0, inodeId=0, path=/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000018_0/part-00018, replication=3, mtime=1415671845675, atime=1415670519537, blockSize=134217728, blocks=[], permissions=oozie:hdfs:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=240823337] java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000018_0/part-00018 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) ...2014-11-12 08:57:16,847 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000018_0/part-00018 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) ...
那么,接下来,就是重复以上动做,其中有时候能找到一部分规律,能够批量将同一个目录下反复报错的文件的OP_CLOSE都替换掉。但更多的时候则是随机 的文件,须要一次次修改XML文件,而后编译成binary,再启动namenode,进行针对性的修改,一直反复的进行下去,直到Namenode可以 成功启动为止。
咱们在具体的操做过程当中,还遇到过关于OP_ADD_BLOCK的错误,形成问题的缘由是因为最后一个edits文件在反编译回binary文件时出现一些关于OP_UPDATE_BLOCK的错误。
我将报错的部分经过以上方式进行了替换,才成功的将edits文件反编译回binary文件。
具体的解决办法,就是根据"Encountered exception on operation AddBlockOp"定位到OP_ADD_BLOCK相关配置并替换便可。
2014-11-12 18:07:39,070 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/user/dong/data/dpp/classification/gender/vw-input/2014-10-30-research-with-no-confict-fix-bug-rerun/all_labelled/_temporary/1/_temporary/attempt_1415171013961_42350_m_001474_0/part-m-01474, penultimateBlock=NULL, lastBlock=blk_1109647729_35920089, RpcClientId=, RpcCallId=-2] java.lang.IllegalStateException
最后,在Namenode启动成功后,会报告不少Block丢失,解决办法是经过fsck删除这些错误的Block。
hadoop fsck / -files -blocks -locations | tee -a fsck.out
而后在fsck.out中获取全部Block的信息,执行"hadoop fsck / -move"加Block进行删除。
最后,退出safemode,生活再次变得美好起来。
hadoop dfsadmin -safemode leave"
Hadoop,Namenode