HBase问题修复

转载自HBase问题修复java

[hbase版本1.1.2]linux

【第一次检查】apache

执行命令: hbase hbck -details "default:test_tony" > 20171227_hbck_test_tony 2>&1 &

查看执行日志 20171227_hbck_test_tony 发现3种错误:[1] First region should start with an empty key、[2] Region not deployed on any region server、[3] a hole in the region chain

日志摘要以下 [hbase@kmr-core1-001 ~]$ less 20171227_hbck_test_tony ... ... ERROR: (region test_tony,P_4013488,1512319359517.c6892d77b1ea148f3e0642d9fdce68af.) First region should start with an empty key.  You need to  create a new region and regioninfo in HDFS to plug the hole. 'ksai:export_import_table_test': There is a hole in the region chain between P_802083 and P_901927D.  You need to create a new .regioninfo and region dir in hdfs to plug the hole. ERROR: There is a hole in the region chain between P_A011FE2 and P_B00FAE.  You need to create a new .regioninfo and region dir in hdfs to plug the hole. ---- Table 'test_tony': overlap groups There are 0 overlap groups with 0 overlapping regions ERROR: Found inconsistency in table test_tony 2017-12-27 16:27:47,925 INFO  [main] util.HBaseFsck: Computing mapping of all store files ... ... ERROR: Region { meta => test_tony,P_2009AE6,1511517508282.1dc960c2cb30a897ec50bbb656e9faa9., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/1dc960c2cb30a897ec50bbb656e9faa9, deployed => , replicaId => 0 } not deployed on any region server. ERROR: Region { meta => test_tony,P_F007E:,1511757663778.5534b6665b97db8d8a7851955ccd8dc5., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/5534b6665b97db8d8a7851955ccd8dc5, deployed => , replicaId => 0 } not deployed on any region server. ERROR: Region { meta => test_tony,P_802083,1513409109894.57ff7ad1a791fb0f8c70f751108d55dd., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/57ff7ad1a791fb0f8c70f751108d55dd, deployed => , replicaId => 0 } not deployed on any region server. ERROR: Region { meta => test_tony,,1511517508282.d0a9bbf43b36b122b7c7f4256f9cdba4., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/d0a9bbf43b36b122b7c7f4256f9cdba4, deployed => , replicaId => 0 } not deployed on any region server. ERROR: Region { meta => test_tony,P_A011FE2,1511537471096.f5777d857532db7ac592e3b621c0372e., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/f5777d857532db7ac592e3b621c0372e, deployed => , replicaId => 0 } not deployed on any region server. ... ... 2017-12-27 16:27:49,637 INFO  [main] util.HBaseFsck: Finishing hbck Summary: Table hbase:meta is okay.     Number of regions: 1     Deployed on:  kmr-5b9c18fc-gn-7b3518df-core-1-005.ksc.com,16020,1514187996430 Table test_tony is inconsistent.     Number of regions: 8     Deployed on:  kmr-5b9c18fc-gn-7b3518df-core-1-001.ksc.com,16020,1514188220947 kmr-5b9c18fc-gn-7b3518df-core-1-003.ksc.com,16020,1514187978942 kmr-5b9c18fc-gn-7b3518df-core-1-004.ksc.com,16020,1514187984473 kmr-5b9c18fc-gn-7b3518df-core-1-005.ksc.com,16020,1514187996430 kmr-5b9c18fc-gn-7b3518df-core-1-006.ksc.com,16020,1514188010078 kmr-5b9c18fc-gn-7b3518df-core-1-008.ksc.com,16020,1514188032392 9 inconsistencies detected. Status: INCONSISTENTapp

阻塞在了转移一个region到其余region server的过程当中,可能的缘由是源region未分配给其余region server,可能的解决办法是:在保证这个hfile存在的前提下,手动强制assign该region

【第一次修复】less

注意:修复此表以前先停掉它

hbase(main):022:0> disable 'default:test_tony' 0 row(s) in 4.9250 secondsoop

再执行命令: hbase hbck -repair "default:test_tony" > 20171227_tried_repaired_test_tony 2>&1 &

查看执行日志 20171227_tried_repaired_test_tony 发现上述的三种错误中的两种已修复,可是 Region not deployed on any region server错误还没能修复

日志摘要以下 ERROR: Region { meta => test_tony,P_2009AE6,1511517508282.1dc960c2cb30a897ec50bbb656e9faa9., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/1dc960c2cb30a897ec50bbb656e9faa9, deployed => , replicaId => 0 } not deployed on any region server. ERROR: Region { meta => test_tony,P_F007E:,1511757663778.5534b6665b97db8d8a7851955ccd8dc5., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/5534b6665b97db8d8a7851955ccd8dc5, deployed => , replicaId => 0 } not deployed on any region server. Trying to fix unassigned region... Trying to fix unassigned region... ERROR: Region { meta => test_tony,,1511517508282.d0a9bbf43b36b122b7c7f4256f9cdba4., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/d0a9bbf43b36b122b7c7f4256f9cdba4, deployed => , replicaId => 0 } not deployed on any region server. Trying to fix unassigned region... ERROR: Region { meta => test_tony,P_802083,1513409109894.57ff7ad1a791fb0f8c70f751108d55dd., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/57ff7ad1a791fb0f8c70f751108d55dd, deployed => , replicaId => 0 } not deployed on any region server. Trying to fix unassigned region... ERROR: Region { meta => test_tony,P_A011FE2,1511537471096.f5777d857532db7ac592e3b621c0372e., hdfs => hdfs://hdfs-ha/apps/hbase/data/data/default/test_tony/f5777d857532db7ac592e3b621c0372e, deployed => , replicaId => 0 } not deployed on any region server. Trying to fix unassigned region....net

【第二次修复】命令行

执行命令: hbase hbck -fixMeta -fixAssignments "default:test_tony" > 20171227_2nd_tried_repaired_test_tony 2>&1 &

查看执行日志 20171227_2nd_tried_repaired_test_tony 发现上面的三种错误中[1]和[3]修好了,可是[2]还没修复,并且新发现了错误 [4] Region failed to move out of transition within timeout XXXXXXms

日志摘要以下 2017-12-27 17:07:19,032 WARN  [hbasefsck-pool1-t42] util.HBaseFsck: Unable to complete check or repair the region 'test_tony,P_802083,1513409109894.57ff7ad1a791fb0f8c70f751108d55dd.'. java.io.IOException: Region {ENCODED => 57ff7ad1a791fb0f8c70f751108d55dd, NAME => 'test_tony,P_802083,1513409109894.57ff7ad1a791fb0f8c70f751108d55dd.', STARTKEY => 'P_802083', ENDKEY => 'P_901927D'} failed to move out of transition within timeout 120000ms         at org.apache.hadoop.hbase.util.HBaseFsckRepair.waitUntilAssigned(HBaseFsckRepair.java:149)         at org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:2114)         at org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:2315)         at org.apache.hadoop.hbase.util.HBaseFsck.access$1100(HBaseFsck.java:197)         at org.apache.hadoop.hbase.util.HBaseFsck$CheckRegionConsistencyWorkItem.call(HBaseFsck.java:1887)         at org.apache.hadoop.hbase.util.HBaseFsck$CheckRegionConsistencyWorkItem.call(HBaseFsck.java:1875)         at java.util.concurrent.FutureTask.run(FutureTask.java:266)         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)         at java.util.concurrent.FutureTask.run(FutureTask.java:266)         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)         at java.lang.Thread.run(Thread.java:745) 2017-12-27 17:05:21,028 INFO  [hbasefsck-pool1-t28] util.HBaseFsckRepair: Region still in transition, waiting for it to become assigned: {ENCODED => f5777d857532db7ac592e3b621c0372e, NAME => 'test_tony,P_A011FE2,1511537471096.f5777d857532db7ac592e3b621c0372e.', STARTKEY => 'P_A011FE2', ENDKEY => 'P_B00FAE'} 2017-12-27 17:05:22,028 INFO  [hbasefsck-pool1-t26] util.HBaseFsckRepair: Region still in transition, waiting for it to become assigned: {ENCODED => d0a9bbf43b36b122b7c7f4256f9cdba4, NAME => 'test_tony,,1511517508282.d0a9bbf43b36b122b7c7f4256f9cdba4.', STARTKEY => '', ENDKEY => 'P_2009AE6'}日志

出现错误[4]以及Region still in transition(RIT)的缘由是,该region被原来的Region Server unassigned了,可是尚未被assigned到一个新的RS上,处于无主状态。

【第三次修复】server

在linux命令行执行 hbase hbck -fixAssignments 'default:test_tony' > 20171227_toFix-_an_empty_key 2>&1 &

查看执行日志 20171227_toFix-_an_empty_key ,成功修复! 

Summary: Table hbase:meta is okay.     Number of regions: 1     Deployed on:  kmr-5b9c18fc-gn-7b3518df-core-1-005.ksc.com,16020,1514187996430 Table test_tony is okay.     Number of regions: 0     Deployed on:  0 inconsistencies detected. Status: OK

再校验一下数据是否是好了

hbase hbck -fixAssignments 'default:test_tony' > 20171227_After_fixAssgnmentOf_should_end_with_an_empty_key_FORtest_tony 2>&1 &

查看执行日志 20171227_After_fixAssgnmentOf_should_end_with_an_empty_key_FORtest_tony 发现Status: OK

成功修复之后,再从新启用表

hbase(main):028:0> enable 'default:test_tony' 0 row(s) in 2.3100 seconds

至此,修复完成。

做者:Tony_仔 来源:CSDN 原文:http://www.javashuo.com/article/p-rzcggimr-ec.html 版权声明:本文为博主原创文章,转载请附上博文连接!