最近经历了一次较为痛苦的灾难恢复过程,在一次维护过程当中,须要shudown 整个db,但shutdown immdiate命令一直没法结束,最后不得不使用shudown abort命令,强制关闭了数据库,但打开时出现以下的00600错误:web
ORA-00600: 内部错误代码, 参数: [kclchkblk_4], [0], [1158738710], [0], [1128825042], [], [], []数据库
Wed Apr 20 21:36:40 2011session
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_5622.trc:oracle
ORA-00600: 内部错误代码, 参数: [kclchkblk_4], [0], [1158738710], [0], [1128825042], [], [], []app
Wed Apr 20 21:36:40 2011ide
Error 600 happened during db open, shutting down databasethis
USER: terminating instance due to error 600spa
看来shutdown abort不能用呀,教训沉重!!如今只能硬着头皮作数据库恢复了。开始提示system01须要介质恢复,但查询了一个controlfile的scn与数据文件头不一致,尝试作了recover database until cancel,提示恢复完成,使用alter system open resetlogs 打开时仍是一样的错误!orm
到metalink上查找[kclchkblk_4]这个错误, [ID 275902.1]说明了这种状况:事件
1) Error, ORA-600[KCLCHKBLK_4], is signaled because the SCN in a tempfile block
is too high. The same reason caused the ORA-600[2662]s in the alert logs.
2) This issue is because the tempfiles may not get reinitialized during open
resetlogs.
具体的缘由就是resetlog期间临时表空间的scn与系统scn不一致;解决办法就是在moun状态将物理的tempfile文件所有删除,而后再在打开状态添加临时文件便可。
按照这种方式处理后,打开时报出了一个新的错误:
Wed Apr 20 22:34:54 2011
SMON: enabling cache recovery
Wed Apr 20 22:34:54 2011
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_30165.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [1128985090], [0], [1158738710], [8388617], [], []
Wed Apr 20 22:34:58 2011
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_30165.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [1128985090], [0], [1158738710], [8388617], [], []
Wed Apr 20 22:34:58 2011
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Wed Apr 20 22:34:58 2011
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl1_lmon_30042.trc:
ORA-00600: ??????, ??: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 30165
2662错误在使用了_all_resetlogs_curruption等参数不彻底恢复后,常常出现的错误, 主要原缘由是当前数据库的数据块的SCN早于当前的SCN,主要是和存储在UGA变量中的dependent SCN进行比较,若是当前的SCN小于它,数据库就会产生这个ORA-600 [2662]的错误了:
Wed Apr 20 22:34:54 2011
SMON: enabling cache recovery
Wed Apr 20 22:34:54 2011
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_30165.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [1128985090], [0], [1158738710], [8388617], [], []
Wed Apr 20 22:34:58 2011
Errors in file /u01/app/oracle/admin/orcl/udump/orcl1_ora_30165.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [1128985090], [0], [1158738710], [8388617], [], []
Wed Apr 20 22:34:58 2011
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Wed Apr 20 22:34:58 2011
Errors in file /u01/app/oracle/admin/orcl/bdump/orcl1_lmon_30042.trc:
ORA-00600: ??????, ??: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 30165
2662错误的解决方法通常为使用10015事件调节scn:
alter session set events '10015 trace name adjust_scn level x';
x为level 1为增进SCN 10亿 (1 billion) (1024*1024*1024),一般Level 1已经足够。也能够根据实际状况适当调整。好比咱们这里的状况,提示1128985090小于1158738710,若是将level设置为1,新调整的scn为1073741824,这样就会小于当前的scn了,调整的数不够,将会报出另外一个为2256的错误,因此我使用level 2。
根据以往在8i/9i下的经验,这时候就应该可以打开数据库了,但是打开时仍是报出相同的错误,同时查询V$database发现scn也没有发生变化。看来调整scn 起做用,这下子就有点麻烦了。
仔细分析生成的trace文件,发如今报出2662错误以前,还报了一个ORA-01031的权限不足的错误:
Clearing ORA-1031 thrown by trace 'ADJUST_SCN'
----- Dump for trace 'ADJUST_SCN': -----
*** 2011-04-20 23:54:19.034
ksedmp: internal or fatal error
ORA-01031: 权限不足
Current SQL statement for this session:
alter database open
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
看来,确实是由于某种权限的缘由,致使了调整scn失败;但在8i/9i下这种方法是常常使用的,应该有没有问题,只能猜想Oracle 10g对10015事件作了某些修改,后来通过多方打探,包括一些朋友和QQ圈,终于在一位朋友那里知道了一个参数,_allow_error_simulation,只有这个参数设置为true的状况下,才能使用10015调整scn。向别人求助是个好习惯,但我坚定反对深夜求助!!!
在init.ora中设置这个参数,再次使用10015事件,终开打开了这个数据库;而后就是exp/imp重建,顺利收工。
这次工做的教训就是,shutdown abort必定慎用,慎重再慎重!!