Oracle安全警示录:加错裸设备致使redo异常

最近一个朋友数据库异常了,咨询我,经过分析日志发现对方人员根本不懂aix中的裸设备和Oracle数据库而后就直接使用OEM建立新表空间,致使了数据库crash并且不能正常启动node

Thread 1 advanced to log sequence 4395数据库

  Current log# 1 seq# 4395 mem# 0: /dev/rorcl_redo01oracle

Thu Jun 12 19:28:38 2014dom

/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo04' SIZE 2000M EXTENT MANAGEMENT spa

LOCAL SEGMENT SPACE MANAGEMENT  AUTO 日志

ORA-1119 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo04'code

SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...orm

Thu Jun 12 19:36:23 2014ci

/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo03' SIZE 2000M EXTENT MANAGEMENT it

LOCAL SEGMENT SPACE MANAGEMENT  AUTO

Thu Jun 12 19:43:56 2014

ORA-604 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo03'

SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...

Thu Jun 12 19:48:11 2014

/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo03' SIZE 2000M EXTENT

MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO

Thu Jun 12 19:48:11 2014

ORA-1537 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo03'

 SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...

Thu Jun 12 19:48:20 2014

/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo04' SIZE 2000M EXTENT

MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO

ORA-1537 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo04'

SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...

Fri Jun 13 00:50:37 2014

Trace dumping is performing id=[cdmp_20140613005032]

Fri Jun 13 00:50:40 2014

Reconfiguration started (old inc 4, new inc 6)

List of nodes:

 0

 Global Resource Directory frozen

 * dead instance detected - domain 0 invalid = TRUE

…………

Fri Jun 13 00:50:40 2014

Beginning instance recovery of 1 threads

Reconfiguration complete

Fri Jun 13 00:50:41 2014

 parallel recovery started with 7 processes

Fri Jun 13 00:50:43 2014

Started redo scan

Fri Jun 13 00:50:43 2014

Errors in file /oracle/admin/orcl/bdump/orcl1_smon_213438.trc:

ORA-00316: log 3 of thread 2, type 0 in header is not log file

ORA-00312: online log 3 thread 2: '/dev/rorcl_redo03'

Fri Jun 13 00:50:43 2014

Errors in file /oracle/admin/orcl/bdump/orcl1_smon_213438.trc:

ORA-00316: log 3 of thread 2, type 0 in header is not log file

ORA-00312: online log 3 thread 2: '/dev/rorcl_redo03'

SMON: terminating instance due to error 316

Fri Jun 13 00:50:43 2014

Errors in file /oracle/admin/orcl/bdump/orcl1_lgwr_335980.trc:

ORA-00316: log  of thread , type  in header is not log file

Instance terminated by SMON, pid = 213438

从这里能够看出来,在使用OEM建立表空间的过程当中犯了两个错误
1. 未分清楚aix的块设备和字符设备的命名方式
2. 对于2节点正在使用的current redo做为不适用设备看成未使用设备来建立新表空间
因为建立表空间的使用了错误的文件和错误的设备,致使2节点的当前redo(/dev/rorcl_redo03)被损坏(由于先读redo header,因此数据库中优先反馈出来的是ORA-00316: log of thread , type in header is not log file).从而致使数据库2节点先crash,而后节点1进行实例恢复,可是因为2节点的current redo已经损坏,致使实例恢复没法完成,从而两个节点都crash.由于是rac的一个节点的当前redo损坏,数据库没法正常.
若是有备份该数据库可使用备份还原进行恢复,若是没有备份只能使用强制拉库的方法抢救数据.但愿不要发生一个大的数据丢失悲剧
介绍这个案例但愿给你们以警示:对数据库的裸设备操做请谨慎,不清楚切不可乱操做,不然后果严重

 

 

更多精彩Oracle内容 请关注我: