联系:手机/微信(+86 17813235971) QQ(107644445)
标题:存储掉电强制拉库引起ORA-01555和ORA-01189/ORA-01190故障处理
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
机房存储突然掉电导致Oracle数据库访问存储异常,数据库报出大量的ORA-27072: File I/O error,Linux-x86_64 Error: 5: Input/output error,ORA-15081: failed to submit an I/O operation to a disk等错误,实例直接crash
Wed Aug 27 07:11:53 2025 Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_m000_17596.trc: ORA-27072: File I/O error Linux-x86_64 Error: 5: Input/output error Additional information: 4 Additional information: 6297632 Additional information: -1 WARNING: Read Failed. group:1 disk:0 AU:3075 offset:16384 size:16384 Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc: ORA-00202: control file: '+DG/xff/controlfile/current.284.918834897' ORA-15081: failed to submit an I/O operation to a disk WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 284 in group [1.2747812198] from disk DG_0000 allocation unit 3075 reason error; if possible, will try another mirror side Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_m000_17596.trc: ORA-00202: control file: '+DG/xff/controlfile/current.284.918834897' ORA-15081: failed to submit an I/O operation to a disk Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc: ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 16384 WARNING: Write Failed. group:1 disk:0 AU:3080 offset:49152 size:16384 Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc: ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 5: Input/output error Additional information: -1 Additional information: 16384 WARNING: Write Failed. group:1 disk:0 AU:3075 offset:49152 size:16384 Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc: ORA-15080: synchronous I/O operation to a disk failed WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 of file 284 in group 1 on disk 0 allocation unit 3075 Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc: ORA-15080: synchronous I/O operation to a disk failed WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 of file 283 in group 1 on disk 0 allocation unit 3080 Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc: ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '+DG/xff/controlfile/current.283.918834897' ORA-15081: failed to submit an I/O operation to a disk ORA-15081: failed to submit an I/O operation to a disk ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '+DG/xff/controlfile/current.284.918834897' ORA-15081: failed to submit an I/O operation to a disk ORA-15081: failed to submit an I/O operation to a disk Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc: ORA-00221: error on write to control file ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '+DG/xff/controlfile/current.283.918834897' ORA-15081: failed to submit an I/O operation to a disk ORA-15081: failed to submit an I/O operation to a disk ORA-00206: error in writing (block 3, # blocks 1) of control file ORA-00202: control file: '+DG/xff/controlfile/current.284.918834897' ORA-15081: failed to submit an I/O operation to a disk ORA-15081: failed to submit an I/O operation to a disk CKPT (ospid: 6165): terminating the instance due to error 221 Wed Aug 27 07:11:53 2025 ORA-1092 : opitsk aborting process
存储恢复之后,尝试open数据库报ORA-00333错误(该错误一般是由于redo写丢失导致)
Wed Aug 27 16:36:32 2025 ALTER DATABASE OPEN This instance was first to open Beginning crash recovery of 2 threads parallel recovery started with 31 processes Started redo scan Incomplete read from log member '+DG/xff/onlinelog/group_2.287.918834905'. Trying next member. Incomplete read from log member '+DG/xff/onlinelog/group_2.288.918834911'. Trying next member. Incomplete read from log member '+DG/xff/onlinelog/group_2.287.918834905'. Trying next member. Abort recovery for domain 0 Aborting crash recovery due to error 333 Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_10257.trc: ORA-00333: redo log read error block 1275904 count 5721 Abort recovery for domain 0 Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_10257.trc: ORA-00333: redo log read error block 1275904 count 5721 ORA-333 signalled during: ALTER DATABASE OPEN... 1 现场人员使用隐含参数,尝试直接拉库操作报ORA-00704 ORA-01555错误,导致拉库失败 1 Wed Aug 27 16:47:11 2025 ALTER DATABASE RECOVER database until cancel Media Recovery Start started logmerger process Parallel Media Recovery started with 32 slaves ORA-279 signalled during: ALTER DATABASE RECOVER database until cancel ... Wed Aug 27 16:47:56 2025 ALTER DATABASE RECOVER CONTINUE DEFAULT Media Recovery Log +DG Wed Aug 27 16:47:56 2025 Errors with log +DG Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_pr00_24154.trc: ORA-00308: cannot open archived log '+DG' ORA-17503: ksfdopn:2 Failed to open file +DG ORA-15045: ASM file name '+DG' is not in reference form ORA-308 signalled during: ALTER DATABASE RECOVER CONTINUE DEFAULT ... ALTER DATABASE RECOVER CANCEL Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_pr00_24154.trc: ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below ORA-01194: file 1 needs more recovery to be consistent ORA-01110: data file 1: '+DG/xff/datafile/system.279.918834827' Slave exiting with ORA-1547 exception Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_pr00_24154.trc: ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below ORA-01194: file 1 needs more recovery to be consistent ORA-01110: data file 1: '+DG/xff/datafile/system.279.918834827' ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ... Wed Aug 27 16:48:09 2025 alter database open resetlogs RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 25330611827626 Resetting resetlogs activation ID 3307041102 (0xc51d714e) Deleted Oracle managed file +DG/xff/onlinelog/group_1.285.918834899 Deleted Oracle managed file +DG/xff/onlinelog/group_1.286.918834901 Deleted Oracle managed file +DG/xff/onlinelog/group_2.287.918834905 Deleted Oracle managed file +DG/xff/onlinelog/group_2.288.918834911 Wed Aug 27 16:48:28 2025 Deleted Oracle managed file +DG/xff/onlinelog/group_3.289.918834917 Deleted Oracle managed file +DG/xff/onlinelog/group_3.290.918834923 Deleted Oracle managed file +DG/xff/onlinelog/group_4.293.918835035 Deleted Oracle managed file +DG/xff/onlinelog/group_4.294.918835037 Wed Aug 27 16:48:48 2025 Deleted Oracle managed file +DG/xff/onlinelog/group_5.295.918835041 Deleted Oracle managed file +DG/xff/onlinelog/group_5.296.918835047 Deleted Oracle managed file +DG/xff/onlinelog/group_6.297.918835055 Wed Aug 27 16:48:58 2025 Deleted Oracle managed file +DG/xff/onlinelog/group_6.298.918835061 Wed Aug 27 16:49:10 2025 Setting recovery target incarnation to 3 Wed Aug 27 16:49:10 2025 This instance was first to open Picked broadcast on commit scheme to generate SCNs Wed Aug 27 16:49:10 2025 Assigning activation ID 3598492411 (0xd67ca2fb) Thread 2 opened at log sequence 1 Current log# 4 seq# 1 mem# 0: +DG/xff/onlinelog/group_4.294.1210265317 Current log# 4 seq# 1 mem# 1: +DG/xff/onlinelog/group_4.293.1210265323 Successful open of redo thread 2 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Wed Aug 27 16:49:10 2025 SMON: enabling cache recovery Instance recovery: looking for dead threads Instance recovery: lock domain invalid but no dead threads ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x1709.be1eb3b1): select ctime, mtime, stime from obj$ where obj# = :1 Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_23787.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 20 with name "_SYSSMU20_1295954159$" too small Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_23787.trc: ORA-00704: bootstrap process failure ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-01555: snapshot too old: rollback segment number 20 with name "_SYSSMU20_1295954159$" too small Error 704 happened during db open, shutting down database USER (ospid: 23787): terminating the instance due to error 704 Instance terminated by USER, pid = 23787 ORA-1092 signalled during: alter database open resetlogs... opiodr aborting process unknown ospid (23787) as a result of ORA-1092
现场进行了一系列尝试操作,最后我接手数据库之时报错为:ORA-01190 ORA-01110,无法recover,也无法重建controlfile,陷入了死局
Completed: ALTER DATABASE MOUNT Sat Aug 30 10:03:20 2025 ALTER DATABASE OPEN This instance was first to open Abort recovery for domain 0 Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_6435.trc: ORA-01190: control file or data file 1 is from before the last RESETLOGS ORA-01110: data file 1: '+DG/xff/datafile/system0829.dbf' ORA-1190 signalled during: ALTER DATABASE OPEN...
Sat Aug 30 00:56:32 2025 NOTE: Loaded library: System SUCCESS: diskgroup DG was mounted Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_17302.trc: ORA-01189: file is from a different RESETLOGS than previous files ORA-01110: data file 2: '+DG/xff/datafile/sysaux.280.918834827' ORA-1503 signalled during: create controlfile reuse database xff noarchivelog noresetlogs
对于这种情况,通过Oracle recovery check脚本可以直接发现异常(WRONG RESETLOGS)
使用Oracle Recovery Tools小工具实现快速恢复

再尝试重建ctl成功

然后修改数据库scn信息,顺利open数据库

后续建议客户逻辑迁移该库