先offline数据文件,再resetlogs导致恢复复杂的故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:先offline数据文件,再resetlogs导致恢复复杂的故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

本来是一个简单的数据文件被误删除,然后通过底层恢复出来数据文件,再启动库就可以的事情,结果由于对oracle的不了解和自以为是,直接把丢失的文件不存在的情况下,offline文件,然后尝试resetlogs打开库,并且进行了各种尝试,结果使得问题比较麻烦.
故障之后现象
通过分析alert日志大概的主要错误,大概梳理故障情况

1. 启动数据库报control03.ctl丢失

Fri Apr 17 21:53:03 2026
MMNL started with pid=16, OS id=3613 
ORACLE_BASE from environment = /data/oracle
Fri Apr 17 21:53:08 2026
alter database mount
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/data/oracle/oradata/orcl/control03.ctl'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/data/oracle/oradata/orcl/control02.ctl'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-205 signalled during: alter database mount...

如果只是这个文件丢失(这里还没有看到其他数据文件丢失的报错),本身是一个非常简单的故障,直接修改control_files参数即可

2. 结果当时操作的人直接rectl

Fri Apr 17 21:57:01 2026
Successful mount of redo thread 1, with mount id 1758675116
Completed: CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 '/data/oracle/oradata/orcl/redo01.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 '/data/oracle/oradata/orcl/redo02.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 '/data/oracle/oradata/orcl/redo03.log'  SIZE 50M BLOCKSIZE 512
DATAFILE
  '/data/oracle/oradata/orcl/system01.dbf',
  '/data/oracle/oradata/orcl/sysaux01.dbf',
  '/data/oracle/oradata/orcl/undotbs01.dbf',
  '/data/oracle/oradata/orcl/users01.dbf'
CHARACTER SET ZHS16GBK

3.然后启动数据库报错

Fri Apr 17 22:02:43 2026
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 3 processes
Started redo scan
Completed redo scan
 read 39020 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 11590, block 2, scn 137806010
Recovery of Online Redo Log: Thread 1 Group 1 Seq 11590 Reading mem 0
  Mem# 0: /data/oracle/oradata/orcl/redo01.log
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 11590, block 78042, scn 137831847
 0 data blocks read, 0 data blocks written, 39020 redo k-bytes read
Fri Apr 17 22:02:44 2026
Thread 1 advanced to log sequence 11591 (thread open)
Thread 1 opened at log sequence 11591
  Current log# 2 seq# 11591 mem# 0: /data/oracle/oradata/orcl/redo02.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Apr 17 22:02:44 2026
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Dictionary check beginning
Tablespace 'TEMP' #3  found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'ERP_XXXX' #6  found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'ERP_AAAA' #7  found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'ABCD' #8  found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'ERP_BBBB' #9  found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'ERP_XXD' #10  found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'ERP_12SF' #11  found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'XXX14' #12  found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'P_ZY' #13 found in data dictionary,
but not in the controlfile. Adding to controlfile.
File #5 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00005' in the controlfile.
File #6 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00006' in the controlfile.
File #7 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00007' in the controlfile.
File #8 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00008' in the controlfile.
File #9 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00009' in the controlfile.
File #10  found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00010' in the controlfile.
File #11  found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00011' in the controlfile.
File #12 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00012' in the controlfile.

4.然后尝试resetlogs操作

Sat Apr 18 05:55:10 2026
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 1758652862
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Sat Apr 18 05:55:14 2026
ALTER DATABASE OPEN RESETLOGS
ORA-1139 signalled during: ALTER DATABASE OPEN RESETLOGS...
Sat Apr 18 05:56:29 2026
Starting ORACLE instance (normal)
ALTER DATABASE RECOVER  DATABASE UNTIL CANCEL  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 4 slaves
Sat Apr 18 05:56:29 2026
Warning: Datafile 5 (/data/oracle/orcl/xxxx.dbf) is offline during full database recovery and will not be recovered
Warning: Datafile 6 (/data/oracle/orcl/xxxx.dbf) is offline during full database recovery and will not be recovered
Warning: Datafile 7 (/data/oracle/orcl/xxxx.dbf) is offline during full database recovery and will not be recovered
Warning: Datafile 8 (/data/oracle/orcl/xxxx.dbf) is offline during full database recovery and will not be recovered
Media Recovery Not Required
Completed: ALTER DATABASE RECOVER  DATABASE UNTIL CANCEL  
Sat Apr 18 05:57:45 2026
ALTER DATABASE OPEN RESETLOGS
RESETLOGS after complete recovery through change 137865786
Resetting resetlogs activation ID 1645665187 (0x6216dba3)
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2549.trc:
ORA-00367: checksum error in log file header
ORA-00322: log 1 of thread 1 is not current copy
ORA-00312: online log 1 thread 1: '/data/oracle/oradata/orcl/redo01.log'
Sat Apr 18 05:57:45 2026
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_m000_2554.trc:
ORA-00316: log 1 of thread 1, type 0 in header is not log file
ORA-00312: online log 1 thread 1: '/data/oracle/oradata/orcl/redo01.log'
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2549.trc:
ORA-00367: checksum error in log file header
ORA-00322: log 2 of thread 1 is not current copy
ORA-00312: online log 2 thread 1: '/data/oracle/oradata/orcl/redo02.log'
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_m000_2554.trc:
ORA-00316: log 2 of thread 1, type 0 in header is not log file
ORA-00312: online log 2 thread 1: '/data/oracle/oradata/orcl/redo02.log'
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_m000_2554.trc:
ORA-00322: log 3 of thread 1 is not current copy
ORA-00312: online log 3 thread 1: '/data/oracle/oradata/orcl/redo03.log'
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2549.trc:
ORA-00367: checksum error in log file header
ORA-00322: log 3 of thread 1 is not current copy
ORA-00312: online log 3 thread 1: '/data/oracle/oradata/orcl/redo03.log'
Sat Apr 18 05:57:46 2026
Setting recovery target incarnation to 2
Sat Apr 18 05:57:46 2026
Assigning activation ID 1758652862 (0x68d2e9be)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /data/oracle/oradata/orcl/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Apr 18 05:57:46 2026
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Dictionary check beginning
File #5 is offline, but is part of an online tablespace.
data file 5: '/data/oracle/oradata/orcl/xxxx.dbf'
File #6 is offline, but is part of an online tablespace.
data file 6: '/data/oracle/oradata/orcl/xxxx.dbf'
File #7 is offline, but is part of an online tablespace.
data file 7: '/data/oracle/oradata/orcl/xxxx.dbf'
File #8 is offline, but is part of an online tablespace.
data file 8: '/data/oracle/oradata/orcl/xxxx.dbf'

到这一步悲剧基本上已经发生,犯了一个在oracle恢复里面比较忌讳的事情,有数据文件offline的情况下,执行resetlogs操作,导致部分数据文件的resetlogs信息没有被及时更新,导致一套库里面,被offline的这个部分数据文件resetlogs信息小于其他online的数据文件的。

5. 后续其他操作各种报错

Completed: ALTER DATABASE   MOUNT
Sun Apr 19 08:13:02 2026
ALTER DATABASE DATAFILE 5 OFFLINE DROP
Sun Apr 19 08:13:02 2026
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_dbw0_9212.trc  (incident=67094):
ORA-00600: internal error code, arguments: [3600], [5], [14], [], [], [], [], [], [], [], [], []
Incident details in: /data/oracle/diag/rdbms/orcl/orcl/incident/incdir_67094/orcl_dbw0_9212_i67094.trc
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_dbw0_9212.trc:
ORA-00600: internal error code, arguments: [3600], [5], [14], [], [], [], [], [], [], [], [], []
DBW0 (ospid: 9212): terminating the instance due to error 471
Tue Apr 21 22:31:23 2026
Assigning activation ID 1758985759 (0x68d7fe1f)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /data/oracle/oradata/orcl/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Apr 21 22:31:23 2026
SMON: enabling cache recovery
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4951.trc  (incident=87950):
ORA-00600: internal error code, arguments: [2662], [0], [137890858], [0], [137891091], [12583056], []
Incident details in: /data/oracle/diag/rdbms/orcl/orcl/incident/incdir_87950/orcl_ora_4951_i87950.trc
Errors in file /data/oracle/diag/rdbms/orcl/orcl/incident/incdir_87950/orcl_ora_4951_i87950.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/data/oracle/oradata/orcl/redo03.log'
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/data/oracle/oradata/orcl/redo02.log'
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/data/oracle/oradata/orcl/redo02.log'
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/data/oracle/oradata/orcl/redo03.log'
ORA-00600: internal error code, arguments: [2662], [0], [137890858], [0], [137891091], [12583056], []
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4951.trc:
ORA-00600: internal error code, arguments: [2662], [0], [137890858], [0], [137891091], [12583056], []
Errors in file /data/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4951.trc:
ORA-00600: internal error code, arguments: [2662], [0], [137890858], [0], [137891091], [12583056], []
Error 600 happened during db open, shutting down database
USER (ospid: 4951): terminating the instance due to error 600

接手故障之后分析
使用obet工具直接快速的检查坏块情况和文件头信息,关于obet的介绍参考:
obet实现对数据文件坏块检测功能
Oracle数据块编辑工具( Oracle Block Editor Tool)-obet
dbv


dbv检测没任何坏块,比较好好的消息
reset

但是检测数据文件头信息,发现有三种类型的resetlogs的信息,证明进行了多次部分文件的情况下进行了resetlogs操作

恢复处理
1. 使用Oracle Recovery Tools工具修改 resetlogs 信息
由于大量reseltogs 信息不一致,先使用Oracle Recovery Tools修改scn等相关信息Oracle Recovery Tools恢复案例总结—202505(注意选择resetlogs scn最大的文件为参照文件)
orarec

2. 重建ctl,打开库
open-db

比较幸运直接打开成功(本来也就应该成功,因为客户本身之前丢失主要业务文件的时候多次打开过库)


3. 然后增加temp文件,并expdp导出数据,完成本次恢复工作