Controlfile sequence number in file header is different from the one in memory

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Controlfile sequence number in file header is different from the one in memory

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近一段时间遇到多个客户类似如下错误,由于某种原因导致rac两个节点oracle实例一段时间之后就crash,然后重启,系统非常不稳定,严重影响业务使用,一个节点alert日志有类似日志

Sat Oct 22 13:01:59 2022
Instance recovery: looking for dead threads
********************* ATTENTION: ******************** 
 The controlfile header block returned by the OS
 has a sequence number that is too old. 
 The controlfile might be corrupted.
 PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE 
 without following the steps below.
 RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE 
 TO THE DATABASE, if the controlfile is truly corrupted.
 In order to re-start the instance safely, 
 please do the following:
 (1) Save all copies of the controlfile for later 
     analysis and contact your OS vendor and Oracle support.
 (2) Mount the instance and issue: 
     ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
 (3) Unmount the instance. 
 (4) Use the script in the trace file to
     RE-CREATE THE CONTROLFILE and open the database. 
*****************************************************
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
SMON (ospid: 31905): terminating the instance
Sat Oct 22 13:02:03 2022
System state dump requested by (instance=1, osid=31905 (SMON)), summary=[abnormal instance termination].
System State dumped to trace file 
     /u02/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_31863_20221022130203.trc
Sat Oct 22 13:02:04 2022
ORA-1092 : opitsk aborting process
Instance terminated by SMON, pid = 31905

另外的一个节点错误日志如下:

Sat Oct 22 13:02:41 2022
[24610] Successfully onlined Undo Tablespace 5.
Undo initialization finished serial:0 start:152928776 end:152930636 diff:1860 (18 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Sat Oct 22 13:03:10 2022
Error: Controlfile sequence number in file header is different from the one in memory
       Please check that the correct mount options are used if controlfile is located on NFS
USER (ospid: 24610): terminating the instance
Sat Oct 22 13:03:10 2022
System state dump requested by (instance=2, osid=24610), summary=[abnormal instance termination].
System State dumped to trace file 
         /u02/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_24486_20221022130310.trc
Dumping diagnostic data in directory=[cdmp_20221022130310], 
requested by (instance=2, osid=24610), summary=[abnormal instance termination].
Instance terminated by USER, pid = 24610

通过节点的日志基本上可以确定是由于Controlfile sequence number异常导致,官方也有类似的文档描述:
Instance Crashed With “Controlfile sequence number in file header is different from the onein memory” (Doc ID 2884958.1)

oracle drop tablespace 恢复最后一招

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:oracle drop tablespace 恢复最后一招

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户由于不太熟悉oracle数据库,加入错误的数据文件到一个业务表空间,然后经过一系列操作,最终结果是做了drop tablespace xxx including contents and datafiles操作,导致表空间被删除,而且该数据库未做任何备份和归档.通过检查操作系统和数据库alert日志,确认文件已经从os层面彻底删除
20221022211123


对于这种情况,如果立即保护现场,然后通过反删除软件进行恢复,运气好还可以恢复出来被删除的数据文件,然后再通过dul之类的工具恢复其中数据,这个客户库一直没有关闭,而且尝试各种工具恢复,解决均没有正常恢复出来被删除的几个数据文件.对于这种情况,正常os层面的方法肯定无法恢复了,尝试使用基于block层面技术进行扫描磁盘恢复,结果发现运气还不错,绝大部分block都被找到,参考类似恢复方法:Oracle 数据文件大小为0kb或者文件丢失恢复通过类似分析,找出来绝大多数没有覆盖的block,恢复出来被删除的含数据的file 18,20,21,并通过检测整体恢复效果如下
20221022212822

通过dul工具结合客户提供的表定语以及获取到大表id信息,相互关联,快速恢复客户绝大多数数据,最大限度挽回客户损失.
20221022213506

对于oracle 删除表空间之类的操作,我们可以做到block层面深入恢复,理论上只要你被删除的数据文件在磁盘上还有一个block没有被覆盖,我们都可以把里面的数据恢复出来,最大限度的减少因为这种误操作而引起的损失.如果有类似需求无法自行解决,可以联系我们进行最大限度、最快速度的抢救数据.
电话/微信:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

oracle open hang 等待cursor: pin S wait on X

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:oracle open hang 等待cursor: pin S wait on X

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户19.3数据库无法在open过程hang住
20221018151321


分析alert日志

2022-10-18T15:04:57.374918+08:00
db_recovery_file_dest_size of 102400 MB is 9.58% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
2022-10-18T15:09:55.535116+08:00
ORCLPDB(4):>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=40
ORCLPDB(4):System State dumped to trace file /data/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_p000_31225.trc
2022-10-18T15:19:33.374783+08:00
ORCLPDB(4):Undo initialization recovery: err:1013 start: 1911760 end: 2790176 diff: 878416 ms (878.4 seconds)
Pdb ORCLPDB hit error 1013 during open read write (1) and will be closed.

这里比较明显,cdb本身open正常,但是其中的ORCLPDB这个pdb无法open,从而显示hang的情况.查询数据库会话情况

SQL> select event,sql_id from v$session where wait_class#<>6;

EVENT								 SQL_ID
---------------------------------------------------------------- -------------
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh

EVENT								 SQL_ID
---------------------------------------------------------------- -------------
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
row cache lock							 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
SQL*Net message to client					 1dhc13tspcmys
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh

EVENT								 SQL_ID
---------------------------------------------------------------- -------------
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh
cursor: pin S wait on X 					 8vyjutx6hg3wh

33 rows selected.

SQL> select sql_text from v$sql where sql_id='8vyjutx6hg3wh';

SQL_TEXT
--------------------------------------------------------------------------------
update /*+ rule */ undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undo
sqn=:7,xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1

SQL> col machine for a30
SQL> /

   INST_ID	  SID PADDR	       SQL_ID	     EVENT			    MACHINE	 PROGRAM
---------- ---------- ---------------- ------------- ------------------------------ ------------ ------------------------
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)

   INST_ID	  SID PADDR	       SQL_ID	     EVENT			    MACHINE	 PROGRAM
---------- ---------- ---------------- ------------- ------------------------------ ------------ -----------------------
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 4517 00000001907FEC50 8vyjutx6hg3wh row cache lock		    xifenfei	 oracle@xifenfei (P000)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)

   INST_ID	  SID PADDR	       SQL_ID	     EVENT			    MACHINE	 PROGRAM
---------- ---------- ---------------- ------------- ------------------------------ ------------ -----------------------
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)
	 1	 5645 000000018095AF40 8vyjutx6hg3wh cursor: pin S wait on X	    xifenfei	 oracle@xifenfei (P002)

31 rows selected.

SQL> l
  1  select b.INST_ID,b.sid,b.paddr,b.sql_id,b.event,b.MACHINE,b.PROGRAM from gv$session a,gv$session b
  2* where a.event='cursor: pin S wait on X'  and a.FINAL_BLOCKING_INSTANCE=b.INST_ID and
  3* a.FINAL_BLOCKING_SESSION=b.sid
SQL> 

通过上述分析,可以确认是在open pdb的过程中被cursor: pin S wait on X等待事件阻塞,而被阻塞的sql是update /*+ rule */ undo$ set …………,基于这样的情况.大概率可以确认是由于bug导致.通过查询mos,确认和Bug 30931981 – Open Reset Logs Hangs With ‘row cache lock’ and ‘cursor: pin s wait for x’ Waits (Doc ID 30931981.8)类似.
20221018200418


不过由于客户的版本是19.3,没有对应的小patch发布.通过对相关恢复事务和恢复方式进行处理,在没有对数据库版本进行任何调整的情况下,顺利打开数据库以最快的速度恢复业务
20221018200716

ORA-600 ktubko_1 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 ktubko_1 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

oracle 12.2的rac库,pdb在open成功之后,没过多久会自动crash掉,主要报错ORA-600 ktubko_1

2022-10-08T16:00:17.874444+08:00
XFF(5):Endian type of dictionary set to little
2022-10-08T16:00:18.602483+08:00
XFF(5):[218515] Successfully onlined Undo Tablespace 26.
XFF(5):Undo initialization finished serial:0 start:73483625 end:73484200 diff:575 ms (0.6 seconds)
XFF(5):Database Characterset for XFF is ZHS16GBK
2022-10-08T16:00:19.340271+08:00
Buffer Cache Full DB Caching mode changing from FULL CACHING ENABLED to FULL CACHING DISABLED 
Full DB Caching disabled: DEFAULT_CACHE_SIZE should be at least 1394670 MBs bigger than current size. 
2022-10-08T16:00:21.308122+08:00
XFF(5):Opening pdb with no Resource Manager plan active
2022-10-08T16:00:22.655433+08:00
Pluggable database XFF opened read write
Completed:  ALTER PLUGGABLE DATABASE ALL OPEN
2022-10-08T16:00:36.419719+08:00
XFF(5):Setting Resource Manager plan SCHEDULER[0x4AC8]:DEFAULT_MAINTENANCE_PLAN via scheduler window
XFF(5):Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
2022-10-08T16:00:57.054295+08:00
XFF(5):minact-scn: got error during useg scan e:1555 usn:57
XFF(5):minact-scn: useg scan erroring out with error e:1555
2022-10-08T16:01:41.527943+08:00
Errors in file /u01/app/db/diag/rdbms/orcl/orcl1/trace/orcl1_smon_218039.trc  (incident=737693) (PDBNAME=XFF):
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
XFF(5):Incident details in: /u01/app/db/diag/rdbms/orcl/orcl1/incident/incdir_737693/orcl1_smon_218039_i737693.trc
XFF(5):Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2022-10-08T16:01:41.530481+08:00
XFF(5):*****************************************************************
XFF(5):An internal routine has requested a dump of selected redo.
XFF(5):This usually happens following a specific internal error, when
XFF(5):analysis of the redo logs will help Oracle Support with the
XFF(5):diagnosis.
XFF(5):It is recommended that you retain all the redo logs generated (by
XFF(5):all the instances) during the past 12 hours, in case additional
XFF(5):redo dumps are required to help with the diagnosis.
XFF(5):*****************************************************************
2022-10-08T16:01:42.611317+08:00
XFF(5):*****************************************************************
XFF(5):An internal routine has requested a dump of selected redo.
XFF(5):This usually happens following a specific internal error, when
XFF(5):analysis of the redo logs will help Oracle Support with the
XFF(5):diagnosis.
XFF(5):It is recommended that you retain all the redo logs generated (by
XFF(5):all the instances) during the past 12 hours, in case additional
XFF(5):redo dumps are required to help with the diagnosis.
XFF(5):*****************************************************************
XFF(5):ORACLE Instance orcl1 (pid = 44) - Error 600 encountered while recovering transaction (12, 1) on object 50.
2022-10-08T16:01:42.611961+08:00
XFF(5):Errors in file /u01/app/db/diag/rdbms/orcl/orcl1/trace/orcl1_smon_218039.trc:
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
2022-10-08T16:01:42.849438+08:00
Errors in file /u01/app/db/diag/rdbms/orcl/orcl1/trace/orcl1_smon_218039.trc  (incident=737694) (PDBNAME=XFF):
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
XFF(5):Incident details in: /u01/app/db/diag/rdbms/orcl/orcl1/incident/incdir_737694/orcl1_smon_218039_i737694.trc
…………
2022-10-08T16:01:55.212368+08:00
Instance Critical Process (pid: 44, ospid: 218039, SMON) died unexpectedly
PMON (ospid: 217933): terminating the instance due to error 474
2022-10-08T16:01:55.379857+08:00
System state dump requested by (instance=1, osid=217933 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/db/diag/rdbms/orcl/orcl1/trace/orcl1_diag_217966_20221008160155.trc
2022-10-08T16:01:56.417514+08:00
ORA-1092 : opitsk aborting process

因为有smon报的ORACLE Instance orcl1 (pid = 44) – Error 600 encountered while recovering transaction (12, 1) on object xxx这种比较明显错误,基本上可以定位是undo问题.对undo异常事务进行处理,数据库顺利open,并且稳定不再crash,然后对异常对象进行处理(当然也可以逻辑迁移)
20221008220645


在oracle 12.2到18.14的rac环境的cdb库中,如果节点sga大小不一致,而且有一个节点sga大于128G,就可能出现该问题,敬请注意
20221008220914

Bug 32347014: ORA-600[4506], ORA-600[KTUBKO_1] OCCUR AND INSTANCE CRASHES

存储强制拉lun导致数据库异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:存储强制拉lun导致数据库异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户那边反馈有几个lun异常,无法正常online
20221004165647


通过存储工程师强制拉起来lun(清除掉了cache),但是数据库无法正常mount

Fri Sep 30 17:22:57 BEIST 2022
ALTER DATABASE   MOUNT
Fri Sep 30 17:22:57 BEIST 2022
This instance was first to mount
Fri Sep 30 17:22:58 BEIST 2022
Starting background process ASMB
ASMB started with pid=25, OS id=12976304
Starting background process RBAL
RBAL started with pid=26, OS id=12779520
Fri Sep 30 17:23:02 BEIST 2022
SUCCESS: diskgroup DATA was mounted
Fri Sep 30 17:23:06 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/udump/xifenfei2_ora_14549110.trc:
ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [423012], [422765], [0x000000000], [], [], [], []
Fri Sep 30 17:23:07 BEIST 2022
ORA-600 signalled during: ALTER DATABASE   MOUNT...
Fri Sep 30 17:23:07 BEIST 2022
Trace dumping is performing id=[cdmp_20220930172307]
Fri Sep 30 17:23:09 BEIST 2022
Shutting down instance (abort)
License high water mark = 1
Instance terminated by USER, pid = 9175148

可以要求保护第一现场,把asm中的数据恢复到文件系统中,然后进行恢复,由于客户是10g的环境,无法直接使用asmcmd中的cp实现此项操作,数据库也没有mount成功(无法使用rman的copy),考虑使用oracle的amdu实现此项操作需求.在拷贝过程中报AMDU-00204报错

root@xifenfei2:/recover/amduo#./amdu -diskstring '/dev/rhdiskpower*' -extract data.298 -noreport
amdu_2022_10_01_14_25_31/
AMDU-00204: file not found; arguments: [3] [DATA]
LEM-00031: Error encountered in lempgmh after calling lmserr.

通过dbv校验拷贝出来的数据文件

racle@xifenfei2:/recover#dbv file=/recover/amduo/amdu_2022_10_01_14_25_31/DATA_298.f

DBVERIFY: Release 10.2.0.5.0 - Production on Sat Oct 1 14:36:50 2022

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

DBVERIFY - Verification starting : FILE = /recover/amduo/amdu_2022_10_01_14_33_26/DATA_298.f


DBVERIFY - Verification complete

Total Pages Examined         : 262144
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 262143
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 121293100 (0.121293100)

确认此报错(AMDU-00204 LEM-00031)对于拷贝出来的数据文件无直接影响,可以忽略,拷贝出来所有文件进行重建ctl,报ORA-01159错误

SQL> CREATE CONTROLFILE REUSE DATABASE "xifenfei" NORESETLOGS  NOARCHIVELOG  
  2      MAXLOGFILES 50  
    MAXLOGMEMBERS 5  
    MAXDATAFILES 100  
    MAXINSTANCES 8  
    MAXLOGHISTORY 226  
LOGFILE  
group 5 '/recover/df/DATA_262.f'  size 200M ,
group 2 '/recover/df/DATA_266.f'  size 200M ,
group 1 '/recover/df/DATA_267.f'  size 200M ,
group 3 '/recover/df/DATA_281.f'  size 200M ,
group 4 '/recover/df/DATA_282.f'  size 200M ,
group 6 '/recover/df/DATA_283.f'  size 200M
DATAFILE  
'/recover/df/DATA_295.f',
'/recover/df/DATA_298.f',
'/recover/df/DATA_272.f',
'/recover/df/DATA_273.f',
'/recover/df/DATA_296.f',
'/recover/df/DATA_274.f',
'/recover/df/DATA_276.f',
'/recover/df/DATA_277.f',
'/recover/df/DATA_275.f',
'/recover/df/DATA_279.f',
'/recover/df/DATA_278.f',
'/recover/df/DATA_288.f',
'/recover/df/DATA_269.f',
'/recover/df/DATA_300.f',
'/recover/df/DATA_264.f',
'/recover/df/DATA_287.f',
'/recover/df/DATA_280.f',
'/recover/df/DATA_286.f',
'/recover/df/DATA_268.f',
'/recover/df/DATA_285.f',
'/recover/df/DATA_297.f'
CHARACTER SET UTF8  
;  
 37  CREATE CONTROLFILE REUSE DATABASE "xifenfei" NORESETLOGS  NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01159: file is not from same database as previous files - wrong database id
ORA-01110: data file 3: '/recover/df/DATA_288.f'

由于部分文件不是该库的,通过进一步分析,除掉不是该库的文件,重建ctl文件成功.尝试recover数据库,报大量ORA-07445错误,由于cache丢失redo损坏导致,此类操作可能导致数据文件损坏【恢复需要谨慎,最好对数据文件做一次备份】

Sat Oct 01 16:23:16 BEIST 2022
ALTER DATABASE RECOVER  database  
Media Recovery Start
 parallel recovery started with 15 processes
Sat Oct 01 16:23:16 BEIST 2022
Recovery of Online Redo Log: Thread 1 Group 5 Seq 2202 Reading mem 0
  Mem# 0: /recover/df/DATA_262.f
Sat Oct 01 16:23:16 BEIST 2022
Recovery of Online Redo Log: Thread 2 Group 6 Seq 2394 Reading mem 0
  Mem# 0: /recover/df/DATA_283.f
Sat Oct 01 16:23:28 BEIST 2022
Recovery of Online Redo Log: Thread 2 Group 3 Seq 2395 Reading mem 0
  Mem# 0: /recover/df/DATA_281.f
Sat Oct 01 16:23:34 BEIST 2022
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2203 Reading mem 0
  Mem# 0: /recover/df/DATA_267.f
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p009_13107402.trc:
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] 
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p014_7929944.trc:
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV]
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p011_10092678.trc:
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] 
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw1_12189898.trc:
ORA-07445: exception encountered: core dump [kcbzdh+0324] [SIGSEGV] 
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p005_14549014.trc:
ORA-07445: exception encountered: core dump [kcbbufaddr2hdr+00d8] [SIGSEGV] 
Sat Oct 01 16:23:35 BEIST 2022
Hex dump of (file 6, block 10) in trace file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw0_13565988.trc
Corrupt block relative dba: 0x0180000a (file 6, block 10)
Bad header found during buffer corrupt after write
Data in bad block:
 type: 2 format: 1 rdba: 0x00000180
 last change scn: 0xa0c3.000a6eeb seq: 0x0 flg: 0x00
 spare1: 0x2 spare2: 0xa2 spare3: 0x3486
 consistency value in tail: 0x0000a0c3
 check value in block header: 0x204
 block checksum disabled
Reread of rdba: 0x0180000a (file 6, block 10) found different data
Sat Oct 01 16:23:35 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw0_13565988.trc:
ORA-07445: exception encountered: core dump [kcbbiop+01b8] [SIGSEGV] [Invalid permissions for mapped object] 
Sat Oct 01 16:23:36 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p014_7929944.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] [Address not mapped to object]
Sat Oct 01 16:23:36 BEIST 2022
Trace dumping is performing id=[cdmp_20221001162336]
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p011_10092678.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] [Address not mapped to object] 
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p009_13107402.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbzfc+00dc] [SIGSEGV] [Address not mapped to object]
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw1_12189898.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbzdh+0324] [SIGSEGV] [Address not mapped to object]
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p005_14549014.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbbufaddr2hdr+00d8] [SIGSEGV] [Address not mapped to object] 
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_dbw0_13565988.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbbiop+01b8] [SIGSEGV] [Invalid permissions for mapped object]
Sat Oct 01 16:23:37 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p005_14549014.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbbufaddr2hdr+00d8] [SIGSEGV] [Address not mapped to object] 
Sat Oct 01 16:23:38 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p010_9502918.trc:
ORA-00600: internal error code, arguments: [3020], [3], [31913], [2], [2395], [210418], [16], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 31913)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: '/recover/df/DATA_278.f'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Sat Oct 01 16:23:39 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p010_9502918.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-00600: internal error code, arguments: [3020], [3], [31913], [2], [2395], [210418], [16], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 31913)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: '/recover/df/DATA_278.f'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Sat Oct 01 16:23:39 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_pmon_14418166.trc:
ORA-00471: DBWR process terminated with error
Sat Oct 01 16:23:39 BEIST 2022
PMON: terminating instance due to error 471
Sat Oct 01 16:23:40 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/bdump/xifenfei2_p010_9502918.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state+027c] [SIGSEGV] 
ORA-00600: internal error code, arguments: [3020], [3], [31913], [2], [2395], [210418], [16], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 31913)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: '/recover/df/DATA_278.f'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Sat Oct 01 16:23:46 BEIST 2022
Dump system state for local instance only
System State dumped to trace file /home/oracle/admin/xifenfei/bdump/xifenfei2_diag_15401212.trc
Sat Oct 01 16:23:46 BEIST 2022
Trace dumping is performing id=[cdmp_20221001162346]
Sat Oct 01 16:23:49 BEIST 2022
Instance terminated by PMON, pid = 14418166

绕过redo,直接强制启动库,报ORA-01092错误

QL> startup mount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area      10737418240 bytes
Fixed Size                        2114208 bytes
Variable Size                  1560284512 bytes
Database Buffers               9160359936 bytes
Redo Buffers                     14659584 bytes
Database mounted.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

分析alert日志,确认是由于undo异常导致

Additional information: 3
Sat Oct 01 17:25:21 BEIST 2022
Setting recovery target incarnation to 2
Sat Oct 01 17:25:21 BEIST 2022
Assigning activation ID 1094862311 (0x414245e7)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /recover/df/DATA_267.f1
Successful open of redo thread 1
Sat Oct 01 17:25:21 BEIST 2022
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Oct 01 17:25:21 BEIST 2022
SMON: enabling cache recovery
Sat Oct 01 17:25:23 BEIST 2022
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0000.6ee0bde5):
Sat Oct 01 17:25:23 BEIST 2022
select ctime, mtime, stime from obj$ where obj# = :1
Sat Oct 01 17:25:23 BEIST 2022
Errors in file /home/oracle/admin/xifenfei/udump/xifenfei2_ora_19726450.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 8 with name "_SYSSMU8$" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 19726450
ORA-1092 signalled during: alter database open resetlogs...

类似此类错误的解决方案,以前写过参考:
在数据库open过程中常遇到ORA-01555汇总
数据库open过程遭遇ORA-1555对应sql语句补充
Oracle Recovery Tools恢复—ORA-00704 ORA-01555故障
使用_allow_resetlogs_corruption导致ORA-00704/ORA-01555故障
解决该问题,数据库启动正常,逻辑导出数据,导入数据完成此次恢复任务,实现绝大部分数据恢复