记录一次oracle现场故障处理经过

联系:手机/微信(+86 13429648788) QQ(107644445)QQ咨询惜分飞

标题:记录一次oracle现场故障处理经过

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

近期到现场进行了一个数据库恢复,我在恢复之前该库先由于硬件进行恢复,然后由其他人对其进行了一系列数据库恢复,但是未恢复成功,客户希望我们到现场进行处理(因为网络原因无法远程).接手库之后,处理第一个问题,是客户在进行现场备份的时候(把linux数据拷贝到win的过程中)发现有几个文件拷贝异常,这个错误很可能是由于当初的硬件故障修复之后留下的后遗症(由于io设备错误,无法运行此项请求),通过工具进行拷贝,恢复出来
20210403210131


DUL> copy file from  /oradata2/xifenfeidata.dbf to /oradata2/xifenfeidata.dbf

starting copy datafile '/oradata1/xifenfeidata.dbf' to '/oradata2/xifenfeidata.dbf'
read data error from file '/oradata1/xifenfeidata.dbf'.error message:Input/output error
read block# error: 560171
read data error from file '/oradata1/xifenfeidata.dbf'.error message:Input/output error
read block# error: 560179
datafile copy completed with 2 block error.
[oracle@localhost ~]$ dbv file=/oradata2/xifenfeidata.dbf blocksize=16384

DBVERIFY: Release 11.2.0.3.0 - Production on Mon Mar 29 17:28:17 2021

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oradata2/xifenfeidata.dbf
Page 560171 is marked corrupt
Corrupt block relative dba: 0x3bc88c2b (file 239, block 560171)
Completely zero block found during dbv: 

Page 560179 is marked corrupt
Corrupt block relative dba: 0x3bc88c33 (file 239, block 560179)
Completely zero block found during dbv: 



DBVERIFY - Verification complete

Total Pages Examined         : 4194302
Total Pages Processed (Data) : 2230726
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 1936953
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 26618
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 3
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 304929867 (106.304929867)

修复完相关无法拷贝文件之后,启动数据库报控制文件异常

Mon Mar 29 15:03:38 2021
alter database mount
USER (ospid: 29044): terminating the instance
Mon Mar 29 15:03:42 2021
System state dump requested by (instance=1, osid=29044), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/xff/xff/trace/xff_diag_28961.trc
Instance terminated by USER, pid = 29044

尝试重建ctl

[oracle@localhost ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Mon Mar 29 17:40:17 2021

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 1.7704E+10 bytes
Fixed Size                  2235568 bytes
Variable Size            2348811088 bytes
Database Buffers         1.5301E+10 bytes
Redo Buffers               52580352 bytes
SQL> @/tmp/ctl.sql
CREATE CONTROLFILE REUSE DATABASE xff NORESETLOGS  NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01189: file is from a different RESETLOGS than previous files
ORA-01110: data file 249: '/oradata/xff/system03.dbf'

初步判断是由于对方之前恢复导致部分文件resetlogs scn异常,通过bbed进行判断确认

BBED> set file 1
        FILE#           1

BBED> p kcvfhrls
struct kcvfhrls, 8 bytes                    @116     
   ub4 kscnbas                              @116      0x00000001
   ub2 kscnwrp                              @120      0x0000

BBED> set file 249
        FILE#           249

BBED> p kcvfhrls
struct kcvfhrls, 8 bytes                    @116     
   ub4 kscnbas                              @116      0x00000001
   ub2 kscnwrp                              @120      0x0000

通过bbed修改相关值,然后重建控制文件成功,尝试resetlogs库,报ORA-01248错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01248: file 234 was created in the future of incomplete recovery
ORA-01110: data file 234: '/oradata1/xifenfeidata5.DBF'

关于ORA-01248的错误解释

01248, 00000, "file %s was created in the future of incomplete recovery"
// *Cause:  Attempting to do a RESETLOGS open with a file entry in the
//          control file that was originally created after the UNTIL time 
//          of the incomplete recovery.
//          Allowing such an entry may hide the version of the file that 
//          is needed at this time.  The file number may be in use for 
//          a different file which would be lost if the RESETLOGS was allowed.
// *Action: If more recovery is desired then apply redo until the creation
//          time of the file is reached. If the file is not wanted and the
//          same file number is not in use at the stop time of the recovery,
//          then the file can be taken offline with the FOR DROP option.
//          Otherwise a different control file is needed to allow the RESETLOGS.
//          Another backup can be restored and recovered, or a control file can
//          be created via CREATE CONTROLFILE.

大概的意思是文件的创建时间大于文件当前的scn,通过查询确实如此

SQL> select file#,CREATION_CHANGE#,CREATION_TIME from v$datafile_header where file#=234;

           FILE# CREATION_CHANGE# CREATION_
---------------- ---------------- ---------
             234     419298664864 02-AUG-19

SQL> SELECT status,  
  2  to_char(checkpoint_change#,'9999999999999999') "SCN",
  3  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,
  4  count(*) ROW_NUM
  5  FROM v$datafile_header
  6  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  SCN               CHECKPOINT_TIME     FUZ          ROW_NUM
------- ----------------- ------------------- --- ----------------
ONLINE       417750848223 2021-02-23 23:50:46 YES                7
ONLINE       417750848223 2021-03-21 11:44:25 NO               396

通过对部分scn进行修改(比如减小创建时间的scn),然后尝试resetlogs库

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 5 with name
"_SYSSMU5_2708889888$" too small
Process ID: 3182
Session ID: 1 Serial number: 3

这个错误比较简单,参考以前的部分文章:在数据库open过程中常遇到ORA-01555汇总数据库open过程遭遇ORA-1555对应sql语句补充,处理之后,数据库open成功

SQL> startup mount;
ORACLE instance started.

Total System Global Area 1.7704E+10 bytes
Fixed Size                  2235568 bytes
Variable Size            2348811088 bytes
Database Buffers         1.5301E+10 bytes
Redo Buffers               52580352 bytes
Database mounted.
SQL> alter database open;

Database altered.

本次数据库恢复基本上完成,已经最大限度恢复数据,导出数据到新库,完成恢复任务

ORA-600 16703直接把orachk备份表插入到tab$恢复

联系:手机/微信(+86 13429648788) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 16703直接把orachk备份表插入到tab$恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有一个朋友和我说,他们数据库出现了以下错误ORA-600 16703 错误
20210324195416


他们是在虚拟化环境中,可以恢复到上一个快照点,但是主机启动之后,数据库依旧异常,让我们进行处理

C:\Users\Administrator>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Wed Mar 24 17:04:01 2021

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select open_mode from v$database;

OPEN_MODE
--------------------
READ WRITE

SQL> select count(1) from tab$;

  COUNT(1)
----------
         0

很明显tab$已经被清空,数据库无法正常使用.因为库没有crash,尝试把备份的orachk表插入进来

SQL> insert into tab$ select * from ORACHKB514061BDCB10EBA9CF58F3;

6318 rows created.

SQL> commit;

Commit complete.

SQL> select 'DROP TRIGGER '||owner||'."'||TRIGGER_NAME||'";' from dba_triggers w
here TRIGGER_NAME like 'DBMS_%_INTERNAL% '
  2  union all
  3  select 'DROP PROCEDURE '||owner||'."'||a.object_name||'";' from dba_procedu
res a where a.object_name like 'DBMS_%_INTERNAL% '
  4  union all
  5  select 'drop '||object_type||' '||owner||'.'||object_name||';' from dba_obj
ects where object_name in('DBMS_SUPPORT_DBMONITOR','DBMS_SUPPORT_DBMONITORP');

'DROPTRIGGER'||OWNER||'."'||TRIGGER_NAME||'";'
--------------------------------------------------------------------------------

drop PROCEDURE SYS.DBMS_SUPPORT_DBMONITORP;
drop TRIGGER SYS.DBMS_SUPPORT_DBMONITOR;

SQL> drop PROCEDURE SYS.DBMS_SUPPORT_DBMONITORP;

Procedure dropped.

SQL> drop TRIGGER SYS.DBMS_SUPPORT_DBMONITOR;

Trigger dropped.

SQL> commit;

Commit complete.

SQL>

重启数据库,该故障恢复完成,数据完美恢复0丢失.

硬件恢复之后,数据库无法open故障恢复

联系:手机/微信(+86 13429648788) QQ(107644445)QQ咨询惜分飞

标题:硬件恢复之后,数据库无法open故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于硬件故障,客户恢复硬件之后,数据库无法正常启动,报ORA-00354 ORA-00353错误

/tmp/> sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Mon Mar 1 17:10:30 2021

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> recover database;

ORA-00283: recovery session canceled due to errors
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 86088 change 16135545783340 time
02/23/2021 13:53:24
ORA-00312: online log 2 thread 1: '/oradata02/redo02b.log'
ORA-00312: online log 2 thread 1: '/oradata01/redo02a.log'

由于redo损坏,数据库无法继续正常恢复,通过屏蔽一致性,force open库

SQL> alter database open resetlogs;
alter database open resetlogs 
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [3756], [3571444619], [3756], [3648471803], [4194545]
Process ID: 5104
Session ID: 576 Serial number: 3

这个错误比较简单,是由于scn问题导致,修改数据库scn启动库

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4193], [], [], [], [], [], []
Process ID: 5536
Session ID: 576 Serial number: 1

这个错误比较明显,修改回滚段,尝试启动库

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 6033
Session ID: 576 Serial number: 3

数据库依旧无法正常open,alert日志报错如下

ARC3 started with pid=30, OS id=6078 
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Exception[type:SIGSEGV Address not mapped to object][ADDR:0x60173487F5][PC:0xC003B1C20,_memcpy()+64][flags:0x0,count:1]
Exception[type:SIGSEGV,Address not mapped to object][ADDR:0x60173487F5][PC:0xC003B1C20,_memcpy()+64][flags:0x2,count:2]
Exception[type:SIGSEGV,Address not mapped to object][ADDR:0x60173487F5][PC:0xC003B1C20,_memcpy()+64][flags:0x2,count:2]
Archived Log entry 2 added for thread 1 sequence 2 ID 0x506cafbb dest 1:
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Mon Mar 01 17:44:44 2021
PMON (ospid: 5993): terminating the instance due to error 397
Mon Mar 01 17:44:45 2021
System state dump requested by (instance=1, osid=5993 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/diag/rdbms/xff/xff/trace/xff_diag_6001.trc
Instance terminated by PMON, pid = 5993

通过其启动过程分析,发现数据库卡在如下对象:

PARSING IN CURSOR #11529215044940435280 len=148 dep=1 uid=0 oct=6 lid=0 tim=223080942765 
hv=3540833987 ad='c000000d67a42778' sqlid='5ansr7r9htpq3'
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
END OF STMT
PARSE #11529215044940435280:c=10000,e=8182,p=6,cr=55,cu=0,mis=1,r=0,dep=1,og=4,plh=0,tim=223080942764
BINDS #11529215044940435280:
 Bind#0
  oacdty=01 mxl=32(20) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=873 siz=32 off=0
  kxsbbbfp=c000000d5fd299aa  bln=32  avl=20  flg=09
  value="_SYSSMU1_3935275865$"
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6e078  bln=24  avl=02  flg=05
  value=3
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6e048  bln=24  avl=03  flg=05
  value=128
 Bind#3
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6e010  bln=24  avl=02  flg=05
  value=5
 Bind#4
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dfe0  bln=24  avl=02  flg=05
  value=1
 Bind#5
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dfb0  bln=24  avl=04  flg=05
  value=28921
 Bind#6
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6df80  bln=24  avl=05  flg=05
  value=1245262
 Bind#7
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6df48  bln=24  avl=06  flg=05
  value=1217986655
 Bind#8
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dc90  bln=24  avl=03  flg=05
  value=3621
 Bind#9
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dc60  bln=24  avl=01  flg=05
  value=0
 Bind#10
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dc30  bln=24  avl=02  flg=05
  value=2
 Bind#11
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dc00  bln=24  avl=02  flg=05
  value=2
 Bind#12
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6e0a8  bln=22  avl=02  flg=05
  value=1
WAIT #11529215044940435280: nam='db file sequential read' ela= 21 file#=1 block#=530 blocks=1 obj#=0 tim=223080944352
Incident 528204 created, dump file: /oracle/diag/rdbms/xff/xff/incident/incdir_528204/xff_ora_6593_i528204.trc
ORA-00600: internal error code, arguments: [4193], [], [], [], [], [], [], [], [], [], [], []

至此基本上可以确认是由于出现回滚段异常,继续查看日志发现

Error 600 in redo application callback
Dump of change vector:
TYP:0 CLS:16 AFN:1 DBA:0x00400212 OBJ:4294967295 SCN:0x0ea6.f4f2da14 SEQ:1 OP:5.1 ENC:0 RBL:0
ktudb redo: siz: 320 spc: 5892 flg: 0x0012 seq: 0x0072 rec: 0x08
            xid:  0x0000.004.000000bc  
ktubl redo: slt: 4 rci: 0 opc: 11.1 [objn: 15 objd: 15 tsn: 0]
Undo type:  Regular undo        Begin trans    Last buffer split:  No 
Temp Object:  No 
Tablespace Undo:  No 
             0x00000000  prev ctl uba: 0x00400212.0072.07 
prev ctl max cmt scn:  0x0eac.d42963be  prev tx cmt scn:  0x0eac.d4296f48 
txn start scn:  0xffff.ffffffff  logon user: 0  prev brb: 4194446  prev bcl: 0 BuExt idx: 0 flg2: 0
KDO undo record:
KTB Redo 
op: 0x04  ver: 0x01  
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x0000.060.000000bb uba: 0x00400212.0072.04
                      flg: C---    lkc:  0     scn: 0x0eac.d9736b46
KDO Op code: URP row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x004000e1  hdba: 0x004000e0
itli: 4  ispac: 0  maxfr: 4863
tabn: 0 slot: 1(0x1) flag: 0x2c lock: 0 ckix: 0
ncol: 17 nnew: 12 size: 0
col  1: [20]  5f 53 59 53 53 4d 55 31 5f 33 39 33 35 32 37 35 38 36 35 24
col  2: [ 2]  c1 02
col  3: [ 2]  c1 04
col  4: [ 3]  c2 02 1d
col  5: [ 6]  c5 0d 12 63 43 38
col  6: [ 3]  c2 25 16
col  7: [ 5]  c4 02 19 35 3f
col  8: [ 4]  c3 03 5a 16
col  9: [ 1]  80
col 10: [ 2]  c1 04
col 11: [ 2]  c1 03
col 16: [ 2]  c1 03
Block after image is corrupt: 
buffer tsn: 0 rdba: 0x00400212 (1/530)
scn: 0x0ea6.f4f2da14 seq: 0x01 flg: 0x04 tail: 0xda140201
frmt: 0x02 chkval: 0x9dd8 type: 0x02=KTU UNDO BLOCK

使用bbed对file 1 block 530进行处理

   struct ktuxcscn, 8 bytes                 @4148    
      ub4 kscnbas                           @4148     0xd42963be
      ub2 kscnwrp                           @4152     0x0eac
   struct ktuxcuba, 8 bytes                 @4156    
      ub4 kubadba                           @4156     0x00400212
      ub2 kubaseq                           @4160     0x0072
      ub1 kubarec                           @4162     0x07
   sb2 ktuxcflg                             @4164     1 (KTUXCFSK)
   ub2 ktuxcseq                             @4166     0x0072
   sb2 ktuxcnfb                             @4168     1
   ub4 ktuxcinc                             @4172     0x00000000
   sb2 ktuxcchd                             @4176     4
   sb2 ktuxcctl                             @4178     3
   ub2 ktuxcmgc                             @4180     0x8002
   ub4 ktuxcopt                             @4188     0x7ffffffe

数据库顺利open成功
20210301210425


后续建议客户逻辑导出数据,导入到新库

ORA-00742 ORA-00312 故障恢复

联系:手机/微信(+86 13429648788) QQ(107644445)QQ咨询惜分飞

标题:ORA-00742 ORA-00312 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

12.1.0.1的由于硬件故障,恢复文件之后,导致redo写丢失,数据库数据库无法正常启动
报错ORA-00742 ORA-00312

Mon Feb 22 17:07:48 2021
alter database open
Mon Feb 22 17:07:48 2021
Beginning crash recovery of 1 threads
 parallel recovery started with 3 processes
Mon Feb 22 17:07:48 2021
Started redo scan
Mon Feb 22 17:07:49 2021
Slave encountered ORA-10388 exception during crash recovery
Mon Feb 22 17:07:49 2021
Slave encountered ORA-10388 exception during crash recovery
Mon Feb 22 17:07:49 2021
Slave encountered ORA-10388 exception during crash recovery
Mon Feb 22 17:07:51 2021
Aborting crash recovery due to error 742
Mon Feb 22 17:07:51 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4624.trc:
ORA-00742: 日志读取在线程 1 序列 4035 块 44165 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO03.LOG'
Mon Feb 22 17:07:51 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4624.trc:
ORA-00742: 日志读取在线程 1 序列 4035 块 44165 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO03.LOG'
ORA-742 signalled during: alter database open...

通过屏蔽一致性,强制resetlogs方式打开库报ORA-600 2662错误

Mon Feb 22 17:27:38 2021
Checker run found 17 new persistent data failures
alter database open resetlogs 
Mon Feb 22 17:27:54 2021
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 102879654
Resetting resetlogs activation ID 762781739 (0x2d77202b)
Mon Feb 22 17:27:59 2021
Setting recovery target incarnation to 4
Mon Feb 22 17:28:00 2021
Assigning activation ID 895702933 (0x35635795)
Starting background process TMON
Mon Feb 22 17:28:00 2021
TMON started with pid=26, OS id=4204 
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO01.LOG
Successful open of redo thread 1
Mon Feb 22 17:28:00 2021
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Feb 22 17:28:00 2021
SMON: enabling cache recovery
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4804.trc  (incident=21657):
ORA-00600: 内部错误代码, 参数: [2662], [0], [102879661], [0], [102879857], [20971648], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\incident\incdir_21657\orcl12c_ora_4804_i21657.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Feb 22 17:28:06 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4804.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [102879661], [0], [102879857], [20971648], [], [], [], [], [], []
Mon Feb 22 17:28:06 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4804.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [102879661], [0], [102879857], [20971648], [], [], [], [], [], []
Mon Feb 22 17:28:06 2021
Error 600 happened during db open, shutting down database
USER (ospid: 4804): terminating the instance due to error 600

由于scn相差的不大,重启几次后,该问题解决,后续数据库启动报ORA-600 4193

Mon Feb 22 19:53:11 2021
Database Characterset is ZHS16GBK
Starting background process SMCO
Mon Feb 22 19:53:11 2021
SMCO started with pid=28, OS id=3236 
Mon Feb 22 19:53:15 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_smon_4460.trc:
ORA-01595: 释放区 (2) 回退段 (1) 时出错
ORA-00600: 内部错误代码, 参数: [4193], [15352], [18655], [], [], [], [], [], [], [], [], []
Mon Feb 22 19:53:18 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_1356.trc:
ORA-00600: 内部错误代码, 参数: [4193], [15352], [18655], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 1356): terminating the instance due to error 600
Mon Feb 22 19:53:21 2021
Instance terminated by USER, pid = 1356
ORA-1092 signalled during: ALTER DATABASE OPEN...

处理异常undo之后,数据库启动正常,完成数据库恢复

硬件故障导致ORA-600 2662错误处理

联系:手机/微信(+86 13429648788) QQ(107644445)QQ咨询惜分飞

标题:硬件故障导致ORA-600 2662错误处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

前几天恢复了一个40多T的CASE:ORA-00600: internal error code, arguments: [16513], [1403] 恢复,又一个近30T的库由于硬件故障,通过其他人一系列恢复之后,无法正常open,让我们提供技术支持:
故障最初原因是由于存储异常

Fri Feb 19 09:03:49 2021
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_3460.trc:
ORA-01114: 将块写入文件 849 时出现 IO 错误 (块 # 3871748)
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1167) 设备没有连接。
ORA-01114: 将块写入文件 849 时出现 IO 错误 (块 # 3871748)
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1167) 设备没有连接。

通过其他人一系列处理后,数据库报ORA-600 2662错误

Sat Feb 20 08:19:35 2021
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Feb 20 08:19:35 2021
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_5304.trc(incident=1960181):
ORA-00600:internal error code,arguments:[2662],[4],[2185364344], [4],[2185453722],[893388032],[],[],[],[],[],[]
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_5304.trc:
ORA-00600:internal error code,arguments:[2662],[4],[2185364344], [4],[2185453722],[893388032],[],[],[],[],[],[]
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_5304.trc:
ORA-00600:internal error code,arguments:[2662],[4],[2185364344], [4],[2185453722],[893388032],[],[],[],[],[],[]
Error 600 happened during db open, shutting down database
USER (ospid: 5304): terminating the instance due to error 600
Instance terminated by USER, pid = 5304
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (5304) as a result of ORA-1092
Sat Feb 20 08:19:42 2021
ORA-1092 : opitsk aborting process

通过对scn处理,数据库顺利绕过该错误,然后报ORA-600 4194错误

Doing block recovery for file 213 block 4688
No block recovery was needed
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_7048.trc(incident=1984136):
ORA-00600: internal error code, arguments: [4194], [38.4.1381252], [0], [], [],[],[],[],[],[],[],[]
Sat Feb 20 10:50:45 2021
Doing block recovery for file 213 block 4688
No block recovery was needed
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_7048.trc:
ORA-00600: internal error code, arguments: [4194], [38.4.1381252], [0], [], [],[],[],[],[],[],[],[]
SMON (ospid: 7048): terminating the instance due to error 474
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_6652.trc(incident=1984185):
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Sat Feb 20 10:50:52 2021
Instance terminated by SMON, pid = 7048

通过对异常事务进行处理,屏蔽smon进程进行回滚,数据库open成功,但是报ORA-600 4137错误

Sat Feb 20 10:53:46 2021
Sweep [inc][1992133]: completed
Stopping background process MMNL
Sat Feb 20 10:53:47 2021
Trace dumping is performing id=[cdmp_20210220105347]
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_6576.trc(incident=1992134):
ORA-00600: internal error code, arguments: [4137], [23.13.3094188], [0], [0], [], [], [], [], [], [], [], []
ORACLE Instance xifenfei (pid = 14) - Error 600 encountered while recovering transaction (23, 13).
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_6576.trc:
ORA-00600: internal error code, arguments: [4137], [23.13.3094188], [0], [0], [], [], [], [], [], [], [], []
Sat Feb 20 10:53:47 2021
Sweep [inc2][1992133]: completed
Sat Feb 20 10:53:47 2021
Sweep [inc][1992134]: completed
Stopping background process MMON
Trace dumping is performing id=[cdmp_20210220105348]
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_6576.trc(incident=1992135):
ORA-00600: internal error code, arguments: [4137], [38.4.1381252], [0], [0], [], [], [], [], [], [], [], []
Starting background process MMON
Starting background process MMNL
Sat Feb 20 10:53:48 2021
MMON started with pid=16, OS id=6448 
ALTER SYSTEM enable restricted session;
Sat Feb 20 10:53:48 2021
MMNL started with pid=36, OS id=6840 
ORACLE Instance xifenfei (pid = 14) - Error 600 encountered while recovering transaction (38, 4).
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_smon_6576.trc:
ORA-00600: internal error code, arguments: [4137], [38.4.1381252], [0], [0], [], [], [], [], [], [], [], []
Sat Feb 20 10:53:49 2021
Sweep [inc][1992135]: completed
Trace dumping is performing id=[cdmp_20210220105349]
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: alter database open

对异常回滚段进行处理,数据库后端启动正常,不再报明显ORA-错误.通过hcheck.sql检查字典正常

HCheck Version 07MAY18 on 20-FEB-2021 11:35:11
----------------------------------------------
Catalog Version 11.2.0.1.0 (1102000100)
db_name: JYJG

                                   Catalog       Fixed
Procedure Name                     Version    Vs Release    Timestamp
Result
------------------------------ ... ---------- -- ---------- --------------
------
.- LobNotInObj                 ... 1102000100 <=  *All Rel* 02/20 11:35:11 PASS
.- MissingOIDOnObjCol          ... 1102000100 <=  *All Rel* 02/20 11:35:11 PASS
.- SourceNotInObj              ... 1102000100 <=  *All Rel* 02/20 11:35:11 PASS
.- IndIndparMismatch           ... 1102000100 <= 1102000100 02/20 11:35:12 PASS
.- InvCorrAudit                ... 1102000100 <= 1102000100 02/20 11:35:12 PASS
.- OversizedFiles              ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- PoorDefaultStorage          ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- PoorStorage                 ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- PartSubPartMismatch         ... 1102000100 <= 1102000100 02/20 11:35:12 PASS
.- TabPartCountMismatch        ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- OrphanedTabComPart          ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- MissingSum$                 ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- MissingDir$                 ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- DuplicateDataobj            ... 1102000100 <=  *All Rel* 02/20 11:35:12 PASS
.- ObjSynMissing               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- ObjSeqMissing               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedUndo                ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedIndex               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedIndexPartition      ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedIndexSubPartition   ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedTable               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedTablePartition      ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedTableSubPartition   ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- MissingPartCol              ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedSeg$                ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- OrphanedIndPartObj#         ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- DuplicateBlockUse           ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- FetUet                      ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- Uet0Check                   ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- ExtentlessSeg               ... 1102000100 <= 1102000100 02/20 11:35:13 PASS
.- SeglessUET                  ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadInd$                     ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadTab$                     ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadIcolDepCnt               ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- ObjIndDobj                  ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- TrgAfterUpgrade             ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- ObjType0                    ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadOwner                    ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- StmtAuditOnCommit           ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadPublicObjects            ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadSegFreelist              ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- BadDepends                  ... 1102000100 <=  *All Rel* 02/20 11:35:13 PASS
.- CheckDual                   ... 1102000100 <=  *All Rel* 02/20 11:35:14 PASS
.- ObjectNames                 ... 1102000100 <=  *All Rel* 02/20 11:35:14 PASS
.- BadCboHiLo                  ... 1102000100 <=  *All Rel* 02/20 11:35:14 PASS
.- ChkIotTs                    ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- NoSegmentIndex              ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- BadNextObject               ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- DroppedROTS                 ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- FilBlkZero                  ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- DbmsSchemaCopy              ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- OrphanedObjError            ... 1102000100 >  1102000000 02/20 11:35:15 PASS
.- ObjNotLob                   ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- MaxControlfSeq              ... 1102000100 <=  *All Rel* 02/20 11:35:15 PASS
.- SegNotInDeferredStg         ... 1102000100 >  1102000000 02/20 11:35:18 PASS
.- SystemNotRfile1             ... 1102000100 >   902000000 02/20 11:35:18 PASS
.- DictOwnNonDefaultSYSTEM     ... 1102000100 <=  *All Rel* 02/20 11:35:19 PASS
.- OrphanTrigger               ... 1102000100 <=  *All Rel* 02/20 11:35:19 PASS
.- ObjNotTrigger               ... 1102000100 <=  *All Rel* 02/20 11:35:19 PASS
---------------------------------------
20-FEB-2021 11:35:19  Elapsed: 8 secs
---------------------------------------
Found 0 potential problem(s) and 0 warning(s)

PL/SQL procedure successfully completed.

Statement processed.

虽然字典正常,但是由于数据库屏蔽了一致性,建议客户在条件允许的情况下,进行逻辑迁移,排除风险隐患.