ORACLE 8.1.7 数据库ORA-600 4000处理

Posted on 2014 年 08 月 30 日 by 惜分飞

在数据库的恢复过程中遇到ORA-600 4000错误挺多的，但是在oracle 8i(8.1.7)中遇到此类问题，还是第一次，做个记忆，供参考：
数据库故障起因：因为存储异常，导致当前redo损坏，并_allow_resetlogs_corruption参数尝试打开数据库

Media Recovery Log 
kcrrga: Warning.  Log sequence in archive filename wrapped
to fix length as indicated by %S in LOG_ARCHIVE_FORMAT.
Old log archive with same name might be overwritten.
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup cont...
Wed Aug 20 23:01:43 2014
ALTER DATABASE RECOVER    CANCEL  
Media Recovery Cancelled
Completed: ALTER DATABASE RECOVER    CANCEL  
Wed Aug 20 23:01:50 2014
alter database open resetlogs

RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 262618871
Wed Aug 20 23:01:50 2014
Thread 1 opened at log sequence 1
  Current log# 3 seq# 1 mem# 0: F:\REDO01.LOG
Successful open of redo thread 1.
Wed Aug 20 23:01:50 2014
SMON: enabling cache recovery
Wed Aug 20 23:01:50 2014
Errors in file C:\oracle\admin\YCFD\udump\ORA00320.TRC:
ORA-00600: ??????????: [4000], [3], [], [], [], [], [], []

SMON: disabling cache recovery
Wed Aug 20 23:01:51 2014
ORA-704 signalled during: alter database open resetlogs

数据库遭遇ORA-600 4000错误，数据库无法打开，分析对应trace日志

Dump file C:\oracle\admin\YCFD\udump\ORA00320.TRC
Wed Aug 20 23:01:50 2014
ORACLE V8.1.7.0.0 - Production vsnsta=0
vsnsql=e vsnxtr=3
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Oracle8i Release 8.1.7.0.0 - Production
JServer Release 8.1.7.0.0 - Production
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Instance name: ycfd

Redo thread mounted by this instance: 1

Oracle process number: 8

Windows thread id: 320, image: ORACLE.EXE


*** SESSION ID:(7.1) 2014-08-20 23:01:50.838
*** 2014-08-20 23:01:50.838
ksedmp: internal or fatal error
ORA-00600: ??????????: [4000], [3], [], [], [], [], [], []
Current SQL statement for this session:
select ctime, mtime, stime from obj$ where obj# = :1
----- Call Stack Trace -----

这里可以看出来，是因为数据库在启动之时需要执行select ctime, mtime, stime from obj$ where obj# = :1语句，但是由于每种原因出现ORA-600 4000导致数据库无法正常启动，继续分析日志

lock header dump:  0x0040003e
 Object id on Block? Y
 seg/obj: 0x12  csc: 0x00.fb5c5c5  itc: 1  flg: -  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   xid:  0x0003.012.0002ae94    uba: 0x00801f5b.5389.11  --U-    1  fsc 0x0000.0fb5c5c6


SQL> select checkpoint_change# from v$database;

263570122

此处比较明显,通过xid可以知道三号回滚段中对应一个事务出现问题：
1. 该block为file 1 bock 62，object_id为 18（obj$)上有一个事务
2. 该事务的scn为263,570,886>database scn(263570122)导致该故障发生
3. 当数据库访问到file 1 block 62的时候，发现有一个事务，而该事务的scn大于数据库scn，从而出现ORA-600[4000]
解决该问题有几种方法
1. 修改block 62，人工提交该事务
2. 修改数据库scn，让数据库scn大于itl scn
解决block 62 事务问题后出现如下错误

Wed Aug 20 23:03:55 2014
SMON: enabling cache recovery
Wed Aug 20 23:03:55 2014
Dictionary check beginning
Dictionary check complete
Wed Aug 20 23:03:55 2014
SMON: enabling tx recovery
Wed Aug 20 23:03:56 2014
Errors in file C:\oracle\admin\YCFD\bdump\ycfdSMON.TRC:
ORA-00600: internal error code, arguments: [4193], [21173], [21181], [], [], [], [], []

Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: F:\REDO03.LOG
SMON: terminating instance due to error 600
Instance terminated by SMON, pid = 2468

数据库出现ORA-600 4193,这个是常见错误,因为redo记录和undo记录不匹配导致,可以直接使用_corrupted_rollback_segments/_offline_rollback_segments屏蔽回滚段跳过

Wed Aug 20 23:08:10 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
SMON: about to recover undo segment 1
SMON: mark undo segment 1 as needs recovery
SMON: about to recover undo segment 2
SMON: mark undo segment 2 as needs recovery
SMON: about to recover undo segment 3
SMON: mark undo segment 3 as needs recovery
SMON: about to recover undo segment 4
SMON: mark undo segment 4 as needs recovery
SMON: about to recover undo segment 5
SMON: mark undo segment 5 as needs recovery
SMON: about to recover undo segment 6
SMON: mark undo segment 6 as needs recovery
SMON: about to recover undo segment 7
SMON: mark undo segment 7 as needs recovery
SMON: about to recover undo segment 1
SMON: mark undo segment 1 as needs recovery
SMON: about to recover undo segment 2
SMON: mark undo segment 2 as needs recovery
SMON: about to recover undo segment 3
SMON: mark undo segment 3 as needs recovery
SMON: about to recover undo segment 4
SMON: mark undo segment 4 as needs recovery
SMON: about to recover undo segment 5
SMON: mark undo segment 5 as needs recovery
SMON: about to recover undo segment 6
SMON: mark undo segment 6 as needs recovery
SMON: about to recover undo segment 7
SMON: mark undo segment 7 as needs recovery
Wed Aug 20 23:08:15 2014
Completed: alter database open

其他类似文章：
ORA-600[4194]/[4193]解决
 通过bbed解决ORA-600 4000案例
 通过bbed解决ORA-00600[4000]案例
 记录一次ORA-600 4000数据库故障恢复

ORACLE REDO各种异常恢复

Posted on 2014 年 02 月 26 日 by 惜分飞

redo是oracle数据库比较核心文件,当redo异常之后,数据库无法正常启动,而且有丢失数据的风险,强烈建议条件允许redo多路镜像.redo文件异常的故障可以说是千奇百怪,但是总体上可以分为几类:
数据库归档/非归档模式下inactive redo异常
ORA-00316 ORA-00327

SQL> startup mount
ORACLE instance started.

Total System Global Area  260046848 bytes
Fixed Size                  1266896 bytes
Variable Size              83888944 bytes
Database Buffers          167772160 bytes
Redo Buffers                7118848 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 2 of thread 1, type  in header is not log file
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/XFF/redo02.log'

SQL> col member for a40
SQL> set lines 120
SQL>   SELECT thread#,
  2           a.sequence#,
  3           a.group#,
  4           TO_CHAR (first_change#, '9999999999999999') "SCN",
  5           a.status,
  6           MEMBER
  7      FROM v$log a, v$logfile b
  8     WHERE a.group# = B.GROUP#
  9  ORDER BY a.sequence# DESC;

   THREAD#  SEQUENCE#     GROUP# SCN               STATUS           MEMBER
---------- ---------- ---------- ----------------- ---------------- -----------------------------------
         1         15          3            665697 CURRENT          /u01/oracle/oradata/XFF/redo03.log
         1         14          2            645619 INACTIVE         /u01/oracle/oradata/XFF/redo02.log
         1         13          1            625540 INACTIVE         /u01/oracle/oradata/XFF/redo01.log

SQL> ALTER DATABASE CLEAR LOGFILE GROUP 2;

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00327: log 2 of thread 1, physical size  less than needed
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/XFF/redo02.log'

SQL> alter database drop logfile group 2;

Database altered.

SQL> alter database open;

Database altered.

SQL> alter database add logfile group 2 ('/u01/oracle/oradata/XFF/redo02.log') size 50M reuse;

Database altered.

正常关闭数据库current redo异常
ORA-00316 ORA-01623

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type  in header is not log file
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'

SQL>   SELECT thread#,
  2           a.sequence#,
  3           a.group#,
  4           TO_CHAR (first_change#, '9999999999999999') "SCN",
  5           a.status,
  6           MEMBER
  7      FROM v$log a, v$logfile b
  8     WHERE a.group# = B.GROUP#
  9  ORDER BY a.sequence# DESC;

   THREAD#  SEQUENCE#     GROUP# SCN               STATUS           MEMBER
---------- ---------- ---------- ----------------- ---------------- ----------------------------------
         1         16          1            685918 CURRENT          /u01/oracle/oradata/XFF/redo01.log
         1         15          3            665697 INACTIVE         /u01/oracle/oradata/XFF/redo03.log
         1          0          2                 0 UNUSED           /u01/oracle/oradata/XFF/redo02.log

SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
ALTER DATABASE CLEAR LOGFILE GROUP 1
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type 0 in header is not log file
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'


SQL> ALTER DATABASE drop logfile group 1;
ALTER DATABASE drop logfile group 1
*
ERROR at line 1:
ORA-01623: log 1 is current log for instance XFF (thread 1) - cannot drop
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'

SQL> recover database until cancel;
Media recovery complete.
SQL> alter database open resetlogs;

Database altered.

数据库异常关闭current/active redo异常
ORA-00316 ORA-01624 ORA-01194

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type 0 in header is not log file
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'

SQL>   SELECT thread#,
  2           a.sequence#,
  3           a.group#,
  4           TO_CHAR (first_change#, '9999999999999999') "SCN",
  5           a.status,
  6           MEMBER
  7      FROM v$log a, v$logfile b
  8     WHERE a.group# = B.GROUP#
  9  ORDER BY a.sequence# DESC;

   THREAD#  SEQUENCE#     GROUP# SCN               STATUS           MEMBER
---------- ---------- ---------- ----------------- ---------------- -----------------------------------
         1          8          2            686310 CURRENT          /u01/oracle/oradata/XFF/redo02.log
         1          7          1            686294 ACTIVE           /u01/oracle/oradata/XFF/redo01.log
         1          6          3            686289 INACTIVE         /u01/oracle/oradata/XFF/redo03.log

SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
ALTER DATABASE CLEAR LOGFILE GROUP 1
*
ERROR at line 1:
ORA-01624: log 1 needed for crash recovery of instance XFF (thread 1)
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'


SQL> ALTER DATABASE drop logfile group 1;
ALTER DATABASE drop logfile group 1
*
ERROR at line 1:
ORA-01624: log 1 needed for crash recovery of instance XFF (thread 1)
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'


SQL> recover database until cancel
ORA-00279: change 686294 generated at 04/20/2013 01:37:16 needed for thread 1
ORA-00289: suggestion : /u01/oracle/oracle/product/10.2.0/db_1/dbs/arch1_7_813202529.dbf
ORA-00280: change 686294 for thread 1 is in sequence #7


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/XFF/redo01.log
ORA-00308: cannot open archived log '/u01/oracle/oradata/XFF/redo01.log'
ORA-27047: unable to read the header block of file
Additional information: 2


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/XFF/system01.dbf'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/XFF/system01.dbf'

SQL> alter system set "_allow_resetlogs_corruption"=true scope=spfile;

System altered.

SQL> shutdown immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area  260046848 bytes
Fixed Size                  1266896 bytes
Variable Size              83888944 bytes
Database Buffers          167772160 bytes
Redo Buffers                7118848 bytes
Database mounted.
SQL> recover database until cancel
ORA-00279: change 686294 generated at 04/20/2013 01:37:16 needed for thread 1
ORA-00289: suggestion : /u01/oracle/oracle/product/10.2.0/db_1/dbs/arch1_7_813202529.dbf
ORA-00280: change 686294 for thread 1 is in sequence #7


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/XFF/system01.dbf'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;

Database altered.

在这样的情况下,数据库异常关闭,current/active redo异常,通过使用隐含参数可能可以侥幸的恢复数据库,但是也可能导致数据丢失.这里因为是模拟情况,无业务所以在很多较为繁忙的业务系统中,使用隐含参数resetlogs过程中可能还会遇到如下很多常见的错误,进一步增加了恢复难度

current/active redo异常后附带其他错误
ORA-600[2662]

Wed Dec 07 13:02:49 2011
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc  (incident=216664):
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_216664\hzyl_ora_3388_i216664.trc
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc:
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3388): terminating the instance due to error 600

ORA-00600[4000]

Thu Feb 28 19:29:10 2013
SMON: enabling cache recovery
Thu Feb 28 19:29:11 2013
Errors in file /u1/PROD/prodora/db/tech_st/10.2.0/admin/PROD_oracle/udump/prod_ora_20989.trc:
ORA-00600: internal error code, arguments: [4000], [50], [], [], [], [], [], []
Thu Feb 28 19:29:13 2013
Incremental checkpoint up to RBA [0x1.3.0], current log tail at RBA [0x1.3.0]
Thu Feb 28 19:29:13 2013
Errors in file /u1/PROD/prodora/db/tech_st/10.2.0/admin/PROD_oracle/udump/prod_ora_20989.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [50], [], [], [], [], [], []

ORA-00704 ORA-00604 ORA-01555

Fri May  4 21:04:21 2012
select ctime, mtime, stime from obj$ where obj# = :1
Fri May  4 21:04:21 2012
Errors in file /oracle/admin/standdb/udump/perfdb_ora_1286288.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 40 with name "_SYSSMU40$" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1286288
ORA-1092 signalled during: alter database open resetlogs...

current/active redo异常还可能报如下错误
redo文件损坏报错

Started redo scan
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_2960.trc  (incident=214262):
ORA-00353: 日志损坏接近块 12014 更改 9743799889 时间 12/05/2011 09:21:11
ORA-00312: 联机日志 3 线程 1: 'R:\ORADATA\HZYL\REDO03.LOG'
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_214262\hzyl_ora_2960_i214262.trc
Aborting crash recovery due to error 368
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_2960.trc:
ORA-00368: 重做日志块中的校验和错误
ORA-00353: 日志损坏接近块 12014 更改 9743799889 时间 12/05/2011 09:21:11
ORA-00312: 联机日志 3 线程 1: 'R:\ORADATA\HZYL\REDO03.LOG'
ORA-368 signalled during: ALTER DATABASE OPEN...

redo文件被其他实例占用报错

Wed May 16 17:03:11 2012
Started redo scan
Wed May 16 17:03:11 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc:
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2'
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1'
ORA-305 signalled during: ALTER DATABASE OPEN...

存储整体异常

Mon Oct 17 09:35:09 2011
Errors in file /oracle/app/admin/orcl/bdump/orcl2_lgwr_348814.trc:
ORA-00340: IO error processing online log 4 of thread 2
ORA-00345: redo log write error block 6732 count 2
ORA-00312: online log 4 thread 2: '/dev/rredo21'
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 6: No such device or address
Additional information: -1
Additional information: 1024
Mon Oct 17 09:35:09 2011
LGWR: terminating instance due to error 340

存储IO异常

Fri Feb 21 08:44:42 2014
Thread 1 advanced to log sequence 591 (LGWR switch)
  Current log# 1 seq# 591 mem# 0: J:\ORADATA\ORCL\REDO01.LOG
Fri Feb 21 15:31:20 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_lgwr_10312.trc:
ORA-00316: log 1 of thread 1, type 286 in header is not log file
ORA-00312: online log 1 thread 1: 'J:\ORADATA\ORCL\REDO01.LOG'

使用_disable_logging参数

Sat May 14 23:16:49 2005
Errors in file d:\oracle\admin\rman\bdump\rman_arc0_736.trc:
ORA-16038: log 3 sequence# 72 cannot be archived
ORA-00354: corrupt redo log block header
ORA-00312: online log 3 thread 1: 'D:\ORACLE\ORADATA\RMAN\REDO03.LOG'

如果你在使用这些思路进行恢复遇到突发情况不能自行解决,请联系我们,将为您提供专业数据库技术支持
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com

姊妹篇
undo异常总结和恢复思路
 ORACLE丢失各种文件导致数据库不能OPEN恢复

记录一次ORA-600 4000数据库故障恢复

Posted on 2013 年 05 月 30 日 by 惜分飞

ORA-600[4000]错误
一朋友数据库因为当前redo丢失,在恢复的过程中启动报ORA-600[4000]错误

SMON: enabling cache recovery
Thu May 30 16:24:17 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/udump/xifenfei1_ora_1458370.trc:
ORA-00600: internal error code, arguments: [4000], [83], [], [], [], [], [], []
Thu May 30 16:24:19 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/udump/xifenfei1_ora_1458370.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [83], [], [], [], [], [], []
Thu May 30 16:24:19 2013
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1458370
ORA-1092 signalled during: alter database open resetlogs...

分析trace文件

*** 2013-05-30 16:24:17.979
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4000], [83], [], [], [], [], [], []
Current SQL statement for this session:
select ctime, mtime, stime from obj$ where obj# = :1
--确定是obj$对象异常,通过某种手段找到obj$的objid和dataobjid均为16，对应16进制为12

Block header dump:  0x0040007a
 Object id on Block? Y
 seg/obj: 0x12  csc: 0xc1e.a329e76f  itc: 1  flg: -  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0053.02a.000598bd  0x0d407e46.4f52.2f  --U-    1  fsc 0x0000.a329e772

这里比较明显obj$对象在rdba为0040007a的block上,scn为0c1e.a329e76f(13325725984623)且未提交的事务,这样的现象就决定了处理的特殊性(不是因为块延迟清理导致访问undo现象,该现象直接推进scn解决,而该情况不行)

数据文件头scn

SQL> SELECT DISTINCT CHECKPOINT# FROM V$DATAFILE_HEADER;

    CHECKPOINT_CHANGE#
-------------------------
           13324676536960

bbed查看文件头scn

   struct kcvfhckp, 160 bytes               @484     
      struct kcvcpscn, 8 bytes              @484     
         ub4 kscnbas                        @484      0x649c9a80
         ub2 kscnwrp                        @488      0x0c1e

这里看到的文件头scn也是为13324676536960(0c1e.649c9a80)和sql查询结果一致,也就是说数据库中的obj$的某个对象含有事务,且scn大于文件头scn(因为当前redo丢失,无法前滚,所以出现该情况),当数据库访问obj$的时候,为了事务的一致性,就需要访问undo(这里提示为83 回滚段),而undo异常,所以smon进程回滚失败,数据库无法正常启动

使用bbed提交事务

BBED> map
 File: /oradata/sys/xifenfei/system01.dbf (1)
 Block: 122                                   Dba:0x0040007a
------------------------------------------------------------
 KTB Data Block (Table/Cluster)

 struct kcbh, 20 bytes                      @0       

 struct ktbbh, 48 bytes                     @20      

 struct kdbh, 14 bytes                      @68      

 struct kdbt[1], 4 bytes                    @82      

 sb2 kdbr[108]                              @86      

 ub1 freespace[802]                         @302     

 ub1 rowdata[7084]                          @1104    

 ub4 tailchk                                @8188    


BBED> p ktbbh
struct ktbbh, 48 bytes                      @20      
   ub1 ktbbhtyp                             @20       0x01 (KDDBTDATA)
   union ktbbhsid, 4 bytes                  @24      
      ub4 ktbbhsg1                          @24       0x00000012
      ub4 ktbbhod1                          @24       0x00000012
   struct ktbbhcsc, 8 bytes                 @28      
      ub4 kscnbas                           @28       0xa329e76f
      ub2 kscnwrp                           @32       0x0c1e
   b2 ktbbhict                              @36       1
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44      
      struct ktbitxid, 8 bytes              @44      
         ub2 kxidusn                        @44       0x0053
         ub2 kxidslt                        @46       0x002a
         ub4 kxidsqn                        @48       0x000598bd
      struct ktbituba, 8 bytes              @52      
         ub4 kubadba                        @52       0x0d407e46
         ub2 kubaseq                        @56       0x4f52
         ub1 kubarec                        @58       0x2f
      ub2 ktbitflg                          @60       0x2001 (KTBFUPB)<--需要提交
      union _ktbitun, 2 bytes               @62      
         b2 _ktbitfsc                       @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0xa329e772

BBED> set count 32
        COUNT           32

BBED> set offset 60
        OFFSET          60

BBED> d
 File: /oradata/sys/xifenfei/system01.dbf (1)
 Block: 122              Offsets:   60 to   91           Dba:0x0040007a
------------------------------------------------------------------------
 20010000 a329e772 0001006c ffff00ea 040c0368 03680000 006c1f7c 1f3c1efb 

 <32 bytes per line>

BBED> m /x 8001
 File: /oradata/sys/xifenfei/system01.dbf (1)
 Block: 122              Offsets:   60 to   91           Dba:0x0040007a
------------------------------------------------------------------------
 80010000 a329e772 0001006c ffff00ea 040c0368 03680000 006c1f7c 1f3c1efb 

 <32 bytes per line>

BBED> sum apply
Check value for File 1, Block 122:
current = 0xafd6, required = 0xafd6

尝试open数据库ORA-600[2662]解决

Thu May 30 21:16:00 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_smon_819664.trc:
ORA-00600: internal error code, arguments:[2662],[3102],[2737532996],[3102],[2745973074],[4194397],[],[]
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Thu May 30 21:16:01 2013
Trace dumping is performing id=[cdmp_20130530211601]
Thu May 30 21:16:02 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_smon_819664.trc:
ORA-00600: internal error code, arguments:[2662],[3102],[2737532997],[3102],[2745973074],[4194397],[],[]
Thu May 30 21:16:03 2013
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 2 out of maximum 100 non-fatal internal errors.
Thu May 30 21:16:05 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_smon_819664.trc:
ORA-00600: internal error code, arguments:[2662],[3102],[2737532997],[3102],[2745973074],[4194397],[],[]
Thu May 30 21:16:08 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_pmon_958764.trc:
ORA-00474: SMON process terminated with error
Thu May 30 21:16:08 2013
PMON: terminating instance due to error 474

数据库在open过程中遇到大量ORA-00600[2662],这个是因为数据库中文件头的scn小于访问的数据块scn导致该问题,解决方法推荐scn,如果数据库的scn本身就很大(和时间理论scn较接近),推进过程中可能遇到如下错误,这个时候就需要选择合适的方法/合适的值来推进scn

SQL> startup pfile=/home/oracle/pfile force
ORACLE instance started.

Total System Global Area 5.5835E+10 bytes
Fixed Size                  2177056 bytes
Variable Size            3.2867E+10 bytes
Database Buffers         2.2951E+10 bytes
Redo Buffers               14598144 bytes
Database mounted.
ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified

后面的工作因为没有redo前滚,而且该库故障时有大量事务在跑,现在无法前滚,导致大量的undo回滚段异常,index和data不一致等故障,需要做的就是屏蔽undo seg,重建undo,重建库

重现ORA-600 4000异常

Posted on 2012 年 08 月 05 日 by 惜分飞

对于数据库恢复感兴趣的人,一定对于ORA-600[4000]这个著名的错误记忆犹新,这里通过试验重现ORA-600[4000]
查询数据库obj$.con$记录

[oracle@xifenfei ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.1.0 - Production on Fri Nov 4 06:32:36 2011

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  318767104 bytes
Fixed Size                  1219160 bytes
Variable Size              92276136 bytes
Database Buffers          218103808 bytes
Redo Buffers                7168000 bytes
Database mounted.
Database opened.
SQL> select OBJ#,dbms_rowid.rowid_relative_fno(rowid) rel_fno,
  2  dbms_rowid.rowid_block_number(rowid) block_num from obj$ 
  3  where name='CON$';

      OBJ#    REL_FNO  BLOCK_NUM
---------- ---------- ----------
        28          1        122

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

bbed修改数据块
根据催华的研究,当OBJ$中部分数据块中的csc和itl的scn都大于header scn的时候,数据库启动就会报ORA-600[4000]

[oracle@xifenfei ~]$ bbed listfile=list mode=edit password=blockedit

BBED: Release 2.0.0.0.0 - Limited Production on Fri Nov 4 06:47:09 2011

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> info
 File#  Name                                                        Size(blks)
 -----  ----                                                        ----------
     1  /u01/oracle/oradata/XFF/system01.dbf                                 0
     2  /u01/oracle/oradata/XFF/undotbs01.dbf                                0
     3  /u01/oracle/oradata/XFF/sysaux01.dbf                                 0
     4  /u01/oracle/oradata/XFF/users01.dbf                                  0
     5  /u01/oracle/oradata/XFF/datfttuser.dbf                               0

BBED> set block 1     
        BLOCK#          1

BBED> set file 1 block 1
        FILE#           1
        BLOCK#          1

BBED>  p kcvfhckp.kcvcpscn
struct kcvcpscn, 8 bytes                    @484     
   ub4 kscnbas                              @484      0x00210f97
   ub2 kscnwrp                              @488      0x0000

BBED> set block 122
        BLOCK#          122

BBED> p ktbbh
struct ktbbh, 48 bytes                      @20      
   ub1 ktbbhtyp                             @20       0x01 (KDDBTDATA)
   union ktbbhsid, 4 bytes                  @24      
      ub4 ktbbhsg1                          @24       0x00000012
      ub4 ktbbhod1                          @24       0x00000012
   struct ktbbhcsc, 8 bytes                 @28         <==csc(SCN of the last block cleanout)
      ub4 kscnbas                           @28       0x0020770d
      ub2 kscnwrp                           @32       0x0000
   sb2 ktbbhict                             @36       1
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44      
      struct ktbitxid, 8 bytes              @44      
         ub2 kxidusn                        @44       0x0005  <==回滚段序号
         ub2 kxidslt                        @46       0x0029
         ub4 kxidsqn                        @48       0x0000029a
      struct ktbituba, 8 bytes              @52      
         ub4 kubadba                        @52       0x00802381
         ub2 kubaseq                        @56       0x01f9
         ub1 kubarec                        @58       0x03
      ub2 ktbitflg                          @60       0x2001 (KTBFUPB)
      union _ktbitun, 2 bytes               @62      
         sb2 _ktbitfsc                      @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0x0020770e <==itl commit scn

BBED> set count 16
        COUNT           16

BBED> m /x 0d772010 offset 28
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/system01.dbf (1)
 Block: 122              Offsets:   28 to  60           Dba:0x0040007a
------------------------------------------------------------------------
 0d772010 00000000 01000200 00000000 

 <32 bytes per line>

BBED> m /x 0e772010 offset 64
 File: /u01/oracle/oradata/XFF/system01.dbf (1)
 Block: 122              Offsets:   64 to  96           Dba:0x0040007a
------------------------------------------------------------------------
 0e772010 00016c00 ffffea00 53046903 

 <32 bytes per line>

BBED> p ktbbh
struct ktbbh, 48 bytes                      @20      
   ub1 ktbbhtyp                             @20       0x01 (KDDBTDATA)
   union ktbbhsid, 4 bytes                  @24      
      ub4 ktbbhsg1                          @24       0x00000012
      ub4 ktbbhod1                          @24       0x00000012
   struct ktbbhcsc, 8 bytes                 @28      
      ub4 kscnbas                           @28       0x1020770d
      ub2 kscnwrp                           @32       0x0000
   sb2 ktbbhict                             @36       1
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44      
      struct ktbitxid, 8 bytes              @44      
         ub2 kxidusn                        @44       0x0005
         ub2 kxidslt                        @46       0x0029
         ub4 kxidsqn                        @48       0x0000029a
      struct ktbituba, 8 bytes              @52      
         ub4 kubadba                        @52       0x00802381
         ub2 kubaseq                        @56       0x01f9
         ub1 kubarec                        @58       0x03
      ub2 ktbitflg                          @60       0x2001 (KTBFUPB)
      union _ktbitun, 2 bytes               @62      
         sb2 _ktbitfsc                      @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0x1020770e

BBED> sum apply
Check value for File 1, Block 122:
current = 0xc902, required = 0xc902

启动数据库

SQL> startup
ORACLE instance started.

Total System Global Area  318767104 bytes
Fixed Size                  1219160 bytes
Variable Size              92276136 bytes
Database Buffers          218103808 bytes
Redo Buffers                7168000 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced

查看日志

Fri Nov  4 06:50:38 2011
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Fri Nov  4 06:50:38 2011
ALTER DATABASE OPEN
Fri Nov  4 06:50:38 2011
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=16, OS id=7048
Fri Nov  4 06:50:38 2011
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=17, OS id=7050
ARC1: STARTING ARCH PROCESSES
Fri Nov  4 06:50:38 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Fri Nov  4 06:50:38 2011
Thread 1 opened at log sequence 38
  Current log# 3 seq# 38 mem# 0: /u01/oracle/oradata/XFF/redo03.log
Successful open of redo thread 1
Fri Nov  4 06:50:38 2011
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov  4 06:50:38 2011
SMON: enabling cache recovery
Fri Nov  4 06:50:38 2011
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC1: Becoming the heartbeat ARCH
ARC2 started with pid=18, OS id=7052
Fri Nov  4 06:50:38 2011
Errors in file /u01/oracle/admin/XFF/udump/xff_ora_7046.trc:
ORA-00600: internal error code, arguments: [4000], [5], [], [], [], [], [], []
Fri Nov  4 06:50:40 2011
Errors in file /u01/oracle/admin/XFF/udump/xff_ora_7046.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [5], [], [], [], [], [], []
Fri Nov  4 06:50:40 2011
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 7046
ORA-1092 signalled during: ALTER DATABASE OPEN...

在ORACL 11.2的版本中,同样的方法无法重新该错误,数据库依然能够正常打开,所以如果要测试的朋友请选择11G以下版本进行.