记录一次ORA-600 3004 恢复过程和处理思路

10.1.0.2数据库在启动的时候报ORA-00600[3004]错误

SQL> startup
ORACLE 例程已经启动。
Total System Global Area 1042254360 bytes
Fixed Size 743960 bytes
Variable Size 503316480 bytes
Database Buffers 536870912 bytes
Redo Buffers 1323008 bytes
数据库装载完毕。
ORA-00600: 内部错误代码,参数: [3004], [1], [0], [0], [], [], [], []

alert日志信息

Tue Jul 15 10:57:11 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Tue Jul 15 10:57:12 2014
ALTER DATABASE OPEN
Tue Jul 15 10:57:12 2014
Beginning crash recovery of 1 threads
 attempting to start a parallel recovery with 3 processes
 parallel recovery started with 3 processes
Tue Jul 15 10:57:12 2014
Started first pass scan
Tue Jul 15 10:57:12 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1084.trc:
ORA-00600: 内部错误代码, 参数: [3004], [1], [0], [0], [], [], [], []

Tue Jul 15 10:57:13 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_p000_3780.trc:
ORA-10388: parallel query server interrupt (failure)

Tue Jul 15 10:57:13 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_p002_3420.trc:
ORA-10388: parallel query server interrupt (failure)

Tue Jul 15 10:57:14 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_p001_3784.trc:
ORA-10388: parallel query server interrupt (failure)

Tue Jul 15 10:57:14 2014
Aborting crash recovery due to error 600
Tue Jul 15 10:57:14 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1084.trc:
ORA-00600: 内部错误代码, 参数: [3004], [1], [0], [0], [], [], [], []

ORA-600 signalled during: ALTER DATABASE OPEN...

根据老杨的blog,出现该问题,很可能是crontrolfile异常导致,所以这里可以考虑通过重建控制文件或者把当前控制文件当作备份控制文件来使用

Tue Jul 15 13:42:31 2014
ALTER DATABASE RECOVER  database using backup controlfile   
Tue Jul 15 13:42:31 2014
Media Recovery Start
WARNING! Recovering data file 1 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 2 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 3 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 4 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 5 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 6 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 7 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 8 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 9 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 10 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 11 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
 attempting to start a parallel recovery with 3 processes
 parallel recovery started with 3 processes
Starting datafile 1 with incarnation depth 0 in thread 1 sequence 794
Datafile 1: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\SYSTEM01.DBF'
Starting datafile 2 with incarnation depth 0 in thread 1 sequence 794
Datafile 2: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\UNDOTBS01.DBF'
Starting datafile 3 with incarnation depth 0 in thread 1 sequence 794
Datafile 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\SYSAUX01.DBF'
Starting datafile 4 with incarnation depth 0 in thread 1 sequence 794
Datafile 4: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\USERS01.DBF'
Starting datafile 5 with incarnation depth 0 in thread 1 sequence 794
Datafile 5: 'E:\ORADB\USER1_1.ORA'
Starting datafile 6 with incarnation depth 0 in thread 1 sequence 794
Datafile 6: 'E:\ORADB\USER1_2.ORA'
Starting datafile 7 with incarnation depth 0 in thread 1 sequence 794
Datafile 7: 'E:\ORADB\USER1_3.ORA'
Starting datafile 8 with incarnation depth 0 in thread 1 sequence 794
Datafile 8: 'E:\ORADB\USER1_4.ORA'
Starting datafile 9 with incarnation depth 0 in thread 1 sequence 794
Datafile 9: 'E:\ORADB\INDEX1_1.ORA'
Starting datafile 10 with incarnation depth 0 in thread 1 sequence 794
Datafile 10: 'E:\ORADB\INDEX1_2.ORA'
Starting datafile 11 with incarnation depth 0 in thread 1 sequence 794
Datafile 11: 'E:\ORADB\TEMP1_1.ORA'
Media Recovery Log 
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup cont...
Tue Jul 15 13:43:03 2014
ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\REDO01.LOG'  
Tue Jul 15 13:43:03 2014
Media Recovery Log D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\REDO01.LOG
Completed: ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\PRODU
Tue Jul 15 13:43:17 2014
alter database open resetlogs
RESETLOGS after complete recovery through change 3142208
Resetting resetlogs activation ID 1378452168 (0x522982c8)
Setting recovery target incarnation to 3
Tue Jul 15 13:43:18 2014
Setting recovery target incarnation to 3
Tue Jul 15 13:43:18 2014
Flashback Database Disabled 
Tue Jul 15 13:43:19 2014
Assigning activation ID 1380750902 (0x524c9636)
Maximum redo generation record size = 120832 bytes
Maximum redo generation change vector size = 116476 bytes
Private_strands 7 at log switch
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
Tue Jul 15 13:43:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Jul 15 13:43:19 2014
SMON: enabling cache recovery
Tue Jul 15 13:43:20 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1484.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [3142214], [0], [3142438], [8388617], [], []

Tue Jul 15 13:43:21 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1484.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [3142214], [0], [3142438], [8388617], [], []
Tue Jul 15 13:43:21 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600

这里出现ORA-600[2662]因为scn相差较小,直接重启几次,然后数据库出现如下启动正常但是出现ORA-01595和ORA-00600[4194]错误

Tue Jul 15 13:48:26 2014
Starting background process MMON
Starting background process MMNL
MMON started with pid=17, OS id=2896
MMNL started with pid=18, OS id=3828
Tue Jul 15 13:48:26 2014
Block recovery completed at rba 2.90.16, scn 0.3162303
Tue Jul 15 13:48:26 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_smon_3724.trc:
ORA-01595: error freeing extent (3) of rollback segment (1))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [57], [6], [], [], [], [], []

Tue Jul 15 13:48:26 2014
Completed: alter database open
Tue Jul 15 13:48:27 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_j000_2956.trc:
ORA-00600: internal error code, arguments: [4193], [1375], [1379], [], [], [], [], []

通过设置undo_management=manual,重建undo表空间解决,至此本数据库恢复完成,建议对其进行逻辑方式重建

误drop tablespace后使用flashback database闪回异常处理

有朋友夜间打电话请求技术支持,数据库表空间被删除,然后使用flashback database 无法正常恢复。通过分析alert日志发现,创建表空间(xifenfei 别名),发现已经存在,就删除了该表空间(其实是业务核心表空间,误删除了,是否是连接错了数据库?)

Sat Jul 05 17:10:06 2014
create tablespace XIFENFEI datafile 'D:\Oracle\oradata\orcl\HANDBB.DBF' 
size 50M autoextend on next 50M maxsize 1536M extent management local

Sat Jul 05 17:10:06 2014
ORA-1543 signalled during: create tablespace XIFENFEI datafile 'D:\Oracle\oradata\orcl\HANDBB.DBF' 
size 50M autoextend on next 50M maxsize 1536M extent management local
...
Sat Jul 05 17:10:59 2014
drop tablespace XIFENFEI

Sat Jul 05 17:10:59 2014
ORA-1549 signalled during: drop tablespace XIFENFEI
...
Sat Jul 05 17:11:05 2014
drop tablespace XIFENFEI

ORA-1549 signalled during: drop tablespace XIFENFEI
...
Sat Jul 05 17:11:24 2014
drop tablespace XIFENFEI including contents

Sat Jul 05 17:11:36 2014
Thread 1 advanced to log sequence 186895 (LGWR switch)
  Current log# 1 seq# 186895 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\REDO01.LOG
Sat Jul 05 17:11:36 2014
ARC3: Warning.  Log sequence in archive filename wrapped
to fix length as indicated by %S in LOG_ARCHIVE_FORMAT.
Old log archive with same name might be overwritten.
Sat Jul 05 17:11:43 2014
LNS: Standby redo logfile selected for thread 1 sequence 186895 for destination LOG_ARCHIVE_DEST_4
Sat Jul 05 17:11:49 2014
LNS: Standby redo logfile selected for thread 1 sequence 186895 for destination LOG_ARCHIVE_DEST_2
Sat Jul 05 17:12:09 2014
Starting control autobackup
Control autobackup written to DISK device
	handle 'D:\FULLBACK\C-1342406147-20140705-00'
Completed: drop tablespace XIFENFEI including contents

通过这里可以发现删除表空间时间点为2014年7月5日17:12:09

闪回数据库到删除之前时间点

Sat Jul 05 18:16:54 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Jul 05 18:19:23 2014
FLASHBACK DATABASE TO TIMESTAMP TO_TIMESTAMP('2014-07-05 17:09:00','YYYY-MM-DD HH24:MI:SS')
Sat Jul 05 18:19:25 2014
Flashback Restore Start
Sat Jul 05 18:20:52 2014
--闪回时的控制文件中无表空间XIFENFEI信息(因为已经被删除),
--但是由于闪回的system 数据字典里面有相关文件信息,因此数据库在控制文件里面创建相关文件信息
Flashback: created tablespace #6: 'XIFENFEI' in the controlfile.
Flashback: created OFFLINE file 'UNNAMED00012' for tablespace #6 in the controlfile.
Filename was:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\XIFENFEI4.DBF' when dropped.
File will have to be restored from a backup and recovered.
Flashback: created OFFLINE file 'UNNAMED00010' for tablespace #6 in the controlfile.
Filename was:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\XIFENFEI3.DBF' when dropped.
File will have to be restored from a backup and recovered.
Flashback: created OFFLINE file 'UNNAMED00008' for tablespace #6 in the controlfile.
Filename was:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\XIFENFEI2.DBF' when dropped.
File will have to be restored from a backup and recovered.
Flashback: created OFFLINE file 'UNNAMED00005' for tablespace #6 in the controlfile.
Filename was:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\XIFENFEI.DBF' when dropped.
File will have to be restored from a backup and recovered.
Flashback Restore Complete
Flashback Media Recovery Start
 parallel recovery started with 15 processes
Flashback Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\ORADATA\ARCHIVE\ARC\ARC86891_0766797318.001
Flashback Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\ORADATA\ARCHIVE\ARC\ARC86892_0766797318.001
Sat Jul 05 18:21:40 2014
Flashback Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\ORADATA\ARCHIVE\ARC\ARC86893_0766797318.001
Sat Jul 05 18:21:47 2014
WARNING: inbound connection timed out (ORA-3136)
Sat Jul 05 18:22:11 2014
Recovery of Online Redo Log: Thread 1 Group 3 Seq 186894 Reading mem 0
  Mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\REDO03.LOG
Sat Jul 05 18:22:39 2014
Incomplete Recovery applied until change 9078991241
Flashback Media Recovery Complete
ORA-38795 signalled during: FLASHBACK DATABASE TO TIMESTAMP TO_TIMESTAMP('2014-07-05 17:09:00','YYYY-MM-DD HH24:MI:SS')...
Sat Jul 05 18:30:11 2014
ALTER DATABASE OPEN RESETLOGS
Sat Jul 05 18:30:11 2014
ORA-1245 signalled during: ALTER DATABASE OPEN RESETLOGS...

--重命名相关UNNAMExxxxx文件名到硬盘上被删除表空间文件
Sat Jul 05 18:39:31 2014
alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00005' 
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI.DBF'
Sat Jul 05 18:39:31 2014
Completed: alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00005' 
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI.DBF'
Sat Jul 05 18:39:47 2014
alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00008' 
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI2.DBF'
Sat Jul 05 18:39:47 2014
Completed: alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00008' 
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI2.DBF'
Sat Jul 05 18:39:59 2014
alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00010' 
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI3.DBF'
Sat Jul 05 18:39:59 2014
Completed: alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00010' 
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI3.DBF'
Sat Jul 05 18:40:12 2014
alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00012' 
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI4.DBF'
Sat Jul 05 18:40:12 2014
Completed: alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00012' 
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI4.DBF'
Sat Jul 05 18:41:25 2014
ALTER DATABASE OPEN RESETLOGS
Sat Jul 05 18:41:25 2014
ORA-1245 signalled during: ALTER DATABASE OPEN RESETLOGS...

到这里,可以看出来,因为数据库整体已经闪回,但是被drop 表空间的四个数据文件未被正常闪回,因此该四个文件的scn可能异常,通过数据库恢复检查脚本(Oracle Database Recovery Check)检查结果如下
datafile_scn
datafile_header_scn
这里很明显控制文件中的scn信息混乱不做过多参考,数据文件头信息看到只有ts# 6中的四个文件(就是被删除的表空间文件)scn过大,其他文件scn都处于正常状态(处于干净状态),到这里很明显,数据库闪回成功,但是被drop tablespace的数据文件未被闪回,因此该故障可以通过bbed修改四个文件头信息和其他文件相同即可使得数据库恢复正常
温馨提示:数据库操作需要慎重,备份重于一切

数据库恢复历史再次刷新到Oracle 7.3.2版本—redo异常恢复

有网友在QQ上找我,说Oracle 7.3的数据库,因为redo异常咨询我是否可以恢复
qq咨询


检查数据库得到以下信息

SVRMGR> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle7 Workgroup Server Release 7.3.2.2.1 - Production Release
PL/SQL Release 2.3.2.2.0 - Production
CORE Version 3.5.2.0.0 - Production
TNS for 32-bit Windows: Version 2.3.2.1.0 - Production
NLSRTL Version 3.2.2.0.0 - Production
已选择 5 行

数据文件信息
qq
qq2


redo信息
qq3


跳过redo进行恢复,在resetlogs过程中报rbs表空间坏块,然后通过dul工具获得回滚段名称,然后使用隐含参数屏蔽掉

License high water mark = 2
Starting up ORACLE RDBMS Version: 7.3.2.2.1.
System parameters with non-default values:
  processes                = 800
  shared_pool_size         = 540000000
  control_files            = D:\ORANT\DATABASE\ctl1orcl.ora, D:\ORANT\DATABASE\ctl2orcl.ora
  compatible               = 7.3.0.0.0
  log_buffer               = 327680
  log_checkpoint_interval  = 1000000
  db_files                 = 40
  db_file_simultaneous_writes= 1280
  max_rollback_segments    = 12800
  _offline_rollback_segments= RB13, RB14, RB15, RB16, RB20
  _corrupted_rollback_segments= RB13, RB14, RB15, RB16, RB20
  sequence_cache_entries   = 100
  sequence_cache_hash_buckets= 100
  remote_login_passwordfile= SHARED
  mts_servers              = 0
  mts_max_servers          = 0
  mts_max_dispatchers      = 0
  audit_trail              = NONE
  sort_area_retained_size  = 65536
  sort_direct_writes       = AUTO
  db_name                  = oracle
  open_cursors             = 800
  text_enable              = TRUE
  snapshot_refresh_processes= 1
  background_dump_dest     = %RDBMS73%\trace
  user_dump_dest           = %RDBMS73%\trace

Mon Jun 16 16:46:57 2014

PMON started
Mon Jun 16 16:46:57 2014

DBWR started
Mon Jun 16 16:46:57 2014

LGWR started
Mon Jun 16 16:46:57 2014

RECO started
Mon Jun 16 16:46:57 2014

SNP0 started
Mon Jun 16 16:46:57 2014
alter database  mount exclusive
Mon Jun 16 16:46:58 2014
Successful mount of redo thread 1.
Mon Jun 16 16:46:58 2014
Completed: alter database  mount exclusive
Mon Jun 16 16:48:15 2014
alter database open
Mon Jun 16 16:48:16 2014
Beginning crash recovery of 1 threads
Crash recovery completed successfully
Mon Jun 16 16:48:17 2014
Thread 1 advanced to log sequence 9
  Current log# 1 seq# 9 mem# 0: D:\ORANT\DATABASE\LOG2ORCL.ORA
Thread 1 opened at log sequence 9
  Current log# 1 seq# 9 mem# 0: D:\ORANT\DATABASE\LOG2ORCL.ORA
Successful open of redo thread 1.
Mon Jun 16 16:48:18 2014
SMON: enabling cache recovery
Mon Jun 16 16:48:19 2014
Completed: alter database open
Mon Jun 16 16:48:20 2014
SMON: enabling tx recovery
SMON: about to recover undo segment 14
SMON: mark undo segment 14 as needs recovery
SMON: about to recover undo segment 15
SMON: mark undo segment 15 as needs recovery
SMON: about to recover undo segment 16
SMON: mark undo segment 16 as needs recovery
SMON: about to recover undo segment 17
SMON: mark undo segment 17 as needs recovery
SMON: about to recover undo segment 18
SMON: mark undo segment 18 as needs recovery
Mon Jun 16 16:48:20 2014
Errors in file D:\ORANT\RDBMS73\trace\orclSMON.TRC:
ORA-00600: internal error code, arguments: [4306], [21], [2], [], [], [], [], []

数据库在启动过程中出现ORA-00600[4306],导致smon异常。该错误是因为在数据库open过程中smon会清理临时段从而出现该错误,通过设置event跳过,数据库算整体打开,不过在恢复过程中还遇到了

Mon Jun 16 17:53:10 2014
Errors in file D:\ORANT\RDBMS73\trace\orclDBWR.TRC:
ORA-00600: internal error code, arguments: [3600], [3], [14], [], [], [], [], []


Mon Jun 16 18:05:12 2014
Errors in file D:\ORANT\RDBMS73\trace\ORA06880.TRC:
ORA-01578: ORACLE数据块有错(文件号12, 块号46644)
ORA-01110: 文件'12'没有联机
ORA-00600: 内部错误码, 变元: [4194], [18], [5], [], [], []
ORA-00600: 内部错误码, 变元: [4194], [18], [5], [], [], []

ORA-00600[3600]是因为在offline 回滚段所在表空间锁出现的问题
ORA-00600[4194]是因为回滚段所在的表空间数据文件出现坏块所导致

记录一次ORA-00600[kdxlin:psno out of range]/ORA-00600[3020]/ORA-00600[4000]/ORA-00600[4193]的数据库恢复

尝试recover database,遭遇ORA-00600[kdxlin:psno out of range]/ORA-00600[3020]/ORA-00354错误

Media Recovery Log 
Recovery of Online Redo Log: Thread 1 Group 1 Seq 5645 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO01.LOG
Mon Jun 09 15:36:10 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p001_9604.trc:
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []

Mon Jun 09 15:36:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p002_9592.trc:
ORA-00600: internal error code, arguments: [3020], [3], [23337], [12606249], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 23337)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSAUX01.DBF'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'

Mon Jun 09 15:36:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p001_9604.trc:
ORA-10562: Error occurred while applying redo to data block (file# 3, block# 20142)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSAUX01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 47841
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []

Mon Jun 09 15:36:13 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p002_9592.trc:
ORA-00600: internal error code, arguments: [3020], [3], [23337], [12606249], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 23337)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSAUX01.DBF'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'

Errors with log 
Mon Jun 09 15:36:14 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p000_9600.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2357 change 25400286 time 06/06/2014 04:00:41
ORA-00334: archived log: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO02.LOG'

Mon Jun 09 15:36:14 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p000_9600.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [1], [1490], [6401], [], [], [], []

Mon Jun 09 15:36:16 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p000_9600.trc:
ORA-10562: Error occurred while applying redo to data block (file# 1, block# 1490)
ORA-10564: tablespace SYSTEM
ORA-01110: data file 1: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSTEM01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 203
ORA-00600: internal error code, arguments: [kddummy_blkchk], [1], [1490], [6401], [], [], [], []

Media Recovery failed with error 12801
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...

因为数据库允许少量丢失数据,且redo文件发生损坏,直接使用隐含参数屏蔽redo前滚,尝试强制拉库,报ORA-00704,ORA-00600[4000]错误

Mon Jun 09 15:57:51 2014
SMON: enabling cache recovery
Mon Jun 09 15:57:51 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_8664.trc:
ORA-00600: 内部错误代码, 参数: [4000], [1], [], [], [], [], [], []

Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_8664.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [4000], [1], [], [], [], [], [], []

Mon Jun 09 15:57:52 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_pmon_9760.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_reco_5244.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_smon_7096.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_ckpt_7924.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_lgwr_708.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_dbw0_7400.trc:
ORA-00704: bootstrap process failure

Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_mman_9836.trc:
ORA-00704: bootstrap process failure

Instance terminated by USER, pid = 8664
ORA-1092 signalled during: alter database open resetlogs...

对数据库启动过程做10046,然后使用bbed修改scn绕过该错误,然后继续尝试打开数据库,报ORA-00604/ORA-00607/ORA-00600[4193]错误

Mon Jun 09 16:01:09 2014
SMON: enabling cache recovery
Mon Jun 09 16:01:10 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_7548.trc:
ORA-00600: 内部错误代码, 参数: [4193], [57], [51], [], [], [], [], []

Mon Jun 09 16:01:10 2014
Doing block recovery for file 1 block 397
Block recovery range from rba 2.3.0 to scn 0.1073741830
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO02.LOG
Block recovery stopped at EOT rba 2.5.16
Block recovery completed at rba 2.5.16, scn 0.1073741830
Doing block recovery for file 1 block 9
Block recovery range from rba 2.3.0 to scn 0.1073741829
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO02.LOG
Block recovery completed at rba 2.5.16, scn 0.1073741830
Mon Jun 09 16:01:11 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_7548.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码, 参数: [4193], [57], [51], [], [], [], [], []

Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Mon Jun 09 16:01:11 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_reco_9176.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:11 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_smon_7932.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_ckpt_7428.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_lgwr_6936.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_dbw0_404.trc:
ORA-00604: error occurred at recursive SQL level 

Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_mman_7968.trc:
ORA-00604: error occurred at recursive SQL level 

Instance terminated by USER, pid = 7548
ORA-1092 signalled during: ALTER DATABASE OPEN...

该错误的原因是因为数据库在启动的过程中,会事先利用上次数据库运行过程中system undo segment header指向的block,而该block异常,所以出现该错误,使用bbed/dul之类的工具清除掉undo seg header 指向block指针,然后数据库启动会重新分配一个block,从而实现数据库正常启动.

记录一次ORA-600 kccpb_sanity_check_2和ORA-600 kcbgtcr_13 错误恢复

晚上朋友告诉我数据库不能open,请求技术支持,检查alert日志发现ORA-00600[kccpb_sanity_check_2]错误导致数据库无法正常mount

Fri Jun  6 23:36:08 2014
alter database mount
Fri Jun  6 23:36:08 2014
This instance was first to mount
Fri Jun  6 23:36:12 2014
Errors in file /on3000/oracle/admin/on3000/udump/on30001_ora_295198.trc:
ORA-00600:内部错误代码, 参数:[kccpb_sanity_check_2], [18045], [17928], [0x000000000], [], [], [], []
ORA-600 signalled during: alter database mount...

依次替换三个控制文件依然无法解决该问题。查询MOS得到解释为[435436.1]

ORA-600 [kccpb_sanity_check_2] indicates that the seq# of the last read block is
higher than the seq# of the control file header block. This is indication of
the lost write of the header block during commit of the previous cf transaction.

出现该故障的原因是因为写丢失导致,而解决该故障的方法有

1) restore a backup of a controlfile and recover

OR

2) recreate the controlfile

OR

3) restore the database from last good backup and recover

该数据库为无备份非归档数据库,因此只能重建控制文件来解决ORA-00600[kccpb_sanity_check_2]故障,通过重建控制文件数据库mount成功.但是在open的过程中又出现需要一个不存在的归档日志(数据库一个节点5月5日异常,另外一个节点5月23日异常,到6月6日我接手中间进行了N多操作,未细细分析原因).
Oracle Database Recovery Check Result检查结果
数据库SCN
1
控制文件中关于数据文件SCN
2
数据文件头SCN
3
REDO SCN
4
这里明显表示thread#缺少归档,导致恢复过程出现如下提示
5


最后没有办法只能使用_allow_resetlogs_corruption参数跳过redo,然后去open数据库,很不幸出现更加悲催的ORA-00704和ORA-00600[kcbgtcr_13]错误,导致数据库open失败

Sat Jun  7 00:28:58 2014
SMON: enabling cache recovery
Sat Jun  7 00:28:58 2014
Errors in file /on3000/oracle/admin/on3000/udump/on30001_ora_344084.trc:
ORA-00600: 内部错误代码, 参数: [kcbgtcr_13], [], [], [], [], [], [], []
Sat Jun  7 00:28:59 2014
Errors in file /on3000/oracle/admin/on3000/udump/on30001_ora_344084.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [kcbgtcr_13], [], [], [], [], [], [], []
Sat Jun  7 00:28:59 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 344084
ORA-1092 signalled during: alter database open...

对启动过程做10046发现

*** 2014-06-07 00:28:58.528
ksedmp: internal or fatal error
ORA-00600: 内部错误代码, 参数: [kcbgtcr_13], [], [], [], [], [], [], []
Current SQL statement for this session:
select ctime, mtime, stime from obj$ where obj# = :1
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              000000004 ? 10538629C ?
ksedmp+0290          bl       ksedst               104A2CDB0 ?
ksfdmp+0018          bl       03F2735C             
kgerinv+00dc         bl       _ptrgl               
kgeasnmierr+004c     bl       kgerinv              000000000 ? 10564B4CC ?
                                                   000000004 ? 70000006D3FD6F0 ?
                                                   000000000 ?
kcbassertbd+0074     bl       kgeasnmierr          110195490 ? 110486310 ?
                                                   10564BD54 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 11043DC90 ?
kcbgtcr+2a68         bl       kcbassertbd          FFFFFFFFFFE7D00 ?
                                                   FFFFFFFFFE7D28 ?
kturbk1+0258         bl       kcbgtcr              000000000 ? 104D23384 ?
                                                   104D2330C ? 000000000 ?
ktrgcm+1294          bl       kturbk1              F0000000F ? FFFFFFFFFFE84B8 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFE84D0 ? 000000000 ?
ktrgtc+0660          bl       ktrgcm               1104B7600 ?
kdsgrp+0094          bl       ktrgtc               0000006C8 ? 000000000 ?
                                                   FFFFFFFFFFE8D80 ?
                                                   28242043335E5162 ?
                                                   103A06D48 ? 70000006F6C87B8 ?
                                                   1051C3528 ?
kdsfbrcb+0298        bl       kdsgrp               1044726E4 ?
                                                   433000000000003B ?
                                                   FFFFFFFFFFE8E30 ?
qertbFetchByRowID+0  bl       03F27E18             
69c                                                
qerstFetch+00ec      bl       01F9482C             
opifch2+141c         bl       03F25E6C             
opifch+003c          bl       opifch2              FFFFFFFFFFEC3D8 ? 000000000 ?
                                                   FFFFFFFFFFEA360 ?
opiodr+0ae0          bl       _ptrgl               
rpidrus+01bc         bl       opiodr               5FFFEC840 ? 200000000 ?
                                                   FFFFFFFFFFEC850 ? 500000000 ?
skgmstack+00c8       bl       _ptrgl               
rpidru+0088          bl       skgmstack            11049A820 ? 110195490 ?
                                                   000000002 ? 000000000 ?
                                                   FFFFFFFFFFEC3E8 ?
rpiswu2+034c         bl       _ptrgl               
rpidrv+095c          bl       rpiswu2              70000006F6C7250 ? 1104851E8 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   1101B8C48 ? 000000000 ?
rpifch+0050          bl       rpidrv               5FFFEC840 ? 500000000 ?
                                                   FFFFFFFFFFEC850 ? 06CD75F70 ?
kqdpts+0134          bl       rpifch               11022A3E0 ?
kqrlfc+0258          bl       kqdpts               000000000 ?
kqlbplc+00b4         bl       03F28AF0             
kqlblfc+0204         bl       kqlbplc              10011A9A8 ?
adbdrv+1978          bl       01F944B4             
opiexe+2c08          bl       adbdrv               
opiosq0+19f0         bl       opiexe               FFFFFFFFFFF9550 ? 100000000 ?
                                                   FFFFFFFFFFF8F20 ?
kpooprx+0168         bl       opiosq0              3693644A8 ? 700000010003520 ?
                                                   700000069364428 ?
                                                   A4000110195490 ?
kpoal8+0400          bl       kpooprx              FFFFFFFFFFFB774 ?
                                                   FFFFFFFFFFFB500 ?
                                                   1B0000001B ? 100000001 ?
                                                   000000000 ? A40000000000A4 ?
                                                   000000000 ? 11039FA18 ?
opiodr+0ae0          bl       _ptrgl               
ttcpip+1020          bl       _ptrgl               
opitsk+1124          bl       01F971E8             
opiino+0990          bl       opitsk               1E00000000 ? 000000000 ?
opiodr+0ae0          bl       _ptrgl               
opidrv+0484          bl       01F96034             
sou2o+0090           bl       opidrv               3C02D9A29C ? 4A006E298 ?
                                                   FFFFFFFFFFFF6B0 ?
opimai_real+01bc     bl       01F939B4             
main+0098            bl       opimai_real          000000000 ? 000000000 ?
__start+0070         bl       main                 000000000 ? 000000000 ?
 
--------------------- Binary Stack Dump ---------------------

--undo cr构造记录
Dump of buffer cache at level 1 for tsn=4, rdba=16777293
BH (700000035fb77b8) file#: 4 rdba: 0x0100004d (4/77) class: 46 ba: 7000000357b4000
  set: 10 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 4 afn: 4
  hash: [700000035f969c8,70000006d3fd868] lru: [700000035fb7728,70000006d6acff0]
  ckptq: [NULL] fileq: [NULL] objq: [700000035fb7798,7000000693932c0]
  st: CR md: NULL tch: 0
  cr: [scn: 0x0.1],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.19afb5fb],[sfl: 0x0]
  flags:
BH (700000035f969c8) file#: 4 rdba: 0x0100004d (4/77) class: 46 ba: 7000000353d6000
  set: 9 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 4 afn: 4
  hash: [700000035ff9398,700000035fb77b8] lru: [700000035f96938,70000006d6aca98]
  ckptq: [NULL] fileq: [NULL] objq: [700000035f969a8,700000069390250]
  st: CR md: NULL tch: 0
  cr: [scn: 0x0.1],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.19afb5fa],[sfl: 0x0]
  flags:
BH (700000035ff9398) file#: 4 rdba: 0x0100004d (4/77) class: 46 ba: 700000035f70000
  set: 12 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 4 afn: 4
  hash: [700000035fd86b8,700000035f969c8] lru: [700000035ff9528,70000006d6adaa0]
  ckptq: [NULL] fileq: [NULL] objq: [7000000693973c8,7000000693973c8]
  st: CR md: NULL tch: 0
  cr: [scn: 0x0.1],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.19afb5f9],[sfl: 0x0]
  flags:
BH (700000035fd86b8) file#: 4 rdba: 0x0100004d (4/77) class: 46 ba: 700000035b94000
  set: 11 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 4 afn: 4
  hash: [700000035fb76a8,700000035ff9398] lru: [700000035fd8848,70000006d6ad548]
  ckptq: [NULL] fileq: [NULL] objq: [700000069396398,700000069396398]
  st: CR md: NULL tch: 0
  cr: [scn: 0x0.1],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.19afb5f8],[sfl: 0x0]
  flags:
BH (700000035fb76a8) file#: 4 rdba: 0x0100004d (4/77) class: 46 ba: 7000000357b2000
  set: 10 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 4 afn: 4
  hash: [700000035f968b8,700000035fd86b8] lru: [700000035fb7948,700000035fb7838]
  ckptq: [NULL] fileq: [NULL] objq: [7000000693932c0,700000035fb78a8]
  st: CR md: NULL tch: 0
  cr: [scn: 0x0.1],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.19afb5f7],[sfl: 0x0]
  flags:
BH (700000035f968b8) file#: 4 rdba: 0x0100004d (4/77) class: 46 ba: 7000000353d4000
  set: 9 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 4 afn: 4
  hash: [70000006d3fd868,700000035fb76a8] lru: [700000035f96b58,700000035f96a48]
  ckptq: [NULL] fileq: [NULL] objq: [700000069390250,700000035f96ab8]
  st: CR md: NULL tch: 0
  cr: [scn: 0x0.1],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.19afb5f6],[sfl: 0x0]
  flags:
WAIT #5: nam='db file sequential read' ela= 135 file#=4 block#=77 blocks=1 obj#=-1 tim=1107709395109
on-disk scn: 0x0.19af5d47
BH (700000035fb77b8) file#: 4 rdba: 0x0100004d (4/77) class: 46 ba: 7000000357b4000
  set: 10 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 4 afn: 4
  hash: [700000035f969c8,70000006d3fd868] lru: [700000035fb7728,70000006d6acff0]
  ckptq: [NULL] fileq: [NULL] objq: [700000035fb7798,7000000693932c0]
  st: CR md: NULL tch: 0
  cr: [scn: 0x0.1],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.19afb5fb],[sfl: 0x0]
  flags:
Dump of buffer cache at level 10 for tsn=4, rdba=16777293
BH (700000035fb77b8) file#: 4 rdba: 0x0100004d (4/77) class: 46 ba: 7000000357b4000
  set: 10 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0
  dbwrid: 0 obj: -1 objn: 0 tsn: 4 afn: 4
  hash: [700000035f969c8,70000006d3fd868] lru: [700000035fb7728,70000006d6acff0]
  ckptq: [NULL] fileq: [NULL] objq: [700000035fb7798,7000000693932c0]
  st: CR md: NULL tch: 0
  cr: [scn: 0x0.1],[xid: 0x0.0.0],[uba: 0x0.0.0],[cls: 0x0.19afb5fb],[sfl: 0x0]
  flags:
  buffer tsn: 4 rdba: 0x0100004d (4/77)
  scn: 0x0000.19af5d47 seq: 0x01 flg: 0x04 tail: 0x5d470201
  frmt: 0x02 chkval: 0x6d2e type: 0x02=KTU UNDO BLOCK

--obj$ block dump记录
Block header dump:  0x0040007a
 Object id on Block? Y
 seg/obj: 0x12  csc: 0x00.19afb597  itc: 1  flg: -  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000f.012.0002c79c  0x0100004d.6113.1c  --U-    1  fsc 0x0000.19afb598

这里可以知道,数据库在读取obj$的时候使用到了undo cr块的构造,由于某种原因导致构造cr块失败,从而出现ORA-00600[kcbgtcr_13]错误,而因为obj$又在bootstarp$里面,因此又出现ORA-704.所以解决该问题的方法就是让数据库在查询obj$表的时候不再进行cr块构造,比如使用bbed提交事务等方法解决.这里使用bbed提交事务(bbed模拟提交事务一之修改itl),数据库启动成功

SQL> startup
ORACLE 例程已经启动。

Total System Global Area 1610612736 bytes
Fixed Size                  2084400 bytes
Variable Size             973078992 bytes
Database Buffers          620756992 bytes
Redo Buffers               14692352 bytes
数据库装载完毕。
数据库已经打开。