ORA-600[KCLCHKBLK_4] – 专业Oracle数据库恢复，或许是您恢复数据的最后机会@phone:17813235971

-bash-3.2$ sqlplus / as sysdba SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:23:35 2014 Copyright (c) 1982, 2010, Oracle. All Rights Reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production With the Partitioning, Real Application Clusters, Data Mining and Real Application Testing options SQL> recover database; ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread 2 ORA-00289: suggestion : /export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf ORA-00280: change 1639063379 for thread 2 is in sequence #29919 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel Media recovery cancelled. SQL> select thread#,group#,sequence# from v$log; THREAD# GROUP# SEQUENCE# ---------- ---------- ---------- 1 1 30261 1 2 30262 1 3 30258 1 4 30259 1 5 30260 2 6 29966 2 7 29967 2 8 29968 2 9 29969 2 10 29970 10 rows selected. SQL> select group#,member from v$logfile; GROUP# MEMBER ---------- -------------------------------------------------- 1 /dev/md/oradg/rdsk/d115 2 /dev/md/oradg/rdsk/d116 3 /dev/md/oradg/rdsk/d117 4 /dev/md/oradg/rdsk/d118 5 /dev/md/oradg/rdsk/d119 6 /dev/md/oradg/rdsk/d120 7 /dev/md/oradg/rdsk/d121 8 /dev/md/oradg/rdsk/d122 9 /dev/md/oradg/rdsk/d123 10 /dev/md/oradg/rdsk/d124 10 rows selected. SQL> recover database; ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread 2 ORA-00289: suggestion : /export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf ORA-00280: change 1639063379 for thread 2 is in sequence #29919 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel Media recovery cancelled.

SQL> startup mount pfile='/tmp/pfile.txt'; ORACLE instance started. Total System Global Area 2097152000 bytes Fixed Size 2053120 bytes Variable Size 1090522112 bytes Database Buffers 855638016 bytes Redo Buffers 148938752 bytes Database mounted. SQL> recover database until cancel; ORA-00279: change 1639063379 generated at 11/01/2014 12:58:20 needed for thread 1 ORA-00289: suggestion : /export/home/oracle/product/10.2.0/db_1/dbs/arch1_30262_790965041.dbf ORA-00280: change 1639063379 for thread 1 is in sequence #30262 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} cancel ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below ORA-01194: file 1 needs more recovery to be consistent ORA-01110: data file 1: '/dev/md/oradg/rdsk/d105' ORA-01112: media recovery not started SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced

Mon Nov 03 17:37:11 CST 2014 Completed crash recovery at Thread 2: logseq 1, block 3, scn 1639083385 0 data blocks read, 0 data blocks written, 1 redo blocks read Picked broadcast on commit scheme to generate SCNs Mon Nov 03 17:37:12 CST 2014 Thread 2 advanced to log sequence 2 (thread open) Thread 2 opened at log sequence 2 Current log# 7 seq# 2 mem# 0: /dev/md/oradg/rdsk/d121 Successful open of redo thread 2 Mon Nov 03 17:37:12 CST 2014 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set Mon Nov 03 17:37:13 CST 2014 SMON: enabling cache recovery Mon Nov 03 17:37:13 CST 2014 Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], [] Mon Nov 03 17:37:14 CST 2014 Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc: ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], [] Mon Nov 03 17:37:14 CST 2014 Error 600 happened during db open, shutting down database USER: terminating instance due to error 600 Instance terminated by USER, pid = 15700 ORA-1092 signalled during: alter database open...

SQL> alter database tempfile '/dev/md/oradg/rdsk/d109' drop; Database altered. SQL> c/109/110 1* alter database tempfile '/dev/md/oradg/rdsk/d110' drop SQL> / Database altered. SQL> alter database open; alter database open * ERROR at line 1: ORA-01092: ORACLE instance terminated. Disconnection forced

-bash-3.2$ sqlplus / as sysdba SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:41:17 2014 Copyright (c) 1982, 2010, Oracle. All Rights Reserved. Connected to an idle instance. SQL> startup pfile='/tmp/pfile.txt' mount; ORACLE instance started. Total System Global Area 2097152000 bytes Fixed Size 2053120 bytes Variable Size 1090522112 bytes Database Buffers 855638016 bytes Redo Buffers 148938752 bytes Database mounted. SQL> alter database open; Database altered. SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d109' size 8388608000 autoextend off; Tablespace altered. SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d110' size 8388608000 autoextend off; Tablespace altered. SQL> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 2097152000 bytes Fixed Size 2053120 bytes Variable Size 1090522112 bytes Database Buffers 855638016 bytes Redo Buffers 148938752 bytes Database mounted. Database opened.

当前日志被覆盖导致错误
朋友的一客户在一套rac上包含了两个数据库,其其中一个库增加redo group时候,覆盖了另外一个库的redo,悲剧的是刚好是current redo

Wed May 16 17:03:05 2012
ALTER DATABASE OPEN
This instance was first to open
Wed May 16 17:03:09 2012
Beginning crash recovery of 2 threads
 parallel recovery started with 15 processes
Wed May 16 17:03:11 2012
Started redo scan
Wed May 16 17:03:11 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc:
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2'
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1'
Abort recovery for domain 0
Wed May 16 17:03:11 2012
Aborting crash recovery due to error 305
Wed May 16 17:03:11 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc:
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2'
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1'
ORA-305 signalled during: ALTER DATABASE OPEN...
Wed May 16 17:03:13 2012
Shutting down instance (abort)

使用_allow_resetlogs_corruption= TRUE进行恢复

Wed May 16 18:16:48 2012
SMON: enabling cache recovery
Wed May 16 18:16:48 2012
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Wed May 16 18:16:48 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2105454.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [2522], [18446744072024280773], 
[2522], [18446744072024247666], [], [], []
Wed May 16 18:16:50 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2105454.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [2522], [18446744072024280773], 
[2522], [18446744072024247666], [], [], []
Wed May 16 18:16:50 2012
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 2105454
ORA-1092 signalled during: alter database open resetlogs...

ORA-600[KCLCHKBLK_4], is signaled because the SCN in a tempfile block is too high.
The same reason caused the ORA-600[2662]s in the alert logs.
因为是临时文件的scn太大的问题,那就比较好解决:
启动数据库到mount状态,查询出来相关temp file,然后drop掉.

ORA-00600[6856]

Wed May 16 20:25:16 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2482210.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/dev/rods_redo2_1_1'
ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], []
ORACLE Instance odsdb1 (pid = 16) - Error 600 encountered while recovering transaction 
(10, 8） on object 7162533.
Wed May 16 20:25:16 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2482210.trc:
ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], []

这里的4193和4194是比较熟悉的,根据这里的提示猜测6856也是和undo有关系
ORA-600[6856]SMON is trying to recover a dead transaction.
But the undo application runs into an internal error (trying to delete a row that is already deleted).
因为smon回滚的时候出现上面错误,解决方法是想办法终止回滚,使用event=”10513 trace name context forever, level 2″.

ORA-00600[4193]/ORA-00600[4194]

Wed May 16 20:25:17 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/dev/rods_redo2_1_1'
ORA-00600: internal error code, arguments: [4194], [22], [25], [], [], [], [], []
Wed May 16 20:25:18 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/dev/rods_redo2_1_1'
ORA-00600: internal error code, arguments: [4194], [22], [25], [], [], [], [], []
Wed May 16 20:25:56 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc:
ORA-00600: internal error code, arguments: [4193], [22248], [22252], [], [], [], [], []

太常见错误,不再做说明,虽然使用event是的库open成功,因为部分回滚段有问题,该错误还是会出现(还是喜欢直接屏蔽回滚段)

ORA-00600[ktpridestroy2]

Wed May 16 20:36:26 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2101296.trc:
ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], []

This error could be the result of a corruption and involves the parallel rollback that SMON enables each startup.
解决:fast_start_parallel_rollback=false

ORA-00600[kturacf1]/ORA-00600[kcbgcur_9]

Wed May 16 20:49:15 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_j000_2007088.trc:
ORA-00600: internal error code, arguments: [kturacf1], [2097152], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kcbgcur_9], [780140563], [4], [4294959056], [2097152], [], [], []

ORA-00600[kturacf1]错误未查询到原因
ORA-00600[kcbgcur_9]错误原因可能是:Buffers are pinned in a specific class order to prevent internal deadlocks.
因为这两个错误是job产生非致命错误,在这次的处理过程中可以忽略

ORA-00600[4097]

Wed May 16 21:05:05 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_j000_1716282.trc:
ORA-12012: error on auto execute of job 6603
ORA-20001: ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
ORA-06512: at "EPBI.UP_SYSLOG_ONLINE_USER", line 141
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []

When an instance has a rollback segment offline and the instance crashes, or
the user does a shutdown abort, the rollback segment wrap number does not get
updated. If that segment is then dropped and recreated immediately after the
instance is restarted, the wrap number could be lower than existing wrap
numbers. This will cause the ORA-600[4097] to occur in subsequent
transactions using Rollback.
这个错误也是因为回滚段wrap number未被及时更新导致的异常.

总结这次恢复过程
1.因当前redo丢失,使用隐含参数打开库,然后出现ORA-600[KCLCHKBLK_4](这个错误比较少见,更何况这个库是10.2.0.4）
2.undo出现问题出现ORA-00600[6856]错误不是很多见
3.接下来都是一些列undo导致的问题,其实如果开始就直接使用隐含参数删除掉有问题回滚段,效果可能会比event好.
4.因为部分trace文件没有拿到,未做深入分析,文章列出整体恢复思路
5.本次恢复的数据库版本是10.2.0.4,根据mos描述,很多错误应不会发生,但是实际还是发生了,MOS的版本范围,不要太看重.
6.其实这篇文章的本质不是展示恢复过程,而是再一次提醒:操作数据库慎重,特别是一台机器上多套库.

专业Oracle数据库恢复技术支持

专业Oracle数据库恢复，或许是您恢复数据的最后机会@phone:17813235971

Tag Archives: ORA-600[KCLCHKBLK_4]

ORA-600 kclchkblk_4 故障恢复

rac redo log file被意外覆盖数据库恢复