ORA-600 kclchkblk_4 故障恢复

朋友和我说,他们数据库由于存储控制器异常,导致数据库无法正常启动.
数据库recover database 提示需要已经覆盖的redo

-bash-3.2$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:23:35 2014

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, Data Mining and Real Application Testing options

SQL> recover database;
ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread
2
ORA-00289: suggestion :
/export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf
ORA-00280: change 1639063379 for thread 2 is in sequence #29919


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.

SQL> select thread#,group#,sequence# from v$log;

   THREAD#     GROUP#  SEQUENCE#
---------- ---------- ----------
         1          1      30261
         1          2      30262
         1          3      30258
         1          4      30259
         1          5      30260
         2          6      29966
         2          7      29967
         2          8      29968
         2          9      29969
         2         10      29970

10 rows selected.

SQL> select group#,member from v$logfile;

    GROUP# MEMBER
---------- --------------------------------------------------
         1 /dev/md/oradg/rdsk/d115
         2 /dev/md/oradg/rdsk/d116
         3 /dev/md/oradg/rdsk/d117
         4 /dev/md/oradg/rdsk/d118
         5 /dev/md/oradg/rdsk/d119
         6 /dev/md/oradg/rdsk/d120
         7 /dev/md/oradg/rdsk/d121
         8 /dev/md/oradg/rdsk/d122
         9 /dev/md/oradg/rdsk/d123
        10 /dev/md/oradg/rdsk/d124

10 rows selected.

SQL> recover database;
ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread 2
ORA-00289: suggestion : /export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf
ORA-00280: change 1639063379 for thread 2 is in sequence #29919


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.

数据库没归档,redo覆盖,无法继续恢复,使用_allow_resetlogs_corruption屏蔽前滚,继续恢复

SQL> startup mount pfile='/tmp/pfile.txt';
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2053120 bytes
Variable Size            1090522112 bytes
Database Buffers          855638016 bytes
Redo Buffers              148938752 bytes
Database mounted.
SQL> recover database until cancel;          
ORA-00279: change 1639063379 generated at 11/01/2014 12:58:20 needed for thread
1
ORA-00289: suggestion :
/export/home/oracle/product/10.2.0/db_1/dbs/arch1_30262_790965041.dbf
ORA-00280: change 1639063379 for thread 1 is in sequence #30262


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/dev/md/oradg/rdsk/d105'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

查看alert日志

Mon Nov 03 17:37:11 CST 2014
Completed crash recovery at
 Thread 2: logseq 1, block 3, scn 1639083385
 0 data blocks read, 0 data blocks written, 1 redo blocks read
Picked broadcast on commit scheme to generate SCNs
Mon Nov 03 17:37:12 CST 2014
Thread 2 advanced to log sequence 2 (thread open)
Thread 2 opened at log sequence 2
  Current log# 7 seq# 2 mem# 0: /dev/md/oradg/rdsk/d121
Successful open of redo thread 2
Mon Nov 03 17:37:12 CST 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Nov 03 17:37:13 CST 2014
SMON: enabling cache recovery
Mon Nov 03 17:37:13 CST 2014
Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], []
Mon Nov 03 17:37:14 CST 2014
Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], []
Mon Nov 03 17:37:14 CST 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 15700
ORA-1092 signalled during: alter database open...

参考:In 10.1.0.2: ORA-600 [kclchkblk_4] and ORA-600 [2662] After Recovery of Database (Doc ID 275902.1),删除tempfile继续打开数据库

SQL> alter database tempfile '/dev/md/oradg/rdsk/d109' drop;

Database altered.

SQL> c/109/110
  1* alter database tempfile '/dev/md/oradg/rdsk/d110' drop
SQL> /

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

数据库依然报ORA-00600kclchkblk_4]错误,因此解决该问题选择使用bbed修改文件头scn来完成,具体参考类似文章:使用bbed修复损坏datafile header

-bash-3.2$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:41:17 2014

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

Connected to an idle instance.

SQL> startup pfile='/tmp/pfile.txt' mount; 
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2053120 bytes
Variable Size            1090522112 bytes
Database Buffers          855638016 bytes
Redo Buffers              148938752 bytes
Database mounted.
SQL> alter database open;

Database altered.

SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d109' size 8388608000 autoextend off;

Tablespace altered.

SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d110' size 8388608000 autoextend off;

Tablespace altered.

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2053120 bytes
Variable Size            1090522112 bytes
Database Buffers          855638016 bytes
Redo Buffers              148938752 bytes
Database mounted.
Database opened.

至此数据库恢复完成,建议逻辑方式重建数据库

记录一次Oracle 12C 文件头损坏恢复

接到第一个使用Oracle 12C作为生产库的恢复救援.有两个业务数据文件报文件头损坏,其他数据文件全部是9月份的一次备份,在当前的条件下,希望我们能够帮他们恢复出来业务文件中的数据
数据库版本信息

SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

数据库故障
具体脚本请参考:数据库恢复检查脚本(Oracle Database Recovery Check)
控制文件信息
1
控制文件中关于数据文件信息
2
数据文件头信息
3
alert日志报错

Reading datafile '/app/oracle/oradata/freetouch/sales.dbf' for corruption at rdba: 0x00000001 (file 4, block 1)
Reread (file 4, block 1) found same corrupt data (no logical check)
Hex dump of (file 5, block 1) in trace file /app/oracle/diag/rdbms/valuenet/valuenet/trace/valuenet_ora_12384.trc

Corrupt block relative dba: 0x00000001 (file 5, block 1)
Fractured block found during kcvxfh v8
Data in bad block:
 type: 0 format: 2 rdba: 0x00000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0xa701
 computed block checksum: 0x0

Reading datafile '/app/oracle/oradata/freetouch/drp_200200' for corruption at rdba: 0x00000001 (file 5, block 1)
Reread (file 5, block 1) found same corrupt data (no logical check)
Hex dump of (file 4, block 1) in trace file /app/oracle/diag/rdbms/valuenet/valuenet/trace/valuenet_ora_12384.trc

Corrupt block relative dba: 0x00000001 (file 4, block 1)
Fractured block found during kcvxfh v8
Data in bad block:
 type: 0 format: 2 rdba: 0x00000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0xa701
 computed block checksum: 0x0

dbv 检测

[oracle@db trace]$ dbv file=/app/oracle/oradata/freetouch/drp_200200

DBVERIFY: Release 12.1.0.1.0 - Production on Sun Nov 2 14:08:34 2014

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /app/oracle/oradata/freetouch/drp_200200


DBVERIFY - Verification complete

Total Pages Examined         : 194560
Total Pages Processed (Data) : 114596
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 26198
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 37787
Total Pages Processed (Seg)  : 1
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 15979
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 129603862 (0.129603862)

[oracle@db ~]$ dbv file=/app/oracle/oradata/freetouch/sales.dbf

DBVERIFY: Release 12.1.0.1.0 - Production on Sun Nov 2 23:12:05 2014

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /app/oracle/oradata/freetouch/sales.dbf


DBVERIFY - Verification complete

Total Pages Examined         : 655360
Total Pages Processed (Data) : 294938
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 233404
Total Pages Failing   (Index): 0
Total Pages Processed (Lob)  : 38
Total Pages Failing   (Lob)  : 0
Total Pages Processed (Other): 23252
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 103728
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 134665298 (0.134665298)

dbv检测结果无坏块,但是v$datafile_header和alert日志中报坏块,初步判断是由于该文件是bigfile,dbv未检测到文件头坏块,实际该该数据文件头损坏,其他block正常.所幸的是该库有9月份的rman备份(中间归档丢失),因此使用rman还原出来9月份的数据文件,然后使用dd拷贝两个 block(block 0和block 1)到异常文件.

[root@db freetouch]# dd if=/app1/oracle/oradata/freetouch/sales.dbf bs=8192 count=2 of=/tmp/odu/sales.2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.0176368 s, 929 kB/s
[root@db freetouch]# dd if=/tmp/odu/sales.2 of=/app/oracle/oradata/freetouch/sales.dbf bs=8192 count=2 conv=notrunc 
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 6.4281e-05 s, 255 MB/s
[root@db freetouch]# dd if=/app1/oracle/oradata/freetouch/drp_200200 bs=8192 count=2 of=/tmp/odu/drp_200200.2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.0185934 s, 881 kB/s
[root@db freetouch]# dd if=/tmp/odu/drp_200200.2 of=/app/oracle/oradata/freetouch/drp_200200 bs=8192 count=2 conv=notrunc 
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 6.4419e-05 s, 254 MB/s

尝试恢复数据库

SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done


SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 4 belongs to an orphan incarnation
ORA-01110: data file 4: '/app/oracle/oradata/freetouch/sales.dbf'

使用bbed修改相关文件头,然后继续恢复
具体见:bbed解决ORA-01190类似方法处理

SQL> recover database using backup controlfile;
ORA-00279: change 129603904 generated at 11/02/2014 19:19:54 needed for thread
1
ORA-00289: suggestion :
/app/oracle/recovery_area/VALUENET/archivelog/2014_11_02/o1_mf_1_1_%u_.arc
ORA-00280: change 129603904 for thread 1 is in sequence #1


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel;
ORA-00308: cannot open archived log 'cancel;'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [129603911], [0],
[129603913], [29360256], [], [], [], [], [], []
Process ID: 19881
Session ID: 1 Serial number: 3

出现ORA-600[2662]错误,因为scn相差比较小,重启数据库机器,出现ORA-600[4194]错误

SQL> startup pfile='/tmp/pfile.txt' mount
ORACLE instance started.

Total System Global Area 2505338880 bytes
Fixed Size                  2291472 bytes
Variable Size             973080816 bytes
Database Buffers         1526726656 bytes
Redo Buffers                3239936 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/app/oracle/oradata/freetouch/system01.dbf'


SQL> recover database ;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database or pluggable database must be opened in read/write
mode.

重建控制文件后继续恢复

SQL> alter database open resetlogs;    
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4194], [46], [19], [], [], [], [],
[], [], [], [], []
Process ID: 20351
Session ID: 1 Serial number: 3

设置undo_management=MANUAL然后继续恢复

[oracle@db tmp]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on Sun Nov 2 19:29:45 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup pfile='/tmp/pfile.txt'
ORACLE instance started.

Total System Global Area 2505338880 bytes
Fixed Size                  2291472 bytes
Variable Size             973080816 bytes
Database Buffers         1526726656 bytes
Redo Buffers                3239936 bytes
Database mounted.
Database opened.

这次的恢复也证明Oracle 12C确实有着越来越多的用户在使用.