存储异常导致ORA-10562故障恢复

朋友数据库由于存储变动,导致数据库瞬间hang住,然后直接crash,之后无法正常启动,请求技术支持.
数据库报ORA-00600[2131]错误
不能mount,可以通过重建控制文件解决

Mon Nov 30 20:35:38 2015
alter database mount
Mon Nov 30 20:35:38 2015
NOTE: Loaded library: System 
Mon Nov 30 20:35:38 2015
SUCCESS: diskgroup DATADG was mounted
Mon Nov 30 20:35:38 2015
NOTE: dependency between database xifenfei and diskgroup resource ora.DATADG.dg is established
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_26450.trc  (incident=3032256):
ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database mount...

尝试recover数据库

Mon Nov 30 20:45:53 2015
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 80 slaves
Mon Nov 30 20:45:56 2015
Recovery of Online Redo Log: Thread 2 Group 11 Seq 617 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo011.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 5410 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo04.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5411 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo05.log
Mon Nov 30 20:46:07 2015
Recovery of Online Redo Log: Thread 1 Group 6 Seq 5412 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo06.log
Mon Nov 30 20:46:13 2015
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xC] [PC:0x95FB502, kdxlin()+4088] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr13_30480.trc  (incident=3032568):
ORA-07445: 出现异常错误: 核心转储 [kdxlin()+4088] [SIGSEGV] [ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []
Mon Nov 30 20:46:17 2015
Sweep [inc][3032568]: completed
Sweep [inc2][3032568]: completed
Mon Nov 30 20:46:31 2015
Slave exiting with ORA-10562 exception
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr13_30480.trc:
ORA-10562: Error occurred while applying redo to data block (file# 2, block# 165054)
ORA-10564: tablespace SYSAUX
ORA-01110: 数据文件 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 271
ORA-00607: 当更改数据块时出现内部错误
ORA-00602: 内部编程异常错误
ORA-07445: 出现异常错误: 核心转储 [kdxlin()+4088] [SIGSEGV] [ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []
Mon Nov 30 20:46:31 2015
Recovery Slave PR13 previously exited with exception 10562
Mon Nov 30 20:46:33 2015
Checker run found 28 new persistent data failures
Mon Nov 30 20:46:35 2015
Media Recovery failed with error 448
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr00_30400.trc:
ORA-00283: 恢复会话因错误而取消
ORA-00448: 后台进程正常结束
ORA-10562 signalled during: ALTER DATABASE RECOVER  database  ...

通过这里可以看到,由于在recover 操作之时,由于某种原因redo的数据无法apply到file 2 block 165054中,导致数据库recover database失败.

按照数据文件recover操作

SQL> recover datafile 1;
Media recovery complete.
SQL> recover datafile 3,4,5,6,7;
Media recovery complete.
SQL> recover datafile 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28;              
Media recovery complete.
SQL> recover datafile 2;
ORA-00283: recovery session canceled due to errors
ORA-10562: Error occurred while applying redo to data block (file# 2, block#
165054)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 271
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kdxlin()+4088] [SIGSEGV]
[ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []

错误提示和recover database一样,那我们只能让恢复跳过该block继续恢复,因为根据经验判断data object# 271不是系统核心对象,不会影响数据库的启动

跳过坏块继续恢复

SQL> recover  datafile 2 allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [2], [69793], [8458401], [],
[], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 69793, file
offset is 571744256 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 272


SQL> recover  datafile 2 allow 1 corruption;
Media recovery complete.
SQL> alter database open;

Database altered.

出现了ORA-600[3020] 继续跳过坏块,然后数据库顺利open,别忘记加tempfile

处理异常对象

SQL> select object_name,object_type from dba_objects where data_object_id in(272,271);

OBJECT_NAME
--------------------------------------------------------------------------------
OBJECT_TYPE
-------------------
SMON_SCN_TIME_TIM_IDX
INDEX

SMON_SCN_TIME_SCN_IDX
INDEX

SQL> select index_name from dba_indexes where table_name='SMON_SCN_TIME';

INDEX_NAME
------------------------------
SMON_SCN_TIME_TIM_IDX
SMON_SCN_TIME_SCN_IDX

SQL> set pages 1000
SQL> set long 1000
SQL> Select dbms_metadata.get_ddl('TABLE','SMON_SCN_TIME','SYS') FROM DUAL ;

DBMS_METADATA.GET_DDL('TABLE','SMON_SCN_TIME','SYS')
--------------------------------------------------------------------------------

  CREATE TABLE "SYS"."SMON_SCN_TIME"
   (    "THREAD" NUMBER,
        "TIME_MP" NUMBER,
        "TIME_DP" DATE,
        "SCN_WRP" NUMBER,
        "SCN_BAS" NUMBER,
        "NUM_MAPPINGS" NUMBER,
        "TIM_SCN_MAP" RAW(1200),
        "SCN" NUMBER DEFAULT 0,
        "ORIG_THREAD" NUMBER DEFAULT 0           /* for downgrade */
   ) CLUSTER "SYS"."SMON_SCN_TO_TIME_AUX" ("THREAD")

SQL> analyze table smon_scn_time validate structure cascade online;
analyze table smon_scn_time validate structure cascade online
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 2, block # 165054)
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'

SQL> truncate CLUSTER "SYS"."SMON_SCN_TO_TIME_AUX";

Cluster truncated.

关于SMON_SCN_TIME部分处理,可以参考:关于SMON_SCN_TIME若干问题说明.至此数据库基本上恢复完成,而且运气非常好,恢复的非常完美,数据实现0丢失.

asm disk格式化恢复

接到网友请求,由于操作人员粗心把asm disk的磁盘映射到另外的机器上,并且格式化为了win ntfs文件系统,导致asm 磁盘组异常,数据库无法使用
asm 日志报ORA-27072错

Mon Nov 30 12:00:13 2015
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27070: async read/write failed
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 21) 设备未就绪。
WARNING: IO Failed. group:1 disk(number.incarnation):0.0xf0f0bbfb disk_path:\\.\ORCLDISKDATA0
	 AU:1 disk_offset(bytes):2093056 io_size:4096 operation:Write type:synchronous
	 result:I/O error process_id:868
WARNING: disk 0.4042308603 (DATA_0000) not responding to heart beat
ERROR: too many offline disks in PST (grp 1)
WARNING: Disk DATA_0000 in mode 0x7f will be taken offline
Mon Nov 30 12:00:13 2015
NOTE: process 576:37952 initiating offline of disk 0.4042308603 (DATA_0000) with mask 0x7e in group 1
WARNING: Disk DATA_0000 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 1, dsk = 0/0xf0f0bbfb, mode = 0x15
kfdp_updateDsk(): 5 
kfdp_updateDskBg(): 5 
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):1.0xf0f0bbfc disk_path:\\.\ORCLDISKDATA1
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):1.0xf0f0bbfc disk_path:\\.\ORCLDISKDATA1
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):2.0xf0f0bbfd disk_path:\\.\ORCLDISKDATA2
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):2.0xf0f0bbfd disk_path:\\.\ORCLDISKDATA2
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):3.0xf0f0bbfe disk_path:\\.\ORCLDISKDATA3
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):3.0xf0f0bbfe disk_path:\\.\ORCLDISKDATA3
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):4.0xf0f0bbff disk_path:\\.\ORCLDISKDATA4
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):4.0xf0f0bbff disk_path:\\.\ORCLDISKDATA4
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):6.0xf0f0bc01 disk_path:\\.\ORCLDISKDATA6
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):6.0xf0f0bc01 disk_path:\\.\ORCLDISKDATA6
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):7.0xf0f0bc02 disk_path:\\.\ORCLDISKDATA7
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):7.0xf0f0bc02 disk_path:\\.\ORCLDISKDATA7
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
ERROR: no PST quorum in group: required 1, found 0
WARNING: Disk DATA_0000 in mode 0x7f offline aborted
Mon Nov 30 12:00:14 2015
SQL> alter diskgroup DATA dismount force /* ASM SERVER */ 
NOTE: cache dismounting (not clean) group 1/0xBB404B03 (DATA) 
Mon Nov 30 12:00:14 2015
NOTE: halting all I/Os to diskgroup DATA
Mon Nov 30 12:00:14 2015
NOTE: LGWR doing non-clean dismount of group 1 (DATA)
NOTE: LGWR sync ABA=367.7265 last written ABA 367.7265
NOTE: cache dismounted group 1/0xBB404B03 (DATA) 
kfdp_dismount(): 6 
kfdp_dismountBg(): 6 
NOTE: De-assigning number (1,0) from disk (\\.\ORCLDISKDATA0)
NOTE: De-assigning number (1,1) from disk (\\.\ORCLDISKDATA1)
NOTE: De-assigning number (1,2) from disk (\\.\ORCLDISKDATA2)
NOTE: De-assigning number (1,3) from disk (\\.\ORCLDISKDATA3)
NOTE: De-assigning number (1,4) from disk (\\.\ORCLDISKDATA4)
NOTE: De-assigning number (1,5) from disk (\\.\ORCLDISKDATA5)
NOTE: De-assigning number (1,6) from disk (\\.\ORCLDISKDATA6)
NOTE: De-assigning number (1,7) from disk (\\.\ORCLDISKDATA7)
SUCCESS: diskgroup DATA was dismounted
NOTE: cache deleting context for group DATA 1/-1153414397
SUCCESS: alter diskgroup DATA dismount force /* ASM SERVER */
ERROR: PST-initiated MANDATORY DISMOUNT of group DATA

这里的asm日志很明显由于asm disk无法正常访问,报ORA-27072错误,磁盘组强制dismount.

分析磁盘情况
asm-disk1
asm-disk2


通过与客户沟通,确定从I到O本为asm disk 被格式化为了NTFS文件系统的磁盘,结合asmtool分析可以发现还有一个asm disk没有格式化掉,该磁盘组中一个共有8个磁盘格式化掉了7个.

通过kfed分析磁盘信息

C:\Users\Administrator>kfed read '\\.\J:'
kfbh.endian:                        235 ; 0x000: 0xeb
kfbh.hard:                           82 ; 0x001: 0x52
kfbh.type:                          144 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                         78 ; 0x003: 0x4e
kfbh.block.blk:               542328404 ; 0x004: T=0 NUMB=0x20534654
kfbh.block.obj:                 2105376 ; 0x008: TYPE=0x0 NUMB=0x2020
kfbh.check:                        2050 ; 0x00c: 0x00000802
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                    63488 ; 0x014: 0x0000f800
kfbh.spare1:                   16711743 ; 0x018: 0x00ff003f
kfbh.spare2:                       2048 ; 0x01c: 0x00000800
ERROR!!!, failed to get the oracore error message

C:\Users\Administrator>kfed read '\\.\J:' blkn=2
kfbh.endian:                         70 ; 0x000: 0x46
kfbh.hard:                           73 ; 0x001: 0x49
kfbh.type:                           76 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                         69 ; 0x003: 0x45
kfbh.block.blk:                  196656 ; 0x004: T=0 NUMB=0x30030
kfbh.block.obj:                33563364 ; 0x008: TYPE=0x0 NUMB=0x22e4
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                    65537 ; 0x010: 0x00010001
kfbh.fcn.wrap:                    65592 ; 0x014: 0x00010038
kfbh.spare1:                        416 ; 0x018: 0x000001a0
kfbh.spare2:                       1024 ; 0x01c: 0x00000400
ERROR!!!, failed to get the oracore error message

C:\Users\Administrator>kfed read '\\.\J:' blkn=256
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                           13 ; 0x002: KFBTYP_PST_NONE
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:              2147483648 ; 0x004: T=1 NUMB=0x0
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                    17662471 ; 0x00c: 0x010d8207
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
ERROR!!!, failed to get the oracore error message

C:\Users\Administrator>kfed read '\\.\J:' blkn=510
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: T=0 NUMB=0xfe
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                   717599272 ; 0x00c: 0x2ac5b228
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:    ORCLDISKDATA6 ; 0x000: length=13
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:           54 ; 0x00c: 0x00000036
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
…………

通过分析,可以确定asm disk的备份block没有被覆盖,原则上可以通过备份block实现磁盘组恢复,从而减小了恢复难度

kfed恢复磁盘头

C:\Users\Administrator> kfed repair '\\.\J:'
C:\Users\Administrator>kfed read '\\.\J:' 
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: T=0 NUMB=0xfe
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                   717599272 ; 0x00c: 0x2ac5b228
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:    ORCLDISKDATA6 ; 0x000: length=13
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:           54 ; 0x00c: 0x00000036
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
…………

确定asm disk相关信息
对于7个被格式化的磁盘都进行类似处理之后,通过工具看到相关磁盘信息如下
asm-disk3


恢复处理
根据ntfs的文件系统分布,我们可以知道,虽然asm disk header备份block正常,但是asm disk中间部分依旧有不少au会被破坏
ntfs


这样的情况,不合适直接使用工具拷贝出来datafile(由于可能记录block的字典正好被覆盖,导致拷贝出来的文件异常,在恢复过程中我们也做了试验小文件拷贝ok,大文件拷贝然后使用dbv检测有很多坏块),我们采用工具(asm disk header 彻底损坏恢复)从底层扫描直接重组出来asm disk中的数据文件,然后结合拷贝出来的控制文件,redo文件,参数文件,然后通过重命名相关路径,然后直接open数据库

Q:\>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on 星期三 1月 22 16:08:18 2014

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


SQL> set pages 1000
SQL> col name for a100
SQL> set lines 150
SQL> select file#,name from v$datafile;

     FILE# NAME
---------- --------------------------------------------------------------------
         1 +DATA/vspdb/datafile/system.256.778520603
         2 +DATA/vspdb/datafile/sysaux.257.778520603
         3 +DATA/vspdb/datafile/undotbs1.258.778520603
         4 +DATA/vspdb/datafile/users.259.778520603
         5 +DATA/vspdb/datafile/vsp_tbs.293.779926097
        …………
       147 +DATA/vspdb/datafile/index_dg.418.864665747
       148 +DATA/vspdb/datafile/data_dg.419.864667053
       149 +DATA/vspdb/datafile/vsp_mm_tbs.420.890410367
       150 +DATA/vspdb/datafile/vsp_mm_tbs.421.890410457

SQL> select member from v$logfile;

MEMBER
-------------------------------------------------------------------------------------
+DATA/vspdb/onlinelog/group_7.263.862676593
+DATA/vspdb/onlinelog/group_7.262.862676601
+DATA/vspdb/onlinelog/group_4.410.862652291
+DATA/vspdb/onlinelog/group_4.411.862652307
+DATA/vspdb/onlinelog/group_5.412.862653715
+DATA/vspdb/onlinelog/group_5.413.862653727
+DATA/vspdb/onlinelog/group_6.414.862676425
+DATA/vspdb/onlinelog/group_6.415.862676433

重命名数据文件和redo文件,open数据库

SQL> recover database;
完成介质恢复。
SQL> alter database open;

数据库已更改。

已用时间:  00: 00: 04.51

由于部分block被覆盖,使用空块代替,导致数据访问到该block就会出现ora-8103(模拟普通ORA-08103并解决,模拟极端ORA-08103并解决)错误,对于该种对象,最简单处理方法就是直接通过dul抽出来数据然后truncate table重新导入数据,当然如果你想彻底安全逻辑方式重建库最靠谱

Oracle bug ORA-600 k2vcbk_2故障恢复

有朋友找到我说他们数据库无法启动,数据库启动报ORA-600[k2vcbk_2]错误,数据库版本为11.2.0.2 RAC,操作系统是AIX 6.1

SQL> recover database;
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [],
[], [], [], [], []
Process ID: 7930020
Session ID: 49 Serial number: 14761

数据库节点1日志

Mon Sep 21 15:45:41 2015
Thread 1 advanced to log sequence 54076 (LGWR switch)
  Current log# 13 seq# 54076 mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
  Current log# 13 seq# 54076 mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
Mon Sep 21 15:45:44 2015
Archived Log entry 74655 added for thread 1 sequence 54075 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:56:18 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184348):
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184348/xifenfei1_ora_18088342_i184348.trc
Mon Sep 21 15:56:34 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Error 600 trapped in 2PC on transaction 7.16.120119. Cleaning up.
Error stack returned to user:
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184349):
ORA-00603: ORACLE 服务器会话因致命错误而终止
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Mon Sep 21 15:56:34 2015
Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184349/xifenfei1_ora_18088342_i184349.trc
Mon Sep 21 15:56:35 2015
Sweep [inc][184349]: completed
Sweep [inc][184348]: completed
Sweep [inc2][184348]: completed
opiodr aborting process unknown ospid (18088342) as a result of ORA-603
Mon Sep 21 15:57:12 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc  (incident=184274):
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184274/xifenfei1_smon_7536810_i184274.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Sep 21 15:57:16 2015
Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc:
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 7536810): terminating the instance due to error 474
Mon Sep 21 15:57:18 2015
ORA-1092 : opitsk aborting process

数据库节点2日志

Mon Sep 21 15:21:50 2015
Archived Log entry 74653 added for thread 2 sequence 23559 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:44:28 2015
Thread 2 advanced to log sequence 23561 (LGWR switch)
  Current log# 12 seq# 23561 mem# 0: +DG01/xifenfei/onlinelog/group_12.338.779457003
  Current log# 12 seq# 23561 mem# 1: +DG01/xifenfei/onlinelog/group_12.265.779582493
Mon Sep 21 15:44:31 2015
Archived Log entry 74654 added for thread 2 sequence 23560 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:45:31 2015
DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
  is local tran 20.3.1533822 (hex=14.03.17677e)
  insert pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
Mon Sep 21 15:45:31 2015
DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
  is local tran 20.3.1533822 (hex=14.03.17677e))
  delete pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
Mon Sep 21 15:56:35 2015
Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
Mon Sep 21 15:57:10 2015
Error 3135 trapped in 2PC on transaction 20.11.1534704. Cleaning up.
Error stack returned to user:
ORA-03135: 连接失去联系
opidcl aborting process unknown ospid (9175532) as a result of ORA-604
Mon Sep 21 15:57:17 2015
Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
Mon Sep 21 15:57:23 2015
Reconfiguration started (old inc 18, new inc 20)
List of instances:
 2 (myinst: 2) 
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE 
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Mon Sep 21 15:57:23 2015
 LMS 2: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
Mon Sep 21 15:57:23 2015
 LMS 0: 2 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 21 15:57:23 2015
 LMS 1: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
Mon Sep 21 15:57:23 2015
minact-scn: Inst 2 is now the master inc#:20 mmon proc-id:6816208 status:0x7
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0d45.28c2bb5c gcalc-scn:0x0d45.28c3bd2e
minact-scn: master found reconf/inst-rec before recscn scan old-inc#:20 new-inc#:20
Mon Sep 21 15:57:23 2015
Instance recovery: looking for dead threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Beginning instance recovery of 1 threads
 parallel recovery started with 31 processes
Started redo scan
Completed redo scan
 read 12626 KB redo, 1724 data blocks need recovery
Started redo application at
 Thread 1: logseq 54076, block 184416
Recovery of Online Redo Log: Thread 1 Group 13 Seq 54076 Reading mem 0
  Mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
  Mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
Completed redo application of 9.78MB
Completed instance recovery at
 Thread 1: logseq 54076, block 209669, scn 14590688357285
 1633 data blocks read, 1794 data blocks written, 12626 redo k-bytes read
Thread 1 advanced to log sequence 54077 (thread recovery)
Mon Sep 21 15:57:33 2015
Error 3113 trapped in 2PC on transaction 21.18.1965522. Cleaning up.
Redo thread 1 internally disabled at seq 54077 (SMON)
Error stack returned to user:
ORA-02050: 事务处理 21.18.1965522 已回退, 某些远程数据库可能有问题
ORA-03113: 通信通道的文件结尾
ORA-02063: 紧接着 line (起自 ZSK)
Mon Sep 21 15:57:34 2015
Archived Log entry 74656 added for thread 1 sequence 54076 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:34 2015
ARC0: Archiving disabled thread 1 sequence 54077
Archived Log entry 74657 added for thread 1 sequence 54077 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:35 2015
Thread 2 advanced to log sequence 23562 (LGWR switch)
  Current log# 8 seq# 23562 mem# 0: +DG01/xifenfei/onlinelog/group_8.334.779456945
  Current log# 8 seq# 23562 mem# 1: +DG01/xifenfei/onlinelog/group_8.267.779582453
Mon Sep 21 15:57:36 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc  (incident=200218):
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei2/incident/incdir_200218/xifenfei2_smon_6750672_i200218.trc
Archived Log entry 74658 added for thread 2 sequence 23561 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:38 2015
minact-scn: master continuing after IR
Mon Sep 21 15:57:41 2015
Dumping diagnostic data in directory=[cdmp_20150921155741], requested by (instance=2, osid=6750672 (SMON)), summary=[incident=200218].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing instance transaction recovery.
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc:
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 6750672): terminating the instance due to error 474
Mon Sep 21 15:57:41 2015
ORA-1092 : opitsk aborting process
Mon Sep 21 15:57:42 2015
ORA-1092 : opitsk aborting process
Mon Sep 21 15:57:42 2015
License high water mark = 291
Instance terminated by SMON, pid = 6750672
USER (ospid: 18874814): terminating the instance

通过数据库日志大概可以看出来,由于节点2的分布式事事务异常,而在11.2.0.2中,分布式事务跨节点,引起节点2的pmon清理异常事务,但是由于bug,使得异常事务无法被清理掉,从而引起节点1 crash,节点1 crash之后节点2进行恢复,也因为分布式事务bug,导致smon回滚失败,实例也crash。无法进行回滚导致数据库无法正常启动,通过查询mos发现定位到是Bug 10222544 ORA-600 [k2vpci_2] from multi-branch distributed transaction
ORA-600-k2vpci_2


对于这类问题,由于分布事务无法清理,处理方法就是找出来事务人工提交或者直接屏蔽掉这个事务解决该问题

12C sysaux 异常恢复—wrong resetlogs 错误恢复

有朋友请求支援,他们数据库由于file 3 大量坏块,然后直接使用rman 备份还原了file 3,但是在recover过程中发现归档丢失,而且整个库在丢失归档的scn之后,还做过resetlogs操作,导致现在整个库无法正常启动,报ORA-01190错误,希望帮忙把file 3 给online起来,整个库正常open【当然在丢失sysaux的情况下,数据库可以open起来,但是这种情况下,迁移数据比较麻烦】

SQL> startup;
ORACLE 例程已经启动。

Total System Global Area 3.1868E+10 bytes
Fixed Size                  3601144 bytes
Variable Size            2.8655E+10 bytes
Database Buffers         3154116608 bytes
Redo Buffers               54804480 bytes
数据库装载完毕。
ORA-01190: 控制文件或数据文件 3 来自最后一个 RESETLOGS 之前
ORA-01110: 数据文件 3: 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF'

Oracle Database Recovery Check Result结果显示[脚本]
wrong+resetlogs


尝试不完全恢复并使用隐含参数打开库

Fri Oct 02 19:10:12 2015
ALTER DATABASE RECOVER  database until cancel  
Fri Oct 02 19:10:12 2015
Media Recovery Start
 Started logmerger process
Fri Oct 02 19:10:12 2015
Media Recovery failed with error 16433
Fri Oct 02 19:10:14 2015
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Fri Oct 02 19:10:37 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_5176.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:10:37 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_5176.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
ALTER DATABASE RECOVER  database until cancel  
Fri Oct 02 19:11:18 2015
Media Recovery Start
 Started logmerger process
Fri Oct 02 19:11:18 2015
Media Recovery failed with error 16433
Fri Oct 02 19:11:19 2015
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
alter database open resetlogs
ORA-1139 signalled during: alter database open resetlogs...
alter database open
Fri Oct 02 19:11:49 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_ora_4252.trc:
ORA-01190: 控制文件或数据文件 3 来自最后一个 RESETLOGS 之前
ORA-01110: 数据文件 3: 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF'
ORA-1190 signalled during: alter database open...
Fri Oct 02 19:15:38 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_5292.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:15:38 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_5292.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:20:39 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_2276.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:20:39 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_2276.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:25:40 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_4804.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:25:40 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_4804.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:30:41 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_876.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:30:41 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_876.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:32:40 2015
Shutting down instance (abort)

数据库遭遇ORA-16433,此类方法无法打开数据库,根据经验值出现此类问题,可能需要重建控制文件,但是由于其中file 3的resetlogs scn不正确,无法包含该文件重建控制文件

Fri Oct 02 20:10:55 2015
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Fri Oct 02 20:10:55 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_ora_5004.trc:
ORA-01189: ????????????? RESETLOGS
ORA-01110: ???? 3: 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF'
ORA-1503 signalled during:  CREATE CONTROLFILE REUSE DATABASE "orcl" RESETLOGS FORCE LOGGING ARCHIVELOG
     MAXLOGFILES 16
     MAXLOGMEMBERS 3
     MAXDATAFILES 100
     MAXINSTANCES 8
     MAXLOGHISTORY 2921
 LOGFILE
 GROUP 3 'E:\APP\ORAADM\ORADATA\ORCL\REDO03.LOG' size 50M,
 GROUP 2 'E:\APP\ORAADM\ORADATA\ORCL\REDO02.LOG'  size 50M,
 GROUP 1 'E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG'  size 50M
 DATAFILE
'E:\APP\ORAADM\ORADATA\ORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\UNDOTBS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SAMPLE_SCHEMA_USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\EXAMPLE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\NMSA_BACKUP01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE1.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE02.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE03.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE04.DBF'
 CHARACTER SET AL32UTF8
 ...

除掉file 3 继续重建控制文件

Fri Oct 02 20:33:11 2015
Successful mount of redo thread 1, with mount id 1419796614
Completed:  CREATE CONTROLFILE REUSE DATABASE "orcl" RESETLOGS FORCE LOGGING ARCHIVELOG
     MAXLOGFILES 16
     MAXLOGMEMBERS 3
     MAXDATAFILES 100
     MAXINSTANCES 8
     MAXLOGHISTORY 2921
 LOGFILE
 GROUP 3 'E:\APP\ORAADM\ORADATA\ORCL\REDO03.LOG' size 50M,
 GROUP 2 'E:\APP\ORAADM\ORADATA\ORCL\REDO02.LOG'  size 50M,
 GROUP 1 'E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG'  size 50M
 DATAFILE
'E:\APP\ORAADM\ORADATA\ORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSTEM01.DBF',
--'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\UNDOTBS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SAMPLE_SCHEMA_USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\EXAMPLE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\NMSA_BACKUP01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE1.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE02.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE03.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE04.DBF'
 CHARACTER SET AL32UTF8

继续恢复数据库

ALTER DATABASE OPEN
Fri Oct 02 20:34:57 2015
…………
Archived Log entry 3 added for thread 1 sequence 8 ID 0x54a083a3 dest 1:
Fri Oct 02 20:35:16 2015
Tablespace 'SYSAUX' #1 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
File #3 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00003' in the controlfile.
Corrected file 15 plugged in read-only status in control file
Corrected file 16 plugged in read-only status in control file
Corrected file 17 plugged in read-only status in control file
Corrected file 18 plugged in read-only status in control file
Corrected file 19 plugged in read-only status in control file
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
Fri Oct 02 20:35:19 2015
SMON: enabling tx recovery
Fri Oct 02 20:35:19 2015
*********************************************************************
WARNING: The following temporary tablespaces in container(CDB$ROOT)
         contain no files.
Starting background process SMCO
Fri Oct 02 20:35:19 2015
SMCO started with pid=45, OS id=1500 
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Database Characterset is AL32UTF8
No Resource Manager plan active
Fri Oct 02 20:35:21 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_ora_2220.trc:
ORA-00376: 此时无法读取文件 3
ORA-01111: 数据文件 3 名称未知 - 请重命名以更正文件
ORA-01110: 数据文件 3: 'C:\APP\ORAADM\PRODUCT\12.1.0\DBHOME_1\DATABASE\MISSING00003'
Fri Oct 02 20:35:21 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_ora_2220.trc:
ORA-00376: 此时无法读取文件 3
ORA-01111: 数据文件 3 名称未知 - 请重命名以更正文件
ORA-01110: 数据文件 3: 'C:\APP\ORAADM\PRODUCT\12.1.0\DBHOME_1\DATABASE\MISSING00003'
Error 376 happened during db open, shutting down database
USER (ospid: 2220): terminating the instance due to error 376
Fri Oct 02 20:35:26 2015
Instance terminated by USER, pid = 2220
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (2220) as a result of ORA-1092

此时由于file 3 未包含在控制文件中,但是存在数据字典中,因此在数据库open的时候出现了默认文件名MISSING0003,尝试重命名改文件指定为存在的file 3,并且尝试恢复

SQL> startup mount;
ORACLE 例程已经启动。

Total System Global Area 3.1868E+10 bytes
Fixed Size                  3601144 bytes
Variable Size            2.8655E+10 bytes
Database Buffers         3154116608 bytes
Redo Buffers               54804480 bytes
数据库装载完毕。
SQL> alter database datafile 3 offline;

数据库已更改。

SQL> alter database rename file 'C:\APP\ORAADM\PRODUCT\12.1.0\DBHOME_1\DATABASE\
MISSING00003' to 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF';

数据库已更改。

SQL> recover database until cancel;
ORA-00279: 更改 617412726 (在 10/02/2015 20:35:06 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ORAADM\FAST_RECOVERY_AREA\ORCL\ARCHIVELOG\2015_10_02\O1_MF_1_9_%U_.ARC
ORA-00280: 更改 617412726 (用于线程 1) 在序列 #9 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG
ORA-00310: archived log contains sequence 7; sequence 9 required
ORA-00334: archived log: 'E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG'


ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: 'E:\APP\ORAADM\ORADATA\ORCL\SYSTEM01.DBF'


SQL> recover database until cancel;
ORA-00279: 更改 617412726 (在 10/02/2015 20:35:06 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ORAADM\FAST_RECOVERY_AREA\ORCL\ARCHIVELOG\2015_10_02\O1_MF_1_9_%U_.ARC
ORA-00280: 更改 617412726 (用于线程 1) 在序列 #9 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ORAADM\ORADATA\ORCL\REDO03.LOG
已应用的日志。
完成介质恢复。
SQL> alter database datafile 3 online;

数据库已更改。

SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-01122: 数据库文件 3 验证失败
ORA-01110: 数据文件 3: 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF'
ORA-01202: 此文件的原型错误 - 创建时间错误

这里比较明显ORA-01202,由于创建控制文件之时没有file 3信息,因此导致控制文件中关于file 3的信息和该文件头的创建时间不一致(此处之时显示了时间不一致,如果通过bbed修改时间,后续可能还有很多东西不一致,因此通过bbed 一个个修改一个个尝试,理论可行,但实际可操作性不好),因此尝试直接使用bbed修改file 3文件头(由于是win环境,操作稍微麻烦点),把resetlogs信息修改和其他的一样

BBED> m /x 3c6b2b35
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  112 to  143           Dba:0x00c00002
------------------------------------------------------------------------
 3c6b2b35 386b2200 00000000 00000000 00000000 00000000 00004000 bb460000 

 <32 bytes per line>

BBED> set offset 116
        OFFSET          116

BBED> m /x 3137ca24
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  116 to  147           Dba:0x00c00002
------------------------------------------------------------------------
 3137ca24 00000000 00000000 00000000 00000000 00004000 bb460000 7dc12b35 

 <32 bytes per line>

BBED> m /x b9f8
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  484 to  515           Dba:0x00c00002
------------------------------------------------------------------------
 b9f8a424 00000000 e65e2435 01000000 d3410000 b89b0000 10000900 02000000 

 <32 bytes per line>

BBED> set offset +2
        OFFSET          486

BBED> m /x cc24
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  486 to  517           Dba:0x00c00002
------------------------------------------------------------------------
 cc240000 0000e65e 24350100 0000d341 0000b89b 00001000 09000200 00000000 

 <32 bytes per line>

BBED> m /x 87df offset 492
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  492 to  523           Dba:0x00c00002
------------------------------------------------------------------------
 87df2435 01000000 d3410000 b89b0000 10000900 02000000 00000000 00000000 

 <32 bytes per line>

BBED> 
BBED> m /x 2b35 offset +2
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  494 to  525           Dba:0x00c00002
------------------------------------------------------------------------
 2b350100 0000d341 0000b89b 00001000 09000200 00000000 00000000 00000000 

 <32 bytes per line>

BBED> d offset 140
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  140 to  171           Dba:0x00c00002
------------------------------------------------------------------------
 bb460000 7dc12b35 ba460000 00000000 00000000 00000000 00000000 00000000 

 <32 bytes per line>

BBED> m /x 4248
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  140 to  171           Dba:0x00c00002
------------------------------------------------------------------------
 42480000 7dc12b35 ba460000 00000000 00000000 00000000 00000000 00000000 

 <32 bytes per line>

BBED> d offset 148
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  148 to  179           Dba:0x00c00002
------------------------------------------------------------------------
 ba460000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 

 <32 bytes per line>

BBED> m /x 4148
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  148 to  179           Dba:0x00c00002
------------------------------------------------------------------------
 41480000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 

 <32 bytes per line>

BBED> sum apply
Check value for File 3, Block 2:
current = 0xd0c8, required = 0xd0c8

BBED> verify
DBVERIFY - Verification starting
FILE = SYSAUX01.dbf
BLOCK = 1


DBVERIFY - Verification complete

Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

修改完file 3的文件头之后,再次重建控制文件,此次包含file 3

Fri Oct 02 21:19:58 2015
Successful mount of redo thread 1, with mount id 1419797885
Completed:  CREATE CONTROLFILE REUSE DATABASE "orcl" NORESETLOGS FORCE LOGGING ARCHIVELOG
     MAXLOGFILES 16
     MAXLOGMEMBERS 3
     MAXDATAFILES 100
     MAXINSTANCES 8
     MAXLOGHISTORY 2921
 LOGFILE
 GROUP 3 'E:\APP\ORAADM\ORADATA\ORCL\REDO03.LOG' size 50M,
 GROUP 2 'E:\APP\ORAADM\ORADATA\ORCL\REDO02.LOG'  size 50M,
 GROUP 1 'E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG'  size 50M
 DATAFILE
'E:\APP\ORAADM\ORADATA\ORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\UNDOTBS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SAMPLE_SCHEMA_USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\EXAMPLE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\NMSA_BACKUP01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE1.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE02.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE03.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE04.DBF'
 CHARACTER SET AL32UTF8

继续恢复数据库,数据库正常open,而且file 3 已经正常online,数据库可以直接导出来,至此恢复大体完成

ORA-01245 ORA-01110 恢复

有朋友找到我,说数据库做recover报ORA-01245和ORA-01110错误,无法继续恢复,请求支持

SQL> recover database using backup controlfile until cancel;
…………

第 1 行出现错误:
ORA-01245: RESETLOGS 完成时脱机文件 1 将丢失
ORA-01110: 数据文件 1: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\SYSTEM01.DBF'

通过Oracle Database Recovery Check检查数据库情况,发现datafile 1处于offline状态
oracle_recovery_check


Wed Aug 26 23:11:00 2015
alter database datafile 1 offline drop
Completed: alter database datafile 1 offline drop

从这里基本上可以知道为什么出现ORA-01245错误了,由于system表空间中文件被offline导致.

redo信息
oracle_recovery_check_redo

Mon Aug 24 22:38:35 2015
alter database clear unarchived logfile group 2
Clearing online log 2 of thread 1 sequence number 5705
Completed: alter database clear unarchived logfile group 2
Wed Aug 26 23:13:23 2015
alter database clear logfile group 3
Clearing online log 3 of thread 1 sequence number 5706
Completed: alter database clear logfile group 3

除当前redo之外,其他redo被clear

尝试恢复

SQL> alter database datafile 1 online;

数据库已更改。

SQL> recover database;
ORA-00283: 恢复会话因错误而取消
ORA-01610: 使用 BACKUP CONTROLFILE 选项的恢复必须已完成


SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570

5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO03.LOG
ORA-00310: 归档日志包含序列 5706; 要求序列 5705
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO03.LOG'


SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570

5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO02.LOG
ORA-00339: 归档日志未包含任何重做
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO02.LOG'


SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570

5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO01.LOG
ORA-00310: 归档日志包含序列 5707; 要求序列 5705
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO01.LOG'

数据库做恢复需要seq 5705的redo,但是redo已经被clear,导致现在数据库常规手段无法恢复,只用使用隐含参数屏蔽数据库前滚(一致性检查)

再次尝试打开数据库

ORACLE 例程已经启动。

Total System Global Area  778387456 bytes
Fixed Size                  1374808 bytes
Variable Size             486540712 bytes
Database Buffers          285212672 bytes
Redo Buffers                5259264 bytes
数据库装载完毕。
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570

5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
介质恢复已取消。
SQL> alter database open resetlogs;

数据库已更改。

在数据库恢复中,请不要对system表空间数据文件进行offline操作,如果对此类文件进行offline操作,讲在数据库恢复过程中出现ORA-01245和ORA-01110错误,而且文件还会出现SYSOFF状态