ora-600 2037 ORA-7445 kcbs_dump_adv_state

有客户系统断电,导致数据库无法启动,让我们帮忙解决,通过分析主要是ORA-600 2037和ORA-7445 _kcbs_dump_adv_state等错误,通过人工recover解决.
数据库报ORA-03113,无法启动成功

C:\Documents and Settings\Administrator>sqlplus / as sysdba

SQL*Plus: Release 10.2.0.1.0 - Production on 星期五 5月 12 09:50:36 2017

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

已连接到空闲例程。

SQL> startup
ORACLE 例程已经启动。

Total System Global Area 1258291200 bytes
Fixed Size                  1250548 bytes
Variable Size             218106636 bytes
Database Buffers         1031798784 bytes
Redo Buffers                7135232 bytes
数据库装载完毕。
ORA-03113: 通信通道的文件结束

分析alert日志

Fri May 12 09:50:43 2017
ALTER DATABASE OPEN
Fri May 12 09:50:43 2017
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Fri May 12 09:50:43 2017
Started redo scan
Fri May 12 09:50:43 2017
Completed redo scan
 1240 redo blocks read, 277 data blocks need recovery
Fri May 12 09:50:44 2017
Started redo application at
 Thread 1: logseq 5881, block 41179
Fri May 12 09:50:44 2017
Recovery of Online Redo Log: Thread 1 Group 1 Seq 5881 Reading mem 0
  Mem# 0 errs 0: E:\ORACLE\PRODUCT\10.2.0\ORADATA\xff\REDO01.LOG
Fri May 12 09:50:44 2017
Completed redo application
Fri May 12 09:50:44 2017
Errors in file e:\oracle\product\10.2.0\admin\xff\bdump\xff_p006_6072.trc:
ORA-00600: internal error code, arguments: [6110], [193], [3], [], [], [], [], []

Fri May 12 09:50:44 2017
Hex dump of (file 3, block 14004) in trace file e:\oracle\product\10.2.0\admin\xff\bdump\xff_p000_6024.trc
Corrupt block relative dba: 0x00c036b4 (file 3, block 14004)
Bad header found during crash/instance recovery
Data in bad block:
 type: 255 format: 7 rdba: 0x06010601
 last change scn: 0xa206.a2060601 seq: 0xb4 flg: 0x36
 spare1: 0x1 spare2: 0x6 spare3: 0x673
 consistency value in tail: 0x1b0a0708
 check value in block header: 0x36b4
 computed block checksum: 0xe4f5
Fri May 12 09:50:44 2017
Hex dump of (file 9, block 65507) in trace file e:\oracle\product\10.2.0\admin\xff\bdump\xff_p003_6056.trc
Corrupt block relative dba: 0x0240ffe3 (file 9, block 65507)
Bad header found during crash/instance recovery
Data in bad block:
 type: 3 format: 6 rdba: 0x06020601
 last change scn: 0xa206.a2060602 seq: 0xe3 flg: 0xff
 spare1: 0x1 spare2: 0x6 spare3: 0x6dc
 consistency value in tail: 0xc1028001
 check value in block header: 0xffe3
 computed block checksum: 0xff01
Fri May 12 09:50:44 2017
Reread of rdba: 0x00c036b4 (file 3, block 14004) found different data
Fri May 12 09:50:44 2017
Reread of rdba: 0x0240ffe3 (file 9, block 65507) found different data
Fri May 12 09:50:44 2017
Errors in file e:\oracle\product\10.2.0\admin\xff\bdump\xff_p005_6060.trc:
ORA-00600: internal error code,arguments:[2037],[17442602],[2718302723],[255],[9],[203],[657105414],[2147549568]
Fri May 12 09:50:44 2017
Errors in file e:\oracle\product\10.2.0\admin\xff\bdump\xff_p000_6024.trc:
ORA-07445:exception encountered:core dump[ACCESS_VIOLATION][_kclcomplete+79][PC:0x72B0C7][ADDR:0x220][UNABLE_TO_READ][]
Fri May 12 09:50:44 2017
Errors in file e:\oracle\product\10.2.0\admin\xff\bdump\xff_p006_6072.trc:
ORA-07445: exception encountered:core dump[ACCESS_VIOLATION][_kcbzdh+2496][PC:0x4A4928][ADDR:0xB][UNABLE_TO_READ][]
ORA-00600: internal error code, arguments: [6110], [193], [3], [], [], [], [], []
Errors in file e:\oracle\product\10.2.0\admin\xff\bdump\xff_p012_6128.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_kcbs_dump_adv_state+723] 
                                 [PC:0x5975A3] [ADDR:0xCBC0CBB2] [UNABLE_TO_READ] []
ORA-00600:internal error code,arguments:[2037],[17430318],[2718303745],[128],[1],[203],[4147028486],[2147549568]

错误比较明显由于坏块导致应用日志恢复异常,主要错误集中在ORA-600 2037,ORA-7445 _kcbs_dump_adv_state,ORA-7445_kcbzdh,ORA-7445 _kclcomplete等

dbv检查数据文件

E:\>dbv file=E:\ORACLE\PRODUCT\10.2.0\ORADATA\xff\SYSAUX01.DBF

DBVERIFY: Release 10.2.0.1.0 - Production on 星期五 5月 12 09:57:39 2017

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

DBVERIFY - 开始验证: FILE = E:\ORACLE\PRODUCT\10.2.0\ORADATA\xff\SYSAUX01.DBF

页 13353 标记为损坏
Corrupt block relative dba: 0x00c03429 (file 3, block 13353)
Bad header found during dbv:
Data in bad block:
 type: 1 format: 6 rdba: 0x3429a206
 last change scn: 0x066f.066f3429 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0x8c96
 consistency value in tail: 0x06018001
 check value in block header: 0x0
 block checksum disabled

页 14004 标记为损坏
Corrupt block relative dba: 0x00c036b4 (file 3, block 14004)
Bad header found during dbv:
Data in bad block:
 type: 1 format: 6 rdba: 0x36b4a206
 last change scn: 0x0673.067336b4 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xfb97
 consistency value in tail: 0x06010210
 check value in block header: 0x0
 block checksum disabled

页 15261 标记为损坏
Corrupt block relative dba: 0x00c03b9d (file 3, block 15261)
Bad header found during dbv:
Data in bad block:
 type: 2 format: 6 rdba: 0x3b9da206
 last change scn: 0x0673.06733b9d seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0x0
 consistency value in tail: 0x06018001
 check value in block header: 0x5549
 block checksum disabled



DBVERIFY - 验证完成

检查的页总数: 58880
处理的页总数 (数据): 19318
失败的页总数 (数据): 0
处理的页总数 (索引): 18610
失败的页总数 (索引): 0
处理的页总数 (其它): 13747
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 7202
标记为损坏的总页数: 3
流入的页总数: 0
最高块 SCN            : 178325323 (0.178325323)


E:\>dbv file=E:\ORACLE\PRODUCT\10.2.0\ORADATA\xff\xff_BSE02

DBVERIFY: Release 10.2.0.1.0 - Production on 星期五 5月 12 10:10:24 2017

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

DBVERIFY - 开始验证: FILE = E:\ORACLE\PRODUCT\10.2.0\ORADATA\xff\xff_BSE02

页 65507 标记为损坏
Corrupt block relative dba: 0x0240ffe3 (file 9, block 65507)
Bad header found during dbv:
Data in bad block:
 type: 2 format: 6 rdba: 0xffe3a206
 last change scn: 0x06dc.06dcffe3 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xb32
 consistency value in tail: 0x060102ff
 check value in block header: 0x0
 block checksum disabled



DBVERIFY - 验证完成

检查的页总数: 1310720
处理的页总数 (数据): 34102
失败的页总数 (数据): 0
处理的页总数 (索引): 30270
失败的页总数 (索引): 0
处理的页总数 (其它): 10850
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 1235497
标记为损坏的总页数: 1
流入的页总数: 0
最高块 SCN            : 178325221 (0.178325221)

确实如alert日志报错,file 3和9 都出现坏块导致实例恢复无法进行。根据错误ORA-600 2037和ORA-7445 _kcbs_dump_adv_state,初步判断和During Startup (Open Database) Alert Log Shows ORA-600[2037] and ORA-7445[kcbs_dump_adv_state] (Doc ID 551993.1)文章描述相符(而且版本也相符)

尝试recover datafile部分file

E:\>sqlplus / as sysdba

SQL*Plus: Release 10.2.0.1.0 - Production on 星期五 5月 12 10:16:00 2017

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


连接到:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> recover datafile 1;
完成介质恢复。
SQL> recover datafile 2;
完成介质恢复。
SQL> recover datafile 3;
完成介质恢复。
SQL> recover datafile 4;
完成介质恢复。
SQL> recover datafile 9;
完成介质恢复。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [kcratr1_lastbwr], [], [], [], [], [], [], []

ORA-00600 kcratr1_lastbwr错误比较明显,见ORA-00600:[Kcratr1_lastbwr] During Database Startup after a Crash (Doc ID 393984.1)

通过recover database处理

SQL> recover database;
完成介质恢复。
SQL> alter database open;

数据库已更改。

然后通过查询dba_extents 处理坏块对象

补充ORA-600 2037错误

Format: ORA-600 [2037] [a] [b] 1 [d] [e] [f] [g]


VERSIONS:
  versions 8.0 and above

DESCRIPTION:

  During recovery we are examining a block to ensure that it is not
  corrupt prior to applying any change vectors.

  The block has failed this check and this exception is raised.

ARGUMENTS:
  Arg [a] Relative Data Block Address (RDBA) that the redo vector is for
  Arg [b] The Block format  
  Arg 1 RDBA in the block itself
  Arg [d] The block type
  Arg [e] The sequence number
  Arg [f] Flags, if set  
  Arg [g] The return value from the block head/tail checker.

MON_MODS$表ORA-600 13013报错处理

有朋友反馈数据库启动运行一点时间之后,然后就自动crash,让我们帮忙找原因,通过分析是由于smon进程触发ORA-600 13013导致数据库异常
alert日志报错信息

Thu Aug  4 18:39:44 2016
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=33, OS id=22935
Thu Aug  4 18:39:44 2016
Completed: ALTER DATABASE OPEN
Thu Aug  4 18:39:44 2016
db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Thu Aug  4 18:48:41 2016
Thread 1 advanced to log sequence 86746
  Current log# 3 seq# 86746 mem# 0: /opt/ora10/oradata/ora10g/redo03.log
Thu Aug  4 18:58:13 2016
Errors in file /opt/ora10/admin/ora10g/bdump/ora10g_smon_22449.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [482], [4198075], [40], [4198075], [17], []
Thu Aug  4 18:58:56 2016
Non-fatal internal error happenned while SMON was doing flushing of monitored table stats.
SMON encountered 8 out of maximum 100 non-fatal internal errors.
Thu Aug  4 18:59:06 2016
Errors in file /opt/ora10/admin/ora10g/bdump/ora10g_smon_22449.trc:
ORA-00600: internal error code, arguments: [13013], [5001], [482], [4198075], [40], [4198075], [17], []
Thu Aug  4 18:59:08 2016
Errors in file /opt/ora10/admin/ora10g/bdump/ora10g_pmon_22413.trc:
ORA-00474: SMON process terminated with error
Thu Aug  4 18:59:08 2016
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 22413

通过trace文件大概可以发现是由于ORA-600 13013错误导致数据库crash,而且这里有类似”SMON was doing flushing of monitored table stats”错误提示,根据经验,很可能是smon把表的dml操作收集信息相关.

ORA-600 [13013] 含义

ORA-600 [13013] [a] [b] {c} [d] [e] [f] 


This format relates to Oracle Server 8.0.3 to 10.1 

Arg [a] Passcount 
Arg [b] Data Object number 
Arg {c} Tablespace Relative DBA of block containing the row to be updated 
Arg [d] Row Slot number 
Arg [e] Relative DBA of block being updated (should be same as 1) 
Arg [f] Code 

根据这个错误信息,以及How to resolve ORA-00600 [13013], [5001] [ID 816784.1]中的描述

ORA-600 13013 对应对象

SQL> select object_name from dba_objects where object_id=482

OBJECT_NAME
--------------------------------------------------------------------------------
MON_MODS$

该对象正是和监控dml变化相关的表,smon会对其进行相关操作,以前写过一篇:MON_MODS$和MON_MODS_ALL$统计DML操作次数的文章
对于MON_MODS$表ORA-600 13013处理

SQL> analyze table mon_mods$ validate structure cascade;

analyze table mon_mods$ validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

 
SQL> select index_name from dba_indexes where table_name='MON_MODS$';

INDEX_NAME
------------------------------
I_MON_MODS$_OBJ

SQL> ALTER INDEX I_MON_MODS$_OBJ REBUILD;

Index altered.

SQL> analyze table mon_mods$ validate structure cascade;
analyze table mon_mods$ validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

SQL> CREATE TABLE MON_MODS_BAK AS SELECT * FROM MON_MODS$;

Table created.

SQL> SELECT COUNT(*) FROM MON_MODS$;

  COUNT(*)
----------
      1247

SQL> C/MON_MODS$/MON_MODS_BAK;
  1* SELECT COUNT(*) FROM MON_MODS_BAK
SQL> /

  COUNT(*)
----------
      1247

SQL> TRUNCATE TABLE MON_MODS$;

Table truncated.

SQL> INSERT INTO MON_MODS$ SELECT * fROM MON_MODS_BAK;

1247 rows created.

SQL> COMMIT;

Commit complete.

SQL>  analyze table mon_mods$ validate structure cascade;

Table analyzed.

自此关于MON_MODS$表相关的ORA-600 13013异常处理完全,当然也可以通过重建I_MON_MODS$_OBJ索引来解决,但是不能通过rebuild index解决.数据库也就不会因此而crash了.

ORA-00354 ORA-00353 ORA-00312异常处理

数据库启动报错
WIN平台oracle 9.2.0.6版本数据库redo log block header损坏,ORA-00354 ORA-00353 ORA-00312错误导致数据库无法启动

SQL >alter database open;
*
ERROR at line 1:

ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 1892904 change 281470950178815
ORA-00312: online log 3 thread 1: 'D:\ORACLE\ORADATA\ZOYO\REDO03.LOG'
Sun Jan 24 15:44:05 2016
Database mounted in Exclusive Mode.
Completed: alter database mount exclusive
Sun Jan 24 15:44:05 2016
alter database open
Sun Jan 24 15:44:05 2016
Beginning crash recovery of 1 threads
Sun Jan 24 15:44:05 2016
Started redo scan
ORA-354 signalled during: alter database open...
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 3
Sun Jan 24 15:44:32 2016
ALTER DATABASE CLOSE NORMAL
ORA-1109 signalled during: ALTER DATABASE CLOSE NORMAL...

通过分析,确定损坏的redo03为当前redo,无法使用正常方法打开,加上_allow_resetlogs_corruption参数,尝试打开库,依旧失败

数据库报ORA-600 2662错误

Sun Jan 24 16:26:30 2016
SMON: enabling cache recovery
Sun Jan 24 16:26:30 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00600: 内部错误代码,参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []

Sun Jan 24 16:26:31 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码,参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []

Sun Jan 24 16:26:31 2016
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 640
ORA-1092 signalled during: alter database open resetlogs...

ORA 600 2662的错误处理
根据经验,这个错误只需要推scn即可,可以通过bbed,隐含参数,event,oradebug,修改控制文件等方法进行,推scn之后,数据库报熟悉的ORA-00604 ORA-00607 ORA-600 4194错误,以前我们遇到的block大部分是128,这次报异常block为9.实际中跟版本有关系,在ORACLE 9.2.0.6中该错误为file 1 block 9.大部分版本为128

Sun Jan 24 16:29:39 2016
SMON: enabling cache recovery
Sun Jan 24 16:29:39 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00600: 内部错误代码,参数: [4194], [14], [5], [], [], [], [], []

Sun Jan 24 16:29:39 2016
Doing block recovery for fno: 1 blk: 401
Sun Jan 24 16:29:39 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Doing block recovery for fno: 1 blk: 9
Sun Jan 24 16:29:40 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Sun Jan 24 16:29:40 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [14], [5], [], [], [], [], []

Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 3432

ORA-00604 ORA-00607 ORA-600 4194分析trace文件

*** 2016-01-24 16:29:40.031
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
 
Block image after block recovery:
buffer tsn: 0 rdba: 0x00400009 (1/9)
scn: 0x0000.01e112e1 seq: 0x01 flg: 0x04 tail: 0x12e10e01
frmt: 0x02 chkval: 0xba76 type: 0x0e=KTU UNDO HEADER W/UNLIMITED EXTENTS
  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 6      #blocks: 47    
                  last map  0x00000000  #maps: 0      offset: 4128  
      Highwater::  0x00400191  ext#: 4      blk#: 0      ext size: 8     
  #blocks in seg. hdr's freelists: 0     
  #blocks below: 0     
  mapblk  0x00000000  offset: 4     
                   Unlocked
     Map Header:: next  0x00000000  #extents: 6    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x0040000a  length: 7     
   0x00400011  length: 8     
   0x00400181  length: 8     
   0x00400189  length: 8     
   0x00400191  length: 8     
   0x00400199  length: 8     
  
  TRN CTL:: seq: 0x008e chd: 0x0060 ctl: 0x0024 inc: 0x00000000 nfb: 0x0001
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400191.008e.04 scn: 0x0000.01ded29c
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00400191.008e.04 ext: 0x4  spc: 0x1c3e  
    uba: 0x00000000.002f.21 ext: 0x5  spc: 0x1334  
    uba: 0x00000000.002e.37 ext: 0x4  spc: 0x788   
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0     
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0     
  TRN TBL::

从这里可以确定undo segment header中的分配block记录有问题,清除ktuxc.fbp.fbp[N].kuba.kdba相关记录,数据库正常打开

    . struct ktuxc  kernel transaction undo xaction table control with 15 members
    . {
    .   struct kscn   scn with 3 members
    .   {
04148     ub4           bas      = 0X9CD2DE01 = 31380124
04152     ub2           wrp      = 0X0000 = 0
04154     cc32          pad      = 0X0000 = 0
    .   }
    .   struct kuba   uba with 4 members
    .   {
04156     kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04160     ub2           seq      = 0X8E00 = 142
04162     ub1           rec      = 0X04 = 4
04163     cc16          pad      = 0X00 = 0
    .   }
04164   sb2           flg      = 0X0100 = 1
04166   ub2           seq      = 0X8E00 = 142
04168   sb2           nfb      = 0X0100 = 1
04170   cc32          pad1     = 0X0000 = 0
04172   ub4           inc      = 0X00000000 = 0
04176   sb2           chd      = 0X6000 = 96
04178   sb2           ctl      = 0X2400 = 36
04180   ub2x          mgc      = 0X0280 = 0x8002
04182   ub2           ver      = 0X0100 = 1
04184   ub2           xts      = 0X6800 = 104
04186   cc32          pad2     = 0X0000 = 0
04188   ub4           opt      = 0XFEFFFF7F = 2147483646
    .   ktufb fbp[5] (array with 5 elements)
    .     struct fbp   [0] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04192         kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04196         ub2           seq      = 0X8E00 = 142
04198         ub1           rec      = 0X04 = 4
04199         cc16          pad      = 0X00 = 0
    .       }
04200       sb2           ext      = 0X0400 = 4
04202       sb2           spc      = 0X3E1C = 7230
    .     }
    .     struct fbp   [1] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04204         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04208         ub2           seq      = 0X2F00 = 47
04210         ub1           rec      = 0X21 = 33
04211         cc16          pad      = 0X00 = 0
    .       }
04212       sb2           ext      = 0X0500 = 5
04214       sb2           spc      = 0X3413 = 4916
    .     }
    .     struct fbp   [2] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04216         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04220         ub2           seq      = 0X2E00 = 46
04222         ub1           rec      = 0X37 = 55
04223         cc16          pad      = 0X00 = 0
    .       }
04224       sb2           ext      = 0X0400 = 4
04226       sb2           spc      = 0X8807 = 1928
    .     }
    .     struct fbp   [3] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04228         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04232         ub2           seq      = 0X0000 = 0
04234         ub1           rec      = 0X00 = 0
04235         cc16          pad      = 0X00 = 0
    .       }
04236       sb2           ext      = 0X0000 = 0
04238       sb2           spc      = 0X0000 = 0
    .     }
    .     struct fbp   [4] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04240         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04244         ub2           seq      = 0X0000 = 0
04246         ub1           rec      = 0X00 = 0
04247         cc16          pad      = 0X00 = 0
    .       }
04248       sb2           ext      = 0X0000 = 0
04250       sb2           spc      = 0X0000 = 0
    .     }
    . }
Sun Jan 24 16:44:52 2016
SMON: enabling tx recovery
Sun Jan 24 16:44:52 2016
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: ALTER DATABASE OPEN

ORA-01115 ORA-01110 ORA-27067故障恢复案例

接到朋友恢复请求,aix 5.3,Oracle 10.2.0.1平台,数据库启动报ORA-01115 ORA-01110 ORA-27067错误,数据库无法正常打开

Mon Aug 10 13:25:22 2015
ALTER DATABASE   MOUNT
Mon Aug 10 13:25:29 2015
Setting recovery target incarnation to 1
Mon Aug 10 13:25:29 2015
Successful mount of redo thread 1, with mount id 432339141
Mon Aug 10 13:25:29 2015
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Mon Aug 10 13:25:36 2015
alter database open
Mon Aug 10 13:25:36 2015
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Mon Aug 10 13:25:37 2015
Started redo scan
Mon Aug 10 13:25:52 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:25:53 2015
Errors in file /dc/admin/datacent/bdump/datacent_p002_144124.trc:
ORA-01115: IO error reading block from file 2 (block # 40704)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1572864
Mon Aug 10 13:25:53 2015
Aborting crash recovery due to slave death, attempting serial crash recovery
Mon Aug 10 13:25:53 2015
Beginning crash recovery of 1 threads
Mon Aug 10 13:25:53 2015
Started redo scan
Mon Aug 10 13:26:09 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:26:12 2015
Aborting crash recovery due to error 1115
Mon Aug 10 13:26:12 2015
Errors in file /dc/admin/datacent/udump/datacent_ora_123384.trc:
ORA-01115: IO error reading block from file 2 (block # 39077)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1310720
ORA-1115 signalled during: alter database open...

这里报的前面两个错误ORA-01115 ORA-01110我们都非常熟悉,类似数据库启动遇到坏块或者io错误之时可能就会报如此错误。但是ORA-27067确实不多见,从mos上看,很多是由于rman备份之时的bug可能导致该错误。

dbv检测undo坏块文件

DBVERIFY: Release 10.2.0.1.0 - Production on Mon Aug 10 23:18:15 2015

Copyright (c) 1982, 2003, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /dc/oradata/datacent/undotbs01.dbf


DBVERIFY - Verification complete

Total Pages Examined         : 329600
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 327504
Total Pages Processed (Seg)  : 17
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2096
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1887888 (0.1887888)

这里可以看到,undo文件本身并没有逻辑和物理的坏块,证明因为数据库异常的原因,可能是由于ORA-27067: size of I/O buffer is invalid导致。根据官方文档ORA-01115 ORA-27067 DURING PARALLEL INSTANCE RECOVERY AFTER INSTANCE CRASH中的解释,我们基本上可以确定很可能是由于10.2.0.1在aix平台的jfs2系统中,由于大量事务操作,突然abort掉数据库(也可能断电),从而数据库在启动的时候进行实例恢复,而由于内部的bug,导致实例恢复无法成功。通过我们处理后的,数据库完美启动,数据0丢失

数据库启动日志

Mon Aug 10 16:34:14 2015
alter database open
Mon Aug 10 16:34:14 2015
Beginning crash recovery of 1 threads
parallel recovery started with 15 processes
Mon Aug 10 16:34:14 2015
Started redo scan
Mon Aug 10 16:34:27 2015
Completed redo scan
7889582 redo blocks read, 0 data blocks need recovery
Mon Aug 10 16:34:27 2015
Started redo application at
Thread 1: logseq 664704, block 1286922
Mon Aug 10 16:34:27 2015
Recovery of Online Redo Log: Thread 1 Group 4 Seq 664704 Reading mem 0
Mem# 0 errs 0: /dev/rredo04
Mon Aug 10 16:34:32 2015
Recovery of Online Redo Log: Thread 1 Group 5 Seq 664705 Reading mem 0
Mem# 0 errs 0: /dev/rredo05
Mon Aug 10 16:34:38 2015
Recovery of Online Redo Log: Thread 1 Group 6 Seq 664706 Reading mem 0
Mem# 0 errs 0: /dev/rredo06
Mon Aug 10 16:34:40 2015
Completed redo application
Mon Aug 10 16:34:40 2015
Completed crash recovery at
Thread 1: logseq 664706, block 1017805, scn 8554793334
0 data blocks read, 0 data blocks written, 7889582 redo blocks read
Mon Aug 10 16:34:40 2015
Thread 1 advanced to log sequence 664707
Thread 1 opened at log sequence 664707
Current log# 1 seq# 664707 mem# 0: /dev/rredo01
Successful open of redo thread 1
Mon Aug 10 16:34:40 2015
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Aug 10 16:34:40 2015
SMON: enabling cache recovery
Mon Aug 10 16:34:40 2015
Successfully onlined Undo Tablespace 1.
Mon Aug 10 16:34:40 2015
SMON: enabling tx recovery
Mon Aug 10 16:34:41 2015
Database Characterset is ZHS32GB18030
replication_dependency_tracking turned off (no async multimaster replication found)
WARNING: AQ_TM_PROCESSES is set to 0. System operation might be adversely affected.
Mon Aug 10 16:34:41 2015
SMON: Parallel transaction recovery tried
Mon Aug 10 16:34:42 2015
db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Mon Aug 10 16:34:42 2015
Completed: alter database open
[/sql]

win中创建控制文件出现ORA-01565 ORA-27041 OSD-04002

oracle 在win平台上创建控制文件可能会出现ORA-01565 ORA-27041 OSD-04002错误

C:\Users\feicheng>sqlplus  / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on 星期六 9月 13 16:20:38 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options



SQL> startup nomount;
ORACLE 例程已经启动。

Total System Global Area  400846848 bytes
Fixed Size                  2281656 bytes
Variable Size             188747592 bytes
Database Buffers          201326592 bytes
Redo Buffers                8491008 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
 11  DATAFILE
 12    'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
 13    'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
 14    'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
 15    'D:\惜分飞\USERS01.DBF'
 16  CHARACTER SET ZHS16GBK
 17  ;
CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
*
第 1 行出现错误:
ORA-01503: CREATE CONTROLFILE ??
ORA-01565: ???? 'D:\???\USERS01.DBF' ???
ORA-27041: ??????
OSD-04002: ????????????
O/S-Error: (OS 123) ????????????????????????????????

alert日志对应错误提示为

Sat Sep 13 16:27:48 2014
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
ORACLE_BASE from environment = D:\oracle
Sat Sep 13 16:28:11 2014
CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
DATAFILE
  'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
  'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
  'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
  'D:\???\USERS01.DBF'
CHARACTER SET ZHS16GBK
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Errors in file D:\ORACLE\diag\rdbms\xff\xff\trace\xff_ora_8136.trc:
ORA-01565: ???? 'D:\???\USERS01.DBF' ???
ORA-27041: ??????
OSD-04002: 无法打开文件
O/S-Error: (OS 123) 文件名、目录名或卷标语法不正确。
ORA-1503 signalled during: CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
DATAFILE
  'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
  'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
  'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
  'D:\???\USERS01.DBF'
CHARACTER SET ZHS16GBK

ORA-01565 ORA-27041 OSD-04002的含义大致为:在创建控制文件的时候,有数据文件无法不存在.
另外在alert日志里面也可以看到,sqlplus中的”D:\惜分飞\USERS01.DBF”变为了”D:\???\USERS01.DBF”导致无法定位到数据文件,从而在创建数据文件之时出现ORA-01565 ORA-27041 OSD-04002错误.
解决放方法:
1.创建控制文件语句中不含中文

C:\Users\feicheng>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on 星期六 9月 13 16:32:09 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

已连接到空闲例程。

SQL> STARTUP NOMOUNT
ORACLE 例程已经启动。

Total System Global Area  400846848 bytes
Fixed Size                  2281656 bytes
Variable Size             188747592 bytes
Database Buffers          201326592 bytes
Redo Buffers                8491008 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
 11  DATAFILE
 12    'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
 13    'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
 14    'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
 15    'D:\xifenfei\USERS01.DBF'
 16  CHARACTER SET ZHS16GBK
 17  ;

控制文件已创建。

2.设置nls_lang为american_america.ZHS16GBK

C:\Users\feicheng>set NLS_LANG=american_america.ZHS16GBK

C:\Users\feicheng>sqlplus  / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Sat Sep 13 16:29:04 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
 11  DATAFILE
 12    'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
 13    'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
 14    'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
 15    'D:\惜分飞\USERS01.DBF'
 16  CHARACTER SET ZHS16GBK
 17  ;

Control file created.

此时alert日志提示

Sat Sep 13 16:29:22 2014
CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
DATAFILE
  'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
  'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
  'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
  'D:\惜分飞\USERS01.DBF'
CHARACTER SET ZHS16GBK
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Sat Sep 13 16:29:25 2014
Successful mount of redo thread 1, with mount id 3507744098
Completed: CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  NOARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 'D:\ORACLE\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
DATAFILE
  'D:\ORACLE\ORADATA\XFF\SYSTEM01.DBF',
  'D:\ORACLE\ORADATA\XFF\SYSAUX01.DBF',
  'D:\ORACLE\ORADATA\XFF\UNDOTBS01.DBF',
  'D:\惜分飞\USERS01.DBF'
CHARACTER SET ZHS16GBK

通过此实验简单说明:在oracle使用该过程中,尽可能少用中文路径或者文件名