ORA-600 999

有网友数据库启动报ORA-600 999错误,无法正常open,让我们介入分析,帮忙恢复其中部分数据
数据库启动报ORA-600 999

Sun Jul 31 23:09:36 2016
SMON: enabling cache recovery
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_3356.trc  (incident=179779):
ORA-00600: internal error code, arguments: [999], [0x7FFAE748013], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_179779\orcl_smon_3356_i179779.trc
No Resource Manager plan active
Starting background process QMNC
Sun Jul 31 23:09:37 2016
QMNC started with pid=20, OS id=5068 
ORACLE Instance orcl (pid = 13) - Error 600 encountered while recovering transaction (7, 1).
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_3356.trc:
ORA-00600: internal error code, arguments: [999], [0x7FFAE748013], [], [], [], [], [], [], [], [], [], []
Completed: alter database open
Sun Jul 31 23:09:38 2016
db_recovery_file_dest_size of 8680 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Trace dumping is performing id=[cdmp_20160731230939]
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_3356.trc  (incident=179785):
ORA-00600: internal error code, arguments: [999], [0x7FFAE748013], [], [], [], [], [], [], [], [], [], []
ORACLE Instance orcl (pid = 13) - Error 600 encountered while recovering transaction (7, 1).
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_3356.trc:
ORA-00600: internal error code, arguments: [999], [0x7FFAE748013], [], [], [], [], [], [], [], [], [], []
Sun Jul 31 23:09:41 2016
Starting background process CJQ0
Sun Jul 31 23:09:41 2016
CJQ0 started with pid=25, OS id=2572 
Process debug not enabled via parameter _debug_enable
Trace dumping is performing id=[cdmp_20160731230942]
PMON (ospid: 3948): terminating the instance due to error 474
Sun Jul 31 23:09:48 2016
opiodr aborting process unknown ospid (2592) as a result of ORA-1092
Sun Jul 31 23:09:48 2016
ORA-1092 : opitsk aborting process
Sun Jul 31 23:09:52 2016
Instance terminated by PMON, pid = 3948

设置_offline_rollback_segments数据库启动正常

Sun Jul 31 23:18:13 2016
ALTER DATABASE OPEN
Thread 1 opened at log sequence 16
  Current log# 1 seq# 16 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 5.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
No Resource Manager plan active
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4372.trc  (incident=182188):
ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [224], [38508], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_182188\orcl_smon_4372_i182188.trc
Doing block recovery for file 3 block 224
Resuming block recovery (PMON) for file 3 block 224
Block recovery from logseq 16, block 2945 to scn 15431544
Recovery of Online Redo Log: Thread 1 Group 1 Seq 16 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Trace dumping is performing id=[cdmp_20160731231815]
Block recovery stopped at EOT rba 16.2952.16
Block recovery completed at rba 16.2952.16, scn 0.15431543
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4372.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [224], [38508], [], [], [], [], [], [], [], []
Sun Jul 31 23:18:19 2016
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4372.trc  (incident=182189):
ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [224], [38508], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_182189\orcl_smon_4372_i182189.trc
Starting background process QMNC
Sun Jul 31 23:18:19 2016
QMNC started with pid=20, OS id=4920 
Doing block recovery for file 3 block 224
Resuming block recovery (PMON) for file 3 block 224
Block recovery from logseq 16, block 2945 to scn 15431544
Recovery of Online Redo Log: Thread 1 Group 1 Seq 16 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Block recovery completed at rba 16.2952.16, scn 0.15431545
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4372.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kdBlkCheckError], [3], [224], [38508], [], [], [], [], [], [], [], []
Starting background process SMCO
Sun Jul 31 23:18:19 2016
SMCO started with pid=21, OS id=3176 
Sun Jul 31 23:18:20 2016
Trace dumping is performing id=[cdmp_20160731231820]
Completed: ALTER DATABASE OPEN

尝试删除异常回滚段

Sun Jul 31 23:15:07 2016
drop rollback segment "_SYSSMU7_1101470402$"
Sun Jul 31 23:15:07 2016
Corrupt Block Found
         TSN = 2, TSNAME = UNDOTBS1
         RFN = 3, BLK = 224, RDBA = 12583136
         OBJN = -1, OBJD = -1, OBJECT = , SUBOBJECT = 
         SEGMENT OWNER = , SEGMENT TYPE = 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_5300.trc  (incident=181035):
ORA-00600: 内部错误代码, 参数: [kdBlkCheckError], [3], [224], [38508], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_181035\orcl_ora_5300_i181035.trc
Doing block recovery for file 3 block 224
Resuming block recovery (PMON) for file 3 block 224
Block recovery from logseq 14, block 8682 to scn 15397854
Recovery of Online Redo Log: Thread 1 Group 2 Seq 14 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
Block recovery completed at rba 14.8688.16, scn 0.15397855
ORA-607 signalled during: drop rollback segment "_SYSSMU7_1101470402$"...
Corrupt Block Found
         TSN = 2, TSNAME = UNDOTBS1
         RFN = 3, BLK = 224, RDBA = 12583136
         OBJN = -1, OBJD = -1, OBJECT = , SUBOBJECT = 
         SEGMENT OWNER = , SEGMENT TYPE = 

从这里看,我们可以确定file 3 block 224异常,导致删除回滚段异常.和mos官方给出来的案例类似,由于undo坏块导致数据库报ORA-600 999错误

mos中ORA-600 999报错信息
官方的益处ORA-600[999]报错,也是由于undo坏块引起和本文的报错基本上一致
ORA-600-999


因为只要部分数据,直接屏蔽回滚段,数据库不再crash,导出需要对象即可

找出来asm 磁盘组中数据文件别名对应的文件号

前段时间有多个朋友问我,在amdu中,如果数据文件命名不是omf的方式,该如何找出来数据文件的asm file_number,从而实现通过amdu对不能mount的磁盘组中的数据文件进行恢复,这里通过测试给出来处理方法.根据我们对asm的理解,asm file_number 6为asm file的别名文件记录所在地,我们通过分析kfed这些au中的记录即可获得相关数据文件的别名对应的asm文件号

模拟各种别名

D:\app\product\10.2.0\db_1\bin>sqlplus / as sysdba

SQL*Plus: Release 10.2.0.3.0 - Production on 星期三 7月 27 22:48:48 2016

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.


连接到:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+DATA/ora10g/datafile/system.256.914797317
+DATA/ora10g/datafile/undotbs1.258.914797317
+DATA/ora10g/datafile/sysaux.257.914797317
+DATA/ora10g/datafile/users.259.914797317

SQL> create tablespace xifenfei
  2  datafile '+data/xifenfei01.dbf' size 10M;

表空间已创建。

SQL> alter tablespace xifenfei add
  2  datafile '+data/ora10g/datafile/xifenfei02.dbf' size 10m;

表空间已更改。

SQL> alter tablespace xifenfei add
  2  datafile '+data/ora10g/xifenfei03.dbf' size 10m;

表空间已更改。

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+DATA/ora10g/datafile/system.256.914797317
+DATA/ora10g/datafile/undotbs1.258.914797317
+DATA/ora10g/datafile/sysaux.257.914797317
+DATA/ora10g/datafile/users.259.914797317
+DATA/xifenfei01.dbf
+DATA/ora10g/datafile/xifenfei02.dbf
+DATA/ora10g/xifenfei03.dbf

已选择7行。

分析磁盘组和别名信息

SQL> select name from v$asm_disk;

NAME
------------------------------
DATA_0000
DATA_0001

SQL> select path from v$asm_disk;

PATH
-----------------------------------------
H:\ASMDISK\ASMDISK1.DD
H:\ASMDISK\ASMDISK2.DD

SQL> SELECT NAME,FILE_NUMBER FROM V$ASM_ALIAS where file_number<>4294967295;

NAME                           FILE_NUMBER
------------------------------ -----------
SYSTEM.256.914797317                   256
SYSAUX.257.914797317                   257
UNDOTBS1.258.914797317                 258
USERS.259.914797317                    259
XIFENFEI.266.918341361                 266
XIFENFEI.267.918341389                 267
xifenfei02.dbf                         267
XIFENFEI.268.918341409                 268
Current.260.914797381                  260
group_1.261.914797385                  261
group_2.262.914797385                  262
group_3.263.914797387                  263
TEMP.264.914797393                     264
spfile.265.914797421                   265
spfileora10g.ora                       265
xifenfei03.dbf                         268
xifenfei01.dbf                         266

已选择17行。

SQL> SELECT NAME,FILE_NUMBER FROM V$ASM_ALIAS;

NAME                           FILE_NUMBER
------------------------------ -----------
ORA10G                          4294967295
DATAFILE                        4294967295
SYSTEM.256.914797317                   256
SYSAUX.257.914797317                   257
UNDOTBS1.258.914797317                 258
USERS.259.914797317                    259
XIFENFEI.266.918341361                 266
XIFENFEI.267.918341389                 267
xifenfei02.dbf                         267
XIFENFEI.268.918341409                 268
CONTROLFILE                     4294967295
Current.260.914797381                  260
ONLINELOG                       4294967295
group_1.261.914797385                  261
group_2.262.914797385                  262
group_3.263.914797387                  263
TEMPFILE                        4294967295
TEMP.264.914797393                     264
PARAMETERFILE                   4294967295
spfile.265.914797421                   265
spfileora10g.ora                       265
xifenfei03.dbf                         268
xifenfei01.dbf                         266

已选择23行。

从sql查询,我们可以确定xifenfei0n.dbf对应的文件号分别为:xifenfei01.dbf==>266,xifenfei02.dbf==>267,xifenfei03.dbf==>268

通过kfed file 6所在位置

www.xifenfei.com>kfed read H:\ASMDISK\ASMDISK1.DD |grep f1b1
kfdhdb.f1b1locn:                      2 ; 0x0d4: 0x00000002
kfdhdb.f1b1fcn.base:                  0 ; 0x100: 0x00000000
kfdhdb.f1b1fcn.wrap:                  0 ; 0x104: 0x00000000

www.xifenfei.com>kfed read H:\ASMDISK\ASMDISK1.DD aun=2 blkn=6|grep kfffde|more
kfffde[0].xptr.au:                   26 ; 0x4a0: 0x0000001a
kfffde[0].xptr.disk:                  0 ; 0x4a4: 0x0000
kfffde[0].xptr.flags:                 0 ; 0x4a6: L=0 E=0 D=0 S=0
kfffde[0].xptr.chk:                  48 ; 0x4a7: 0x30
kfffde[1].xptr.au:           4294967295 ; 0x4a8: 0xffffffff
kfffde[1].xptr.disk:              65535 ; 0x4ac: 0xffff

从这里我们可以确定别名的au只有一个位于disk 0, au 26(0x1a)的位置
通过kfed分析别名

www.xifenfei.com>kfed read H:\ASMDISK\ASMDISK1.DD aun=26 |more
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 11 ; 0x002: KFBTYP_ALIASDIR
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 6 ; 0x008: file=6
kfbh.check: 1563703526 ; 0x00c: 0x5d3438e6
kfbh.fcn.base: 3461 ; 0x010: 0x00000d85
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kffdnd.bnode.incarn: 1 ; 0x000: A=1 NUMM=0x0
kffdnd.bnode.frlist.number: 4294967295 ; 0x004: 0xffffffff
kffdnd.bnode.frlist.incarn: 0 ; 0x008: A=0 NUMM=0x0
kffdnd.overfl.number: 4294967295 ; 0x00c: 0xffffffff
kffdnd.overfl.incarn: 0 ; 0x010: A=0 NUMM=0x0
kffdnd.parent.number: 0 ; 0x014: 0x00000000
kffdnd.parent.incarn: 1 ; 0x018: A=1 NUMM=0x0
kffdnd.fstblk.number: 0 ; 0x01c: 0x00000000
kffdnd.fstblk.incarn: 1 ; 0x020: A=1 NUMM=0x0
kfade[0].entry.incarn: 1 ; 0x024: A=1 NUMM=0x0
kfade[0].entry.hash: 2080305534 ; 0x028: 0x7bfef17e
kfade[0].entry.refer.number: 1 ; 0x02c: 0x00000001
kfade[0].entry.refer.incarn: 1 ; 0x030: A=1 NUMM=0x0
kfade[0].name: ORA10G ; 0x034: length=6
kfade[0].fnum: 4294967295 ; 0x

_OFFLINE_ROLLBACK_SEGMENTS _CORRUPTED_ROLLBACK_SEGMENTS

对于oracle undo异常的时候恢复中,经常需要使用的_OFFLINE_ROLLBACK_SEGMENTS和_CORRUPTED_ROLLBACK_SEGMENTS参数,关于这两个参数的区别进行说明
_OFFLINE_ROLLBACK_SEGMENTS 参数说明
_offline_rollback_segments


_CORRUPTED_ROLLBACK_SEGMENTS 参数说明
_corrupted_rollback_segments


_OFFLINE_ROLLBACK_SEGMENTS 和 _CORRUPTED_ROLLBACK_SEGMENTS 区别
offline_corrupted


这两个参数属于oracle隐含参数,在没有oracle support的情况下,请慎用.该相关参数可能导致数据库逻辑不一致风险,如果使用了,建议逻辑方式导出导入库

oracle asm系列文章汇总—www.xifenfie.com

为了方便大家更容易的查看相关asm内容,今天(2016年7月28日)对asm的相关文章进行了汇总整理.如果有asm相关的其他问题,可以通过手机(17813235971)或者QQ(107644445)交流
远程访问ASM
ASMCMD常用命令
ASM简单管理(1)
ASM简单管理(2)
找回ASM中数据文件
bbed修改ASM中数据
ASM迁移至文件系统
普通库迁移至ASM存储
使用dd复制asm中文件
配置Oracle ASM磁盘
ASM中磁盘组权限设置
监控asm disk磁盘性能
create spfile to asm
pvid=yes导致asm无法mount—ASM恢复案例
asm数据文件迁移(os–>asm)
asm数据文件迁移(asm–>asm)
asm数据文件迁移(asm–>os)
asm disk格式化为ntfs恢复—ASM恢复案例
通过ftp/http拷贝asm中文件
ASM DISK HEADER 备份与恢复
手工修复ASM DISK HEADER 异常
asm disk header 彻底损坏恢复—ASM恢复案例
ASM未正常启动,使用dd找回数据文件
asm disk被格式化为ext4文件系统恢复—ASM恢复案例
ORACLE 12C ASM 新特性:共享密码文件
asm备份元数据之md_backup和md_restore
分区无法识别导致asm diskgroup无法mount—ASM恢复案例
使用losetup实现linux普通文件做asm disk
Oracle异常恢复前备份保护现场建议—ASM环境
多cpu环境中运行root.sh失败,asm报ORA-04031
asmlib异常报ORA-00600[kfklLibFetchNext00]
因asm sga_target设置不当导致11gr2 rac无法正常启动
asm disk误设置pvid导致asm diskgroup无法mount恢复—ASM恢复案例
使用_asm_allow_only_raw_disks实现普通文件做asm disk
ORACLE 12C RAC修改ocr/votedisk/asm spfile所在磁盘组名称
使用asm disk header 自动备份信息恢复异常asm disk header
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例—ASM恢复案例
ADHU(ASM Disk Header Utility)—asm disk header备份恢复工具
How to Get the Contents of an Spfile on ASM when ASM/GRID is down
ORA-15042: ASM disk “N” is missing from group number “M” 故障恢复—ASM恢复案例

ORA-600 [4193]

ORA-600 4193 解释说明

ERROR:              

  Format: ORA-600 [4193] [a] [b]

VERSIONS:           
  versions 6.0 to 12.1

DESCRIPTION:        

  A mismatch has been detected between Redo records and Rollback (Undo) 
  records.

  We are validating the Undo block sequence number in the undo block against 
  the Redo block sequence number relating to the change being applied.

  This error is reported when this validation fails.

ARGUMENTS:
  Arg [a] Undo record seq number
  Arg [b] Redo record seq number

FUNCTIONALITY:
  KERNEL TRANSACTION UNDO





ORA-600 [4193] [a] [b] [ ] [ ]  [ ]        
Versions: 7.2.2  - 9.2.0                              Source: ktuc.c
===========================================================================
Meaning: seq# mismatch while adding an undo record to an undo block. This 
         is done by the application of redo. 
---------------------------------------------------------------------------
Argument Description:

    a. (ktubhseq): undo record seq# - this is the seq# of the block that 
                                      this undo record WILL BE APPLIED TO. 
                                      This is from the Undo Block. It is 
                                      NOT the seq# of the undo block itself.
                                      
    b. (ktudbseq): redo RECORD seq# - this is the seq# number in the block 
                                      that this redo WILL BE APPLIED TO. 
                                      This is from the Redo Record. 

---------------------------------------------------------------------------
Diagnosis:

    This error is raised in kturdb which handles the adding of undo records 
    by the application of redo. 
    
    When we try to apply redo to an undo block (forward changes are made by 
    the application of redo to a block) we check that the seq# in the undo 
    record matches the seq# in the redo record. These seq# should be the 
    same because when we apply a redo record we must apply it to the 
    correct version of the block. We can only apply a redo record to a 
    block that contains the same seq# as in the redo record. 

    If the seq# do not match then this error is raised. This implies some 
    kind of block corruption in either the redo or the undo block. 

7.3.x - 8.1.7.x
ASSERT2(ubh->ktubhseq == db->ktudbseq, OERI(4193), KSESVSGN,
            ubh->ktubhseq, db->ktudbseq);
9.2.x
ksesic2(OERI(4193), ksenrg(ubh->ktubhseq), ksenrg(db->ktudbseq));

struct ktubh
{
  kxid  ktubhxid;      /* txid of tx currently using or last used this block */
  ub2   ktubhseq;                              /* undo block sequence number */
  ub1   ktubhcnt;    /* high water mark record index, number of undo entries */
  ub1   ktubhirb;  /* rollback record index, rec index to start the rollback */
  ub1   ktubhicl;  /* collecting record index, rec index to start retrieving col info */
  ub1   ktubhflg;                                                 /* dummy */
  ub2   ktubhidx[1];     /* byte offset of record in block, grows at runtime */
};

struct ktudb   Kernel Transaction Undo Data operation Block (redo)
{
  ub2    ktudbsiz;                                          /* size of entry */
  ub2    ktudbspc;                 /* verification: space left in undo block */
  ub2    ktudbflg;            /* flag to indicate the kind of redo operation */
  kxid   ktudbxid;                                          /* current tx id */
  ub2    ktudbseq;                                  /* block sequence number */
  ub1    ktudbrec;                       /* new record index for this change */
};

ORA 600 4193 处理方法同How to resolve ORA-600 [4194] errors