联系:手机(13429648788) QQ(107644445)
链接:https://www.orasos.com/ora-6002037%e4%b8%8eora-07445kcbs_dump_adv_state%e9%94%99%e8%af%af.html
标题:ORA-600[2037]与ORA-07445[kcbs_dump_adv_state]错误
作者:惜分飞©版权所有[文章允许转载,但必须以链接方式注明源地址,否则追究法律责任.]
一台win oracle 数据库,重启后发现数据库无法访问,检查发现是Bug 4899479,但是oracle未提供完整的解决方法,这里根据自己对于数据库启动过程的理解,通过屏蔽前滚和回滚,拉起来数据库
数据库版本平台信息
ORACLE:11.1.0.7 OS:WIN 2008 R2 X64
数据库启动报错
Tue Apr 16 12:36:31 2013 alter database open Beginning crash recovery of 1 threads parallel recovery started with 7 processes Started redo scan Completed redo scan 28878 redo blocks read, 7353 data blocks need recovery Started redo application at Thread 1: logseq 7960, block 14132 Recovery of Online Redo Log: Thread 1 Group 1 Seq 7960 Reading mem 0 Mem# 0: D:\APP\SDWLJG-DB101\ORADATA\WLJG\REDO01.LOG Tue Apr 16 12:36:32 2013 RECOVERY OF THREAD 1 STUCK AT BLOCK 915068 OF FILE 9 Hex dump of (file 9, block 1698691) in trace file c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p001_1500.trc Corrupt block relative dba: 0x0259eb83 (file 9, block 1698691) Bad header found during crash/instance recovery Data in bad block: type: 0 format: 0 rdba: 0x0000a206 last change scn: 0x2359.0259eb83 seq: 0xf7 flg: 0x0b spare1: 0x0 spare2: 0x0 spare3: 0x601 consistency value in tail: 0x02c10243 check value in block header: 0x0 block checksum disabled Reread of rdba: 0x0259eb83 (file 9, block 1698691) found valid data Slave exiting with ORA-1172 exception Errors in file c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p001_1500.trc: ORA-01172: recovery of thread 1 stuck at block 915068 of file 9 ORA-01151: use media recovery to recover block, restore backup if needed Tue Apr 16 12:36:32 2013 Errors in file c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p003_4088.trc (incident=187558): ORA-00600: internal error code, arguments: [2037], [12619645], [41474], [6], [1], [247], [12619645], [0], [], [], [], [] Incident details in: c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\incident\incdir_187558\wljg_p003_4088_i187558.trc ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+1352] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x16BFD20] [UNABLE_TO_READ] [] ORA-00600: internal error code, arguments: [2037], [12619645], [41474], [6], [1], [247], [12619645], [0], [], [], [], [] Incident details in: c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\incident\incdir_187559\wljg_p003_4088_i187559.trc Errors in file c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p006_1216.trc (incident=187567):
这里提示file 9 block 915068异常,但是通过dbv检查发现file 9无任何坏块.
trace文件内容
Dump continued from file: c:\app\sdwljg-db101\diag\rdbms\wljg\wljg\trace\wljg_p003_4088.trc
ORA-00600: internal error code, arguments: [2037], [12620930], [41474], [2], [1], [247], [12619645], [0], [], [], [], []
** DBGRL Error: ARB Alert Log
** DBGRL Error: <msg time='2013-04-16T11:05:58.522+08:00' org_id='oracle' comp_id='rdbms'
 msg_id='dbgexProcessError:1097:3370026720' type='TRACE' level='16'
 host_id='SDWLSCJG-DB' host_addr='172.18.1.15'>
 <txt>Incident details in: c:\app\sdwljg-db101\diag\rdbms\wljg\wlj
========= Dump for incident 129879 (ORA 600 [2037]) ========
*** 2013-04-16 11:05:58.522
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst1()+111        CALL???  skdstdst()+0         000000000 000000000 01CFC9B80
                                                   000000200
ksedst()+63          CALL???  ksedst1()+0          000000005 021B00600 005D30C80
                                                   000002004
dbkedDefDump()+1012  CALL???  ksedst()+0           000000000 000000000 000000000
                                                   000000000
ksedmp()+51          CALL???  dbkedDefDump()+0     000000003 000000002 021AF92C0
                                                   000405038
__PGOSF184_ksfdmp()  CALL???  ksedmp()+0           000000000 000000000 000000000
+27                                                27F00000000
dbgexPhaseII()+266   CALL???  __PGOSF184_ksfdmp()  00000000D 0082FAE50 000000000
                              +0                   000000004
dbgexProcessError()  CALL???  dbgexPhaseII()+0     021B00600 021AFCA50 000000201
+1313                                              000000000
dbgeExecuteForError  CALL???  dbgexProcessError()  021B00600 021B07590 000000001
()+55                         +0                   000000000
dbgePostErrorKGE()+  CALL???  dbgeExecuteForError  021AFCA30 021AFCA80 00000002E
1608                          ()+0                 000000005
dbkePostKGE_kgsf()+  CALL???  dbgePostErrorKGE()+  01CFC99D0 021B0E080 000000258
65                            0                    021B0E080
kgeade()+556         CALL???  dbkePostKGE_kgsf()+  000002000 000000000 000000009
                              0                    000000004
kgeriv_int()+105     CALL???  kgeade()+0           3A4F00000003 000C09482
                                                   0FFFFFFFF 000000000
kgeriv()+27          CALL???  kgeriv_int()+0       3A9A024E0 000000000 01CFC9410
                                                   000000000
kgesiv()+102         CALL???  kgeriv()+0           0000008D5 0000008C3 021AFD9A0
                                                   000AFDC73
ksesic7()+125        CALL???  kgesiv()+0           006371F20 000000007 27F912000
                                                   200000004
kcoexam()+248        CALL???  ksesic7()+0          2000007F5 000000000 000C09482
                                                   000000000
kcbtema()+2154       CALL???  kcoexam()+0          27FFC22C8 39E113470 3A940BBB8
                                                   000000000
kcrpap()+355         CALL???  kcbtema()+0          27FFC22C8 28BFC2628 000000000
                                                   021B10200
kcrpdv()+1655        CALL???  kcrpap()+0           021B101A0 000000002 000000004
                                                   000000512
kxfprdp()+1384       CALL???  kcrpdv()+0           3A7AD3098 000000000 00000000C
                                                   00757CF00
opirip()+1396        CALL???  kxfprdp()+0          00000001E 005CDB518 021AFF9E0
                                                   000000000
opidrv()+855         CALL???  opirip()+0           000000032 000000004 021AFFD30
                                                   000000000
sou2o()+52           CALL???  opidrv()+213         000000032 000000004 021AFFD30
                                                   021AFFDB0
opimai_real()+295    CALL???  sou2o()+0            000000000 7FEFD9819B5
                                                   000000000 000000000
opimai()+96          CALL???  opimai_real()+0      000000000 000000000 000000000
                                                   000000000
BackgroundThreadSta  CALL???  opimai()+0           021AFFE98 000000001 000000000
rt()+695                                           000000000
00000000775AF56D     CALL???  BackgroundThreadSta  00A26B7A0 000000000 000000000
                              rt()+0               000000000
0000000077923281     CALL???  00000000775AF560     000000000 000000000 000000000
                                                   000000000
 
--------------------- Binary Stack Dump ---------------------
查询mos发现During Startup (Open Database) Alert Log Shows ORA-600[2037] and ORA-7445[kcbs_dump_adv_state] [ID 551993.1]和我们这里展示的错误相符,引起该问题的原因主要是因为:The database may crash and fail to open due to undo/redo corruption if you are using distributed transactions.因为使用分布式事务的时候,数据库crash导致undo/redo corruption,从而使得数据库无法正常启动.
故障处理思路
因为通过数据库alert日志可以知道,数据库是在做前滚的时候并发进程失败,设置fast_start_parallel_rollback=false,禁止数据库实例恢复并发,可以恢复依然失败.因为前滚过不去,那就通过设置隐含参数禁止数据库前滚,在open数据库的过程中发现ora-600[2662]错误,推进scn,继续open数据库发现ora-600[4194],通过设置undo管理模式,屏蔽事务,屏蔽回滚段等方法,终于重新open库并重建undo,然后重建库算是完成恢复任务
 
	        
During Startup (Open Database) Alert Log Shows ORA-600[2037] and ORA-7445[kcbs_dump_adv_state] [ID 551993.1]
Applies to: Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3 - Release: 10.2 to 10.2 Information in this document applies to any platform. ***Checked for relevance on 22-Feb-2012*** Symptoms The database may crash and fail to open due to undo/redo corruption if you are using distributed transactions. The following error may be seen in the alert log: ORA-00600: internal error code, arguments: [2037] The call stack can look like this: kcoexam kcbtema kcrpap kcrpdv kxfprdp ... In addition, the following error may also be seen in the alert log: ORA-07445: exception encountered: core dump [kcbs_dump_adv_state] Cause The problem is caused by Bug 4899479 Undo/redo corruption if distributed transactions used. Details: Redo/undo corruption and associated dumps and/or ORA-600 errors/memory corruption can occur from a transaction which uses "in memory undo" pool memory if the transaction also includes distributed operations. e.g.: Insert using buffered inserts, distributed operation, subsequent insert can lead to this problem if the first insert uses in-memory undo. Solution To resolve the corruption, you will need to do a point in time recovery from a good backup. To prevent the bug from occurring, you may choose either of these options: 1. Apply the 10.2.0.4 patchset which was not available at the time of this writing (Feb 2008). 2. Check MetaLink for backport Patch 4899479 for your operating system and database release.