open数据库报ora-600 kdsgrp1故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:open数据库报ora-600 kdsgrp1故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户服务器异常关机之后,数据库无法正常启动,查看alert日志发现如下报ORA-600 kdsgrp1报错信息导致数据库无法正常open

2025-12-05T16:17:23.315325+08:00
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
stopping change tracking
Undo initialization recovery: err:0 start: 7620046 end: 7620062 diff: 16 ms (0.0 seconds)
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_ora_9672.trc  (incident=1243378):
ORA-00600: 内部错误代码, 参数: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\his\his\incident\incdir_1243378\his_ora_9672_i1243378.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2025-12-05T16:17:25.341421+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************
2025-12-05T16:17:25.533576+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_ora_9672.trc  (incident=1243379):
ORA-00600: 内部错误代码, 参数: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\his\his\incident\incdir_1243379\his_ora_9672_i1243379.trc
2025-12-05T16:17:27.007101+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2025-12-05T16:17:27.008097+08:00
Undo initialization online undo segments: err:600 start: 7620062 end: 7623343 diff: 3281 ms (3.3 seconds)
2025-12-05T16:17:27.008097+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_ora_9672.trc:
ORA-00600: 内部错误代码, 参数: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
2025-12-05T16:17:27.008097+08:00
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_ora_9672.trc:
ORA-00600: 内部错误代码, 参数: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_ora_9672.trc  (incident=1243380):
ORA-00603: ORACLE 服务器会话因致命错误而终止
ORA-01092: ORACLE 实例终止。强制断开连接
ORA-00600: 内部错误代码, 参数: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\his\his\incident\incdir_1243380\his_ora_9672_i1243380.trc
2025-12-05T16:17:28.540363+08:00
opiodr aborting process unknown ospid (9672) as a result of ORA-603

尝试人工启动数据库,报错比较明显也是ORA-600 kdsgrp1错误

C:\Users\Administrator>sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Fri Dec 5 16:45:48 2025
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> recover datafile 1;
Media recovery complete.
SQL> recover database;
Media recovery complete.
SQL> alter database open ;
alter database open
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [],
[], [], [], [], []
Process ID: 10800
Session ID: 5025 Serial number: 34108

查看报错trace文件

*** CLIENT DRIVER:(SQL*PLUS) 2025-12-05T16:17:23.924647+08:00
 
[TOC00000]
Jump to table of contents
Dump continued from file: D:\APP\ADMINISTRATOR\diag\rdbms\his\his\trace\his_ora_9672.trc
[TOC00001]
ORA-00600: 内部错误代码, 参数: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []

[TOC00001-END]
[TOC00002]
========= Dump for incident 1243378 (ORA 600 [kdsgrp1]) ========
[TOC00003]
----- Beginning of Customized Incident Dump(s) -----
kdsDumpState: cdb: 0 dspdb: 0 type: 1 info: 0x5 flag: 0x0
kdsDumpState: sample type: 0 layer: 0
* kdsgrp1-1: ***********************************************
            row 0x00c02754.24 continuation at: 0x00c02754.24 file# 3 block# 10068 slot 36 not found (dscnt: 0)
 state kdscurrid points to 0x00c28206.17
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 36 ..... nrows: 60
Dumping kcb descriptor:
kcbds 0x1e08fe71510: pdb 0, tsn 1, rdba 0x00c02754, afn 3, objd 11331, cls 1, tidflg 0x8 0x80 0x0

在这里基本上可以确认报ORA-600 错误的是file 3 block 10068,dataobj 为11331,根据经验初步怀疑是sysaux中的wrh$_某个对象异常.对启动过程做10046进一步确认

EXEC #2063975471112:c=17341,e=35163,p=54,cr=283,cu=0,mis=1,r=0,dep=1,og=4,plh=794648223,tim=9717774784
WAIT #2063975471112: nam='db file sequential read' ela= 115 file#=3 block#=11434 blocks=1 obj#=11579 tim=9717774961
WAIT #2063975471112: nam='db file sequential read' ela= 93 file#=3 block#=11435 blocks=1 obj#=11579 tim=9717775098
WAIT #2063975471112: nam='db file sequential read' ela= 89 file#=3 block#=11436 blocks=1 obj#=11579 tim=9717775217
WAIT #2063975471112: nam='db file sequential read' ela= 98 file#=3 block#=11437 blocks=1 obj#=11579 tim=9717775361
WAIT #2063975471112: nam='db file sequential read' ela= 23087 file#=3 block#=11438 blocks=1 obj#=11579 tim=9717798471
WAIT #2063975471112: nam='db file sequential read' ela= 83 file#=3 block#=11439 blocks=1 obj#=11579 tim=9717798681
WAIT #2063975471112: nam='db file sequential read' ela= 183 file#=3 block#=10075 blocks=1 obj#=11332 tim=9717799100
WAIT #2063975471112: nam='db file sequential read' ela= 124 file#=3 block#=10078 blocks=1 obj#=11332 tim=9717799291
WAIT #2063975471112: nam='db file sequential read' ela= 115 file#=3 block#=165702 blocks=1 obj#=11332 tim=9717799460
WAIT #2063975471112: nam='db file sequential read' ela= 109 file#=3 block#=83355 blocks=1 obj#=11332 tim=9717799621
WAIT #2063975471112: nam='db file sequential read' ela= 139 file#=3 block#=165703 blocks=1 obj#=11332 tim=9717799816
WAIT #2063975471112: nam='db file sequential read' ela= 139 file#=3 block#=10079 blocks=1 obj#=11332 tim=9717800074
WAIT #2063975471112: nam='db file sequential read' ela= 112 file#=3 block#=165696 blocks=1 obj#=11332 tim=9717800248
WAIT #2063975471112: nam='db file sequential read' ela= 110 file#=3 block#=165701 blocks=1 obj#=11332 tim=9717800502
WAIT #2063975471112: nam='db file sequential read' ela= 107 file#=3 block#=83354 blocks=1 obj#=11332 tim=9717800665
WAIT #2063975471112: nam='db file sequential read' ela= 109 file#=3 block#=83356 blocks=1 obj#=11332 tim=9717800823
WAIT #2063975471112: nam='db file sequential read' ela= 107 file#=3 block#=83358 blocks=1 obj#=11332 tim=9717800978
WAIT #2063975471112: nam='db file sequential read' ela= 106 file#=3 block#=165698 blocks=1 obj#=11332 tim=9717801133
WAIT #2063975471112: nam='db file sequential read' ela= 106 file#=3 block#=164358 blocks=1 obj#=11331 tim=9717801298
WAIT #2063975471112: nam='db file sequential read' ela= 108 file#=3 block#=10068 blocks=1 obj#=11331 tim=9717801509
2025-12-05T16:52:21.382064+08:00
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []

FETCH #2063975471112:c=1599462,e=1655487,p=20,cr=26,cu=0,mis=0,r=0,dep=1,og=4,plh=794648223,tim=9719430286
STAT #2063975471112 id=1 cnt=0 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=0 pr=0 pw=0 str=1 time=8 us)'
STAT #2063975471112 id=2 cnt=0 pid=1 pos=1 obj=0 op='HASH JOIN  (cr=0 pr=0 pw=0 str=1 time=5 us cost=9 size=34 card=1)'
STAT #2063975471112 id=3 cnt=1 pid=2 pos=1 obj=11579 op='TABLE ACCESS FULL WRM$_SNAPSHOT
 (cr=12 pr=6 pw=0 str=1 time=23966 us cost=4 size=17 card=1)'
STAT #2063975471112 id=4 cnt=1 pid=3 pos=1 obj=0 op='SORT AGGREGATE (cr=6 pr=3 pw=0 str=1 time=23513 us)'
STAT #2063975471112 id=5 cnt=1 pid=4 pos=1 obj=11579 op='TABLE ACCESS FULL WRM$_SNAPSHOT
(cr=6 pr=3 pw=0 str=1 time=23497 us cost=4 size=13 card=1)'
STAT #2063975471112 id=6 cnt=6 pid=2 pos=2 obj=11331 op='TABLE ACCESS BY INDEX ROWID BATCHED WRH$_UNDOSTAT
(cr=13 pr=13 pw=0 str=1 time=2455 us cost=4 size=102 card=6)'
STAT #2063975471112 id=7 cnt=7 pid=6 pos=1 obj=11332 op='INDEX SKIP SCAN WRH$_UNDOSTAT_PK
(cr=12 pr=12 pw=0 str=1 time=2304 us cost=2 size=0 card=6)'
2025-12-05T16:52:23.005928+08:00
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []

<error barrier> at 0x000000275BED14D0 placed dbsdrv.c@4959
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
<error barrier> at 0x000000275BED14D0 placed dbsdrv.c@4959
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []

到这里基本上可以确认是WRH$_UNDOSTAT_PK去找WRH$_UNDOSTAT中午对应记录,从而出现该问题,对应的具体sql为

PARSING IN CURSOR #2063975471112 len=233 dep=1 uid=0 oct=3 lid=0 tim=9717739563
  hv=517686153 ad='1e1efda1f20' sqlid='d4m7ss0gdqhw9'
select max(maxconcurrency) from sys.wrh$_undostat where instance_number = :1 and dbid = :2 and snap_id in  
(select snap_id from dba_hist_snapshot where end_interval_time >(select max(end_interval_time)-7 from dba_hist_snapshot))

BINDS #2063975471112:

 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0000 frm=00 csi=00 siz=48 off=0
  kxsbbbfp=8e8646e8  bln=22  avl=02  flg=05
  value=1
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=0000 frm=00 csi=00 siz=0 off=24
  kxsbbbfp=8e864700  bln=22  avl=06  flg=01
  value=971429521

对于这种问题,处理起来相对比较简单,绕过oracle open过程对该sql的执行,然后open库对该表的index进行rebuild

SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = &FILE_ID
  4     AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1;
Enter value for file_id: 3
old   3:  WHERE FILE_ID = &FILE_ID
new   3:  WHERE FILE_ID = 3
Enter value for block_id: 10068
old   4:    AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
new   4:    AND 10068 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1

OWNER
--------------------------------------------------------------------------------
SEGMENT_NAME
--------------------------------------------------------------------------------
SEGMENT_TYPE       TABLESPACE_NAME
------------------ ------------------------------
PARTITION_NAME
--------------------------------------------------------------------------------
SYS
WRH$_UNDOSTAT
TABLE              SYSAUX



SQL> select index_name from dba_indexes where table_name='WRH$_UNDOSTAT';

INDEX_NAME
--------------------------------------------------------------------------------
WRH$_UNDOSTAT_PK

SQL> alter index WRH$_UNDOSTAT_PK REBUILD ONLINE;

Index altered.

至此该数据库完成恢复任务,重启之后一切正常.