aix平台运行11.2.0.4 rac,突然一个节点crash,lms2进程报ORA-600 kghstack_underflow_internal_2错误
Thu Aug 03 18:43:16 2023
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc (incident=761244):
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_761244/xff2_lms2_2884404_i761244.trc
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc (incident=761245):
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11AB5BBF0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_761245/xff2_lms2_2884404_i761245.trc
Thu Aug 03 18:43:19 2023
Dumping diagnostic data in directory=[cdmp_20230803184319], requested by (instance=2, osid=2884404 (LMS2)), summary=[incident=761245].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Thu Aug 03 18:43:23 2023
Sweep [inc][761245]: completed
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
Sweep [inc][761244]: completed
Sweep [inc2][761245]: completed
Sweep [inc2][761244]: completed
Thu Aug 03 18:43:29 2023
Errors in file /u01/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms2_2884404.trc:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x11074D658], [], [], [], [], [], [], [], [], [], []
LMS2 (ospid: 2884404): terminating the instance due to error 484
分析trace文件中的Call Stack Trace信息
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+40 bl 0000000109B3EE38 000000000 ? 000000001 ?
000000003 ? 000000000 ?
000000000 ? 000000001 ?
000000003 ? 000000000 ?
ksedst1()+112 call skdstdst() 1777D9901C4FD34D ?
4840284100000000 ?
FFFFFFFFFFECE20 ?
2A501377F67A7 ? 10A742204 ?
000000000 ? 1107486C0 ?
2050033FFFECE28 ?
ksedst()+40 call ksedst1() FFFFFFFFFFFE0002 ?
0000060F1 ? 000000001 ?
10A46AD18 ? 000000000 ?
000000000 ? 000002004 ?
000000001 ?
dbkedDefDump()+1516 call ksedst() 000000000 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 300000003 ?
ksedmp()+72 call dbkedDefDump() 3107486C0 ? 110000A28 ?
FFFFFFFFFFED630 ? 1106ABC70 ?
100125778 ? FFFFFFFFFFED5B0 ?
FFFFFFFFFFEDA30 ? 1106ABC70 ?
ksfdmp()+100 call ksedmp() 000000002 ? 000000000 ?
000000002 ? 10AF71A68 ?
10A0720F8 ? 000000000 ?
1108EC608 ? 1107486C0 ?
dbgexPhaseII()+1904 call ksfdmp() FFFFFFFFFFFE0002 ?
0000060F1 ? 000000002 ?
000000000 ? 000000002 ?
10A0720F0 ? 000000000 ?
001050005 ?
dbgexProcessError() call dbgexPhaseII() 1107486C0 ? 1108EFB28 ?
+1556 0000B9D9D ? 200000000 ?
FFFFFFFFFFEE548 ? 000000104 ?
FFFFFFFFFFEDBB0 ?
FB400000000 ?
dbgeExecuteForError call dbgexProcessError() 1107486C0 ? 1108EC608 ?
()+72 100000000 ? 000000000 ?
FFFFFFFFFFF29E0 ?
2840288000000012 ?
10013DA4C ? 1108EE350 ?
dbgePostErrorKGE()+ call dbgeExecuteForError 000000002 ? 000000128 ?
2044 () FFFFFFFFFFFE0002 ?
215265335E5162 ?
3726000000000001 ?
10A46AD18 ? 10A46CB00 ?
FFFFFFFFFFF1D30 ?
dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 000000001 ? 10A46AD18 ?
68 25800000000 ? 109E7A740 ?
000000000 ? 000000038 ?
FFFFFFFFFFF2800 ? 11AB1AC50 ?
kgeadse()+380 call dbkePostKGE_kgsf() 900000000512C74 ?
9001000A008DAD0 ? 000000000 ?
9001000A008DAD0 ?
8000000FFFF2C40 ?
7000147E8F28C98 ? 400000008 ?
1100054A0 ?
kgerinv_internal()+ call kgeadse() 7FFFFFFFFFFFFFFF ?
48 FFFFFFFFFFFEF8FF ?
000000019 ? 110476528 ?
000000001 ? 000000017 ?
00000000B ? 000000000 ?
kgerinv()+48 call kgerinv_internal() FFFFFFFFFFFEF8FF ?
FFFFFFFFFFFFFFFF ?
FFFFFFFFFFFFFFFF ?
7FFFFFFFFFFFFFFF ?
1001648E0 ? FFFFFFFFFFF25E0 ?
1106ABC70 ? 11073B3C0 ?
kgeasnmierr()+72 call kgerinv() 000000000 ? 215265335E5162 ?
372600383A0F5000 ?
000000004 ? 10A328F7C ?
FFFFFFFFFFF2898 ? 000000002 ?
0FFFFFFFF ?
kghstack_underflow_ call kgeasnmierr() 11AB967A0 ? 000000000 ?
internal()+280 FFFFFFFFFFF2860 ? 100000001 ?
000000002 ? 11AB5BBF0 ?
000000000 ? 11AB96778 ?
kghstack_free()+716 call kghstack_underflow_ 10A328F7C ? 110A2FEC0 ?
internal() 000000004 ? 000000000 ?
000000000 ? 000000000 ?
000000080 ? 80000000000000 ?
ktudda()+912 call kghstack_free() 11AB5BBF0 ? 7215265335E5162 ?
3726000000000008 ?
000000102 ? 109E747E0 ?
FFFFFFFFFFF2A90 ? 000000048 ?
28408880FFFFFFFF ?
kcbtdu()+1636 call ktudda() 70001383A0F4014 ? 000000000 ?
1FE800000000 ? 07F7F7F7F ?
FFFFFFFF80808080 ?
000000000 ? 000000030 ?
FFFFFFFFFFF2B30 ?
kcbzdh()+3200 call kcbtdu() 35900000359 ? 100000001 ?
000000001 ? 200000001 ?
000000001 ? 00000005D ?
200066665D20 ? 000000000 ?
kcbzpnd()+504 call kcbzdh() 70001383F6D64B8 ? 000002004 ?
2107486C0 ? 10A74269E ?
1107486C0 ? FFFFFFFFFFF3B30 ?
FFFFFFFFFFF38E0 ? 000000000 ?
kcbdnb()+724 call kcbzpnd() 10A74267C ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 0001CE860 ?
000000000 ? 000000000 ?
dbkedDefDump()+5528 call kcbdnb() 200000000 ? 000000000 ?
000000000 ? 000000000 ?
1100224D0 ? 000000018 ?
110001366 ? 000000000 ?
ksedmp()+72 call dbkedDefDump() 3107486C0 ? 110000A28 ?
FFFFFFFFFFF3FC0 ? 1106ABC70 ?
100125778 ? 000000000 ?
FFFFFFFFFFF3FB0 ? 1106ABC70 ?
ksfdmp()+100 call ksedmp() 000000002 ? 000000000 ?
000000002 ? 10AF71A68 ?
10A0720F8 ? 000000000 ?
1109DE650 ? 1107486C0 ?
dbgexPhaseII()+1904 call ksfdmp() 11074B65C ? 000000001 ?
000000002 ? 000000000 ?
000000002 ? 10A0720F0 ?
000000000 ? 001050005 ?
dbgexProcessError() call dbgexPhaseII() 1107486C0 ? 1109DC860 ?
+1556 0000B9D9C ? 200000000 ?
FFFFFFFFFFF4ED8 ? 000000082 ?
FFFFFFFFFFF4560 ?
88A4422A00000000 ?
dbgeExecuteForError call dbgexProcessError() 1107486C0 ? 1109DE650 ?
()+72 100000000 ? 000000000 ?
000000000 ? 000000000 ?
0DFFFFFFF ? 1109E0398 ?
dbgePostErrorKGE()+ call dbgeExecuteForError 00000000A ? 000000000 ?
2044 () 000000001 ? 000000001 ?
000000000 ? 000000000 ?
FFFFFFFFFFFB4E0 ? 000000000 ?
dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 000000000 ? FFFFFFFFFFF96B0 ?
68 2580000000A ? 109E7A740 ?
000000000 ? 000000000 ?
FFFFFFFFFFF9190 ? 11AB1AC50 ?
kgeadse()+380 call dbkePostKGE_kgsf() 000000001 ? 000000008 ?
000000000 ? 10A30EA38 ?
110000C20 ? 700014771160D68 ?
700014772ADB3A8 ? 000000001 ?
kgerinv_internal()+ call kgeadse() 000000003 ? 000000000 ?
48 11074B65C ? 000000001 ?
000000000 ? FFFFFFFFFFF96B0 ?
00000000A ? 000000001 ?
kgerinv()+48 call kgerinv_internal() 000000000 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 000000000 ?
kgeasnmierr()+72 call kgerinv() 000000000 ? 000000000 ?
000000000 ? 000000000 ?
FFFFFFFFFFF92B0 ?
48102840FFFFA5B0 ?
11AB5BBB8 ? 11074D658 ?
kghstack_underflow_ call kgeasnmierr() 022028200 ? 022202820 ?
internal()+280 11AB5BBB8 ? 100000001 ?
000000002 ? 11074D658 ?
0442C2394 ? 000002000 ?
kghstack_free()+716 call kghstack_underflow_ FFFFFFFFFFF92B0 ?
internal() FFFFFFFFFFF95B8 ?
FFFFFFFFFFF92B0 ? 000000001 ?
FFFFFFFFFFF92B0 ?
FFFFFFFFFFF95E8 ?
FFFFFFFFFFF95B8 ? 11074B650 ?
ktundo()+924 call kghstack_free() 0DEADBEEF ? 11074D668 ?
11074B654 ? 300000000 ?
1FFFFB4E0 ? FFFFFFFFFFFB4E0 ?
FFFFFFFFFFF94C0 ?
FFFFFFFFFFF9470 ?
kturCRBackoutOneChg call ktundo() 19FFFFB5E0 ?
()+848 494CEDB3FFFF9E50 ?
FFFFFFFFFFF9E48 ? 000000000 ?
000000000 ? FFFFFFFFFFFA5B0 ?
100000000 ? FFFFFFFFFFFB4E0 ?
ktrgcm()+5816 call kturCRBackoutOneChg FFFFFFFFFFFA5B0 ?
() 19FFFFA440 ?
FFFFFFFFFFFA5B8 ? 000000000 ?
1FFFFA478 ? FFFFFFFFFFFB4E0 ?
000000000 ? 000000000 ?
ktrget3()+832 call ktrgcm() FFFFFFFFFFFAC80 ? 000000000 ?
000000000 ? 000000003 ?
058F7501F ? 000000001 ?
000000004 ? 000000003 ?
ktrget2()+104 call ktrget3() 000000002 ? 700000000014488 ?
7000147E9C41A50 ? 000000022 ?
110A123A0 ? 000000000 ?
FFFFFFFFFFFB080 ? 110A123B8 ?
kclgeneratecr()+654 call ktrget2() FFFFFFFFFFFB4D0 ? 110AA1610 ?
0 14F11E4E00 ? 0F11E4E00 ?
357FED028 ? 000030000 ?
7000147E9C41A50 ?
700000000014488 ?
kclgcr()+812 call kclgeneratecr() 11A209508 ? FFFFFFFFFFFBFC0 ?
FFFFFFFFFFFBC18 ? 000000000 ?
0FFFFBB10 ? 01A275AC8 ?
1761D7F302ED25AC ?
20000011A275AC8 ?
kclcrrf()+536 call kclgcr() FFFFFFFFFFFBC20 ?
FFFFFFFFFFFBD00 ? 101F5080C ?
000000000 ? 0000003E8 ?
000000028 ? 0000000C8 ?
FFFFFFFFFFFBF88 ?
kjblcrcbk()+896 call kclcrrf() 000000001 ? 000000000 ?
7000147EB0F07B8 ?
7000147576C4471 ?
401472C30C7F0 ?
7000147576C4408 ?
7000147576C3190 ?
7000147576C7170 ?
kjblpcr()+304 call kjblcrcbk() FFFFFFFFFFFBDA8 ? 000000038 ?
7000147FABBDB48 ? 600000006 ?
000000016 ? 11A209468 ?
000000013 ? 0001C2153 ?
kjbmpbast()+1792 call kjblpcr() 000000012 ? 000000168 ?
000000002 ? 70001109FDB8148 ?
357000000000357 ?
7000144F31F7750 ?
895000000000895 ? 000000000 ?
kjmxmpm()+760 call kjbmpbast() 1000000000000 ? 80000001E ?
000000000 ? 11A2951C8 ?
C000000000 ? 000000000 ?
1000000000000 ? 000000000 ?
kjmpbmsg()+3508 call kjmxmpm() 000000000 ? 11A3769E0 ?
FFFFFFFFFFFC380 ? 06DBFBAEF ?
101E13820 ? 11A3769E0 ?
7000147E339AE08 ?
FFFFFFFFFFFC210 ?
kjmsm()+13416 call kjmpbmsg() 11A209448 ? 7000147E339AE08 ?
100000019 ? 100000000 ?
000000000 ? 000000000 ?
000000000 ? 7000000000168FD ?
ksbrdp()+2216 call kjmsm() 7000000000168E0 ?
7000000000168FC ? 048244028 ?
000000E00 ? 1108B69F0 ?
100637768 ? 000000001 ?
700000007 ?
opirip()+1620 call ksbrdp() FFFFFFFFFFFFE22 ? 10AFA5FC8 ?
FFFFFFFFFFFDC10 ? 000000000 ?
000000001 ? 000000000 ?
01380038F ? 000000001 ?
opidrv()+608 call opirip() 10AFA23B0 ? 410134118 ?
FFFFFFFFFFFED80 ?
2F7530312F ? 108A7E8C4 ?
1106ABC70 ?
652F70726F647563 ?
1106ABC70 ?
sou2o()+136 call opidrv() 3208A885B0 ? 400000000 ?
FFFFFFFFFFFED80 ?
23001801CD0000 ? 000000010 ?
1106ABC70 ? 000000000 ?
000000000 ?
opimai_real()+188 call sou2o() FFFFFFFFFFFEDF0 ?
4424444B00000001 ?
9000000000D73CC ?
BADC0FFEE0DDF00D ?
000000003 ? 9001000A008DAD0 ?
A0000000A000000 ? 10B6A8F30 ?
ssthrdmain()+276 call opimai_real() 9001000A0011A60 ?
FFFFFFFFFFFF148 ?
FFFFFFFFFFFEEF0 ? 10B6E9280 ?
90000000008582C ?
9001000A008DAD0 ?
FFFFFFFFFFFEED0 ?
9001000A008DAD0 ?
main()+204 call ssthrdmain() 3F0003660 ? FFFFFFFFFFFF238 ?
FFFFFFFFFFFF2A0 ?
9FFFFFFF000D658 ?
9FFFFFFF00009A0 ? 000000000 ?
000000000 ? 9FFFFFFF000D658 ?
__start()+112 call main() 000000000 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 000000000 ?
000000000 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------
查询mos对比相关信息,参考: LMON or LMS Process Crashes Instance With ORA-600 [kghstack_underflow_internal_2] (Doc ID 2003278.1)信息
The LMON or LMS process crash the instance with an error like:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x110A10838], [], [], [], [], [], [], [], [], [], []
ORA-1092 : opitsk aborting process
Instance terminated by LMS1, pid = 14024818
Review of the generated tracefiles reveals a call stack similar to:
... kghstack_underflow_internal kghstack_free kccgrd kjxgrf_rr_read kjxgrDD_rr_read kjxgrimember kjxggpoll kjfmact kjfdact kjfcln ksbrdp ...
- OR -
... kghstack_underflow_internal kghstack_free ktundo kturcrbackoutonechg ktrgcm ktrget3 ktrget2 kclgcr ...
确认为Bug 18687067 – ORA-600 [KGHSTACK_UNDERFLOW_INTERNAL_2] closed as duplicate of Bug 20675347 – ORA-07445 [KGHSTACK_OVERFLOW_INTERNAL()+644](The bug is caused by an AIX compiler issue causing volatile variables in the Oracle kernel not to be handled properly.),解决方案升级数据库到12.1及其以上版本或者打上patch 20675347