8i升级到9i出现ORA-07445[pevm_MOVC_i()+18]

一个朋友数据库从8i升级到9i后,出现ORA-07445[pevm_MOVC_i()+18]错误
alert日志ORA-07445[pevm_MOVC_i()+18]

Mon Jul 16 12:21:54 2012
Errors in file /oracle/admin/ora8/udump/ora8_ora_8938.trc:
ORA-07445: exception encountered: core dump [pevm_MOVC_i()+18] [SIGSEGV] [Address not mapped to object] [0x7] [] []

trace文件

--版本平台信息
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
ORACLE_HOME = /oracle/product/9.2.0
System name:	Linux
Node name:	localhost.localdomain
Release:	2.6.18-194.el5PAE
Version:	#1 SMP Tue Mar 16 22:00:21 EDT 2010
Machine:	i686
Instance name: ora8
Redo thread mounted by this instance: 1
Oracle process number: 15
Unix process pid: 8938, image: oracle@localhost.localdomain (TNS V1-V3)

--trace信息
*** 2012-07-16 12:21:54.399
*** SESSION ID:(12.6) 2012-07-16 12:21:54.399
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x7, PC: [0x9bfac06, pevm_MOVC_i()+18]
Registers:
%eax: 0x00000000 %ebx: 0x00000025 %ecx: 0x00000000
%edx: 0xbf93bf50 %edi: 0x00000000 %esi: 0x002ff1d8
%esp: 0xbf93bc28 %ebp: 0xbf93bc60 %eip: 0x09bfac06
%efl: 0x00010296
  pevm_MOVC_i()+6 (0x9bfabfa) mov %edi,0xffffffcc(%ebp)
  pevm_MOVC_i()+9 (0x9bfabfd) mov %esi,0xffffffd0(%ebp)
  pevm_MOVC_i()+12 (0x9bfac00) mov %ebx,0xffffffc8(%ebp)
  pevm_MOVC_i()+15 (0x9bfac03) mov 0x14(%ebp),%eax
> pevm_MOVC_i()+18 (0x9bfac06) movb 0x7(%eax),%dl
  pevm_MOVC_i()+21 (0x9bfac09) mov $0x0,0xfffffff0(%ebp)
  pevm_MOVC_i()+28 (0x9bfac10) movb %dl,0xffffffe0(%ebp)
  pevm_MOVC_i()+31 (0x9bfac13) movb %dl,0xffffffe0(%ebp)
  pevm_MOVC_i()+34 (0x9bfac16) cmpb $0x1,%dl
*** 2012-07-16 12:21:54.407
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [pevm_MOVC_i()+18] [SIGSEGV] [Address not mapped to object] [0x7] [] []
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedmp()+269         call     ksedst()+0           1 ? 0 ? 0 ? 1 ? 64252C31 ?
                                                   6666006C ?
ssexhd()+1108        call     ksedmp()+0           3 ? 0 ? 0 ? 0 ? 0 ? 0 ?
pevm_MOVC_i()+18     signal   ssexhd()+0           B ? BF93B8BC ? BF93B93C ?
pfrrun()+8458        call     pevm_MOVC_i()+0      2FF19C ? 16 ? BE14650 ? 0 ?
pricar()+1277        call     pfrrun()+0           2FF19C ? 1 ? BF93CCFC ?
                                                   AD638A0 ? 2DFBAC ? 0 ?
pricbr()+427         call     pricar()+0           BF93DA88 ? BF93D084 ?
                                                   9BEAE0C ? 1 ? 0 ? 98C93728 ?
prient2()+598        call     pricbr()+0           BF93DA88 ? BF93D084 ? 0 ?
prient()+1438        call     prient2()+0          BF93DA88 ? BF93D084 ? 1 ?
                                                   BF93E4E0 ? 0 ?
kkxrpc()+347         call     prient()+0           BF93DA88 ? AD638A0 ?
                                                   BF93E534 ? 38 ? 1C8C997 ? 0 ?
kporpc()+138         call     kkxrpc()+0           4C ? F ? BF93E63C ?
opiodr()+5238        call     kjushutdown()+2671   4C ? F ? BF93E63C ?
ttcpip()+2124        call     opiodr()+0           4C ? F ? BF93E63C ? 0 ?
Cannot find symbol in /lib/libc.so.6.
opitsk()+1635        call     ttcpip()+0           AD638A0 ? 4C ? BF93E63C ? 0 ?
                                                   BF93EF14 ? BF93EF10 ?
opiino()+602         call     opitsk()+0           0 ? 0 ? AD638A0 ? BE01DE0 ?
                                                   103 ? 0 ?
opiodr()+5238        call     kjushutdown()+2671   3C ? 4 ? BF9402E0 ?
opidrv()+517         call     opiodr()+0           3C ? 4 ? BF9402E0 ? 0 ?
sou2o()+25           call     opidrv()+0           3C ? 4 ? BF9402E0 ?
main()+182           call     sou2o()+0            BF9402C4 ? 3C ? 4 ?
                                                   BF9402E0 ? 0 ? 0 ?
00125E9C             call     main()+0             2 ? BF940384 ? BF940390 ?
                                                   88A810 ? 0 ? 1 ?
 
---------------------Binary Stack Dump ---------------------

--进程信息
Process global information:
     process: 0x962ba0b8, call: 0x96342cd8, xact: (nil), curses: 0x962e4070, usrses: 0x962e4070
  ----------------------------------------
  SO: 0x962ba0b8, type: 2, owner: (nil), flag: INIT/-/-/0x00
  (process) Oracle pid=15, calls cur/top: 0x96342cd8/0x96342cd8, flag: (0) -
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 196 0 4
              last post received-location: kslpsr
              last process to post me: 962b7828 1 6
              last post sent: 0 0 15
              last post sent-location: ksasnd
              last process posted by me: 962b7828 1 6
    (latch info) wait_event=0 bits=0
    Process Group: DEFAULT, pseudo proc: 0x962d9444
    O/S info: user: oracle, term: UNKNOWN, ospid: 8938
    OSD pid info: Unix process pid: 8938, image: oracle@localhost.localdomain (TNS V1-V3)
    ----------------------------------------
    SO: 0x962e4070, type: 4, owner: 0x962ba0b8, flag: INIT/-/-/0x00
    (session) trans: (nil), creator: 0x962ba0b8, flag: (8000041) USR/- BSY/-/-/-/-/-
              DID: 0001-000F-00000004, short-term DID: 0000-0000-00000000
              txn branch: (nil)
              oct: 0, prv: 0, sql: (nil), psql: 0x98c3b858, user: 95/DDDD
    O/S info: user: mis, term: LANDERSVR3, ospid: 7904:3012, machine: XANDER\LANDERSVR3
              program: c:\orant\bin\f50run32.exe c:\forms\bas9010.fmx
    application name: c:\orant\bin\f50run32.exe c:\forms\bas9010.fmx, hash value=0
    last wait for 'db file sequential read' blocking sess=0x0 seq=1277 wait_time=11
                file#=1, block#=21b, blocks=1

1.该用户程序是从8i升级到9i之后产生该错误
2.报错的访问程序是FORM 5

解决方案
查询MOS[ID 273411.1]发现是因为FORM 5和9i不兼容导致该错误,ORACLE未给出解决方案,言外之意,如果FORM不能升级,那就只能把ORACLE重新降级到8i.

温馨提示
在做oracle数据库升级前,需要实现进行评估,测试,如果是oracle相关软件和oracle数据库结合紧密,升级前最好需要和ORACLE技术人员确认是否兼容.

ORA-00600[kcbshlc_1]导致数据库 down 案例

一台服务器因为ORA-00600[kcbshlc_1]错误引起PMON异常导致数据库down掉

Sun Jul  8 17:20:10 2012
Errors in file /opt/oracle/admin/xff/bdump/xff_pmon_16412.trc:
ORA-00600: internal error code, arguments: [kcbshlc_1], [33], [], [], [], [], [], []
Sun Jul  8 17:20:12 2012
Errors in file /opt/oracle/admin/xff/bdump/xff_pmon_16412.trc:
ORA-00600: internal error code, arguments: [kcbshlc_1], [33], [], [], [], [], [], []
Sun Jul  8 17:20:12 2012
PMON: terminating instance due to error 472

分析trace文件

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /opt/oracle/product/10.2.0
System name:	Linux
Node name:	localhost.localdomain
Release:	2.6.9-89.ELsmp
Version:	#1 SMP Mon Apr 20 10:33:05 EDT 2009
Machine:	x86_64
Instance name: xff
Redo thread mounted by this instance: 1
Oracle process number: 2
Unix process pid: 16412, image: oracle@localhost.localdomain (PMON)

*** 2012-07-08 03:00:11.351
*** SERVICE NAME:(SYS$BACKGROUND) 2012-07-08 03:00:11.338
*** SESSION ID:(1105.1) 2012-07-08 03:00:11.338
 wsd 0x1f8169a6c8, sbuf (nil), setid 9, op 0
lcuridx 0, lasz (nil)
freeing in-flux r/w latch for process state: 1fc165d248
... in-flux r/w latch  1fc1fcc9b0 Child cache buffers chains level=1 child#=4753 
        Location from where latch is held: kcbgtcr: kslbegin excl: 
        Context saved from call: 113266196
        state=busy(exclusive) (val=0x2000000000000071) holder orapid = 113
    waiters [orapid (seconds since: put on list, posted, alive check)]:
     139 (2, 1341687611, 2)
     192 (2, 1341687611, 2)
     191 (2, 1341687611, 2)
     173 (2, 1341687611, 2)
     185 (2, 1341687611, 2)
     176 (2, 1341687611, 2)
     174 (2, 1341687611, 2)
     118 (2, 1341687611, 2)
     190 (2, 1341687611, 2)
     179 (2, 1341687611, 2)
     184 (1, 1341687611, 1)
     189 (1, 1341687611, 1)
     177 (1, 1341687611, 1)
     195 (1, 1341687611, 1)
     187 (1, 1341687611, 1)
     194 (1, 1341687611, 1)
     147 (1, 1341687611, 1)
     183 (1, 1341687611, 1)
     143 (1, 1341687611, 1)
     144 (1, 1341687611, 1)
     186 (1, 1341687611, 1)
     188 (1, 1341687611, 1)
     196 (1, 1341687611, 1)
     145 (1, 1341687611, 1)
     193 (1, 1341687611, 1)
     waiter count=25
*** 2012-07-08 03:50:06.228
 wsd 0x1f8169ac20, sbuf 0xac1ffafe8, setid 10, op 3
lcuridx 1, lasz 0x3c1ffc110
*** 2012-07-08 16:30:05.294
freeing in-flux r/w latch for process state: 20406507f0
... in-flux r/w latch  1f81265f28 Child cache buffers chains level=1 child#=14180 
        Location from where latch is held: kcbgtcr: kslbegin excl: 
        Context saved from call: 71341989
        state=busy(exclusive) (val=0x2000000000000066) holder orapid = 102
    waiters [orapid (seconds since: put on list, posted, alive check)]:
     121 (2, 1341736205, 2)
     116 (2, 1341736205, 2)
     125 (2, 1341736205, 2)
     140 (2, 1341736205, 2)
     145 (2, 1341736205, 2)
     waiter count=5
freeing in-flux r/w latch for process state: 1fc165f9d0
... in-flux r/w latch  1f813aec18 Child cache buffers chains level=1 child#=20914 
        Location from where latch is held: kcbrls: kslbegin: 
        Context saved from call: 96505705
        state=busy(exclusive) (val=0x200000000000007b) holder orapid = 123
*** 2012-07-08 17:20:10.876
 wsd 0x1f8169a6c8, sbuf (nil), setid 9, op 0
lcuridx 0, lasz (nil)
*** 2012-07-08 17:20:10.876
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kcbshlc_1], [33], [], [], [], [], [], []
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
kgeasnmierr()+163    call     kgerinv()            0066876E0 ? 2A97200260 ?
                                                   7FBFFFCF10 ? 7FBFFFCE50 ?
                                                   000000000 ? 000000000 ?
kcbshlc()+239        call     kgeasnmierr()        0066876E0 ? 2A97200260 ?
                                                   7FBFFFCF10 ? 7FBFFFCE50 ?
                                                   000000000 ? 000000021 ?
kslilcr()+770        call     kcbshlc()            0066876E0 ? 1F801DDB28 ?
                                                   7FBFFFCF10 ? 7FBFFFCE50 ?
                                                   000000000 ? 000000021 ?
ksl_cleanup()+1567   call     kslilcr()            7FBFFFCE50 ? 000000000 ?
                                                   7FBFFFDCE0 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksuxfl()+492         call     ksl_cleanup()        000000000 ? 000000000 ?
                                                   000000000 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksuxda()+55          call     ksuxfl()             1FC165B8E0 ? 000000000 ?
                                                   000000000 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksucln()+1390        call     ksuxda()             1FC165B8E0 ? 000000000 ?
                                                   000000000 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksbrdp()+794         call     ksucln()             060008100 ? 000000000 ?
                                                   FFFFFFFF9720ED9F ?
                                                   1F801DDB28 ? 0066876E0 ?
                                                   000000021 ?
opirip()+616         call     ksbrdp()             060008100 ? 000000000 ?
                                                   000000001 ? 060008100 ?
                                                   0066876E0 ? 000000021 ?
opidrv()+582         call     opirip()             000000032 ? 000000004 ?
                                                   7FBFFFF698 ? 060008100 ?
                                                   0066876E0 ? 000000021 ?
sou2o()+114          call     opidrv()             000000032 ? 000000004 ?
                                                   7FBFFFF698 ? 060008100 ?
                                                   0066876E0 ? 000000021 ?
opimai_real()+317    call     sou2o()              7FBFFFF670 ? 000000032 ?
                                                   000000004 ? 7FBFFFF698 ?
                                                   0066876E0 ? 000000021 ?
main()+116           call     opimai_real()        000000003 ? 7FBFFFF700 ?
                                                   000000004 ? 7FBFFFF698 ?
                                                   0066876E0 ? 000000021 ?
__libc_start_main()  call     main()               000000003 ? 7FBFFFF700 ?
+219                                               000000004 ? 7FBFFFF698 ?
                                                   0066876E0 ? 000000021 ?
_start()+42          call     __libc_start_main()  000713984 ? 000000001 ?
                                                   7FBFFFF848 ? 005288D00 ?
                                                   000000000 ? 000000003 ?

通过这个trace可以看出数据库运行在LINUX 64操作系统,版本是10.2.0.4。
出现错误的原因:
PMON在清理1fc165d248的时候,因为被orapid = 102持有,导致清理失败.
PMON在清理20406507f0的时候,因为被orapid = 102持有,导致清理失败.
PMON在清理1fc165f9d0的时候,因为被orapid = 123持有,导致清理失败.

查询MOS[443909.1]
发现是unpublished Bug 4723109.处理方法打上Patch 4723109.

DBCA Fails With ORA-15243

今天接到朋友的电话说他们装ORACLE 11G R1 RAC的时候遇到ORA-12801/ORA-15243错误,请求我帮忙解决
具体情况
AIX系统以前装过11G R2 RAC,现因为项目要求11G R1,已经重装了系统,然后安装R1,在安装到DBCA配置ASM的时候,出现ORA-12801/ORA-15243错误

ORA-12801: error signaled in parallel query server PZ99, instance wmsdb1:+ASM1(1)
ORA-15243: 11.2.0.0.0 is not a valid version number


通过SQLPLUS登录ASM1实例查询发现该有一个ORADATA磁盘组,包含了一个/dev/rhdisk1.通过询问,得出结论是这个磁盘组以前是安装R2的时候作为存储OCR和VOTINGDISK使用,重装系统的时候未对该磁盘进行处理.

处理思路[想办法清除磁盘中asm信息]
1.尝试通过sqlplus 删除该磁盘组,报该磁盘组处于dismount状态
2.尝试mount该磁盘组,提示版本无效(ORA-15243)[当前的asm程序是11.1而磁盘组信息是11.2 程序当然不一致了]
3.直接使用dd清理该asm disk header信息(dd if=/dev/zero of=/dev/rhdisk1 bs=4096 count=1)
4.重新运行dbca一切工作正常

MOS中相关文章[1460997.1]只适合linux asmlib情况

Applies to:
Oracle Server - Enterprise Edition - Version 11.1.0.7 and later
Information in this document applies to any platform.

Symptoms
On : 11.1.0.7 version, STORAGE

When attempting to create database or query gv$asm_diskgroup,
the following error occurs.

ERROR
-----------------------
ORA-12801: error signaled in parallel query server PZ99, instance dchilcmsdb2.hq.navteq.com:+ASM2 (2)
ORA-15243: 11.2.0.0.0 is not a valid version number


STEPS
-----------------------
The issue can be reproduced at will with the following steps:
1. Previously had 11GR2 installed and configured. Removed this installation then installed 
   11.1.0.7  and created diskgroups using some of the same disks previously used.
2. Attempt to create database and receive the errors. Drop the newly 
   created diskgroups and query the view still get same errors.


BUSINESS IMPACT
-----------------------
The issue has the following business impact:
Due to this issue, users cannot create new database.

Changes
 Removed 11.2.0.1 installation and installed 11.1.0.7 software without cleaning up all of 
 the diskgroup information from previous installation.

Cause
All the current information shows that we are using correct binaries and 
that the diskgroups that are being used have correct comparability settings. 
HTML shows that the disks for the old diskgroup are still being discovered. 
This in conjunction with the text of the error as follows shows that 
we are picking up 11.2.0.0.0 as version from somewhere.
ORA-15243: 11.2.0.0.0 is not a valid version number
 
Problem was caused by the disks that had been used for the 
OCR/Voting disk diskgroup in 11GR2 installation still being present and accessible.
 

Solution
As the root user execute /etc/init.d/oracleasm/deletedisk command against all the disks 
that were previously used for the OCR/Voting disk diskgroup then try the operation again.

关闭数据库出现ORA-00379错误

关闭数据库出现ORA-00379错误

SQL> shutdown immediate
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00379: 缓冲池 DEFAULT 中无法提供 8K 块大小的空闲缓冲区

查看内存分配

SQL> show parameter sga;

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 412M
sga_target                           big integer 0

SQL> select * from v$sgainfo;

NAME                                  BYTES RES
-------------------------------- ---------- ---
Fixed SGA Size                      1333676 No
Redo Buffers                        6078464 No
Buffer Cache Size                 104857600 Yes
Shared Pool Size                  142606336 Yes
Large Pool Size                     4194304 Yes
Java Pool Size                     12582912 Yes
Streams Pool Size                         0 Yes
Shared IO Pool Size                       0 Yes
Granule Size                        4194304 No
Maximum SGA Size                  431038464 No
Startup overhead in Shared Pool    46137344 No
Free SGA Memory Available         159383552

--spfile中分配情况
orcl.__db_cache_size=104857600
orcl.__java_pool_size=12582912
orcl.__large_pool_size=4194304
orcl.__pga_aggregate_target=104857600
orcl.__sga_target=281018368
orcl.__shared_io_pool_size=0
orcl.__shared_pool_size=142606336
orcl.__streams_pool_size=0

--初始化参数
*.sga_max_size=0
*.sga_target=536870912
*.memory_max_target=536870912
*.memory_target=536870912

alert日志

Mon Jul 02 11:30:19 2012
DIA0 started with pid=8, OS id=1520 
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_dia0_1520.trc  (incident=10883):
ORA-04030: out of process memory when trying to allocate 29916 bytes (heap_ksdhngreq,msg_body:ksdhng)
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_dia0_1520.trc  (incident=10884):
ORA-04030: out of process memory when trying to allocate 8204 bytes (diag pga,dbgtbDefaultBucket)
ORA-04030: out of process memory when trying to allocate 29916 bytes (heap_ksdhngreq,msg_body:ksdhng)
ORA-4030 : opidrv aborting process DIA0 ospid (1348_1520)
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_dia0_1520.trc  (incident=12013):
ORA-04030: out of process memory when trying to allocate 8204 bytes (diag pga,dbgtbDefaultBucket)
ORA-04030: out of process memory when trying to allocate 8204 bytes (diag pga,dbgtbDefaultBucket)
ORA-04030: out of process memory when trying to allocate 29916 bytes (heap_ksdhngreq,msg_body:ksdhng)
Process debug not enabled via parameter _debug_enable

Mon Jul 02 11:33:19 2012
Trace dumping is performing id=[cdmp_20120702113319]
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_1820.trc:
ORA-00379: no free buffers available in buffer pool DEFAULT for block size 8K
ORA-00379: no free buffers available in buffer pool DEFAULT for block size 8K
Mon Jul 02 11:33:49 2012
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_1820.trc:
ORA-00379: no free buffers available in buffer pool DEFAULT for block size 8K
Mon Jul 02 11:34:38 2012
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_1820.trc:
ORA-00379: no free buffers available in buffer pool DEFAULT for block size 8K
Mon Jul 02 11:37:05 2012
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_2400.trc:
ORA-00379: 缓冲池 DEFAULT 中无法提供 8K 块大小的空闲缓冲区
ORA-00379: 缓冲池 DEFAULT 中无法提供 8K 块大小的空闲缓冲区
ORA-00379: 缓冲池 DEFAULT 中无法提供 8K 块大小的空闲缓冲区

Tue Jul 03 09:58:06 2012
WARNING: sga_target 432013312 cannot be more than memory_target (432013312) - 
pga_aggregate_target (104857600/0) or untunable pga 104857600, 73783296

通过这里可以看出,系统的data buffe和pga都有内存不足的报错.

解决问题
问题的原因是由于内存分配不多,导致sga组件被消耗完,现在数据库不能正常关闭,修改了相关的内存参数的配置[避免该bug采用asmm内存管理]也无法生效,现在需要做的任务是重启数据库.导致数据库不能被关闭的原因是因为data buffer中的脏数据不能写入新数据.查询MOS发现是Bug 7702085.正常关闭库解决办法手工刷sga组件,然后升级数据库到11.2.0.1 (Base Release)/11.1.0.7.3 (Patch Set Update)/11.1.0.7 Patch 25 on Windows Platforms

SQL>  alter system flush BUFFER_CACHE;

System altered.

SQL> alter system flush  SHARED_POOL;

System altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

spfile被覆盖导致ORA-600[kmgs_parameter_update_timeout_1]

数据库出现如下错误ORA-00600[kmgs_parameter_update_timeout_1]

Thu Jun 21 17:42:45 BEIST 2012
alter tablespace TS_TAB_WG_SYSMGR_01 add datafile '/dev/rvgoradata3_1_01'
Thu Jun 21 17:42:58 BEIST 2012
Completed: alter tablespace TS_TAB_WG_SYSMGR_01 add datafile '/dev/rvgoradata3_1_01'
Thu Jun 21 17:45:31 BEIST 2012
System State dumped to trace file /oracle/app/oracle/admin/bomc3/bdump/bomc3_mmon_19530138.trc
Thu Jun 21 17:45:42 BEIST 2012
Errors in file /oracle/app/oracle/admin/bomc3/bdump/bomc3_mmon_19530138.trc:
ORA-00600: internal error code, arguments: [kmgs_parameter_update_timeout_1], [1565], [], [], [], [], [], []
ORA-01565: error in identifying file '/dev/rvgoradata3_1_01'
ORA-27086: unable to lock file - already in use
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 8
Additional information: 18874484
Thu Jun 21 17:45:49 BEIST 2012
Errors in file /oracle/app/oracle/admin/bomc3/bdump/bomc3_dbw0_18874484.trc:
ORA-00600: internal error code, arguments: [ksprcvsp1], [0], [0], [], [], [], [], []
Thu Jun 21 17:45:52 BEIST 2012
Errors in file /oracle/app/oracle/admin/bomc3/bdump/bomc3_dbw0_18874484.trc:
ORA-00600: internal error code, arguments: [kmgs_parameter_update_timeout_1], [600], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [ksprcvsp1], [0], [0], [], [], [], [], []
Thu Jun 21 17:45:53 BEIST 2012
Errors in file /oracle/app/oracle/admin/bomc3/bdump/bomc3_dbw0_18874484.trc:
ORA-00600: internal error code, arguments: [kmgs_parameter_update_timeout_1], [600], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [ksprcvsp1], [0], [0], [], [], [], [], []
Thu Jun 21 17:45:53 BEIST 2012
DBW0: terminating instance due to error 471
Instance terminated by DBW0, pid = 18874484

通过这个错误可以看出大概:TS_TAB_WG_SYSMGR_01增加数据文件/dev/rvgoradata3_1_01成功后,然后mmon启动收集统计信息,读取spfile文件信息出错.最后dbw进程读取spfile文件出错,使得dbwn进程终止,从而数据库abort掉.通过这些信息,初步怀疑是增加数据文件的时候,错误的把spfile文件的裸设备作为一个新数据文件增加到数据库中,导致spfile被覆盖,从而出现mmon和dbwn访问spfile出错.

找出证据
如果spfile使用裸设备而且文件名是dev/rvgoradata3_1_01,那很可能是通过init_SID.ora中的spfile项实现,查找该文件内容果然发现

[zwq_acc1:/home/xifenfei]cat initbomc3.ora
spfile='/dev/rvgoradata3_1_01'

通过这些可以确定是用户增加数据文件时,错误的把spfile文件当中新的控制问及爱你增加到相关表空间中导致该问题.

解决办法
1.如果有备份spfile文件,使用备份spfile文件
2.如果有pfile文件,使用pfile创建spfile
3.如果上面两个都没有,那么使用alert中相关信息创建pfile文件然后创建spfile