ORA-00600[kcfrbd_3]故障解决

朋友一数据库因为断电,被重建控制文件等操作折腾的最后出现在启动的时候出现ORA-00600[kcfrbd_3]

Wed Dec 05 10:26:34 2012
Thread 1 advanced to log sequence 11
Thread 1 opened at log sequence 11
  Current log# 1 seq# 11 mem# 0: E:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
Wed Dec 05 10:26:34 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Dec 05 10:26:34 2012
SMON: enabling cache recovery
Wed Dec 05 10:26:35 2012
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Dictionary check complete
Wed Dec 05 10:26:35 2012
SMON: enabling tx recovery
Wed Dec 05 10:26:35 2012
Database Characterset is ZHS16GBK
Wed Dec 05 10:26:35 2012
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_548.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [2279045], [1], [2277120], [2277120], [], []

replication_dependency_tracking turned off (no async multimaster replication found)
Wed Dec 05 10:26:36 2012
Fatal internal error happened while SMON was doing active transaction recovery.
Wed Dec 05 10:26:36 2012
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_548.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [2279045], [1], [2277120], [2277120], [], []

SMON: terminating instance due to error 474

这个错误很明显:数据库已经open成功了,但是因为有事务不能正常被回滚,然后数据库的smon进程异常,从而使得数据库不能正常启动,解决该问题的方法也是很简单,就是常规的undo处理思路(使用人工undo管理,event屏蔽事务,隐含参数屏蔽回滚段),然后重建undo表空间,这个时候可以结合txchecker来检测是否有异常事务:如果有重要基表对象异常,需要重建库;如果是个别其他对象异常,可以通过重建该对象解决

ORA-00600[qmxtriCheckAndRewriteQb0]

数据库报ORA-00600[qmxtriCheckAndRewriteQb0]

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u01/oracle/product/10.2.0
System name:	AIX
Node name:	abc
Release:	3
Version:	5
Machine:	00C58A644C00
Instance name: XFF2
Redo thread mounted by this instance: 2
Oracle process number: 434
Unix process pid: 492340, image: oracle@abc

*** ACTION NAME:() 2012-11-12 08:46:47.132
*** MODULE NAME:() 2012-11-12 08:46:47.132
*** SERVICE NAME:(ORCL) 2012-11-12 08:46:47.132
*** CLIENT ID:() 2012-11-12 08:46:47.132
*** SESSION ID:(870.58602) 2012-11-12 08:46:47.132
*** 2012-11-12 08:46:47.132
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [qmxtriCheckAndRewriteQb0], [], [], [], [], [], [], []
Current SQL statement for this session:
SELECT EXTRACTVALUE(配置,'//SYSTEM[@XTH="'||:B1 ||'"]/FILE') , 
WHERE EXTRACTVALUE(配置,'//SYSTEM[@XTH="'||:B1 ||'"]/BM')=:B2  AND ROWNUM<2
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
70000021d535f70        25  procedure ZLTOOLS.ZL_MBRUNLOG_INSERT
7000002b6819368         1  anonymous block
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              000000000 ? 000000000 ?
ksedmp+0290          bl       ksedst               104A2C690 ?
ksfdmp+0018          bl       03F26C3C             
kgerinv+00dc         bl       _ptrgl               
kgeasnmierr+004c     bl       kgerinv              7000002F735A838 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   0FFFFBFFF ?
IPRA.$qmxtriCheckAn  bl       03F25970             
dRewriteQb_rec+0194                                
IPRA.$qmxtriCheckAn  bl       IPRA.$qmxtriCheckAn  1000881EC ? 000000000 ?
dRewriteQb_rec+006c           dRewriteQb_rec       000000000 ?
IPRA.$qmxtriCheckAn  bl       IPRA.$qmxtriCheckAn  FFFFFFFFFFF07E0 ? 000000033 ?
dRewriteQb_rec+006c           dRewriteQb_rec       1056037F8 ?
qmxtriCheckAndRewri  bcl      dmqlKMlod+00c0       000000000 ? 110421CB0 ?
teQb+0094                                          FFFFFFFFFFE87C0 ?
qmxtrxq+0210         bl       03F252EC             
qmxtrxop+00a4        bl       qmxtrxq              FFFFFFFFFFF25B8 ?
                                                   700000282F66DD0 ? 110195E98 ?
koksspend+02b0       bl       qmxtrxop             100346AB4 ?
kkmdrvend+01a8       bl       koksspend            000000001 ? 104B3A8A8 ?
                                                   000000000 ?
kkmdrv+004c          bl       kkmdrvend            FFFFFFFFFFE8BE0 ?
                                                   883843401048F2F8 ?
opiSem+13c0          bl       kkmdrv               000000000 ? 000000000 ?
                                                   000000000 ? 11022AC50 ?
opiDeferredSem+0234  bl       opiSem               FFFFFFFFFFE9CE0 ?
                                                   7000001E327CCE0 ? 000000111 ?
                                                   100000001 ?
opitca+01e8          bl       opiDeferredSem       
kksFullTypeCheck+00  bl       03F25230             
1c                                                 
rpiswu2+034c         bl       _ptrgl               
kksSetBindType+0d28  bl       rpiswu2              70000030850C178 ?
                                                   3300000033 ?
                                                   FFFFFFFFFFF0570 ?
                                                   FFFFFFFFFFF0578 ?
                                                   7000002F6F0C700 ?
                                                   33104027D8 ?
                                                   FFFFFFFFFFF1F48 ? 000000000 ?
kksfbc+1054          bl       kksSetBindType       70000030F58F400 ? 1107CB418 ?
                                                   70000001003B800 ?
                                                   10200003000 ? 110000FF8 ?
                                                   7000000100ECAB8 ?
                                                   FFFFFFFFFFF1480 ?
                                                   481A408400003000 ?
opiexe+098c          bl       01F960BC             
opipls+185c          bl       opiexe               FFFFFFFFFFF3900 ?
                                                   FFFFFFFFFFF39E8 ?
                                                   FFFFFFFFFFF38A0 ?
opiodr+0ae0          bl       _ptrgl               
rpidrus+01bc         bl       opiodr               66FFFF54B0 ? 608736A20 ?
                                                   FFFFFFFFFFF67C0 ?
                                                   1510195E98 ?
skgmstack+00c8       bl       _ptrgl               
rpidru+0088          bl       skgmstack            102320840 ? 000000000 ?
                                                   000000002 ? 000000000 ?
                                                   FFFFFFFFFFF5F88 ?
rpiswu2+034c         bl       _ptrgl               
rpidrv+095c          bl       rpiswu2              70000030850C178 ? 110469C28 ?
                                                   11044AA58 ? 000000000 ?
                                                   FFFFFFFFFFF5D60 ?
                                                   3300000000 ? 000000000 ?
                                                   000000000 ?
psddr0+02bc          bl       03F266D4             
psdnal+01d0          bl       psddr0               1500000000 ? 6600000000 ?
                                                   FFFFFFFFFFF67C0 ?
                                                   30100BACC8 ?
pevm_EXECC+01f8      bl       _ptrgl               
pfrinstr_EXECC+0070  bl       pevm_EXECC           10147B2A4 ? 000000000 ?
                                                   700000262828B72 ?
pfrrun_no_tool+005c  bl       _ptrgl               
pfrrun+1014          bl       pfrrun_no_tool       FFFFFFFFFFF6B20 ?
                                                   7000002B6819368 ? 3100ECBB0 ?
plsql_run+06b4       bl       pfrrun               1107D84A8 ?
peicnt+0224          bl       plsql_run            1107D84A8 ? 10001102676F8 ?
                                                   000000000 ?
kkxexe+0250          bl       peicnt               FFFFFFFFFFF7E38 ? 1107D84A8 ?
opiexe+2ef8          bl       kkxexe               11047E1C8 ?
kpoal8+0edc          bl       opiexe               FFFFFFFFFFFB454 ?
                                                   FFFFFFFFFFFB1A8 ?
                                                   FFFFFFFFFFF9628 ?
opiodr+0ae0          bl       _ptrgl               
ttcpip+1020          bl       _ptrgl               
opitsk+1124          bl       01F96AC8             
opiino+0990          bl       opitsk               0FFFFD490 ? 000000000 ?
opiodr+0ae0          bl       _ptrgl               
opidrv+0484          bl       01F95914             
sou2o+0090           bl       opidrv               3C02D99B7C ? 4A076D928 ?
                                                   FFFFFFFFFFFF390 ?
opimai_real+01bc     bl       01F93294             
main+0098            bl       opimai_real          000000000 ? 000000000 ?
__start+0098         bl       main                 000000000 ? 000000000 ?
 
--------------------- Binary Stack Dump ---------------------

通过这个trace的部分信息可以得到:
1.操作系统版本AIX x64(5.3)
2.数据库版本10.2.0.4
3.sql语句调用EXTRACTVALUE函数
4.Call Stack Trace信息

查询MOS[ID 467350.1]发现匹配信息

Cause
Bug 6030982 ORA-600 [QMXTRICHECKANDREWRITEQB0] WITH QUERY USING EXTRACTVALUE FUNCTION

Solution
This bug is going to be fixed in furture 10.2.0.5.0 and 11g
At the mean time , user can workaround by 

set
event = "19027 trace name context forever, level 1"
within init.ora or spfile file then bounce database.

or

SQL> alter session set events ='19027 trace name context forever, level 1';
SQL> Alter system flush shared_pool;
-- Execute affected query

通过mos可以确定:
1.是因为数据库执行EXTRACTVALUE函数遇到该bug
2.在11g和10.2.0.5中修复该bug
3.可以通过设置event = “19027 trace name context forever, level 1″来临时解决该问题

个人处理建议
1.如果数据库方便升级,那建议升级处理
2.如果数据库不便立马升级,建议在业务低估时设置session event 19027,然后 flush shared_pool,执行报错sql,如果问题解决,在合适时间设置system event来临时屏蔽该问题.

expdp遭遇ORA-39006/ORA-39213故障解决

expdp导出数据遇到ORA-39006/ORA-39213错误,通过执行执行dbms_metadata_util.load_stylesheets解决
expdp工作异常

--导出awr信息
SQL> @?/rdbms/admin/awrextr.sql
…………
Exception encountered in AWR_EXTRACT
ORA-39006: internal error
ORA-39213: Metadata processing is not available
begin
*
ERROR at line 1:
ORA-31623: a job is not attached to this session via the specified handle
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
ORA-06512: at "SYS.DBMS_DATAPUMP", line 911
ORA-06512: at "SYS.DBMS_DATAPUMP", line 4710
ORA-06512: at "SYS.DBMS_SWRF_INTERNAL", line 656
ORA-06512: at "SYS.DBMS_SWRF_INTERNAL", line 962
ORA-06512: at line 3

--导出一个表
$ expdp "'/ as sysdba'" dumpfile=xifenfei.dmp tables=scott.t_xifenfei

Export: Release 10.2.0.1.0 - 64bit Production on Wednesday, 31 October, 2012 13:03:20

Copyright (c) 2003, 2005, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORA-39006: internal error
ORA-39213: Metadata processing is not available

错误提示

$ oerr ora 39006
39006, 00000, "internal error"
// *Cause:  An unexpected error occurred while processing a Data Pump job.
//          Subsequent messages supplied by DBMS_DATAPUMP.GET_STATUS 
//          will further describe the error.
// *Action: Contact Oracle Customer Support.
$ oerr ora 39213
39213, 00000, "Metadata processing is not available"
// *Cause:  The Data Pump could not use the Metadata API.  Typically,
//          this is caused by the XSL stylesheets not being set up properly.
// *Action: Connect AS SYSDBA and execute dbms_metadata_util.load_stylesheets
//          to reload the stylesheets.

解决ORA-39006/ORA-39213问题

--查询数据库已经安装组件
SQL> col COMP_NAME for a35
SQL> select comp_name, version, status from dba_registry;

COMP_NAME                           VERSION                        STATUS
----------------------------------- ------------------------------ ----------------------
Oracle Database Catalog Views       10.2.0.1.0                     VALID
Oracle Database Packages and Types  10.2.0.1.0                     VALID
Oracle Workspace Manager            10.2.0.1.0                     VALID
JServer JAVA Virtual Machine        10.2.0.1.0                     VALID
Oracle XDK                          10.2.0.1.0                     VALID
Oracle Database Java Packages       10.2.0.1.0                     VALID
Oracle Expression Filter            10.2.0.1.0                     VALID
Oracle Data Mining                  10.2.0.1.0                     VALID
Oracle Text                         10.2.0.1.0                     VALID
Oracle XML Database                 10.2.0.1.0                     VALID
Oracle Rules Manager                10.2.0.1.0                     VALID
Oracle interMedia                   10.2.0.1.0                     VALID
OLAP Analytic Workspace             10.2.0.1.0                     VALID
Oracle OLAP API                     10.2.0.1.0                     VALID
OLAP Catalog                        10.2.0.1.0                     VALID
Spatial                             10.2.0.1.0                     VALID
Oracle Enterprise Manager           10.2.0.1.0                     VALID

17 rows selected.

--如果缺少下面组件,使用下面对应的程序安装
Oracle Database Catalog Views
Oracle Database Packages and Types 
JServer JAVA Virtual Machine
Oracle XDK    
Oracle Database Java Packages

--使用下面脚本安装(根据组件选择)
SQL> connect / as sysdba
SQL> @$ORACLE_HOME/javavm/install/initjvm.sql
 
SQL> connect / as sysdba
SQL> @$ORACLE_HOME/xdk/admin/initxml.sql
 
SQL> connect / as sysdba
SQL> @$ORACLE_HOME/rdbms/admin/catjava.sql
 
SQL> connect / as sysdba
SQL> @$ORACLE_HOME/rdbms/admin/utlrp.sql

--执行sys.dbms_metadata_util.load_stylesheets
SQL> execute sys.dbms_metadata_util.load_stylesheets;

PL/SQL procedure successfully completed.

测试expdp导出

$ expdp "'/ as sysdba'" dumpfile=xifenfei.dmp tables=scott.t_xifenfei  Directory=AWR_DIR

Export: Release 10.2.0.1.0 - 64bit Production on Wednesday, 31 October, 2012 14:18:04

Copyright (c) 2003, 2005, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
Starting "SYS"."SYS_EXPORT_TABLE_01":  '/******** AS SYSDBA' dumpfile=xifenfei.dmp 
tables=scott.t_xifenfei Directory=AWR_DIR 
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 7 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "SCOTT"."T_XIFENFEI"                        5.374 MB   57376 rows
Master table "SYS"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TABLE_01 is:
  /data/enmotech/xifenfei.dmp
Job "SYS"."SYS_EXPORT_TABLE_01" successfully completed at 14:18:11

测试证明,在不缺少相关组件的情况下,使用dbms_metadata_util.load_stylesheets可以解决expdp导出报ORA-39006/ORA-39213错误;如果缺少组件,需要先安装对应组件,然后再执行dbms_metadata_util.load_stylesheets解决该问题

rman备份出现ORA-19625/ORA-27054解决

RAC环境NFS挂载归档日志使用rman备份出现ORA-19625/ORA-27054错误分析
系统运行环境

OS:AIX 6100-06
DB:11.1.0.6.0 RAC
归档:挂载NFS

rman执行archive log时候报错

sql statement: alter system archive log  current
Starting backup at 25-OCT-11
current log archived
released channel: c1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 02/17/2012 13:03:51
RMAN-06059: expected archived log not found, lost of archived log compromises recoverability
ORA-19625: error identifying file /rarchlogA/1_13775_764866137.dbf
ORA-27054: NFS file system where the file is created or resides is not mounted with correct options
Additional information: 6

这里由于RAC存放归档使用了NFS文件系统,在使用rman备份归档日志执行alter system archive log current的时候发生如下错误.

相关目录挂载情况

$ df -g
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
…………
/dev/lv_arch      50.00     43.24   14%      166     1% /arch1
oradb2:/arch2     50.00     35.31   30%      463     1% /arch2
/dev/baklv        90.00     83.82    7%       10     1% /backup


$ mount:
…………
        /dev/lv_oracle   /oracle          jfs2   Oct 14 11:27 rw,log=/dev8
         /dev/lv_arch     /arch1           jfs2   Oct 14 11:27 rw,log=/dev8
     oradb2   /arch2           /arch2           nfs3   Oct 14 11:47

通过这里可以知道,这里使用默认的参数挂载NFS,从而使得NFS在rman工作时候不能正常工作。

错误原因

From  Oracle 10G R2 , Oracle checks the options with which a NFS mount is mounted on the filesystem. 
and this is done to ensure that no corruption of the database can happen as incorrectly 
mounted NFS volumes can result in data corruption. 

There are no single set of NFS mount options that work across all Oracle platforms 
Please ensure that you have the proper mount options specified by the NAS vendor /Vendor user guide 


The exact checks used for an NFS mounted disk vary between platforms but in general 
the basic checks will include the following checks
 

a) The mount table (eg; /etc/mnttab) can be read to check the mount options 
b) The NFS mount is mounted with the &quot;hard&quot; option 
c) The mount options include rsize&gt;=32768 and wsize&gt;=32768 
d) For RAC environments, where NFS disks are supported, the &quot;noac&quot; mount option is used. 

解决方案
1.临时解决方案
As suggested in the bug the workaround recommended is to use the Event 10298.
alter system set events ‘10298 trace name context forever, level 32’;

2.永久解决方案
具体见:http://www.orasos.com/3269.html

ORACLE 11.2.0.3 生成awr html文件报SYS.DBMS_WORKLOAD_REPOSITORY异常

在想分析数据库性能的关键时刻,突然发现awr不能正常的工作,那就和你上了战场突然发现枪没有子弹一样的郁闷,今天就遇到了11.2.0.3在win的环境中awr生成html不能正常工作.通过查询mos发现该问题出现在各种平台中(win,linux,aix等),提醒大家注意该问题.
数据库版本

SQL> SELECT * FROM V$VERSION;

BANNER
-------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for 32-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

awr报错(html)

SQL> @?/rdbms/admin/awrrpt.sql
ORA-06502: PL/SQL: 数字或值错误 :  字符串缓冲区太小
ORA-06512: 在 "SYS.DBMS_WORKLOAD_REPOSITORY", line 919
ORA-06512: 在 line 1

设置errorstack

SQL> alter session set events '6502 trace name errorstack level 12';

会话已更改。

分析错误

----- Error Stack Dump -----
ORA-06502: PL/SQL: 数字或值错误 :  字符串缓冲区太小
----- Current SQL Statement for this session (sql_id=572fbaj0fdw2b) -----
select output from table(dbms_workload_repository.awr_report_html( :dbid,
                                                            :inst_num,
                                                            :bid, :eid,
                                                            :rpt_options ))


----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
94348684       919  package body SYS.DBMS_WORKLOAD_REPOSITORY
983BAD54         1  anonymous block
----- Call Stack Trace -----
_skdstdst()+121      CALLrel  _kgdsdst()           19D99520 2
_ksedst1()+93        CALLrel  _skdstdst()          19D99520 0 1 485816 4863B2
                                                   485816
_ksedst()+49         CALLrel  _ksedst1()           0 1
_dbkedDefDump()+368  CALLrel  _ksedst()            0
6                                                  
_ksedmp()+44         CALLrel  _dbkedDefDump()      C 0
_dbkdaKsdActDriver(  CALLreg  00000000             C
)+4209                                             
…………

通过查询mos发现Bug 13575143一致,可以确定是该bug,但是通过进一步测试证明不光是awrrpt会出现该错误,awr的相关报告中,只要是展示html结果的都有可能出现类此错误(比如awrrpti.sql/awrddrpt.sql/awrddrpi.sql等等).同时这里通过进一步分析发现其实该bug的起源是Bug 6458801(REPLACE on a CLOB can corrupt multibyte data ID 6458801.8),不过该bug说明已经在11.2.0.1中修复,其实通过这里的分析发现并没有真正的在11.2.0.3中修复该bug,针对该问题没有官方没有提供较好解决方法,只能是用过WORKAROUND来临时解决

They are able to generate the AWR report in the .txt format