ORA-00600[kcfrbd_3]故障解决

朋友一数据库因为断电,被重建控制文件等操作折腾的最后出现在启动的时候出现ORA-00600[kcfrbd_3]

Wed Dec 05 10:26:34 2012
Thread 1 advanced to log sequence 11
Thread 1 opened at log sequence 11
  Current log# 1 seq# 11 mem# 0: E:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
Wed Dec 05 10:26:34 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Dec 05 10:26:34 2012
SMON: enabling cache recovery
Wed Dec 05 10:26:35 2012
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Dictionary check complete
Wed Dec 05 10:26:35 2012
SMON: enabling tx recovery
Wed Dec 05 10:26:35 2012
Database Characterset is ZHS16GBK
Wed Dec 05 10:26:35 2012
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_548.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [2279045], [1], [2277120], [2277120], [], []

replication_dependency_tracking turned off (no async multimaster replication found)
Wed Dec 05 10:26:36 2012
Fatal internal error happened while SMON was doing active transaction recovery.
Wed Dec 05 10:26:36 2012
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_548.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [2279045], [1], [2277120], [2277120], [], []

SMON: terminating instance due to error 474

这个错误很明显:数据库已经open成功了,但是因为有事务不能正常被回滚,然后数据库的smon进程异常,从而使得数据库不能正常启动,解决该问题的方法也是很简单,就是常规的undo处理思路(使用人工undo管理,event屏蔽事务,隐含参数屏蔽回滚段),然后重建undo表空间,这个时候可以结合txchecker来检测是否有异常事务:如果有重要基表对象异常,需要重建库;如果是个别其他对象异常,可以通过重建该对象解决

创建包含null值index

在不少时候我们需要对一个值是否是null的查询,根据Oracle的特点,我们单纯在在这个列上创建一个index不能满足这个需求,因为b-tree index中就是不包含null列.通过创建含常数列的复合index可以满足该需求
数据库版本

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

创建测试表

SQL> create table t_xifenfei 
  2  as
  3  select * from dba_objects;

Table created.

SQL> desc t_xifenfei;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OWNER                                              VARCHAR2(30)
 OBJECT_NAME                                        VARCHAR2(128)
 SUBOBJECT_NAME                                     VARCHAR2(30)
 OBJECT_ID                                          NUMBER
 DATA_OBJECT_ID                                     NUMBER
 OBJECT_TYPE                                        VARCHAR2(19)
 CREATED                                            DATE
 LAST_DDL_TIME                                      DATE
 TIMESTAMP                                          VARCHAR2(19)
 STATUS                                             VARCHAR2(7)
 TEMPORARY                                          VARCHAR2(1)
 GENERATED                                          VARCHAR2(1)
 SECONDARY                                          VARCHAR2(1)

创建可能含null列index

SQL> create index ind_object_id on t_xifenfei(object_id);

Index created.

SQL> exec dbms_stats.gather_table_stats(user,'T_XIFENFEI',cascade => true);

PL/SQL procedure successfully completed.

查看执行计划

SQL> SET AUTOT TRACE EXP stat
SQL> SELECT * FROM T_XIFENFEI WHERE OBJECT_ID IS NULL;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 548923532

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |     1 |    95 |   159   (2)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| T_XIFENFEI |     1 |    95 |   159   (2)| 00:00:02 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID" IS NULL)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        695  consistent gets
          0  physical reads
          0  redo size
        995  bytes sent via SQL*Net to client
        389  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

可以看到我们创建的一个普通的index,查询判断是否为null的时候,没有被应用该index,而是直接使用全表扫描.

创建支持null index

SQL> drop index ind_object_id ;

Index dropped.

SQL> create index ind_object_id on t_xifenfei(object_id,0);  

Index created.

SQL>  exec dbms_stats.gather_table_stats(user,'T_XIFENFEI',cascade => true);

PL/SQL procedure successfully completed.

查看执行计划

SQL> SELECT * FROM T_XIFENFEI WHERE OBJECT_ID IS NULL;

Execution Plan
----------------------------------------------------------
Plan hash value: 804765899

---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     1 |    95 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI    |     1 |    95 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_OBJECT_ID |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID" IS NULL)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        995  bytes sent via SQL*Net to client
        389  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

这里使用了index,并且执行中逻辑读大幅度下降,很大程度提高了程序的执行效率,逻辑读从695降低为2.

原因分析
建立一个包含列和常数的复合index,可以实现该需求,根据b-tree index的特点,只有当index中包含的列都为null的时候,才不会别在index中记录,因为设置了index中包含的常数列,所以就是列为null,也会被包含在该index中,从而查询null值的时候依然可以使用到该index

v$sgainfo中Free SGA Memory Available的各种情况解释

今天网友问到了v$sgainfo中的Free SGA Memory Available的一些情况,开始我也比较迷糊,为什么会出现Free SGA Memory Available的值不为0,通过查询一些试验和测试,对这个问题进行了有力的说明
数据库版本10G

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

当前SGA各组件信息

SQL> select name,bytes/1024/1024 from v$sgainfo;

NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.20853043
Redo Buffers                           6.7890625
Buffer Cache Size                            192
Shared Pool Size                              88
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Granule Size                                   4
Maximum SGA Size                             304
Startup overhead in Shared Pool               36
Free SGA Memory Available                      0

11 rows selected.

sga配置

SQL> show parameter sga;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ---------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 304M
sga_target                           big integer 304M

在10g中,当采用asmm管理时,如果sga_max_size=sga_target,则Free SGA Memory Available为0

修改sga_target

SQL> alter system set sga_target=290M;

System altered.

再次查询v$sgainfo

SQL> select name,bytes/1024/1024 from v$sgainfo;

NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.20853043
Redo Buffers                           6.7890625
Buffer Cache Size                            180
Shared Pool Size                              88
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Granule Size                                   4
Maximum SGA Size                             304
Startup overhead in Shared Pool               36
Free SGA Memory Available                     12

11 rows selected.

再次查看sga_target值

SQL> show parameter sga;

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 304M
sga_target                           big integer 292M

在10g中,当采用asmm管理时,如果sga_max_size>sga_target,则Free SGA Memory Available为sga_max_size-sga_target

找出sga_target修改为292M而不是290M原因

SQL> select component, granule_size from v$sga_dynamic_components;

COMPONENT                                                        GRANULE_SIZE
---------------------------------------------------------------- ------------
shared pool                                                           4194304
large pool                                                            4194304
java pool                                                             4194304
streams pool                                                          4194304
DEFAULT buffer cache                                                  4194304
KEEP buffer cache                                                     4194304
RECYCLE buffer cache                                                  4194304
DEFAULT 2K buffer cache                                               4194304
DEFAULT 4K buffer cache                                               4194304
DEFAULT 8K buffer cache                                               4194304
DEFAULT 16K buffer cache                                              4194304
DEFAULT 32K buffer cache                                              4194304
ASM Buffer Cache                                                      4194304

13 rows selected.

SQL> select 4194304/1024/1024 from dual;

4194304/1024/1024
-----------------
                4

SQL> select 292/4 from dual;

     292/4
----------
        73

因为sga的内存分配是按照GRANULE为单位进行的,而在该库中sga对应组件的GRANULE为4M,所以我们修改的290M的最近的GRANULE整数倍为292M

sga_target为0的情况

SQL> show parameter sga_target;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------
sga_target                           big integer 0

SQL> select name,bytes/1024/1024 from v$sgainfo;

NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.20846176
Redo Buffers                           6.7890625
Buffer Cache Size                            180
Shared Pool Size                              88
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Granule Size                                   4
Maximum SGA Size                             292
Startup overhead in Shared Pool               36
Free SGA Memory Available                      0

11 rows selected.

在10g中,如果不采用asmm内存管理模式,Free SGA Memory Available为0

11g数据库版本

SQL> select * from v$version;

BANNER
-----------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

查询v$sgainfo信息

SQL> select name,bytes/1024/1024 from v$sgainfo;

NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.28236008
Redo Buffers                          6.03515625
Buffer Cache Size                             20
Shared Pool Size                             116
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Shared IO Pool Size                            0
Granule Size                                   4
Maximum SGA Size                      299.320313
Startup overhead in Shared Pool       52.2684898
Free SGA Memory Available                    140

oracle内存相关参数设置

SQL> show parameter memory;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------
hi_shared_memory_address             integer     0
memory_max_target                    big integer 300M
memory_target                        big integer 300M
shared_memory_address                integer     0

SQL> show parameter pga;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------
pga_aggregate_target                 big integer 0

数据库动态组件内存分配

SQL> select COMPONENT,CURRENT_SIZE/1024/1024 CURRENT_SIZE from V$MEMORY_DYNAMIC_COMPONENTS;

COMPONENT                                                        CURRENT_SIZE
---------------------------------------------------------------- ------------
shared pool                                                               116
large pool                                                                  4
java pool                                                                   4
streams pool                                                                8
SGA Target                                                                160
DEFAULT buffer cache                                                       20
KEEP buffer cache                                                           0
RECYCLE buffer cache                                                        0
DEFAULT 2K buffer cache                                                     0
DEFAULT 4K buffer cache                                                     0
DEFAULT 8K buffer cache                                                     0
DEFAULT 16K buffer cache                                                    0
DEFAULT 32K buffer cache                                                    0
Shared IO Pool                                                              0
PGA Target                                                                140
ASM Buffer Cache                                                            0

16 rows selected.

初步怀疑:在11g的amm内存管理模式下Free SGA Memory Available和PGA Target相等

尝试修改pga_aggregate_target值

SQL> alter system set pga_aggregate_target=150M;

System altered.

再次查询v$sgainfo

SQL>  select name,bytes/1024/1024 from v$sgainfo;

NAME                             BYTES/1024/1024
-------------------------------- ---------------
Fixed SGA Size                        1.28236008
Redo Buffers                          6.03515625
Buffer Cache Size                              8
Shared Pool Size                             116
Large Pool Size                                4
Java Pool Size                                 4
Streams Pool Size                              8
Shared IO Pool Size                            0
Granule Size                                   4
Maximum SGA Size                      299.320313
Startup overhead in Shared Pool       52.2684898
Free SGA Memory Available                    152

查询动态组件内存分布

SQL> select COMPONENT,CURRENT_SIZE/1024/1024 CURRENT_SIZE from V$MEMORY_DYNAMIC_COMPONENTS;

COMPONENT                                                        CURRENT_SIZE
---------------------------------------------------------------- ------------
shared pool                                                               116
large pool                                                                  4
java pool                                                                   4
streams pool                                                                8
SGA Target                                                                148
DEFAULT buffer cache                                                        8
KEEP buffer cache                                                           0
RECYCLE buffer cache                                                        0
DEFAULT 2K buffer cache                                                     0
DEFAULT 4K buffer cache                                                     0
DEFAULT 8K buffer cache                                                     0
DEFAULT 16K buffer cache                                                    0
DEFAULT 32K buffer cache                                                    0
Shared IO Pool                                                              0
PGA Target                                                                152
ASM Buffer Cache                                                            0

16 rows selected.

进一步证明:在11g的amm内存管理模式下Free SGA Memory Available和PGA Target相等

设置pga_aggregate_target为150M,PGA Target为152M原因分析

SQL> select COMPONENT,GRANULE_SIZE from V$MEMORY_DYNAMIC_COMPONENTS;

COMPONENT                                                        GRANULE_SIZE
---------------------------------------------------------------- ------------
shared pool                                                           4194304
large pool                                                            4194304
java pool                                                             4194304
streams pool                                                          4194304
SGA Target                                                            4194304
DEFAULT buffer cache                                                  4194304
KEEP buffer cache                                                     4194304
RECYCLE buffer cache                                                  4194304
DEFAULT 2K buffer cache                                               4194304
DEFAULT 4K buffer cache                                               4194304
DEFAULT 8K buffer cache                                               4194304
DEFAULT 16K buffer cache                                              4194304
DEFAULT 32K buffer cache                                              4194304
Shared IO Pool                                                        4194304
PGA Target                                                            4194304
ASM Buffer Cache                                                      4194304

16 rows selected.

原理同上述10g中的sga_target分析,不再重复,主要就是:150M不能被4M整除,所以取最近的152M

整体总结
1.如果不采用asmm和amm,Free SGA Memory Available为0
2.如果采用asmm,当sga_max_size=sga_target,则Free SGA Memory Available为0
3.如果采用asmm,当sga_max_size>sga_target,则Free SGA Memory Available为sga_max_size-sga_target
4.如果采用amm,Free SGA Memory Available和PGA Target相等

WARNING: Subscription for node down event still pending

监听日志中出现很多”WARNING: Subscription for node down event still pending”警告

--监听日志中内容
……
26-NOV-2012 09:35:48 * ping * 0
WARNING: Subscription for node down event still pending
……
26-NOV-2012 09:45:49 * ping * 0
WARNING: Subscription for node down event still pending
……

--监听日志大小
$ ls -l /oracle/product/10g/network/log/listener.log
-rw-r-----    1 oracle   dba       229526148 Nov 26 14:20 /oracle/product/10g/network/log/listener.log

--总计条数
$ grep "Subscription for node down event still pending" \
> /oracle/product/10g/network/log/listener.log|wc -l
   77306

数据库版本和平台

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.5.0 - Productio
NLSRTL Version 10.2.0.5.0 - Production

监听日志配置

$ more /oracle/product/10g/network/admin/listener.ora
SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /oracle/product/10g)
      (PROGRAM = extproc)
    )
  )

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = xifenfei)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
    )
  )

解决方法

--监听文件中增加
SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name>=OFF

--reload监听
lsnrctl reload

补充说明

SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name>=OFF will prevent the messages 
from being written to the log file and may also prevent the TNS Listener from hanging periodically. 

Setting SUBSCRIBE_FOR_NODE_DOWN_<listener_name> to OFF disables 
a necessary RAC functionality(Oracle Notification Service[ONS],fast application notification[FAN]). 
The above workaround is recommended only for non-RAC environments. 
The issue may be present in all 10g and newer installations.

具体参考:372959.1和340091.1

新删除data guard归档日志shell脚本

以前写过删除dataguard归档日志的方法(删除data guard归档日志),但是以前的方法确实不够灵活也不够简便,现在提供最新的一次在客户现场部署的dg删除归档日志的shell脚本

#!/bin/sh
source ~/.bash_profile

grep "Media Recovery Log" $ORACLE_BASE/admin/$ORACLE_SID/bdump/alert_${ORACLE_SID}.log|
\ awk '{print $4}'|sed -e 's/^/rm /' >/tmp/rmarchlog.sh
chmod +x /tmp/rmarchlog.sh

/tmp/rmarchlog.sh
cd $ORACLE_BASE/admin/$ORACLE_SID/bdump

cat alert_${ORACLE_SID}.log >>alert_${ORACLE_SID}.log.bak
echo ''>alert_${ORACLE_SID}.log

rm -f /tmp/rmarchlog.sh


$ORACLE_HOME/bin/rman target / <<XIFENFEI
crosscheck archivelog all;
delete expired archivelog all;
YES
exit;
XIFENFEI

根据alert日志中dg应用日志的信息”Media Recovery Log”信息来删除掉相关的归档日志,可以保证应用过的归档日志都被删除,而没有应用的归档日志都保留.