CACHE BUFFERS CHAINS等待事件

关于CACHE BUFFERS CHAINS描述

CACHE BUFFERS CHAINS latch is acquired when searching for data blocks cached in the buffer cache. 
Since the Buffer cache is implemented as a sum of chains of blocks, each of those chains is protected 
by a child of this latch when needs to be scanned. Contention in this latch can be caused by very heavy 
access to a single block. This can require the application to be reviewed.

产生CACHE BUFFERS CHAINS原因

The main cause of the cache buffers chains latch contention is usually a hot block issue. 
This happens when multiple sessions repeatedly access one or more blocks that are protected 
by the same child cache buffers chains latch.

CACHE BUFFERS CHAINS 处理方法
1) Examine the application to see if the execution of certain DML and SELECT statements can be reorganized to eliminate contention on the object.

处理方法如下:
--通过报告确定latch: cache buffers chains 等待

Top 5 Timed Events                                      Avg    %Total
~~~~~~~~~~~~~~~~~~                                      wait   Call
Event                          Waits        Time (s)    (ms)   Time   Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
latch: cache buffers chains          74,642      35,421    475    6.1 Concurrenc
CPU time                                         11,422           2.0
log file sync                        34,890       1,748     50    0.3 Commit
latch free                            2,279         774    340    0.1 Other
db file parallel write               18,818         768     41    0.1 System I/O
-------------------------------------------------------------

--找出逻辑读高sql
SQL ordered by Gets         DB/Inst:  Snaps: 1-2
-> Resources reported for PL/SQL code includes the resources used by all SQL
statements called by the code.
-> Total Buffer Gets:   265,126,882
-> Captured SQL account for   99.8% of Total


                            Gets                CPU      Elapsed
Buffer Gets    Executions   per Exec     %Total Time (s) Time (s)  SQL Id
-------------- ------------ ------------ ------ -------- --------- -------------
   256,763,367       19,052     13,477.0   96.8 ######## ######### a9nchgksux6x2
Module: JDBC Thin Client
SELECT * FROM SALES ....

     1,974,516      987,056          2.0    0.7    80.31    110.94 ct6xwvwg3w0bv
SELECT COUNT(*) FROM ORDERS ....

--逻辑读大对象
Segments by Logical Reads           
-> Total Logical Reads:     265,126,882
-> Captured Segments account for   98.5% of Total

           Tablespace                      Subobject  Obj.       Logical
Owner         Name    Object Name            Name     Type         Reads  %Total
---------- ---------- -------------------- ---------- ----- ------------ -------
DMSUSER    USERS      SALES                           TABLE  212,206,208   80.04
DMSUSER    USERS      SALES_PK                        INDEX   44,369,264   16.74
DMSUSER    USERS      SYS_C0012345                    INDEX    1,982,592     .75
DMSUSER    USERS      ORDERS_PK                       INDEX      842,304     .32
DMSUSER    USERS      INVOICES                        TABLE      147,488     .06
          -------------------------------------------------------------
处理思路:
1.Look for SQL that accesses the blocks in question and determine if the repeated reads are necessary. 
  This may be within a single session or across multiple sessions.

2.Check for suboptimal SQL (this is the most common cause of the events)  
 look at the execution plan for the SQL being run and try to reduce the 
 gets per executions which will minimize the number of blocks being accessed 
 and therefore reduce the chances of multiple sessions contending for the same block.

Note:1342917.1 Troubleshooting ‘latch: cache buffers chains’ Wait Contention

2) Decrease the buffer cache -although this may only help in a small amount of cases.

3) DBWR throughput may have a factor in this as well.If using multiple DBWR’s then increase the number of DBWR’s.

4) Increase the PCTFREE for the table storage parameters via ALTER TABLE or rebuild. This will result in less rows per block.

找出热点对象
First determine which latch id(ADDR) are interesting by examining the number of 
sleeps for this latch. The higher the sleep count, the more interesting the 
latch id(ADDR) is:

SQL> select CHILD#  "cCHILD"
     ,      ADDR    "sADDR"
     ,      GETS    "sGETS"
     ,      MISSES  "sMISSES"
     ,      SLEEPS  "sSLEEPS" 
     from v$latch_children 
     where name = 'cache buffers chains'
     order by 5, 1, 2, 3;

Run the above query a few times to to establish the id(ADDR) that has the most 
consistent amount of sleeps. Once the id(ADDR) with the highest sleep count is found
then this latch address can be used to get more details about the blocks
currently in the buffer cache protected by this latch. 
The query below should be run just after determining the ADDR with 
the highest sleep count.

SQL> column segment_name format a35
     select /*+ RULE */
       e.owner ||'.'|| e.segment_name  segment_name,
       e.extent_id  extent#,
       x.dbablk - e.block_id + 1  block#,
       x.tch,
       l.child#
     from
       sys.v$latch_children  l,
       sys.x$bh  x,
       sys.dba_extents  e
     where
       x.hladdr  = '&ADDR' and
       e.file_id = x.file# and
       x.hladdr = l.addr and
       x.dbablk between e.block_id and e.block_id + e.blocks -1
     order by x.tch desc ;

Example of the output :
SEGMENT_NAME                     EXTENT#      BLOCK#       TCH    CHILD#
-------------------------------- ------------ ------------ ------ ----------
SCOTT.EMP_PK                       5            474          17     7,668
SCOTT.EMP                          1            449           2     7,668

Depending on the TCH column (The number of times the block is hit by a SQL 
statement), you can identify a hot block. The higher the value of the TCH column,
the more frequent the block is accessed by SQL statements.

5) Consider implementing reverse key indexes (if range scans aren’t commonly used against the segment)

xdb组件中关于ftp/http监听

发现监听进程监听8080和2100端口

--监听端口
[oracle@xifenfei ~]$ netstat -nap|grep tnslsnr
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:8080                0.0.0.0:*                   LISTEN      29866/tnslsnr
tcp        0      0 0.0.0.0:1521                0.0.0.0:*                   LISTEN      29866/tnslsnr
tcp        0      0 0.0.0.0:2100                0.0.0.0:*                   LISTEN      29866/tnslsnr

--进程名称
[oracle@xifenfei ~]$ ps -ef|grep 29866
oracle   29866     1  0 00:20 pts/0    00:00:00 /u01/oracle/9.2.0/db_1/bin/tnslsnr LISTENER -inherit

查看监听状态

[oracle@xifenfei ~]$ lsnrctl status

LSNRCTL for Linux: Version 9.2.0.4.0 - Production on 08-MAY-2012 00:26:50

Copyright (c) 1991, 2002, Oracle Corporation.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=xifenfei)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 9.2.0.4.0 - Production
Start Date                08-MAY-2012 00:20:47
Uptime                    0 days 0 hr. 6 min. 3 sec
Trace Level               off
Security                  OFF
SNMP                      OFF
Listener Parameter File   /u01/oracle/9.2.0/db_1/network/admin/listener.ora
Listener Log File         /u01/oracle/9.2.0/db_1/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xifenfei)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xifenfei)(PORT=8080))(Presentation=HTTP)(Session=RAW))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xifenfei)(PORT=2100))(Presentation=FTP)(Session=RAW))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
  Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
Service "xffXDB" has 1 instance(s).
  Instance "xff", status READY, has 1 handler(s) for this service...
Service "xifenfei" has 2 instance(s).
  Instance "xff", status UNKNOWN, has 1 handler(s) for this service...
  Instance "xff", status READY, has 1 handler(s) for this service...
The command completed successfully
--从这里可以看出oracle的listener确实监听了8080和2100端口

查看listener.ora文件

[oracle@xifenfei ~]$ more /u01/oracle/9.2.0/db_1/network/admin/listener.ora
# LISTENER.ORA Network Configuration File: /u01/oracle/9.2.0/db_1/network/admin/listener.ora
# Generated by Oracle configuration tools.

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = xifenfei)(PORT = 1521))
      )
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
      )
    )
  )

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /u01/oracle/9.2.0/db_1)
      (PROGRAM = extproc)
    )
    (SID_DESC =
      (GLOBAL_DBNAME = xifenfei)
      (ORACLE_HOME = /u01/oracle/9.2.0/db_1)
      (SID_NAME = xff)
    )
  )
--从中确认未监听8080和2100端口,看来这两个端口是在动态注册监听的时候加入进去的.
--我们知道oracle的xdb组件可能会启用http和ftp功能,这两个功能可能会开启响应端口,我们分析xdb组件.

查看xdb组件是否工作正常

SQL> select 
  2  comp_name, status, version from 
  3  DBA_REGISTRY where 
  4  comp_name='Oracle XML Database'; 

COMP_NAME                      STATUS                 VERSION
------------------------------ ---------------------- ----------
Oracle XML Database            VALID                  9.2.0.4.0

SQL> select  count(*) from  dba_objects where owner='XDB' and status='INVALID';

  COUNT(*)
----------
         0

SQL>   show parameter dispatchers;

NAME                                 TYPE                   VALUE
------------------------------------ ---------------------- ------------------------------
dispatchers                          string                 (PROTOCOL=TCP) (SERVICE=xffXDB)
max_dispatchers                      integer                5
mts_dispatchers                      string                 (PROTOCOL=TCP) (SERVICE=xffXDB)
mts_max_dispatchers                  integer                5
--查询证明xdb组件应该工作正常

查看xdb中ftp和http启动相关端口

SQL> select dbms_xdb.GETFTPPORT() from dual; 
select dbms_xdb.GETFTPPORT() from dual
       *
ERROR at line 1:
ORA-00904: "DBMS_XDB"."GETFTPPORT": invalid identifier


SQL> select dbms_xdb.GETHTTPPORT() from dual; 
select dbms_xdb.GETHTTPPORT() from dual
       *
ERROR at line 1:
ORA-00904: "DBMS_XDB"."GETHTTPPORT": invalid identifier
--9i中为提供上述查询端口的相关程序.

SQL> set long 10000
SQL> set pagesize 0
SQL> SELECT dbms_xdb.cfg_get FROM dual;
--从中找到类此这样记录,确实ftp启用2100端口,http启用8080端口
     <ftpconfig>
        <ftp-port>2100</ftp-port>
        <ftp-listener>local_listener</ftp-listener>
        <ftp-protocol>tcp</ftp-protocol>
        <session-timeout>6000</session-timeout>
     </ftpconfig>
     <httpconfig>
        <http-port>8080</http-port>
        <http-listener>local_listener</http-listener>
        <http-protocol>tcp</http-protocol>
        <session-timeout>6000</session-timeout>
        <server-name>XDB HTTP Server</server-name>
     </httpconfig>
--到这里我们可以确定是由于xdb组件中的ftp和http功能自动注册导致监听了2100和8080端口

修改xdb中监听ftp和http端口

SQL> call dbms_xdb.cfg_update(updateXML(dbms_xdb.cfg_get(),'
   2 /xdbconfig/sysconfig/protocolconfig/httpconfig/http-port/text()',8888));

Call completed.

SQL> call dbms_xdb.cfg_update(updateXML(dbms_xdb.cfg_get(),
   2 '/xdbconfig/sysconfig/protocolconfig/ftpconfig/ftp-port/text()',2222));

Call completed.

SQL> commit;

Commit complete.

SQL>  exec dbms_xdb.cfg_refresh;

PL/SQL procedure successfully completed.

SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
[oracle@xifenfei ~]$ lsnrctl status

LSNRCTL for Linux: Version 9.2.0.4.0 - Production on 08-MAY-2012 00:57:13

Copyright (c) 1991, 2002, Oracle Corporation.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=xifenfei)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 9.2.0.4.0 - Production
Start Date                08-MAY-2012 00:20:47
Uptime                    0 days 0 hr. 36 min. 26 sec
Trace Level               off
Security                  OFF
SNMP                      OFF
Listener Parameter File   /u01/oracle/9.2.0/db_1/network/admin/listener.ora
Listener Log File         /u01/oracle/9.2.0/db_1/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xifenfei)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xifenfei)(PORT=8888))(Presentation=HTTP)(Session=RAW))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xifenfei)(PORT=2222))(Presentation=FTP)(Session=RAW))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
  Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
Service "xffXDB" has 1 instance(s).
  Instance "xff", status READY, has 1 handler(s) for this service...
Service "xifenfei" has 2 instance(s).
  Instance "xff", status UNKNOWN, has 1 handler(s) for this service...
  Instance "xff", status READY, has 1 handler(s) for this service...
The command completed successfully

xdb中ftp和http监听

SQL> alter system reset dispatchers  scope=spfile sid='*';

System altered.

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area  353441008 bytes
Fixed Size                   451824 bytes
Variable Size             184549376 bytes
Database Buffers          167772160 bytes
Redo Buffers                 667648 bytes
Database mounted.
Database opened.

SQL> show parameter dispatchers;

NAME                                 TYPE                   VALUE
------------------------------------ ---------------------- ---------
dispatchers                          string
max_dispatchers                      integer                5
mts_dispatchers                      string
mts_max_dispatchers                  integer                5

--SELECT dbms_xdb.cfg_get FROM dual;中信息
      <ftpconfig>
        <ftp-port>2222</ftp-port>
        <ftp-listener>local_listener</ftp-listener>
        <ftp-protocol>tcp</ftp-protocol>
        <session-timeout>6000</session-timeout>
      </ftpconfig>
      <httpconfig>
        <http-port>8888</http-port>
        <http-listener>local_listener</http-listener>
        <http-protocol>tcp</http-protocol>
        <session-timeout>6000</session-timeout>
        <server-name>XDB HTTP Server</server-name>
      </httpconfig>

[oracle@xifenfei dbs]$ lsnrctl status

LSNRCTL for Linux: Version 9.2.0.4.0 - Production on 08-MAY-2012 01:10:07

Copyright (c) 1991, 2002, Oracle Corporation.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=xifenfei)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 9.2.0.4.0 - Production
Start Date                08-MAY-2012 00:20:47
Uptime                    0 days 0 hr. 49 min. 20 sec
Trace Level               off
Security                  OFF
SNMP                      OFF
Listener Parameter File   /u01/oracle/9.2.0/db_1/network/admin/listener.ora
Listener Log File         /u01/oracle/9.2.0/db_1/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xifenfei)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
  Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
Service "xifenfei" has 2 instance(s).
  Instance "xff", status UNKNOWN, has 1 handler(s) for this service...
  Instance "xff", status READY, has 1 handler(s) for this service...
The command completed successfully
--证明已经关闭了xdb 组件的ftp/http监听

xdb组件中的ftp/http监听在9i数据库中,只要你安装了xdb组件,会自动启用这功能.
在10g及其11g中默认不启用.所以为了你的数据库安全,如果不使用这些功能,建议手工关闭

使用dblink导致的/*+ OPAQUE_TRANSFORM */

数据库版本

--目标端
SQL> select * from v$version;

BANNER
-----------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for 32-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

--源端
SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

目标端创建dblink

SQL> create database link dblink_xff connect to test identified by
  2  test using 'ip/mcrm';

数据库链接已创建。

dblink查询操作测试

--目标端
SQL> select count(*) from t_xifenfei@dblink_xff;

  COUNT(*)
----------
     50645

--源端
SQL> select sql_text from v$sql where lower(sql_text) like '%t_xifenfei%' 
and sql_text not like '%lower(%' and sql_text not like '%OPT_DYN_SAMP%';

SQL_TEXT
-------------------------------------------------------------------
SELECT COUNT(*) FROM "T_XIFENFEI" "A1"
SELECT /*+ FULL(P) +*/ * FROM "T_XIFENFEI" P

dblink创建空表测试

--目标端
SQL> create table  chf.t_xifenfei as select * from t_xifenfei@dblink_xff where 1=0;

表已创建。

--源端
SQL> select sql_text from v$sql where lower(sql_text) like '%t_xifenfei%' 
and sql_text not like '%lower(%' and sql_text not like '%OPT_DYN_SAMP%';

SQL_TEXT
----------------------------------------------------------------------------
SELECT /*+ FULL(P) +*/ * FROM "T_XIFENFEI" P

dblink创建表插入数据

--目标端
SQL> create table  chf.t_xifenfei_new as select * from t_xifenfei@dblink_xff;

表已创建。

--源端
SQL> select sql_text from v$sql where lower(sql_text) like '%t_xifenfei%' 
and sql_text not like '%lower(%' and sql_text not like '%OPT_DYN_SAMP%';

SQL_TEXT
--------------------------------------------------------------------------------

SELECT /*+ FULL(P) +*/ * FROM "T_XIFENFEI" P
SELECT "OWNER","OBJECT_NAME","SUBOBJECT_NAME","OBJECT_ID","DATA_OBJECT_ID","OBJE

CT_TYPE","CREATED","LAST_DDL_TIME","TIMESTAMP","STATUS","TEMPORARY","GENERATED",

"SECONDARY" FROM "T_XIFENFEI" "T_XIFENFEI"

dblink insert select插入数据测试

--目标端
SQL> insert into chf.t_xifenfei
  2  select * from t_xifenfei@dblink_xff;

已创建 50645 行。

--源端
SQL> select sql_text from v$sql where lower(sql_text) like '%t_xifenfei%' 
and sql_text not like '%lower(%' and sql_text not like '%OPT_DYN_SAMP%';

SQL_TEXT
--------------------------------------------------------------------------------
SELECT /*+ FULL(P) +*/ * FROM "T_XIFENFEI" P
SELECT /*+ OPAQUE_TRANSFORM */ "OWNER","OBJECT_NAME","SUBOBJECT_NAME","OBJECT_ID

","DATA_OBJECT_ID","OBJECT_TYPE","CREATED","LAST_DDL_TIME","TIMESTAMP","STATUS",

"TEMPORARY","GENERATED","SECONDARY" FROM "T_XIFENFEI" "T_XIFENFEI"

除掉OPAQUE_TRANSFORM 提示

--目标端
SQL> alter session set events '22825 trace name context forever, level 1' ;

会话已更改。

SQL> insert into chf.t_xifenfei
  2  select * from t_xifenfei@dblink_xff;

已创建 50645 行。

--源端
SQL> select sql_text from v$sql where lower(sql_text) like '%t_xifenfei%' and sq
l_text not like '%lower(%' and sql_text not like '%OPT_DYN_SAMP%';

SQL_TEXT
--------------------------------------------------------------------------------

SELECT /*+ FULL(P) +*/ * FROM "T_XIFENFEI" P
SELECT "OWNER","OBJECT_NAME","SUBOBJECT_NAME","OBJECT_ID","DATA_OBJECT_ID","OBJE

CT_TYPE","CREATED","LAST_DDL_TIME","TIMESTAMP","STATUS","TEMPORARY","GENERATED",

"SECONDARY" FROM "T_XIFENFEI" "T_XIFENFEI"

通过dblink的相关实验可以得出,在 insert-as-remote-select的时候,源端库上会出现/*+ OPAQUE_TRANSFORM */的hint提示.该hint的作用是:给出源端目标端要求的数据类型的明确信息(The OPAQUE_TRANSFORM hint is to help with the transformation of datatype when certain type of operations are done within the database).屏蔽盖hint的方法是设置event:22825 trace name context forever, level 1(官方文档还提供了另外两种hint的方式屏蔽这个,但是我测试均未成功)

记录一次比较棘手数据库恢复要点

在最近的一次数据库异常恢复过程中遇到不少问题,把重点记录下
ORA-00704/ORA-01555错误

Fri May  4 21:04:21 2012
select ctime, mtime, stime from obj$ where obj# = :1
Fri May  4 21:04:21 2012
Errors in file /oracle/admin/standdb/udump/perfdb_ora_1286288.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 40 with name "_SYSSMU40$" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1286288
ORA-1092 signalled during: alter database open resetlogs...

这里的提示可以看出obj$基表中有事务存在,查询这个表的时候,要去找40号回滚段中相关数据;通过非常规方法,
查找到40号回滚段的状态是offliine了(这个查询出来的信息和是否使用隐含参数无关).

问题原因,为什么40号回滚段变得offline?
Fri May  4 17:36:26 2012
alter tablespace undotbs offline
Fri May  4 17:36:26 2012
ORA-1109 signalled during: alter tablespace undotbs offline...
Fri May  4 17:37:29 2012
alter database datafile  '/dev/rundodbs01' offline drop
Fri May  4 17:37:29 2012
Completed: alter database datafile  '/dev/rundodbs01' offline drop
因为强制offline 了file# 2文件导致(一个undo表空间文件)

解决方法:
1.bbed提交事务
因为现在生产的trace文件中未有关于obj$ 未提交事务的记录,做10046也为发现该记录,如果要使用bbed修改该事务,
那需要dump obj$相关的数据块(在mount状态下dump),然后找到相关事务,再修改

2.强制让file# 2 online
因为在resetlogs前file#2 已经offline掉了,所以要使得该文件能够成功online,需要先推进scn

ORA-00600[krhpfh_03-1209]

SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [782415504],
[782428968], [3987078030], [2379], [0], [0]
ORA-01110: data file 2: '/dev/rundodbs01'

问题原因:
数据库处于非归档模式下,连续三次resetlogs,引起该bug

解决办法:
重建控制文件

但是这里问题出现了,因为file# 2的resetlogs scn和其他数据文件不一致,导致在file# 2 online的前提下,无法重建.
这样就处在了一个循环中(需要online file# 2 又要重建控制文件),这样的问题,可以通过bbed修改file# 2的resetlogs scn完成
或者先让file# 2 offline(没有加drop)掉,重建控制文件(除掉file# 2的文件记录)

ORA-00600[25025]

SMON: enabling cache recovery
Fri May  4 22:36:36 2012
Errors in file /oracle/admin/standdb/udump/perfdb_ora_1167402.trc:
ORA-00600: internal error code, arguments: [25025], [2], [], [], [], [], [], []
Fri May  4 22:36:38 2012
Errors in file /oracle/admin/standdb/udump/perfdb_ora_1167402.trc:
ORA-00600: internal error code, arguments: [25025], [2], [], [], [], [], [], []
Fri May  4 22:36:38 2012
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 1167402

错误原因:
因为有undo文件不在undo对应的表空间中,而我们的file# 2文件确实是undo文件,而且重建控制文件时候未加入进来

解决办法:
  undo_management          = AUTO
  undo_tablespace          = UNDODBS(file# 2属于该表空间)
修改为
  undo_management          = MANUAL
  undo_tablespace          = SYSTEM
或者bbed修改file# 2的header,然后重建控制文件

ORA-00600[4137]

Errors in file /oracle/admin/standdb/bdump/perfdb_smon_1290564.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Fri May  4 23:20:52 2012
create undo tablespace undotbs3 datafile '/dev/rundodbs21' size 20400M
Fri May  4 23:23:47 2012
Errors in file /oracle/admin/standdb/bdump/perfdb_smon_1290564.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Fri May  4 23:23:48 2012
Errors in file /oracle/admin/standdb/bdump/perfdb_pmon_1520126.trc:
ORA-00474: SMON process terminated with error
Fri May  4 23:23:48 2012
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 1520126

错误原因:
_smon_internal_errlimit(limit of SMON internal errors) SMON遇到了内部错误,最大允许100次,
不断计数增长,达到100的时候,数据库smon进程自动down掉,从而导致数据库down

解决办法:
1.临时解决办法:设置_smon_internal_errlimit一个较大值
3.根本解决办法:使用undo隐含参数,删除有问题undo 回滚段和undo表空间或者使用10513 事件

Bind Variable Peeking 测试

相关参数

SQL> select * from v$version;

BANNER
---------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> show parameter optimizer_mode;

NAME                                 TYPE                   VALUE
------------------------------------ ---------------------- ----------------
optimizer_mode                       string                 ALL_ROWS

SQL> show parameter cursor_sharing;

NAME                                 TYPE                   VALUE
------------------------------------ ---------------------- ----------------
cursor_sharing                       string                 EXACT

SQL> select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name
  8  /
Enter value for param: _optim_peek_user_binds
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%_optim_peek_user_binds%')

NAME                             VALUE                    DESCRIPTION
-------------------------------- ------------------------ ----------------------------------
_optim_peek_user_binds           TRUE                     enable peeking of user binds

创建模拟表

SQL> create table t_xifenfei(id number,name varchar2(30));

Table created.

SQL>  begin
  2     for i in 1..100000 loop
  3          insert into t_xifenfei values(i,'xifenfei');
  4      end loop;
  5      commit;
  6    end;
  7    /

PL/SQL procedure successfully completed.

SQL> update t_xifenfei SET name='www.orasos.com' where mod(id,20000)=0;

5 row updated.

SQL> commit;

Commit complete.

SQL> create index i_xifenfei on t_xifenfei(name);

Index created.

默认收集统计信息,查看执行计划

SQL> exec  DBMS_STATS.gather_table_stats(user,'T_XIFENFEI',CASCADE=>TRUE);

PL/SQL procedure successfully completed.

SQL> set autot trace exp 
SQL> select id from t_xifenfei where name='xifenfei';

Execution Plan
----------------------------------------------------------
Plan hash value: 548923532

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            | 50000 |   683K|   103   (1)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| T_XIFENFEI | 50000 |   683K|   103   (1)| 00:00:02 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("NAME"='xifenfei')

SQL> select id from t_xifenfei where name='www.orasos.com';

Execution Plan
----------------------------------------------------------
Plan hash value: 548923532

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            | 50000 |   683K|   103   (1)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| T_XIFENFEI | 50000 |   683K|   103   (1)| 00:00:02 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("NAME"='www.orasos.com')
--这里可以发现,对于这样少量的列的情况,没有选择一个合适的执行计划

准确收集统计信息

SQL> exec  DBMS_STATS.gather_table_stats(user,'T_XIFENFEI',CASCADE=>TRUE,
    2 method_opt => 'FOR ALL COLUMNS SIZE 254',estimate_percent => 100);

PL/SQL procedure successfully completed.

再次查看执行计划

SQL> select id from t_xifenfei where name='www.orasos.com';


Execution Plan
----------------------------------------------------------
Plan hash value: 1926396081

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |     1 |    14 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI |     1 |    14 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I_XIFENFEI |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("NAME"='www.orasos.com')


Statistics
----------------------------------------------------------
          0  recursive calls
          1  db block gets
        320  consistent gets
          0  physical reads
          0  redo size
        418  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

SQL> select id from t_xifenfei where name='xifenfei';

99995 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 548923532

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            | 99999 |  1367K|   103   (1)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| T_XIFENFEI | 99999 |  1367K|   103   (1)| 00:00:02 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("NAME"='xifenfei')


Statistics
----------------------------------------------------------
          0  recursive calls
          1  db block gets
       6970  consistent gets
          0  physical reads
          0  redo size
    1455968  bytes sent via SQL*Net to client
      73745  bytes received via SQL*Net from client
       6668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      99995  rows processed
--通过这里可以看出在完整的收集表和index包括直方图信息后,数据库执行计划正常
--也说明一点:在数据列分布不均匀的时候,依靠数据库自动收集直方图还是不怎么拷贝.

使用AUTOTRACE测试

SQL> set autot trace exp
SQL> var a varchar2(30);
SQL> exec :a := 'www.orasos.com';

PL/SQL procedure successfully completed.

SQL> select id from t_xifenfei where name=:a;

Execution Plan
----------------------------------------------------------
Plan hash value: 548923532

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            | 50000 |   683K|   103   (1)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| T_XIFENFEI | 50000 |   683K|   103   (1)| 00:00:02 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("NAME"=:A)
--这里可以发现11g的Bind Variable Peeking 没有使用正确的执行计划,其实这个是AUTOTRACE本身的bug导致

收集下面sql执行计划(peeking测试需要)get_plan.sql脚本

SQL> select * from t_xifenfei where name='wwww.orasos.com' and id=100;

no rows selected

SQL> @get_plan.sql

Rollback complete.

Enter value for hash_value: 2708637417
select * from t_xifenfei where name='wwww.orasos.com' and id=100

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI |     1 |    14 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I_XIFENFEI |     3 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"=100)
   2 - access("NAME"='wwww.orasos.com')
SQL>  select * from t_xifenfei where name='xifenfei' and id=100;

        ID NAME
---------- ------------------------------------------------------------
       100 xifenfei

1 row selected.

SQL> @get_plan.sql

Rollback complete.

Enter value for hash_value: 1355242984
 select * from t_xifenfei where name='xifenfei' and id=100

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |   103 (100)|          |
|*  1 |  TABLE ACCESS FULL| T_XIFENFEI |     1 |    14 |   103   (1)| 00:00:02 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("ID"=100 AND "NAME"='xifenfei'))
--这里可以看到,两个执行计划都我们希望的

测试peeking功能

SQL> alter system flush shared_pool;

System altered.

SQL> select * from t_xifenfei where name='xifenfei' and id=100;

        ID NAME
---------- ------------------------------------------------------------
       100 xifenfei

1 row selected.

SQL> @get_plan.sql

Rollback complete.

Enter value for hash_value: 2860562673
select * from t_xifenfei where name='xifenfei' and id=100

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |   103 (100)|          |
|*  1 |  TABLE ACCESS FULL| T_XIFENFEI |     1 |    14 |   103   (1)| 00:00:02 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("ID"=100 AND "NAME"='xifenfei'))
SQL> var b varchar2(30);
SQL> exec :b := 'www.orasos.com';

PL/SQL procedure successfully completed.

SQL> select * from t_xifenfei where name=:b and id=100;

no rows selected

SQL> @get_plan.sql

Rollback complete.

Enter value for hash_value: 4157424768
select * from t_xifenfei where name=:b and id=100

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |   103 (100)|          |
|*  1 |  TABLE ACCESS FULL| T_XIFENFEI |     1 |    14 |   103   (1)| 00:00:02 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("ID"=100 AND "NAME"=:B))

--重新硬解析
SQL> alter system flush shared_pool;

System altered.

SQL> var b varchar2(30);
SQL> exec :b := 'www.orasos.com';

PL/SQL procedure successfully completed.

SQL> select * from t_xifenfei where name=:b and id=100;

no rows selected

SQL> @get_plan.sql

Rollback complete.

Enter value for hash_value: 4157424768
select * from t_xifenfei where name=:b and id=100

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI |     1 |    14 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I_XIFENFEI |     6 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"=100)
   2 - access("NAME"=:B)
SQL> var b varchar2(30);
SQL> exec :b := 'xifenfei';

PL/SQL procedure successfully completed.

SQL> select * from t_xifenfei where name=:b and id=100;

        ID NAME
---------- ------------------------------------------------------------
       100 xifenfei

1 row selected.

SQL> @get_plan.sql

Rollback complete.

Enter value for hash_value: 4157424768
select * from t_xifenfei where name=:b and id=100

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T_XIFENFEI |     1 |    14 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I_XIFENFEI |     6 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"=100)
   2 - access("NAME"=:B)
--虽然oracle 11g宣称在在Bind Variable Peeking上增强了很多,
--但是这里的实验,依然证明他存在问题,导致执行计划不正确

通过整体实验过程,证明几个问题:
1.默认的的DBMS_STATS收集统计信息不一定使得所有执行计划均正确,特别在数据很不均匀分布时.
2.AUTOTRACE不能跟踪Bind Variable Peeking
3.Bind Variable Peeking是在硬解析时候生效,虽然11g进行了改善,但是有些时候效果还是不明显,如果数据很不均匀,在发现sql语句很多不合适的时候,建议先删除该sql的执行计划,让其再次硬解析,碰碰运气,如果一直效果不好,建议不适用绑定参数形式(正确的执行计划,更多的硬解析)
4._optim_peek_user_binds参数可以关闭Bind Variable Peeking功能,很不推荐.