提高短连接性能方法测试

Posted on 2012 年 05 月 12 日 by 惜分飞

创建测试脚本
通过在三个会话中同时执行test_login.sh脚本,模拟当数据库多个短连接情况性能

[oracle@xifenfei tmp]$ more test_login.sh 
#!/bin/bash
echo "start login database `date`*********" >>/tmp/test_1.log
e=2000
for((i=1;i<=$e;i=i+1))
do
/tmp/login_oracle.sh 
done
echo "end login database `date`*********" >>/tmp/test_1.log

[oracle@xifenfei tmp]$ more login_oracle.sh 
#!/bin/bash
sqlplus chf/xifenfei@ORA11G_P<<XFF>/dev/null 
select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') from dual;
exit
XFF 

--ORA11G_P根据不同的测试情景指定不同名称

情况1:一个监听情况下

start login database Tue May  1 18:03:32 CST 2012*********
start login database Tue May  1 18:03:35 CST 2012*********
start login database Tue May  1 18:03:37 CST 2012*********

end login database Tue May  1 18:08:20 CST 2012*********
end login database Tue May  1 18:08:25 CST 2012*********
end login database Tue May  1 18:08:26 CST 2012*********

--计算2000个会话登录/查询/推出时间
4:48
4:40
4:49

情况2:三个监听,客户端配置tns负载均衡

--监听配置
LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.10)(PORT = 1521))
    )
  )
SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
     (GLOBAL_DBNAME = ora11g)
     (ORACLE_HOME = /u01/oracle/oracle/product/11.2.0/db_1)
     (SID_NAME = ora11g)
    )
  )
ADR_BASE_LISTENER = /u01/oracle

LISTENER1 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.10)(PORT = 1522))
    )
  )
SID_LIST_LISTENER1 =
  (SID_LIST =
    (SID_DESC =
     (GLOBAL_DBNAME = ora11g)
     (ORACLE_HOME = /u01/oracle/oracle/product/11.2.0/db_1)
     (SID_NAME = ora11g)
    )
  )
ADR_BASE_LISTENER1 = /u01/oracle


LISTENER2 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.10)(PORT = 1523))
    )
  )
SID_LIST_LISTENER2 =
  (SID_LIST =
    (SID_DESC =
     (GLOBAL_DBNAME = ora11g)
     (ORACLE_HOME = /u01/oracle/oracle/product/11.2.0/db_1)
     (SID_NAME = ora11g)
    )
  )
ADR_BASE_LISTENER2 = /u01/oracle

--tns配置
ORA11G_M =
  (DESCRIPTION =
      (LOAD_BALANCE=ON)
      (FAILOVER=ON)
      (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.10)(PORT = 1521))
       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.10)(PORT = 1522))
       (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.10)(PORT = 1523))
       (LOAD_BALANCE = yes)
    )
    (CONNECT_DATA =
     (SERVER=DEDICATED)
      (SERVICE_NAME = ora11g)
    )
  )

--测试结果
start login database Tue May  1 17:51:45 CST 2012*********
start login database Tue May  1 17:51:49 CST 2012*********
start login database Tue May  1 17:51:51 CST 2012*********

end login database Tue May  1 17:55:58 CST 2012*********
end login database Tue May  1 17:56:06 CST 2012*********
end login database Tue May  1 17:56:09 CST 2012*********

--计算2000个会话登录/查询/推出时间
4:13
4:17
4:18

情况2:使用常驻连接池DRCP(11g新特性)

--启动默认DRCP
SQL> exec dbms_connection_pool.start_pool();

PL/SQL procedure successfully completed.

--tns配置
ORA11G_P =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.10)(PORT = 1521))
    )
    (CONNECT_DATA =
     (SERVER=POOLED)  --注意
      (SERVICE_NAME = ora11g)
    )
  )

--执行结果
start login database Tue May  1 18:19:58 CST 2012*********
start login database Tue May  1 18:20:01 CST 2012*********
start login database Tue May  1 18:20:03 CST 2012*********

end login database Tue May  1 18:23:16 CST 2012*********
end login database Tue May  1 18:23:19 CST 2012*********
end login database Tue May  1 18:23:21 CST 2012*********

--计算2000个会话登录/查询/推出时间
3:16
3:18
3:19

总结
如果在数据库短连接过程中发现监听是瓶颈的时候,可以考虑使用多个监听+tns 负载均衡,从一定程度上缓解监听瓶颈.如果是11g数据库可以考虑使用其心功能DRCP,从而很大程度上提高短连接过程中数据库的效率.因为DRCP还属于11g的新功能稳定性不知道如何?使用该功能前,请一定要做好相关测试工作.如有可能还是建议从应用层面尽可能的使用长连接,提高数据库会话效率.

shared pool latch 等待事件

Posted on 2012 年 05 月 11 日 by 惜分飞

shared pool latch相关描述

The shared pool latch is used to protect critical operations when allocating 
and freeing memory in the shared pool. 
If an application makes use of literal (unshared) SQL then this can severely 
limit scalability and throughput. The cost of parsing a new SQL statement is 
expensive both in terms of CPU requirements and the number of times the library 
cache and shared pool latches may need to be acquired and released. Before Oracle9, 
there was just one such latch for the entire database to protect the allocation of 
memory in the library cache. In Oracle9, multiple children were introduced to relieve
contention on this resource.

减少shared pool latch方法

Avoid hard parses when possible, parse once, execute many. 
Eliminate  literal SQL so that same sql is shared by many sessions.
Size the shared_pool adequately to avoid reloads
Use of MTS (shared server option) also greatly influences the shared pool latch.

查询未绑定sql

--9i
SELECT substr(sql_text,1,40) "SQL", 
         count(*) , 
         sum(executions) "TotExecs"
    FROM v$sqlarea
   WHERE executions < 5
   GROUP BY substr(sql_text,1,40)
  HAVING count(*) > 30
   ORDER BY 2
  ;

--10g及其以后版本
SET pages 10000
SET linesize 250
column FORCE_MATCHING_SIGNATURE format 99999999999999999999999
WITH c AS
     (SELECT  FORCE_MATCHING_SIGNATURE,
              COUNT(*) cnt
     FROM     v$sqlarea
     WHERE    FORCE_MATCHING_SIGNATURE!=0
     GROUP BY FORCE_MATCHING_SIGNATURE
     HAVING   COUNT(*) > 20
     )
     ,
     sq AS
     (SELECT  sql_text                ,
              FORCE_MATCHING_SIGNATURE,
              row_number() over (partition BY FORCE_MATCHING_SIGNATURE ORDER BY sql_id DESC) p
     FROM     v$sqlarea s
     WHERE    FORCE_MATCHING_SIGNATURE IN
              (SELECT FORCE_MATCHING_SIGNATURE
              FROM    c
              )
     )
SELECT   sq.sql_text                ,
         sq.FORCE_MATCHING_SIGNATURE,
         c.cnt "unshared count"
FROM     c,
         sq
WHERE    sq.FORCE_MATCHING_SIGNATURE=c.FORCE_MATCHING_SIGNATURE
AND      sq.p                       =1
ORDER BY c.cnt DESC

查询数据库整体解析情况

select to_char(100 * sess / calls, '999999999990.00') || '%' cursor_cache_hits, 
to_char(100 * (calls - sess - hard) / calls, '999990.00') || '%' soft_parses, 
to_char(100 * hard / calls, '999990.00') || '%' hard_parses 
from ( select value calls from v$sysstat where name = 'parse count (total)' ), 
( select value hard from v$sysstat where name = 'parse count (hard)' ), 
( select value sess from v$sysstat where name = 'session cursor cache hits' );

参考:Note 62143.1 Troubleshooting: Tuning the Shared Pool and Tuning Library Cache Latch

Oracle 8.0.5 安装过程截图

Posted on 2012 年 05 月 11 日 by 惜分飞

This gallery contains 7 photos.

作为新一代dba(包括我),很少有机会能够接触到ORACLE 8.0.5数据库.今天无意中获得该版本软件安装包,赶紧安装截图出来和大家分享. ORACLE 8.0.5完整安装截图下载oracle_8.0.5_install_pic

cache buffer lru chain latch等待事件

Posted on 2012 年 05 月 10 日 by 惜分飞

cache buffer lru chain latch官方解释

The cache buffer lru chain latch is acquired in order to introduce a new block into the buffer cache 
and when writing a buffer back to disk, specifically when trying to scan the LRU (least recently used) chain 
containing all the dirty blocks in the buffer cache.

cache buffer lru chain latch可能原因

想查看或者修改LRU+LRUW的进程，始终要持有cache buffers lru chain latch。
若在此过程中发生争用，则要等待latch:cache buffers lru chain 事件。
总结出来如下两种情况会导致cache buffers lru chain latch:
1.进程欲读取还没有装载到内存上的块时，通过查询LRU 列分配到所需空闲缓冲区，在此过程中需要cache buffers lru chain latch。
2.DBWR 为了将脏缓冲区记录到文件上，查询LRUW 列，将相应缓冲区移动到LRU 列的过程中也要获得cache buffers lru chain latch。
2.1)DBWR在如下情况下将脏缓冲区记录到文件里。
2.2)Oracle 进程为了获得空闲缓冲区，向DBWR 请求记录脏缓冲区时；
2.3)Oracle进程为执行Parallel Query 或Tablespace Backup，Truncate/Drop 等工作，请求记录相关对象的脏缓冲区时； 
2.4)周期性或管理上的原因检查点（checkpointing）被执行时。
2.5)Oracle 为了保障将通过FAST_START_MTTR_TARGET（或LOG_CHECKPOINT_TIMEOUT）指定的时间的恢复，周期性执行检查点。
2.6)管理员执行检查点命令或根据日志文件切换，也会发生检查点。

cache buffers lru chain latch争用的最重要的原因是过多请求空闲缓冲区。低效的SQL语句是过多请求空闲缓冲区的最典型情况，若多个会话同时执行低效的SQL语句，则在查询空闲缓冲区过程中和记录脏缓冲区的过程中，为了获取buffers lru chain latch发生争用。多个会话同时扫描不同表或索引时，发生cache buffers lru chain latch争用的概率高。多个会话将各不相同的块载入到内存过程中，确保空闲缓冲区的请求会增多，因此发生对工作组争用的概率将提高。特别是因为数据修改频繁，以至于脏缓冲区数量多，正因此DBWR 因为检查点而查询LRUW 列的次数频繁，所以cache buffers lru chain latch争用将更加严重。cache buffers lru chain latch争用的另一个重要特点就是伴随着物理I/O。若是低效的索引扫描引起的问题，则同时发生db file sequential read 等待和lru chain latch争用；若是不必要的全表扫描引起的问题，则同时发生db file scattered read 等待和lru chain latch争用。事实上，cache buffers chains latch争用和cache buffers lru chain latch争用同时发生的情况较多，因为复杂的应用程序将复合地应用上述模式。data buffer过小或检查点周期过短时，也会增加cache buffers lru chain latch争用；但是现在的数据库的data buffer都不会太小,而检查点周期一般使用缺省值，所以通常定位cache buffers lru chain latch的原因还是在低效的SQL语句上

CACHE BUFFERS CHAINS等待事件

Posted on 2012 年 05 月 09 日 by 惜分飞

关于CACHE BUFFERS CHAINS描述

CACHE BUFFERS CHAINS latch is acquired when searching for data blocks cached in the buffer cache. 
Since the Buffer cache is implemented as a sum of chains of blocks, each of those chains is protected 
by a child of this latch when needs to be scanned. Contention in this latch can be caused by very heavy 
access to a single block. This can require the application to be reviewed.

产生CACHE BUFFERS CHAINS原因

The main cause of the cache buffers chains latch contention is usually a hot block issue. 
This happens when multiple sessions repeatedly access one or more blocks that are protected 
by the same child cache buffers chains latch.

CACHE BUFFERS CHAINS 处理方法
1) Examine the application to see if the execution of certain DML and SELECT statements can be reorganized to eliminate contention on the object.

处理方法如下:
--通过报告确定latch: cache buffers chains 等待

Top 5 Timed Events                                      Avg    %Total
~~~~~~~~~~~~~~~~~~                                      wait   Call
Event                          Waits        Time (s)    (ms)   Time   Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
latch: cache buffers chains          74,642      35,421    475    6.1 Concurrenc
CPU time                                         11,422           2.0
log file sync                        34,890       1,748     50    0.3 Commit
latch free                            2,279         774    340    0.1 Other
db file parallel write               18,818         768     41    0.1 System I/O
-------------------------------------------------------------

--找出逻辑读高sql
SQL ordered by Gets         DB/Inst:  Snaps: 1-2
-> Resources reported for PL/SQL code includes the resources used by all SQL
statements called by the code.
-> Total Buffer Gets:   265,126,882
-> Captured SQL account for   99.8% of Total


                            Gets                CPU      Elapsed
Buffer Gets    Executions   per Exec     %Total Time (s) Time (s)  SQL Id
-------------- ------------ ------------ ------ -------- --------- -------------
   256,763,367       19,052     13,477.0   96.8 ######## ######### a9nchgksux6x2
Module: JDBC Thin Client
SELECT * FROM SALES ....

     1,974,516      987,056          2.0    0.7    80.31    110.94 ct6xwvwg3w0bv
SELECT COUNT(*) FROM ORDERS ....

--逻辑读大对象
Segments by Logical Reads           
-> Total Logical Reads:     265,126,882
-> Captured Segments account for   98.5% of Total

           Tablespace                      Subobject  Obj.       Logical
Owner         Name    Object Name            Name     Type         Reads  %Total
---------- ---------- -------------------- ---------- ----- ------------ -------
DMSUSER    USERS      SALES                           TABLE  212,206,208   80.04
DMSUSER    USERS      SALES_PK                        INDEX   44,369,264   16.74
DMSUSER    USERS      SYS_C0012345                    INDEX    1,982,592     .75
DMSUSER    USERS      ORDERS_PK                       INDEX      842,304     .32
DMSUSER    USERS      INVOICES                        TABLE      147,488     .06
          -------------------------------------------------------------
处理思路：
1.Look for SQL that accesses the blocks in question and determine if the repeated reads are necessary. 
  This may be within a single session or across multiple sessions.

2.Check for suboptimal SQL (this is the most common cause of the events)  
 look at the execution plan for the SQL being run and try to reduce the 
 gets per executions which will minimize the number of blocks being accessed 
 and therefore reduce the chances of multiple sessions contending for the same block.

Note:1342917.1 Troubleshooting ‘latch: cache buffers chains’ Wait Contention

2) Decrease the buffer cache -although this may only help in a small amount of cases.

3) DBWR throughput may have a factor in this as well.If using multiple DBWR’s then increase the number of DBWR’s.

4) Increase the PCTFREE for the table storage parameters via ALTER TABLE or rebuild. This will result in less rows per block.

找出热点对象
First determine which latch id(ADDR) are interesting by examining the number of 
sleeps for this latch. The higher the sleep count, the more interesting the 
latch id(ADDR) is:

SQL> select CHILD#  "cCHILD"
     ,      ADDR    "sADDR"
     ,      GETS    "sGETS"
     ,      MISSES  "sMISSES"
     ,      SLEEPS  "sSLEEPS" 
     from v$latch_children 
     where name = 'cache buffers chains'
     order by 5, 1, 2, 3;

Run the above query a few times to to establish the id(ADDR) that has the most 
consistent amount of sleeps. Once the id(ADDR) with the highest sleep count is found
then this latch address can be used to get more details about the blocks
currently in the buffer cache protected by this latch. 
The query below should be run just after determining the ADDR with 
the highest sleep count.

SQL> column segment_name format a35
     select /*+ RULE */
       e.owner ||'.'|| e.segment_name  segment_name,
       e.extent_id  extent#,
       x.dbablk - e.block_id + 1  block#,
       x.tch,
       l.child#
     from
       sys.v$latch_children  l,
       sys.x$bh  x,
       sys.dba_extents  e
     where
       x.hladdr  = '&ADDR' and
       e.file_id = x.file# and
       x.hladdr = l.addr and
       x.dbablk between e.block_id and e.block_id + e.blocks -1
     order by x.tch desc ;

Example of the output :
SEGMENT_NAME                     EXTENT#      BLOCK#       TCH    CHILD#
-------------------------------- ------------ ------------ ------ ----------
SCOTT.EMP_PK                       5            474          17     7,668
SCOTT.EMP                          1            449           2     7,668

Depending on the TCH column (The number of times the block is hit by a SQL 
statement), you can identify a hot block. The higher the value of the TCH column,
the more frequent the block is accessed by SQL statements.

5) Consider implementing reverse key indexes (if range scans aren’t commonly used against the segment)