手动提交分布式事务一例

一.alert文件中出现了很多类此记录

Fri Feb 10 05:25:01 2012
Errors in file /tmp/recover/ahcx216_reco_7956.trc:
ORA-12154: TNS:could not resolve service name
Fri Feb 10 05:25:01 2012
Errors in file /tmp/recover/ahcx216_reco_7956.trc:
ORA-12154: TNS:could not resolve service name

这里可以看出来两个信息:
1.出错的进程是rec0进程,而该进程的作用是解决分布式事务失败后遗留问题(事务提交或者回滚)
2.错误信息是tns不能被解析

二.查看trace文件

/tmp/recover/ahcx216_reco_7956.trc
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0/db_1
System name:    Linux
Node name:      localhost.localdomain
Release:        2.6.9-89.0.0.0.1.ELhugemem
Version:        #1 SMP Tue May 19 04:38:38 EDT 2009
Machine:        i686
Instance name: ahcx216
Redo thread mounted by this instance: 1
Oracle process number: 7
Unix process pid: 7956, image: oracle@localhost.localdomain (RECO)


*** SESSION ID:(6.1) 2012-02-10 04:58:24.886
*** 2012-02-10 04:58:24.886
ERROR, tran=6.1.712757, session#=1, ose=0:
ORA-12154: TNS:could not resolve service name
………………
*** 2012-02-10 05:25:01.580
ERROR, tran=6.1.712757, session#=1, ose=0:
ORA-12154: TNS:could not resolve service name
*** 2012-02-10 05:25:01.627
ERROR, tran=12.19.99059, session#=1, ose=0:
ORA-12154: TNS:could not resolve service name

通过这里我们可以看出事务id分别为12.19.99059和6.1.712757有问题

三.查看dba_2pc_pending视图

SQL> select local_tran_id,state,fail_time,retry_time from dba_2pc_pending;

LOCAL_TRAN_ID     STATE            FAIL_TIME           RETRY_TIME
-------------- ---------------- --------------       --------------
6.1.712757	collecting	2010/12/1 13:39:03   2012/2/10 5:38:52
12.19.99059	collecting	2010/12/1 15:56:26   2012/2/10 5:38:52

可以看出,果然有两个分布式事务在2010年12月1日出了问题(本库是一个问题库,在把库拉起来后发现该问题)
因为是异机恢复,而且间隔时间较长,很多tns的信息都已经不存在,所以需要手工提交分布式事务

四.手动提交事务

SQL> BEGIN
  2  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757');
  3  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
  4  END;
  5  /
BEGIN
*
第 1 行出现错误:
ORA-30019: Illegal rollback Segment operation in Automatic Undo mode
ORA-06512: at "SYS.DBMS_TRANSACTION", line 65
ORA-06512: at "SYS.DBMS_TRANSACTION", line 85
ORA-06512: at line 2

SQL> alter session set "_smu_debug_mode"=4;

Session altered.

-------------------------------------------------------------
--设置UNDO_SUPPRESS_ERRORS=true也可以解决此问题
alter system set UNDO_SUPPRESS_ERRORS = TRUE;
EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('<事务ID>');
commit;
alter system set UNDO_SUPPRESS_ERRORS = false;
--------------------------------------------------------------

SQL> commit;

SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757');

PL/SQL procedure successfully completed.


SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
BEGIN DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059'); END;

*
ERROR at line 1:
ORA-01453: SET TRANSACTION must be first statement of transaction
ORA-06512: at "SYS.DBMS_TRANSACTION", line 65
ORA-06512: at "SYS.DBMS_TRANSACTION", line 85
ORA-06512: at line 1
--第一个分布式事务处理后,未提交导致

SQL> commit;

Commit complete.

SQL> exec DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');

PL/SQL procedure successfully completed.

SQL> commit;

Commit complete.

五.补充说明
开始在另一个会话中,执行失败原因

SQL> alter session set "_smu_debug_mode"=4;

会话已更改。

SQL> commit;

提交完成。

SQL> BEGIN
  2  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('6.1.712757');
  3  DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('12.19.99059');
  4  END;
  5  /
BEGIN
*
第 1 行出现错误:
ORA-01453: SET TRANSACTION must be first statement of transaction
ORA-06512: at "SYS.DBMS_TRANSACTION", line 65
ORA-06512: at "SYS.DBMS_TRANSACTION", line 85
ORA-06512: at line 3

因为这里的begin end中包含了两个事务的清理,在清理完第一个事务之后,需要提交才能够清理第二个,这里因为没有提交,导致ORA-01453错误。

通过sql语句获取awr/statspack逻辑读/物理读

在日常的性能监控中,我们经常需要需要通过一段时间内数据库的逻辑读/物理读来大致反映数据库的性能情况,这里列出通过awr和statspack来获取相关数据(查询最近30天数据,除掉第一条和最后一条数据)

awr逻辑读

WITH A AS
 (SELECT B.SNAP_ID, SUM(VALUE)/1024/1024/1024*8192 VALUE
    FROM DBA_HIST_SYSSTAT B
   WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
          AND b.INSTANCE_NUMBER = &INST_NUM
     AND B.STAT_NAME IN ('session logical reads')
   GROUP BY B.SNAP_ID
   ORDER BY SNAP_ID)
SELECT A.SNAP_ID,
       LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "START_VALUE(G)",
       VALUE "END_VALUE(G)",
       TO_CHAR(END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS')"END_TIME",
       VALUE - LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "D-VALUE(G)"
  FROM A,(SELECT END_INTERVAL_TIME,SNAP_ID
    FROM DBA_HIST_SNAPSHOT B
   WHERE  B.DBID = (SELECT dbid FROM v$database)
     AND B.INSTANCE_NUMBER = &INST_NUM) B
     WHERE A.SNAP_ID=B.snap_id
     AND END_INTERVAL_TIME>=SYSDATE-30;

awr物理读

WITH A AS
 (SELECT B.SNAP_ID, SUM(VALUE)/1024/1024/1024*8192 VALUE
    FROM DBA_HIST_SYSSTAT B
   WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
          AND b.INSTANCE_NUMBER = &INST_NUM
     AND B.STAT_NAME IN ('physical reads')
   GROUP BY B.SNAP_ID
   ORDER BY SNAP_ID)
SELECT A.SNAP_ID,
       LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "START_VALUE(G)",
       VALUE "END_VALUE(G)",
       TO_CHAR(END_INTERVAL_TIME,'YYYY-MM-DD HH24:MI:SS')"END_TIME",
       VALUE - LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "D-VALUE(G)"
  FROM A,(SELECT END_INTERVAL_TIME,SNAP_ID
    FROM DBA_HIST_SNAPSHOT B
   WHERE  B.DBID = (SELECT dbid FROM v$database)
     AND B.INSTANCE_NUMBER = &INST_NUM) B
     WHERE A.SNAP_ID=B.snap_id
     AND END_INTERVAL_TIME>=SYSDATE-30;

statspack逻辑读

WITH A AS
 (SELECT B.SNAP_ID, SUM(VALUE)/1024/1024/1024*8192 VALUE
    FROM STATS$SYSSTAT B
   WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
     AND B.INSTANCE_NUMBER = &INST_NUM
     AND B.NAME IN ('session logical reads')
   GROUP BY B.SNAP_ID
   ORDER BY SNAP_ID)
SELECT A.SNAP_ID,
       LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "START_VALUE(G)",
       VALUE "END_VALUE(G)",
       TO_CHAR(SNAP_TIME, 'YYYY-MM-DD HH24:MI:SS') "END_TIME",
       VALUE - LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "D-VALUE(G)"
  FROM A,
       (SELECT SNAP_TIME, SNAP_ID
          FROM STATS$SNAPSHOT B
         WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
           AND B.INSTANCE_NUMBER = &INST_NUM) B
 WHERE A.SNAP_ID = B.SNAP_ID
  AND snap_time>=SYSDATE-30;

statspack物理读

WITH A AS
 (SELECT B.SNAP_ID, SUM(VALUE)/1024/1024/1024*8192 VALUE
    FROM STATS$SYSSTAT B
   WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
     AND B.INSTANCE_NUMBER = &INST_NUM
     AND B.NAME IN ('physical reads')
   GROUP BY B.SNAP_ID
   ORDER BY SNAP_ID)
SELECT A.SNAP_ID,
       LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "START_VALUE(G)",
       VALUE "END_VALUE(G)",
       TO_CHAR(SNAP_TIME, 'YYYY-MM-DD HH24:MI:SS') "END_TIME",
       VALUE - LAG(VALUE, 1, '0') OVER(ORDER BY A.SNAP_ID) "D-VALUE(G)"
  FROM A,
       (SELECT SNAP_TIME, SNAP_ID
          FROM STATS$SNAPSHOT B
         WHERE B.DBID = (SELECT DBID FROM V$DATABASE)
           AND B.INSTANCE_NUMBER = &INST_NUM) B
 WHERE A.SNAP_ID = B.SNAP_ID
 AND snap_time>=SYSDATE-30;

ODU恢复PACKAGE/VIEW/DBLINK等数据脚本操作

当数据库不能拉起来时候,我们可以通过odu找回丢失的数据,今天客户出要求,需要我帮忙找回不能起来库中的包,过程,函数,视图,触发器等之类东西。这些东西都是存在system空间中,存在一些系统的基表中,我们如果能够拿到这些基表的数据然后进行处理,原则上就可以得到客户需要的东西
一、PROCEDURE/FUNCTION/PACKAGE/PACKAGE BODY/TRIGGER/TYPE/TYPE BODY之类恢复
1. 查看这些数据存储在什么基表中

select u.name, o.name,
decode(o.type#, 7, 'PROCEDURE', 8, 'FUNCTION', 9, 'PACKAGE',
               11, 'PACKAGE BODY', 12, 'TRIGGER', 13, 'TYPE', 14, 'TYPE BODY',
               'UNDEFINED'),
s.line, s.source
from sys.obj$ o, sys.source$ s, sys.user$ u
where o.obj# = s.obj#
  and o.owner# = u.user#
  and ( o.type# in (7, 8, 9, 11, 12, 14) OR
       ( o.type# = 13 AND o.subname is null))
union all
select u.name, o.name, 'JAVA SOURCE', s.joxftlno, s.joxftsrc
from sys.obj$ o, x$joxfs s, sys.user$ u
where o.obj# = s.joxftobn
  and o.owner# = u.user#
  and o.type# = 28;

通过这个sql知道主要存储在sys.obj$ o, sys.source$ s, sys.user$中(至于x$joxfs表,我暂时在数据库中未找到,不能使用odu提取数据,也就是说我这里不能恢复出来JAVA SOURCE)

2. 使用odu导出数据(注意output_format设置为dmp,防止出现意外)
使用类此unload table sys.obj$ 下面语句导出数据

3. 加载这些数据到新库中(切记不要使用sys schema)
使用类此imp chf/xifenfei file=d:/SYS_OBJ$.dmp fromuser=sys touser=chf导入数据

4. 在查询出需要对象

   SELECT u.name USER_NAME,
         o.name OBJECT_nAME,
         DECODE (o.type#,
                 7, 'PROCEDURE',
                 8, 'FUNCTION',
                 9, 'PACKAGE',
                 11, 'PACKAGE BODY',
                 12, 'TRIGGER',
                 13, 'TYPE',
                 14, 'TYPE BODY',
                 'UNDEFINED')
            OBJECT_tYPE,
         s.line,
         s.source
    FROM chf.obj$ o, chf.source$ s, chf.user$ u
   WHERE o.obj# = s.obj# AND o.owner# = u.user#
         AND (o.type# IN (7, 8, 9, 11, 12, 14)
              OR (o.type# = 13 AND o.subname IS NULL))
--过来条件选择出需要对象
-- AND U.name='BSSADMIN'
ORDER BY U.NAME,
         O.NAME,
         O.TYPE#,
         S.LINE;

5. 使用说明
在line=1的source前面加上create or replace ,使用合适的用户创建需要对象

二、VIEW恢复
1. 找出视图存储基表
sys.obj$,sys.view$,sys.user$
2. 使用odu导出需要表
3. 使用imp导入数据库
4. 查询语句

    SELECT u.name username, o.name viewname, v.text
  FROM chf.obj$ o, chf.view$ v, chf.user$ u
 WHERE o.obj# = v.obj# AND o.owner# = u.user# 
 --过滤条件,挑选需要视图
 AND u.name = 'MAS_ADMIN';

5. 使用说明
create VIEW OR REPLACE username.viewname AS+TEXT中内容

三、dblink恢复
1. 找出关联基表
Sys.link$和sys.user$
2. odu导出相关表
3. imp导入数据
4. 查询语句

SELECT U.NAME   USERNAME,
       L.NAME   DBLINK_NAME,
       L.USERID LOGIN_USER,
       L.HOST   TNS,
       L.CTIME
  FROM CHF.LINK$ L, CHF.USER$ U
 WHERE L.OWNER# = U.USER#
  --过滤条件
   AND U.NAME = 'PUBLIC'

5. 使用说明
根据查询出来信息,自己创建DBLINK

FAST_START_PARALLEL_ROLLBACK与回滚恢复

1.模拟产生大事务需回滚

[oracle@node1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Thu Feb 16 12:47:08 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options

SQL> show sga;

Total System Global Area  622149632 bytes
Fixed Size                  2230912 bytes
Variable Size             406848896 bytes
Database Buffers          205520896 bytes
Redo Buffers                7548928 bytes

SQL> create table chf.xifenfei tablespace users 
  2  as
  3  select * from dba_objects;

--下面两句多次执行
SQL> insert into chf.xifenfei 
  2  select * from chf.xifenfei;

73831 rows created.

SQL> commit;

Commit complete.

SQL> select count(*) from  chf.xifenfei;

  COUNT(*)
----------
  18900736

SQL> select bytes/1024/1024 from dba_segments where segment_name='XIFENFEI';

BYTES/1024/1024
---------------
           2103
--删除数据不提交
SQL> delete from chf.xifenfei;

18900736 rows deleted.

--直接kill掉ora_dbw进程

2.FAST_START_PARALLEL_ROLLBACK=LOW(默认值)

SQL> select  undoblockstotal "Total", undoblocksdone "Done", undoblockstotal-undoblocksdone
  2   "ToDo",decode(cputime,0,'unknown',to_char(sysdate+(((undoblockstotal-undoblocksdone) 
  3   / (undoblocksdone / cputime)) / 86400),'yyyy-mm-dd hh24:mi:ss')) 
  4  "Estimated time to complete",to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') 
  5  from v$fast_start_transactions;

     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    545624     103020     442604 2012-02-16 13:57:38 2012-02-16 13:47:02


SQL> /

     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    545624     122614     423010 2012-02-16 13:57:42 2012-02-16 13:47:31

--每秒钟回滚undo数据块数量
SQL> select (122614-103020)/29 from dual;

(122614-103020)/29
------------------
        675.655172

--数据库并发回滚进程数
[oracle@node1 ~]$ ps -ef|grep ora_p0
oracle   24901     1  4 13:44 ?        00:00:15 ora_p000_chf
oracle   24903     1  3 13:44 ?        00:00:12 ora_p001_chf
oracle   24905     1  3 13:44 ?        00:00:12 ora_p002_chf
oracle   24907     1  3 13:44 ?        00:00:12 ora_p003_chf
oracle   24909     1  3 13:44 ?        00:00:12 ora_p004_chf
oracle   24911     1  3 13:44 ?        00:00:12 ora_p005_chf
oracle   24913     1  3 13:44 ?        00:00:12 ora_p006_chf
oracle   24915     1  3 13:44 ?        00:00:12 ora_p007_chf
oracle   24917     1  3 13:44 ?        00:00:12 ora_p008_chf
oracle   24919     1  3 13:44 ?        00:00:12 ora_p009_chf
oracle   24921     1  3 13:44 ?        00:00:12 ora_p010_chf
oracle   24923     1  3 13:44 ?        00:00:12 ora_p011_chf
oracle   24925     1  3 13:44 ?        00:00:12 ora_p012_chf
oracle   24927     1  3 13:44 ?        00:00:12 ora_p013_chf
oracle   24929     1  3 13:44 ?        00:00:12 ora_p014_chf
oracle   24931     1  3 13:44 ?        00:00:12 ora_p015_chf

说明:该机器操作系统是8个CPU

并发数=CPU*2

3.FAST_START_PARALLEL_ROLLBACK =HIGH

SQL> alter system set FAST_START_PARALLEL_ROLLBACK =HIGH;

System altered.

SQL> select  undoblockstotal "Total", undoblocksdone "Done", undoblockstotal-undoblocksdone
  2   "ToDo",decode(cputime,0,'unknown',to_char(sysdate+(((undoblockstotal-undoblocksdone) 
  3   / (undoblocksdone / cputime)) / 86400),'yyyy-mm-dd hh24:mi:ss')) 
  4  "Estimated time to complete",to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') 
  5  from v$fast_start_transactions;

     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    380434      25351     355083 2012-02-16 14:03:53 2012-02-16 13:49:39


SQL> /

     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    380434      39592     340842 2012-02-16 14:03:33 2012-02-16 13:50:12

--每秒钟回滚undo数据块数量
SQL> select (39592-25351)/33 from dual;

(39592-25351)/33
----------------
      431.545455

--数据库并发回滚进程数
[oracle@node1 ~]$ ps -ef|grep ora_p0
oracle   24901     1  4 13:44 ?        00:00:15 ora_p000_chf
oracle   24903     1  3 13:44 ?        00:00:12 ora_p001_chf
oracle   24905     1  3 13:44 ?        00:00:12 ora_p002_chf
oracle   24907     1  3 13:44 ?        00:00:12 ora_p003_chf
oracle   24909     1  3 13:44 ?        00:00:12 ora_p004_chf
oracle   24911     1  3 13:44 ?        00:00:12 ora_p005_chf
oracle   24913     1  3 13:44 ?        00:00:12 ora_p006_chf
oracle   24915     1  3 13:44 ?        00:00:12 ora_p007_chf
oracle   24917     1  3 13:44 ?        00:00:12 ora_p008_chf
oracle   24919     1  3 13:44 ?        00:00:12 ora_p009_chf
oracle   24921     1  3 13:44 ?        00:00:12 ora_p010_chf
oracle   24923     1  3 13:44 ?        00:00:12 ora_p011_chf
oracle   24925     1  3 13:44 ?        00:00:12 ora_p012_chf
oracle   24927     1  3 13:44 ?        00:00:12 ora_p013_chf
oracle   24929     1  3 13:44 ?        00:00:12 ora_p014_chf
oracle   24931     1  3 13:44 ?        00:00:12 ora_p015_chf
oracle   25072     1  0 13:48 ?        00:00:01 ora_p016_chf
oracle   25074     1  0 13:48 ?        00:00:01 ora_p017_chf
oracle   25076     1  0 13:48 ?        00:00:01 ora_p018_chf
oracle   25078     1  0 13:48 ?        00:00:01 ora_p019_chf
oracle   25080     1  0 13:48 ?        00:00:01 ora_p020_chf
oracle   25082     1  0 13:48 ?        00:00:01 ora_p021_chf
oracle   25084     1  0 13:48 ?        00:00:01 ora_p022_chf
oracle   25086     1  0 13:48 ?        00:00:01 ora_p023_chf
oracle   25088     1  0 13:48 ?        00:00:01 ora_p024_chf
oracle   25090     1  0 13:48 ?        00:00:01 ora_p025_chf
oracle   25092     1  0 13:48 ?        00:00:01 ora_p026_chf
oracle   25094     1  0 13:48 ?        00:00:01 ora_p027_chf
oracle   25096     1  0 13:48 ?        00:00:01 ora_p028_chf
oracle   25098     1  0 13:48 ?        00:00:01 ora_p029_chf
oracle   25100     1  0 13:48 ?        00:00:01 ora_p030_chf
oracle   25102     1  0 13:48 ?        00:00:01 ora_p031_chf

1.说明问题:直接修改FAST_START_PARALLEL_ROLLBACK =HIGH后,
  数据库在原来并发进程基础上,又重新启动额外进程

2.修改FAST_START_PARALLEL_ROLLBACK后,以前回滚过的数据块是成功的
  v$fast_start_transactions视图重新开始计算

3.并发数=CPU*4
[/shell]

<strong>4.FAST_START_PARALLEL_ROLLBACK=FALSE</strong>
1
SQL> alter system set FAST_START_PARALLEL_ROLLBACK=FALSE;

System altered.

--直接修改为FALSE后,观察到数据库的并发等资源都没有释放,重启数据库释放资源继续试验
SQL> startup force;
ORACLE instance started.

Total System Global Area  622149632 bytes
Fixed Size                  2230912 bytes
Variable Size             406848896 bytes
Database Buffers          205520896 bytes
Redo Buffers                7548928 bytes
Database mounted.
Database opened.
SQL> select  undoblockstotal "Total", undoblocksdone "Done", undoblockstotal-undoblocksdone
  2   "ToDo",decode(cputime,0,'unknown',to_char(sysdate+(((undoblockstotal-undoblocksdone) 
  3   / (undoblocksdone / cputime)) / 86400),'yyyy-mm-dd hh24:mi:ss')) 
  4  "Estimated time to complete",to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') 
  5  from v$fast_start_transactions;

     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    306828      15735     291093 2012-02-16 14:04:34 2012-02-16 13:52:33


SQL> /

     Total       Done       ToDo Estimated time to c TO_CHAR(SYSDATE,'YY
---------- ---------- ---------- ------------------- -------------------
    306828      65861     240967 2012-02-16 14:05:15 2012-02-16 13:54:46

--每秒钟回滚undo数据块数量
SQL> select (65861-15735)/133 from dual;

(65861-15735)/133
-----------------
       376.887218

--数据库并发回滚进程数
[oracle@node1 ~]$ ps -ef|grep ora_p0
oracle   25252     1  1 13:51 ?        00:00:00 ora_p000_chf
oracle   25254     1  0 13:51 ?        00:00:00 ora_p001_chf
oracle   25256     1  1 13:51 ?        00:00:00 ora_p002_chf
oracle   25258     1  1 13:51 ?        00:00:00 ora_p003_chf
oracle   25260     1  1 13:51 ?        00:00:00 ora_p004_chf
oracle   25262     1  1 13:51 ?        00:00:00 ora_p005_chf
oracle   25264     1  1 13:51 ?        00:00:00 ora_p006_chf

1.数据库重启后,以前的回滚依然生效(v$fast_start_transactions.undoblockstotal变小)

2.FAST_START_PARALLEL_ROLLBACK=false还是有并发,而非官方文档描述(Parallel rollback is disabled)

5.总结

通过这三种情况下的每秒钟回滚undo数据块数量比较可以知道在LOW状态下最快,HIGH状态下次之,FALSE最慢。其实这个实验没有任何实际说明力,只是想说明几个问题:
1)Oracle大事物回滚,是没有办法取消,但是可以通过FAST_START_PARALLEL_ROLLBACK干预回滚速度
2)数据库的并发效率高于低,取决于系统的资源情况(如果你系统的cpu非常强大,那么可能设置HIGH速度最快)
3)回滚的数据类型,在回滚表中数据时可能设置并发比FALSE快,
但是如果是要回滚串行数据(如:index),那么可能串行方法方式速度更快
4)根据你的系统的使用状况,比如你想让系统的业务受到的影响最小,那么设置FALSE可能是个不错的选择。

6.补充官方说明

FAST_START_PARALLEL_ROLLBACK specifies the degree of parallelism used when recovering terminated transactions. 
Terminated transactions are transactions that are active before a system failure. 
If a system fails when there are uncommitted parallel DML or DDL transactions, 
then you can speed up transaction recovery during startup by using this parameter.

Values:

FALSE
Parallel rollback is disabled

LOW
Limits the maximum degree of parallelism to 2 * CPU_COUNT

HIGH
Limits the maximum degree of parallelism to 4 * CPU_COUNT

Note:If you change the value of this parameter, 
then transaction recovery will be stopped and restarted with the new implied degree of parallelism.

遭遇ORA-07445[kkdliac()+346]使用odu抢救数据

1.Oracle启动报错

-bash-2.05b$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.8.0 - Production on 星期三 2月 15 10:31:53 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to an idle instance.

SQL> startup mount
ORACLE instance started.

Total System Global Area 1527846824 bytes
Fixed Size                   452520 bytes
Variable Size             385875968 bytes
Database Buffers         1140850688 bytes
Redo Buffers                 667648 bytes
Database mounted.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

2.alert文件

Wed Feb 15 10:32:06 2012
ALTER DATABASE   MOUNT
Wed Feb 15 10:32:10 2012
Successful mount of redo thread 1, with mount id 412256678
Wed Feb 15 10:32:10 2012
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Wed Feb 15 10:33:44 2012
alter database open
Wed Feb 15 10:33:44 2012
Beginning crash recovery of 1 threads
Wed Feb 15 10:33:44 2012
Started redo scan
Wed Feb 15 10:33:44 2012
Completed redo scan
 0 redo blocks read, 0 data blocks need recovery
Wed Feb 15 10:33:44 2012
Started recovery at
 Thread 1: logseq 2, block 3, scn 2862.4075508322
Wed Feb 15 10:33:44 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: /data1z/oracle/oradata/ahcx216/redo01.log
Wed Feb 15 10:33:44 2012
Completed redo application
Wed Feb 15 10:33:44 2012
Ended recovery at
 Thread 1: logseq 2, block 3, scn 2862.4075528323
 0 data blocks read, 0 data blocks written, 0 redo blocks read
Crash recovery completed successfully
Wed Feb 15 10:33:44 2012
Thread 1 advanced to log sequence 3
Thread 1 opened at log sequence 3
  Current log# 3 seq# 3 mem# 0: /data1z/oracle/oradata/ahcx216/redo03.log
Successful open of redo thread 1
Wed Feb 15 10:33:45 2012
SMON: enabling cache recovery
Wed Feb 15 10:33:45 2012
Errors in file /data1z/oracle/admin/ahcx216/udump/ahcx216_ora_21325.trc:
ORA-07445: exception encountered: core dump [kkdliac()+346] [SIGSEGV] [Address not mapped to object] [0x43] [] []

3.trace文件

/data1z/oracle/admin/ahcx216/udump/ahcx216_ora_21325.trc
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /data1z/oracle/product/9.2
System name:	Linux
Node name:	aiserch1
Release:	2.4.21-27.ELsmp
Version:	#1 SMP Wed Dec 1 21:59:02 EST 2004
Machine:	i686
Instance name: ahcx216
Redo thread mounted by this instance: 1
Oracle process number: 12
Unix process pid: 21325, image: oracle@aiserch1 (TNS V1-V3)

*** SESSION ID:(11.13) 2012-02-15 10:33:44.739
Thread checkpoint rba:0x000002.00000002.0010 scn:0x0b2e.f2eb5261
Cache low rba is 0xffffffff.ffffffff.ffff
Use incremental checkpoint on-disk rba
Thread 1 recovery from rba:0x000002.00000003.0000 scn:0x0b2e.f2eb5262
----- Recovery Hash Table Statistics ---------
Hash table buckets = 32768
Longest hash chain = 0
Average hash chain = 0/0 = 0.0
Max compares per lookup = 0
Avg compares per lookup = 0/0 = 0.0
----------------------------------------------
*** 2012-02-15 10:33:44.766
KCRA: start recovery claims for 0 data blocks
*** 2012-02-15 10:33:44.766
KCRA: buffers claimed = 0/0, eliminated = 0
*** 2012-02-15 10:33:44.766
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
----- Recovery Hash Table Statistics ---------
Hash table buckets = 32768
Longest hash chain = 0
Average hash chain = 0/0 = 0.0
Max compares per lookup = 0
Avg compares per lookup = 0/0 = 0.0
----------------------------------------------
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x43, PC: [0x8880c00, kkdliac()+346]
Registers:
%eax: 0xa0f6b170 %ebx: 0x00000003 %ecx: 0xa0f6b170
%edx: 0x00000001 %edi: 0x00000000 %esi: 0x00000000
%esp: 0xbfff1cb8 %ebp: 0xbfff1cd8 %eip: 0x08880c00
%efl: 0x00010286
  kkdliac()+329 (0x8880bef) jnz 0x8880f61
  kkdliac()+335 (0x8880bf5) mov %edx,%eax
  kkdliac()+337 (0x8880bf7) mov 0xc0(%eax),%edx
  kkdliac()+343 (0x8880bfd) mov %edx,0xffffffe0(%ebp)
> kkdliac()+346 (0x8880c00) movzw 0x40(%ebx),%esi
  kkdliac()+350 (0x8880c04)  mov %si,0xffffffd4(%ebp)
  kkdliac()+354 (0x8880c08) movb 0x42(%ebx),%al
  kkdliac()+357 (0x8880c0b) movb %al,0xffffffd6(%ebp)
  kkdliac()+360 (0x8880c0e) movzw 0x8(%ebx),%edx
*** 2012-02-15 10:33:45.029
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kkdliac()+346] [SIGSEGV] [Address not mapped to object] [0x43] [] []
Current SQL statement for this session:
create table bootstrap$ ( line#         number not null,   obj#           number not null,   
sql_text   varchar2(4000) not null)   storage (initial 50K objno 56 extents (file 1 block 377))
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedmp()+274         call     ksedst()             1 ? 0 ? 0 ? 1 ? 65252838 ?
                                                   2C297862 ?
ssexhd()+1113        call     ksedmp()             3 ? 0 ? 0 ? 0 ? 0 ? 0 ?
__pthread_sighandle  call     00000000             B ? B75CEC90 ? B75CED10 ? 0 ?
r_rt()+122                                         0 ? 0 ?
kkdliac()+346        signal   00000000             B ? B75CEC90 ? B75CED10 ?
ctcdrv()+1688        call     kkdlcob()            BFFF1DC8 ? 2 ? 0 ? 38 ? 0 ?
                                                   0 ? 0 ?
opiexe()+9647        call     ctcdrv()             B749CD78 ? BFFF3258 ?
                                                   BFFF33B0 ?
opiosq0()+1170       call     opiexe()             4 ? 0 ? BFFF379C ?
opiosq()+19          call     opiosq0()            3 ? F ? BFFF4418 ? 0 ?
opiodr()+1133        call     00000000             4A ? F ? BFFF4418 ?
__PGOSF163_rpidrus(  call     opiodr()             4A ? F ? BFFF4418 ? 2 ?
)+145                                              
skgmstack()+137      call     00000000             BFFF3ED0 ? 2 ? BFFF428C ?
                                                   BFFF3EE8 ? BFFF3ED0 ?
                                                   BFFF3EE8 ?
rpidru()+98          call     skgmstack()          BFFF3EE8 ? AF61BC0 ? F618 ?
                                                   821F376 ? BFFF3ED0 ?
rpiswu2()+315        call     00000000             BFFF428C ? 0 ? 0 ? 0 ?
                                                   BFFF3F64 ? F618 ?
rpidrv()+1087        call     rpiswu2()            9D7416FC ? 0 ? BFFF4394 ? 2 ?
                                                   BFFF43B4 ? 0 ?
rpisplu()+298        call     rpidrv()             2 ? 4A ? BFFF4418 ? 8 ?
                                                   9EA3F16C ? B749DE88 ?
rpispl()+28          call     rpisplu()            2 ? 0 ? 0 ? B749DE88 ? BC ?
                                                   0 ? 0 ?
kqlbebs()+781        call     rpispl()             2 ? 0 ? B749DE88 ? BC ? 0 ?
                                                   0 ?
kqlblfc()+106        call     kqlbebs()            0 ? BFFF5590 ?
adbdrv()+2220        call     kqlblfc()            0 ? BFFF5590 ? BFFF5590 ?
opiexe()+10319       call     adbdrv()             B74A022C ? AF5A904 ?
                                                   B74AD040 ? 1 ? 0 ? 1 ?
opiosq0()+1170       call     opiexe()             4 ? 0 ? BFFF6928 ?
kpooprx()+206        call     opiosq0()            3 ? E ? BFFF6A68 ? 24 ?
kpoal8()+599         call     kpooprx()            BFFF7604 ? BFFF7510 ? 13 ?
                                                   1 ? 0 ? 24 ?
opiodr()+1133        call     00000000             5E ? 14 ? BFFF7600 ?
ttcpip()+4250        call     00000000             5E ? 14 ? BFFF7600 ? 0 ?
opitsk()+1077        call     ttcpip()             AF5A900 ? 5E ? BFFF7600 ? 0 ?
                                                   BFFF87D8 ? BFFF87D4 ?
opiino()+1287        call     opitsk()             0 ? 0 ? AF5A900 ? CC349D8 ?
                                                   F1 ? 0 ?
opiodr()+1133        call     00000000             3C ? 4 ? BFFF9B9C ?
opidrv()+418         call     opiodr()             3C ? 4 ? BFFF9B9C ? 0 ?
sou2o()+30           call     opidrv()             3C ? 4 ? BFFF9B9C ?
main()+187           call     sou2o()              BFFF9B80 ? 3C ? 4 ?
                                                   BFFF9B9C ? 890710 ? 0 ?
__libc_start_main()  call     00000000             2 ? BFFF9C04 ? BFFF9C10 ?
+188                                               890518 ? 2 ? 82174E0 ?
 
--------------------- Binary Stack Dump ---------------------
从这里看出数据库是在open过程中,创建bootstrap$错误,这个错误对于现在的我来说,暂时没有办法去解决。
为了能够抢救出客户需要的其中一个用户下面的数据,我不得不采用odu来解决问题。

4.填写ontrol.txt文件

SQL> set pagesize 1000
SQL> set linesize 200
SQL> col name for a50
SQL> select file#,ts#,rfile#,name from v$datafile;

     FILE#        TS#     RFILE# NAME
---------- ---------- ---------- ----------------------------------------------
         1          0          1 /data1z/oracle/oradata/ahcx216/system01
         2          1          2 /data1z/oracle/oradata/ahcx216/undotbs01
         3          3          3 /data1z/oracle/oradata/ahcx216/CITY
         4          4          4 /data1z/oracle/oradata/ahcx216/DATATS
         5          5          5 /data1z/oracle/oradata/ahcx216/indx01
         6          6          6 /data1z/oracle/oradata/ahcx216/tools01
         7          7          7 /data1z/oracle/oradata/ahcx216/users
         8          4          8 /data1z/oracle/oradata/ahcx216/datats02
         9          4          9 /data1z/oracle/oradata/ahcx216/datats03
      ………………

5.登录odu

bash-2.05b$ ./odu

Oracle Data Unloader:Release 3.0.8

Copyright (c) 2008,2009 XiongJun. All rights reserved.

Web: http://www.laoxiong.net
Email: magic007cn@gmail.com

loading default config.......

byte_order little
block_size  8192
db_timezone -7
client_timezone 8
data_path   data
charset_name ZHS16GBK
ncharset_name AL16UTF16
output_format text
lob_storage infile
clob_byte_order little



load control file 'config.txt' successful
loading default control file ......


 ts#   fn  rfn bsize   blocks bf offset filename
---- ---- ---- ----- -------- -- ------ --------------------------------------------
   0    1    1  8192    32000 N       0 /data1z/oracle/oradata/ahcx216/system01
   1    2    2  8192   524288 N       0 /data1z/oracle/oradata/ahcx216/undotbs01
   3    3    3  8192   524288 N       0 /data1z/oracle/oradata/ahcx216/CITY
   4    4    4  8192   524288 N       0 /data1z/oracle/oradata/ahcx216/DATATS
   5    5    5  8192   524288 N       0 /data1z/oracle/oradata/ahcx216/indx01
………………
load control file 'control.txt' successful
loading dictionary data......

6.加载数据字典

ODU> unload dict
CLUSTER C_USER# file_no: 1 block_no: 89
TABLE OBJ$ file_no: 1 block_no: 121
CLUSTER C_OBJ# file_no: 1 block_no: 25
CLUSTER C_OBJ# file_no: 1 block_no: 25
found IND$'s obj# 19
found IND$'s dataobj#:2,ts#:0,file#:1,block#:25,tab#:3
found TABPART$'s obj# 230
found TABPART$'s dataobj#:230,ts#:0,file#:1,block#:1657,tab#:0
found INDPART$'s obj# 234
found INDPART$'s dataobj#:234,ts#:0,file#:1,block#:1689,tab#:0
found TABSUBPART$'s obj# 240
found TABSUBPART$'s dataobj#:240,ts#:0,file#:1,block#:1737,tab#:0
found INDSUBPART$'s obj# 245
found INDSUBPART$'s dataobj#:245,ts#:0,file#:1,block#:1777,tab#:0
found IND$'s obj# 19
found IND$'s dataobj#:2,ts#:0,file#:1,block#:25,tab#:3
found LOB$'s obj# 156
found LOB$'s dataobj#:2,ts#:0,file#:1,block#:25,tab#:6
found LOBFRAG$'s obj# 258
found LOBFRAG$'s dataobj#:258,ts#:0,file#:1,block#:1881,tab#:0

7.列出用户

ODU> list user

     USER#   USERNAME                      
----------   ------------------------------
       …………         
        20   HS_ADMIN_ROLE                 
        25   MAS_ADMIN                     
        24   HF_CX                         
        21   WMSYS                         
        22   WM_ADMIN_ROLE                 
        23   CITY                          
        26   BB_ADMIN                      
        27   AQ_ADMIN                      
        28   WH_ADMIN                      
        29   LA_ADMIN                      
       …………         

8.列出用户表

          
ODU> list table TL_ADMIN 

      OBJ#   OBJECT_NAME                   
----------   ------------------------------
   2286779   WT_MX_201005                  
     32309   BD_DEALER_308                 
    126172   CDMA_FAVOUR_BAOYUE            
     79675   DAISHENG_TMP                  
     52634   RYSJ_T_308                    
     25577   FEE_ID_T                      
    931167   A136_20080304                 
   1514084   WANGTONG_MOKUAI_2009          
………………                     

10.导出用户下表

ODU> unload user TL_ADMIN
Unloading user TL_ADMIN's tables.

Unloading table: WT_MX_201005,object ID: 2286779
Unloading segment,storage(Obj#=2286779 DataObj#=2286779 TS#=3 File#=31 Block#=8571 Cluster=0)
25205 rows unloaded


Unloading table: BD_DEALER_308,object ID: 32309
Unloading segment,storage(Obj#=32309 DataObj#=32309 TS#=3 File#=31 Block#=17587 Cluster=0)
60 rows unloaded


Unloading table: CDMA_FAVOUR_BAOYUE,object ID: 126172
Unloading segment,storage(Obj#=126172 DataObj#=126172 TS#=3 File#=3 Block#=30899 Cluster=0)
31 rows unloaded


Unloading table: DAISHENG_TMP,object ID: 79675
Unloading segment,storage(Obj#=79675 DataObj#=342004 TS#=3 File#=31 Block#=19451 Cluster=0)
7504 rows unloaded
…………………………

11.查看导出来数据

-bash-2.05b$ ls -l
-rw-r--r--    1 oracle   dba           658  2月 15 08:43 TL_ADMIN_AMORTIZE.ctl
-rw-r--r--    1 oracle   dba           763  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.ctl
-rw-r--r--    1 oracle   dba           683  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.sql
-rw-r--r--    1 oracle   dba         45813  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.txt
-rw-r--r--    1 oracle   dba           748  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.ctl
-rw-r--r--    1 oracle   dba           686  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.sql
-rw-r--r--    1 oracle   dba        263046  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.txt
-rw-r--r--    1 oracle   dba           777  2月 15 08:43 TL_ADMIN_AMORTIZE_ID.ctl
-rw-r--r--    1 oracle   dba           696  2月 15 08:43 TL_ADMIN_AMORTIZE_ID.sql
……………………………………

12.创建表脚本

-bash-2.05b$ ls -l *.sql
-rw-r--r--    1 oracle   dba           312  2月 15 08:43 TL_ADMIN_A136_20080304.sql
-rw-r--r--    1 oracle   dba           683  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.sql
-rw-r--r--    1 oracle   dba           686  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.sql
-rw-r--r--    1 oracle   dba           696  2月 15 08:43 TL_ADMIN_AMORTIZE_ID.sql
-rw-r--r--    1 oracle   dba           147  2月 15 08:44 TL_ADMIN_AMORTIZE_OWE.sql
………………
复制这些文件列表,使用UltraEdit编辑文件(recover_create_tab.sql),生成如下脚本
@@TL_ADMIN_A136_20080304.sql
@@TL_ADMIN_AMORTIZE_HM_COPY.sql
@@TL_ADMIN_AMORTIZE_HM.sql
@@TL_ADMIN_AMORTIZE_ID.sql
@@TL_ADMIN_AMORTIZE_OWE.sql
@@TL_ADMIN_AMORTIZE_PAY.sql
然后使用sqlplus登录数据库,直接执行recover_create_tab.sql,创建了所有需要的表

13.加载数据脚本

-bash-2.05b$ ls -l *.ctl
-rw-r--r--    1 oracle   dba           507  2月 15 08:43 TL_ADMIN_A136_20080304.ctl
-rw-r--r--    1 oracle   dba           658  2月 15 08:43 TL_ADMIN_AMORTIZE.ctl
-rw-r--r--    1 oracle   dba           763  2月 15 08:43 TL_ADMIN_AMORTIZE_HM_COPY.ctl
-rw-r--r--    1 oracle   dba           748  2月 15 08:43 TL_ADMIN_AMORTIZE_HM.ctl
-rw-r--r--    1 oracle   dba           777  2月 15 08:43 TL_ADMIN_AMORTIZE_ID.ctl
-rw-r--r--    1 oracle   dba           344  2月 15 08:44 TL_ADMIN_AMORTIZE_OWE.ctl
-rw-r--r--    1 oracle   dba           342  2月 15 08:43 TL_ADMIN_AMORTIZE_PAY.ctl
也同样使用UltraEdit出来,生成文件(recover_load_data.sh)
sqlldr TL_ADMIN/password  control= TL_ADMIN_A136_20080304.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_HM_COPY.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_HM.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_ID.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_OWE.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_AMORTIZE_PAY.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_BD_DEALER_308.ctl
sqlldr TL_ADMIN/password  control= TL_ADMIN_BRXZ.ctl
然后在shell中执行recover_load_data.sh脚本导入数据,至此数据恢复完成

注意:recover_create_tab.sql和recover_load_data.sh需要在当前抽取出来数据的目录中。