使用flashback query恢复被删除plsql

今天一个朋友在11g中误删除生产中的一个过程,让他通过对dba_source视图的flashback query找回该过程.
从10g及其以后的版本中,如果被删除的plsql被及时发现(undo 未被覆盖掉)可以使用flashback query功能实现恢复.
创建plsql并删除

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> create or replace PROCEDURE  p_test_del(in_put varchar2) as
  2  begin
  3  dbms_output.put_line(in_put);
  4  end;
  5  /

Procedure created.

SQL> set serveroutput on
SQL> exec p_test_del('www.orasos.com');
www.orasos.com

PL/SQL procedure successfully completed.

SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') from dual;

TO_CHAR(SYSDATE,'YY
-------------------
2012-07-02 08:34:45

SQL> drop  PROCEDURE  p_test_del;

Procedure dropped.

恢复plsql

SQL> set pagesize 0
SQL> column text format a4000
SQL> spool /tmp/get_delete_proc.xff
SQL> SELECT text
  2         FROM DBA_source AS OF TIMESTAMP TO_TIMESTAMP('2012-07-02 08:34:45', 'YYYY-MM-DD HH24:MI:SS')
  3        WHERE OWNER = 'CHF' AND NAME = 'P_TEST_DEL' ORDER BY LINE;
PROCEDURE  p_test_del(in_put varchar2) as
begin
dbms_output.put_line(in_put);
end;

SQL> spool off;
SQL> !more /tmp/get_delete_proc.xff
SQL> SELECT text
  2         FROM DBA_source AS OF TIMESTAMP TO_TIMESTAMP('2012-07-02 08:34:45', 'YYYY-MM-DD HH24:MI:SS')
  3        WHERE OWNER = 'CHF' AND NAME = 'P_TEST_DEL' ORDER BY LINE;
create PROCEDURE  p_test_del(in_put varchar2) as 
begin                         
dbms_output.put_line(in_put); 
end;        

SQL> spool off;

重建plsql

SQL> create PROCEDURE  p_test_del(in_put varchar2) as 
  2  begin                                     
  3  dbms_output.put_line(in_put);             
  4  end; 
  5  /

Procedure created.

SQL> set serveroutput on
SQL> exec p_test_del('惜分飞');
惜分飞

PL/SQL procedure successfully completed.

ORA-00600[kcbshlc_1]导致数据库 down 案例

一台服务器因为ORA-00600[kcbshlc_1]错误引起PMON异常导致数据库down掉

Sun Jul  8 17:20:10 2012
Errors in file /opt/oracle/admin/xff/bdump/xff_pmon_16412.trc:
ORA-00600: internal error code, arguments: [kcbshlc_1], [33], [], [], [], [], [], []
Sun Jul  8 17:20:12 2012
Errors in file /opt/oracle/admin/xff/bdump/xff_pmon_16412.trc:
ORA-00600: internal error code, arguments: [kcbshlc_1], [33], [], [], [], [], [], []
Sun Jul  8 17:20:12 2012
PMON: terminating instance due to error 472

分析trace文件

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /opt/oracle/product/10.2.0
System name:	Linux
Node name:	localhost.localdomain
Release:	2.6.9-89.ELsmp
Version:	#1 SMP Mon Apr 20 10:33:05 EDT 2009
Machine:	x86_64
Instance name: xff
Redo thread mounted by this instance: 1
Oracle process number: 2
Unix process pid: 16412, image: oracle@localhost.localdomain (PMON)

*** 2012-07-08 03:00:11.351
*** SERVICE NAME:(SYS$BACKGROUND) 2012-07-08 03:00:11.338
*** SESSION ID:(1105.1) 2012-07-08 03:00:11.338
 wsd 0x1f8169a6c8, sbuf (nil), setid 9, op 0
lcuridx 0, lasz (nil)
freeing in-flux r/w latch for process state: 1fc165d248
... in-flux r/w latch  1fc1fcc9b0 Child cache buffers chains level=1 child#=4753 
        Location from where latch is held: kcbgtcr: kslbegin excl: 
        Context saved from call: 113266196
        state=busy(exclusive) (val=0x2000000000000071) holder orapid = 113
    waiters [orapid (seconds since: put on list, posted, alive check)]:
     139 (2, 1341687611, 2)
     192 (2, 1341687611, 2)
     191 (2, 1341687611, 2)
     173 (2, 1341687611, 2)
     185 (2, 1341687611, 2)
     176 (2, 1341687611, 2)
     174 (2, 1341687611, 2)
     118 (2, 1341687611, 2)
     190 (2, 1341687611, 2)
     179 (2, 1341687611, 2)
     184 (1, 1341687611, 1)
     189 (1, 1341687611, 1)
     177 (1, 1341687611, 1)
     195 (1, 1341687611, 1)
     187 (1, 1341687611, 1)
     194 (1, 1341687611, 1)
     147 (1, 1341687611, 1)
     183 (1, 1341687611, 1)
     143 (1, 1341687611, 1)
     144 (1, 1341687611, 1)
     186 (1, 1341687611, 1)
     188 (1, 1341687611, 1)
     196 (1, 1341687611, 1)
     145 (1, 1341687611, 1)
     193 (1, 1341687611, 1)
     waiter count=25
*** 2012-07-08 03:50:06.228
 wsd 0x1f8169ac20, sbuf 0xac1ffafe8, setid 10, op 3
lcuridx 1, lasz 0x3c1ffc110
*** 2012-07-08 16:30:05.294
freeing in-flux r/w latch for process state: 20406507f0
... in-flux r/w latch  1f81265f28 Child cache buffers chains level=1 child#=14180 
        Location from where latch is held: kcbgtcr: kslbegin excl: 
        Context saved from call: 71341989
        state=busy(exclusive) (val=0x2000000000000066) holder orapid = 102
    waiters [orapid (seconds since: put on list, posted, alive check)]:
     121 (2, 1341736205, 2)
     116 (2, 1341736205, 2)
     125 (2, 1341736205, 2)
     140 (2, 1341736205, 2)
     145 (2, 1341736205, 2)
     waiter count=5
freeing in-flux r/w latch for process state: 1fc165f9d0
... in-flux r/w latch  1f813aec18 Child cache buffers chains level=1 child#=20914 
        Location from where latch is held: kcbrls: kslbegin: 
        Context saved from call: 96505705
        state=busy(exclusive) (val=0x200000000000007b) holder orapid = 123
*** 2012-07-08 17:20:10.876
 wsd 0x1f8169a6c8, sbuf (nil), setid 9, op 0
lcuridx 0, lasz (nil)
*** 2012-07-08 17:20:10.876
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kcbshlc_1], [33], [], [], [], [], [], []
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst()+31          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
ksedmp()+610         call     ksedst()             000000000 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
ksfdmp()+21          call     ksedmp()             000000003 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
kgerinv()+161        call     ksfdmp()             000000003 ? 000000001 ?
                                                   7FBFFFCEB0 ? 7FBFFFCF10 ?
                                                   7FBFFFCE50 ? 000000000 ?
kgeasnmierr()+163    call     kgerinv()            0066876E0 ? 2A97200260 ?
                                                   7FBFFFCF10 ? 7FBFFFCE50 ?
                                                   000000000 ? 000000000 ?
kcbshlc()+239        call     kgeasnmierr()        0066876E0 ? 2A97200260 ?
                                                   7FBFFFCF10 ? 7FBFFFCE50 ?
                                                   000000000 ? 000000021 ?
kslilcr()+770        call     kcbshlc()            0066876E0 ? 1F801DDB28 ?
                                                   7FBFFFCF10 ? 7FBFFFCE50 ?
                                                   000000000 ? 000000021 ?
ksl_cleanup()+1567   call     kslilcr()            7FBFFFCE50 ? 000000000 ?
                                                   7FBFFFDCE0 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksuxfl()+492         call     ksl_cleanup()        000000000 ? 000000000 ?
                                                   000000000 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksuxda()+55          call     ksuxfl()             1FC165B8E0 ? 000000000 ?
                                                   000000000 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksucln()+1390        call     ksuxda()             1FC165B8E0 ? 000000000 ?
                                                   000000000 ? 1F801DDB28 ?
                                                   0066876E0 ? 000000021 ?
ksbrdp()+794         call     ksucln()             060008100 ? 000000000 ?
                                                   FFFFFFFF9720ED9F ?
                                                   1F801DDB28 ? 0066876E0 ?
                                                   000000021 ?
opirip()+616         call     ksbrdp()             060008100 ? 000000000 ?
                                                   000000001 ? 060008100 ?
                                                   0066876E0 ? 000000021 ?
opidrv()+582         call     opirip()             000000032 ? 000000004 ?
                                                   7FBFFFF698 ? 060008100 ?
                                                   0066876E0 ? 000000021 ?
sou2o()+114          call     opidrv()             000000032 ? 000000004 ?
                                                   7FBFFFF698 ? 060008100 ?
                                                   0066876E0 ? 000000021 ?
opimai_real()+317    call     sou2o()              7FBFFFF670 ? 000000032 ?
                                                   000000004 ? 7FBFFFF698 ?
                                                   0066876E0 ? 000000021 ?
main()+116           call     opimai_real()        000000003 ? 7FBFFFF700 ?
                                                   000000004 ? 7FBFFFF698 ?
                                                   0066876E0 ? 000000021 ?
__libc_start_main()  call     main()               000000003 ? 7FBFFFF700 ?
+219                                               000000004 ? 7FBFFFF698 ?
                                                   0066876E0 ? 000000021 ?
_start()+42          call     __libc_start_main()  000713984 ? 000000001 ?
                                                   7FBFFFF848 ? 005288D00 ?
                                                   000000000 ? 000000003 ?

通过这个trace可以看出数据库运行在LINUX 64操作系统,版本是10.2.0.4。
出现错误的原因:
PMON在清理1fc165d248的时候,因为被orapid = 102持有,导致清理失败.
PMON在清理20406507f0的时候,因为被orapid = 102持有,导致清理失败.
PMON在清理1fc165f9d0的时候,因为被orapid = 123持有,导致清理失败.

查询MOS[443909.1]
发现是unpublished Bug 4723109.处理方法打上Patch 4723109.

dul 10 export_mode=true功能增强

在有次8i的库恢复中,因为硬盘损坏导致几个表出现很多诡异性坏块,尝试使用dul对其进行挖掘数据,当时使用dul 9 遇到一个难题:当一张表中有lob类型,同时又有varchar2类型,而且varchar2类型数据中包含回车键,使得解决起来很麻烦(因为export_mode=false支持lob,但是不支持字符串含回车;export_mode=true支持字符串含回车,但是不支持lob),最后放弃了对部分数据的挖掘.这个问题让我一直不甘心,今天测试dul 10 发现是用export_mode=true可以完美解决该问题
创建模拟表和插入数据

SQL> desc t_xff
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 C_BLOB                                             BLOB
 C_VARCHAR                                          VARCHAR2(4000)

SQL> declare
  2  a_blob BLOB;
  3  bfile_name BFILE := BFILENAME('ULTLOBDIR','awr_ora11g_2012-06-01_174_175.html');
  4  begin
  5  insert into t_xff(C_BLOB,C_VARCHAR) values (
  6  empty_blob(),
  7  'www.orasos.com
  8  WWW.XIFENFEI.COM
  9  惜分飞
 10  欢迎访问惜分飞博客
 11  提供数据库异常恢复技术支持')
 12  returning C_BLOB into a_blob;
 13  dbms_lob.fileopen(bfile_name);
 14  dbms_lob.loadfromfile(a_blob, bfile_name, dbms_lob.getlength(bfile_name));
 15  dbms_lob.fileclose(bfile_name);
 16  commit;
 17  end;
 18  /

PL/SQL procedure successfully completed.

SQL> select length(c_varchar),dbms_lob.getlength(c_blob) from t_xff;

LENGTH(C_VARCHAR) DBMS_LOB.GETLENGTH(C_BLOB)
----------------- --------------------------
               61                    4282573

SQL>  select c_varchar from t_xff;

C_VARCHAR
---------------------------------------------------------------
www.orasos.com
WWW.XIFENFEI.COM
惜分飞
欢迎访问惜分飞博客
提供数据库异常恢复技术支持

dul 挖数据

[oracle@xifenfei dul]$ ./dul

Data UnLoader: 10.2.0.5.13 - Internal Only - on Mon Jul  2 04:29:10 2012
with 64-bit io functions

Copyright (c) 1994 2012 Bernard van Duijnen All rights reserved.

 Strictly Oracle Internal Use Only

DUL> bootstrap;
DUL> desc chf.t_xff;
Table CHF.T_XFF
obj#= 51353, dataobj#= 51353, ts#= 4, file#= 4, block#=67
      tab#= 0, segcols= 2, clucols= 0
Column information:
icol# 01 segcol# 01       C_BLOB len 4000 type 113 BLOB
  LOB Segment: dataobj#= 51354, ts#= 4, file#= 4, block#=75 chunk=1
  LOB Index: dataobj#= 51355, ts#= 4, file#= 4, block#=83
icol# 02 segcol# 02    C_VARCHAR len 4000 type  1 VARCHAR2 cs 852(ZHS16GBK)

--export_mode=false
DUL> unload table chf.t_xff;
. unloading (index organized) table     LOB01000053      65 rows unloaded
Preparing lob metadata from lob index
Reading LOB01000053.dat 65 entries loaded and sorted 65 entries
. unloading table                     T_XFF       1 row  unloaded

--导出数据文件
-rw-r--r-- 1 oracle oinstall 6.1K Jul  2 04:15 LOB01000053.dat
-rw-r--r-- 1 oracle oinstall  335 Jul  2 04:15 LOB01000053.ctl
-rw-r--r-- 1 oracle oinstall 8.2M Jul  2 04:15 CHF_T_XFF.dat
-rw-r--r-- 1 oracle oinstall  263 Jul  2 04:15 CHF_T_XFF.ctl

----export_mode=true
DUL> unload table chf.t_xff;
. unloading (index organized) table     LOB01000053
DUL: Warning: Recreating file "LOB01000053.ctl"
      65 rows unloaded
Preparing lob metadata from lob index
Reading LOB01000053.dat 65 entries loaded and sorted 65 entries
. unloading table                     T_XFF       1 row  unloaded

--导出数据文件
-rw-r--r-- 1 oracle oinstall    6229 Jul  2 04:29 LOB01000053.dat
-rw-r--r-- 1 oracle oinstall     335 Jul  2 04:29 LOB01000053.ctl
-rw-r--r-- 1 oracle oinstall 4285027 Jul  2 04:29 CHF_T_XFF.dmp

导入数据测试
sqlldr导入

SQL> truncate table chf.t_xff;

Table truncated.

[oracle@xifenfei dul]$ sqlldr chf/xifenfei control=CHF_T_XFF.ctl

SQL*Loader: Release 10.2.0.1.0 - Production on Mon Jul 2 04:23:18 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

SQL*Loader-510: Physical record in data file (CHF_T_XFF.dat) is longer than the maximum(1048576)
SQL*Loader-2026: the load was aborted because SQL Loader cannot continue.
[oracle@xifenfei dul]$ sqlldr chf/xifenfei control=CHF_T_XFF.ctl readsize=20971520

SQL*Loader: Release 10.2.0.1.0 - Production on Mon Jul 2 04:26:50 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

SQL> select length(c_varchar),dbms_lob.getlength(c_blob) from chf.t_xff;

no rows selected

--试验结果证明在出现表中同时有lob和varchar2列(含回车)时,export_mode=false不能正常工作

imp导入

SQL> drop table chf.t_xff;

Table dropped.

[oracle@xifenfei dul]$ imp chf/xifenfei file=CHF_T_XFF.dmp full=y

Import: Release 10.2.0.1.0 - Production on Mon Jul 2 04:30:30 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

Export file created by EXPORT:V07.00.07 via conventional path

Warning: the objects were exported by Bernard's DUL, not by you

. importing Bernard's DUL's objects into CHF
. importing Bernard's DUL's objects into CHF
. . importing table                        "T_XFF"          1 rows imported

SQL> select length(c_varchar),dbms_lob.getlength(c_blob) from t_xff;

LENGTH(C_VARCHAR) DBMS_LOB.GETLENGTH(C_BLOB)
----------------- --------------------------
               61                    4282573

SQL>  select c_varchar from t_xff;

C_VARCHAR
---------------------------------------------------------------
www.orasos.com
WWW.XIFENFEI.COM
惜分飞
欢迎访问惜分飞博客
提供数据库异常恢复技术支持

--试验结果证明在出现表中同时有lob和varchar2列(含回车)时,export_mode=true正常工作

DBCA Fails With ORA-15243

今天接到朋友的电话说他们装ORACLE 11G R1 RAC的时候遇到ORA-12801/ORA-15243错误,请求我帮忙解决
具体情况
AIX系统以前装过11G R2 RAC,现因为项目要求11G R1,已经重装了系统,然后安装R1,在安装到DBCA配置ASM的时候,出现ORA-12801/ORA-15243错误

ORA-12801: error signaled in parallel query server PZ99, instance wmsdb1:+ASM1(1)
ORA-15243: 11.2.0.0.0 is not a valid version number


通过SQLPLUS登录ASM1实例查询发现该有一个ORADATA磁盘组,包含了一个/dev/rhdisk1.通过询问,得出结论是这个磁盘组以前是安装R2的时候作为存储OCR和VOTINGDISK使用,重装系统的时候未对该磁盘进行处理.

处理思路[想办法清除磁盘中asm信息]
1.尝试通过sqlplus 删除该磁盘组,报该磁盘组处于dismount状态
2.尝试mount该磁盘组,提示版本无效(ORA-15243)[当前的asm程序是11.1而磁盘组信息是11.2 程序当然不一致了]
3.直接使用dd清理该asm disk header信息(dd if=/dev/zero of=/dev/rhdisk1 bs=4096 count=1)
4.重新运行dbca一切工作正常

MOS中相关文章[1460997.1]只适合linux asmlib情况

Applies to:
Oracle Server - Enterprise Edition - Version 11.1.0.7 and later
Information in this document applies to any platform.

Symptoms
On : 11.1.0.7 version, STORAGE

When attempting to create database or query gv$asm_diskgroup,
the following error occurs.

ERROR
-----------------------
ORA-12801: error signaled in parallel query server PZ99, instance dchilcmsdb2.hq.navteq.com:+ASM2 (2)
ORA-15243: 11.2.0.0.0 is not a valid version number


STEPS
-----------------------
The issue can be reproduced at will with the following steps:
1. Previously had 11GR2 installed and configured. Removed this installation then installed 
   11.1.0.7  and created diskgroups using some of the same disks previously used.
2. Attempt to create database and receive the errors. Drop the newly 
   created diskgroups and query the view still get same errors.


BUSINESS IMPACT
-----------------------
The issue has the following business impact:
Due to this issue, users cannot create new database.

Changes
 Removed 11.2.0.1 installation and installed 11.1.0.7 software without cleaning up all of 
 the diskgroup information from previous installation.

Cause
All the current information shows that we are using correct binaries and 
that the diskgroups that are being used have correct comparability settings. 
HTML shows that the disks for the old diskgroup are still being discovered. 
This in conjunction with the text of the error as follows shows that 
we are picking up 11.2.0.0.0 as version from somewhere.
ORA-15243: 11.2.0.0.0 is not a valid version number
 
Problem was caused by the disks that had been used for the 
OCR/Voting disk diskgroup in 11GR2 installation still being present and accessible.
 

Solution
As the root user execute /etc/init.d/oracleasm/deletedisk command against all the disks 
that were previously used for the OCR/Voting disk diskgroup then try the operation again.

11g DirectPath Reads 噩梦案例

DirectPath Reads 说明
在oracle 11g以前的版本中,如果对大表进行全表扫描,wait event是:db file scattered read;在11g中,如果对大表进行全表扫描,wait event是:direct path read。在11g中,大表全表扫描时数据块不经过sga而直接进pga,这样会造成每次进行大表全表扫描,物理读都是很大,而在10g中,由于全表扫描的数据块在sga中已经存在,所以执行全表扫描时,它的物理读为0。但是这里主要是oracle在优化策略上的进步,即假定大表频繁全表扫描这种现象,在生产库上不会太多,通过把数据直接读入pga,进而减少了cache buffer的繁忙交换程度,提高了cache buffer的使用效率.

DirectPath Reads 优势
1. 减少了对栓的使用,避免可能的栓争用
2. 物理IO的大小不再取决于buffer_cache中所存在的块;试想某个8个块的extent中1,3,5,7号块在高速缓存中,而2,4,6,8块没有被缓存,传统的方式在读取该extent时将会是对2,4,6,8块进行4次db file sequential read,这是一种十分可怕的状况,其效率往往要比单次读取这个区间的所有8个块还要低得多,虽然Oracle为了避免这种情况总是尽可能的不缓存大表的块(读入后总是放在队列最冷的一端);而direct path read则可以完全避免这类问题,尽可能地单次读入更多的物理块。

DirectPath Reads 噩梦
这一切听起来都很美好,但是在大并发的OLTP系统中,这东西简直是一个噩梦.通过一个awr来说明该问题:这个是一个系统的awr报告,朋友反馈说系统有段运行缓慢,请求帮忙找出原因
分析总体信息

系统这段时间会话临时大幅度增加(从102增加到223),系统出现异常繁忙(60.62*16=969.92<2,454.52)
分析Load Profile信息

通过这个截图发现系统的业务不是很大,但是Physical reads参数异常
1.物理读大小:25071.1*8192/1024/1024=195.86796875M/S
2.物理读将近逻辑读一半,这个在一般系统中很难得到这个比例,进一步说明物理读过高

分析Top 5信息

这里可以发现direct path read等待很多

分析Host CPU

可以发现iowait很大占40.5%,io等待异常高(195M/S能不高吗?)
补充说明:在这里我们看到的%Idle=1-%System-%User不包括%WIO

处理建议
通过上面的评估,可以确定大部分是由于 导致了数据库的物理读过高,从而使得系统反应变慢,处理方法就是关闭掉11g该新特性
alter system set event= ‘10949 trace name context forever, level 1’ scope=spfile;
重启数据库