利用flashback database实现部分对象回滚

flashback database功能在生产库中,很少被直接使用,因为没有多少业务可以承受整个数据库级别的回滚.但是如果发生一些让人意想不到的误操作时候,想回滚该操作,我们不得不使用历史的备份来进行不完全恢复.如果没有历史备份,那简直是人生一个悲剧的发生.这里通过使用结合flashback database,实现flashback table级别不能完成的恢复,而且确保整个数据库的其他数据还是最新.这些操作比如:修改表结构,删除数据库用户等操作.这里通过修改表列的处理思路来展示该功能的使用方法,其他处理方法类此
1.确定启用flashback database功能

SQL> select flashback_on from v$database;

FLASHBACK_ON
------------------
YES

SQL>  show parameter flash

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_flash_cache_file                  string
db_flash_cache_size                  big integer 0
db_flashback_retention_target        integer     1440

2.模拟表结构被修改

SQL> create table t_xifenfei
  2  as
  3  select object_id,object_name from dba_objects;

表已创建。

SQL> alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';

会话已更改。

SQL>  select sysdate from dual;

SYSDATE
-------------------------
17-6月 -2012 15:25:24

SQL> ALTER TABLE t_xifenfei drop column object_name;

表已更改。

3.尝试flashback query功能

SQL> SELECT * FROM t_xifenfei as of timestamp to_timestamp('2012-06-17 15:25:24','yyyy-mm-dd hh24:mi:ss');
SELECT * FROM t_xifenfei as of timestamp to_timestamp('2012-06-17 15:25:24','yyyy-mm-dd hh24:mi:ss')
              *
第 1 行出现错误:
ORA-01466: 无法读取数据 - 表定义已更改
--这个证明因为ddl操作发生在表上,无法使用flashback table/query等操作

4.尝试flashback database

SQL> shutdown immediate
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
SQL> STARTUP MOUNT;
ORACLE 例程已经启动。

Total System Global Area  535662592 bytes
Fixed Size                  1385840 bytes
Variable Size             390072976 bytes
Database Buffers          138412032 bytes
Redo Buffers                5791744 bytes
数据库装载完毕。
SQL>  flashback database to timestamp to_date('2012-06-17 15:25:24','yyyy-mm-ddhh24:mi:ss');

闪回完成。

SQL> alter database open read only;

数据库已更改。

SQL> DESC CHF.T_XIFENFEI
 名称                                      是否为空? 类型
 ----------------------------------------- -------- ----------------------------

 OBJECT_ID                                          NUMBER
 OBJECT_NAME                                        VARCHAR2(128)

5.导出需要回滚对象

C:\Users\XIFENFEI>EXP chf/xifenfei tables=t_xifenfei file=d:\t_xifenfei.dmp 
>log=d:\t_xifenfei.log

Export: Release 11.2.0.3.0 - Production on 星期日 6月 17 15:40:37 2012

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
已导出 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集

即将导出指定的表通过常规路径...
. . 正在导出表                      T_XIFENFEI导出了       75270 行
成功终止导出, 没有出现警告。

6.恢复数据库至最新状态

SQL> shutdown immediate
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
SQL>  startup mount
ORACLE 例程已经启动。

Total System Global Area  535662592 bytes
Fixed Size                  1385840 bytes
Variable Size             390072976 bytes
Database Buffers          138412032 bytes
Redo Buffers                5791744 bytes
数据库装载完毕。
SQL> recover database;
完成介质恢复。
SQL> alter database open;

数据库已更改。

SQL> desc chf.t_xifenfei
 名称                                      是否为空? 类型
 ----------------------------------------- -------- ----------------------------

 OBJECT_ID                                          NUMBER

7.导入正确数据

SQL> drop table chf.t_xifenfei purge;

表已删除。

SQL> host imp chf/xifenfei tables=t_xifenfei file=d:\t_xifenfei.dmp 
>log=d:\t_xifenfei.log

Import: Release 11.2.0.3.0 - Production on 星期日 6月 17 15:45:53 2012

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options

经由常规路径由 EXPORT:V11.02.00 创建的导出文件
已经完成 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集中的导入
. 正在将 CHF 的对象导入到 CHF
. 正在将 CHF 的对象导入到 CHF
. . 正在导入表                    "T_XIFENFEI"导入了       75270 行
成功终止导入, 没有出现警告。

SQL> desc chf.t_xifenfei
 名称                                      是否为空? 类型
 ----------------------------------------- -------- ----------------------------

 OBJECT_ID                                          NUMBER
 OBJECT_NAME                                        VARCHAR2(128)

使用asm disk header 自动备份信息恢复异常asm disk header

通过参考kamus的Where is the backup of ASM disk header block,发现从10.2.0.5开始的asm确实存在自动备份asm disk header功能.有了这个功能对于那些不备份asm disk header的同学,提供了一层保证,也增加了asm的安全性.
对于10.2.0.5.0以及以后版本,不管au size是多少,asm disk header自动备份存储的位置是第2个au的倒数第2个block.
计算方法:AU中包含的block num[AU_SIZE/block_size]*2-2[因为从第一个块从0计数],通过该方法计算结论为:
1M AU在510
2M AU在1022
4M AU在2046
8M AU在4094
16M AU在8190
32M AU在16382
64M AU在32766
1.对比备份asm disk header

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Prod
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') "xifenfei.com"  from dual;

xifenfei.com
-------------------
2012-06-17 09:41:19

SQL>  select group_number,DISK_NUMBER,PATH,HEADER_STATUS 
   2  from v$asm_disk where group_number<>0;

GROUP_NUMBER DISK_NUMBER PATH            HEADER_STATU
------------ ----------- --------------- ------------
           1           1 /dev/raw/raw2   MEMBER
           1           0 /dev/raw/raw1   MEMBER

SQL> select group_number,name,BLOCK_SIZE,ALLOCATION_UNIT_SIZE from v$asm_diskgroup;

GROUP_NUMBER NAME                           BLOCK_SIZE ALLOCATION_UNIT_SIZE
------------ ------------------------------ ---------- --------------------
           1 DATA                                 4096              1048576

rac1->  kfed read /dev/raw/raw1 blknum=510|>/tmp/xifenfei.510
rac1->  kfed read /dev/raw/raw1 blknum=0|>/tmp/xifenfei.0
rac1-> ll /tmp/xifenfei*
-rw-r--r--  1 oracle oinstall 6606 Jun 14 04:11 /tmp/xifenfei.0
-rw-r--r--  1 oracle oinstall 6606 Jun 14 04:12 /tmp/xifenfei.510
rac1-> diff /tmp/xifenfei.510 /tmp/xifenfei.0
--通过对比发现两者无不同记录返回,证明他们记录内容完全相同

2.尝试破坏asm disk header

rac1-> dd if=/dev/zero of=/dev/raw/raw1 bs=4096 count=1
1+0 records in
1+0 records out
rac1->  kfed read /dev/raw/raw1 blknum=0
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000

SQL> select group_number,DISK_NUMBER,PATH,HEADER_STATUS 
   2 from v$asm_disk where group_number<>0;

GROUP_NUMBER DISK_NUMBER PATH            HEADER_STATU
------------ ----------- --------------- ------------
           1           1 /dev/raw/raw2   MEMBER
           1           0 /dev/raw/raw1   CANDIDATE

SQL> alter diskgroup  data dismount;

Diskgroup altered.

SQL> alter diskgroup  data mount;
alter diskgroup  data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

3.使用kfed repair修改损坏asm disk header

rac1-> kfed  repair '/dev/raw/raw1'
rac1->  kfed read /dev/raw/raw1 blknum=0
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:              2147483648 ; 0x008: TYPE=0x8 NUMB=0x0
kfbh.check:                   883602253 ; 0x00c: 0x34aab34d
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
…………

SQL> alter diskgroup  data mount;

Diskgroup altered.

4.使用kfed merge恢复asm disk header

rac1-> dd if=/dev/zero of=/dev/raw/raw1 bs=4096 count=1
1+0 records in
1+0 records out
rac1->  kfed read /dev/raw/raw1 blknum=0
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000

SQL> alter diskgroup  data dismount;

Diskgroup altered.

SQL> alter diskgroup  data mount;
alter diskgroup  data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

rac1->  kfed merge /dev/raw/raw1 /tmp/xifenfei.510

SQL> alter diskgroup  data mount;

Diskgroup altered.

通过试验证明在10.2.0.5及其以后版本中,对于备份的asm disk header我们可以通过使用kfed repair和kfed merge来恢复.

11G RAC库 ORA-00600[ktubko_1]错误

数据库版本信息

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') "www.orasos.com" from dual;

www.orasos.com
-------------------
2012-06-12 14:44:53

数据库启动报ORA-00600[ktubko_1]错误

Database Characterset is ZHS16GBK
Errors in file /u01/diag/rdbms/xff/XFF1/trace/XFF1_smon_17248.trc  (incident=21754):
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/diag/rdbms/xff/XFF1/incident/incdir_21754/XFF1_smon_17248_i21754.trc
Tue Jun 12 10:37:10 2012
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORACLE Instance XFF1 (pid = 19) - Error 600 encountered while recovering transaction (4, 2) on object 5887.
Errors in file /u01/diag/rdbms/xff/XFF1/trace/XFF1_smon_17248.trc:
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []

查看trace文件

Incorrect next uba in kturCurrBackoutOneChg while backing out xid: 0x0004.002.0000022b uba: 0x00c02068.00af.3b
Undo record:
ktubu redo: slt: 2 rci: 58 opc: 10.22 objn: 5887 objd: 5887 tsn: 1
Undo type:  Regular undo       Undo type:  Last buffer split:  No 
Tablespace Undo:  No 
             0x00000000
index undo for leaf key operations
KTB Redo 
op: 0x04  ver: 0x01  
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x0003.003.0000030a uba: 0x00c0200c.010a.15
                      flg: C---    lkc:  0     scn: 0x0000.00118c55
Dump kdilk : itl=3, kdxlkflg=0x1 sdc=0 indexid=0x800df2 block=0x00800df3
(kdxlre): restore leaf row (clear leaf delete flags)
key :(5):  02 c1 0d 01 80
keydata/bitmap: (6):  00 81 0f 41 00 01
Undo block: tsn 0x2 rdba: 0xc02068
Dump of buffer cache at level 4 for tsn=2 rdba=12591208
BH (0x33ff7264) file#: 3 rdba: 0x00c02068 (3/8296) class: 24 ba: 0x33f24000

*** 2012-06-12 10:36:40.265
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
  dbwrid: 0 obj: -1 objn: 0 tsn: 2 afn: 3 hint: f
  hash: [0x3e78a49c,0x3e78a49c] lru: [0x33ff73ec,0x33ff723c]
  obj-flags: object_ckpt_list
  ckptq:[0x3e05cd68,0x33ff7f0c]fileq:[0x3e05cda4,0x3e05cda4]objq:[0x3b9094a4,0x3b9094a4]objaq:[0x33ff7acc,0x3b909494]
  st: XCURRENT md: NULL fpin: 'ktuwh23: ktubko' tch: 1
  flags: buffer_dirty redo_since_read
  LRBA: [0x15.26.0] LSCN: [0x0.11acb6] HSCN: [0x0.11acb6] HSUB: [1]
Data block dump: tsn: 0x1 rdba: 0x800df3
Dump of buffer cache at level 3 for tsn=1 rdba=8392179
BH (0x33ff70b4) file#: 2 rdba: 0x00800df3 (2/3571) class: 1 ba: 0x33f20000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
  dbwrid: 0 obj: 5887 objn: 5887 tsn: 1 afn: 2 hint: f
  hash: [0x3e7d85e4,0x3e7d85e4] lru: [0x33ff723c,0x3e05c760]
  ckptq: [NULL] fileq: [NULL] objq: [0x3b9092a8,0x3b9092a8] objaq: [0x3b9092a0,0x3b9092a0]
  st: XCURRENT md: NULL fpin: 'kdiwh27: kdiulk' tch: 1
  flags:
  LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
  buffer tsn: 1 rdba: 0x00800df3 (2/3571)
  scn: 0x0000.00118c67 seq: 0x01 flg: 0x06 tail: 0x8c670601
  frmt: 0x02 chkval: 0x5d04 type: 0x06=trans data

Block header dump:  0x00800df3
 Object id on Block? Y
 seg/obj: 0x16ff  csc: 0x00.118c60  itc: 3  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x800df0 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x02   0x0006.01d.00000319  0x00c00332.009b.15  --U-    1  fsc 0x000f.00118c67
0x03   0x0003.003.0000030a  0x00c0200c.010a.15  C---    0  scn 0x0000.00118c55


kcra_dump_redo_internal: skipped for critical process
Dumping redo for undo$
kcra_dump_redo_internal: skipped for critical process

*** 2012-06-12 10:36:43.906
Incident 21754 created, dump file: /u01/diag/rdbms/xff/XFF1/incident/incdir_21754/XFF1_smon_17248_i21754.trc
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []

ORACLE Instance XFF1 (pid = 19) - Error 600 encountered while recovering transaction (4, 2) on object 5887.

*** 2012-06-12 10:37:10.646
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+41        call     kgdsdst()            BFFD84FC ? 2 ?
ksedst1()+77         call     skdstdst()           BFFD84FC ? 0 ? 1 ? 8592A48 ?
                                                   85928C6 ? 8592A48 ?
ksedst()+33          call     ksedst1()            0 ? 1 ?
dbkedDefDump()+2704  call     ksedst()             0 ? 0 ? 3FFE17CC ? 3E7C5344 ?
                                                   1 ? 1 ?
ksedmp()+47          call     dbkedDefDump()       3 ? 0 ?
kturRecoverTxn()+52  call     ksedmp()             3 ? B6B3FEA4 ? 2C ?
471                                                B6B3FE38 ? 1 ? B6B3FE58 ?
kturRecoverUndoSegm  call     kturRecoverTxn()     BFFD9078 ? 2 ? 1 ? 0 ? 11 ?
ent()+1091                                         4 ? 1 ?
kturRecoverActiveTx  call     kturRecoverUndoSegm  4 ? 0 ? 1 ? 0 ? FFFF ? 11 ?
ns()+931                      ent()                4 ?
ktprbeg()+281        call     kturRecoverActiveTx  10 ? 0 ?
                              ns()                 
ktmmon()+13050       call     ktprbeg()            0 ? 1 ? 0 ? B6B5B72C ? 0 ?
                                                   0 ?
ktmSmonMain()+174    call     ktmmon()             20018C2C ? B6B46D2C ?
                                                   114E7B00 ? 0 ? 0 ? B6B59F50 ?
ksbrdp()+826         call     00000000             20018C2C ? 432A884E ? 0 ? 0 ?
                                                   0 ? 0 ?
opirip()+559         call     ksbrdp()             0 ? 0 ? 0 ? 0 ? 0 ? 0 ?
opidrv()+515         call     opirip()             32 ? 4 ? BFFDB20C ?
sou2o()+80           call     opidrv()             32 ? 4 ? BFFDB20C ?
opimai_real()+230    call     sou2o()              BFFDB1F0 ? 32 ? 4 ?
                                                   BFFDB20C ?
ssthrdmain()+212     call     00000000             3 ? BFFDB338 ? 0 ? 4318AF14 ?
                                                   BFFDB2F4 ? 4317E670 ?
main()+147           call     ssthrdmain()         3 ? BFFDB338 ?
__libc_start_main()  call     00000000             1 ? BFFDB434 ? BFFDB43C ?
+220                                               4317E828 ? 0 ? 1 ?
_start()+33          call     __libc_start_main()  856F1C4 ? 1 ? BFFDB434 ?
                                                   BCF1440 ? BCF1430 ?
                                                   43170790 ?
 

--------------------- Binary Stack Dump ---------------------

通过上面的trace可以看出是2/3571中包含了事务,但是和3/8296[4号回滚段]回滚中的信息不相符,从而出现了在数据库启动回滚的时候出现该错误.查询mos[ID 1318986.1]发现这个是数据库的Bug 10205230比较相似,虽说在11.2.0.2中修复而且在asm中不受该影响,我这里库是11.2.0.3的asm rac照样出现该bug.

解决方法
通过alert日志提示object可以找到object_id=5887.当然也可以通过trace中的rdba来确定

SQL> col OBJECT_NAME for a30 
SQL> select object_name,object_type,owner from dba_objects where object_id=5887;

OBJECT_NAME                    OBJECT_TYPE         OWNER
------------------------------ ------------------- ------------------------------
WRI$_ADV_MESSAGE_GROUPS_PK     INDEX               SYS


SQL> alter index sys.WRI$_ADV_MESSAGE_GROUPS_PK rebuild online;

Index altered.

补充说明:如果损坏对象是表,需要使用DBMS_REPAIR跳过坏块,然后重建表

重启数据库观察
数据库已经正常,开始报undo回滚段错误的记录已经不再存在,数据库恢复正常

Tue Jun 12 13:50:43 2012
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Tue Jun 12 13:51:11 2012
No Resource Manager plan active
Tue Jun 12 13:52:01 2012
Starting background process GTX0
Tue Jun 12 13:52:01 2012
GTX0 started with pid=29, OS id=14234 
Starting background process RCBG
Tue Jun 12 13:52:04 2012
RCBG started with pid=41, OS id=14238 
replication_dependency_tracking turned off (no async multimaster replication found)
Tue Jun 12 13:54:01 2012
Starting background process QMNC
Tue Jun 12 13:54:01 2012
QMNC started with pid=42, OS id=14279 
Tue Jun 12 13:57:26 2012
Completed: ALTER DATABASE OPEN

使用bbed让rac中的sysaux数据文件online

一个朋友的11g rac库的sysaux表空间因某种原因缺少历史归档,导致无法正常online,是的数据库的很多功能受限.通过实现展示恢复过程.
模拟环境

SQL> select name,file#,status from v$datafile;

NAME                                                      FILE# STATUS
---------------------------------------------------- ---------- -------
+XIFENFEI/xff/datafile/system.256.776961315                   1 SYSTEM
+XIFENFEI/xff/datafile/sysaux.257.776961315                   2 ONLINE
+XIFENFEI/xff/datafile/undotbs1.258.776961317                 3 ONLINE
+XIFENFEI/xff/datafile/user_dd.dbf                            4 ONLINE
+XIFENFEI/xff/datafile/undotbs2.264.776961693                 5 ONLINE
+XIFENFEI/asm/datafile/xifenfei01.dbf.268.781967893           6 ONLINE

6 rows selected.

SQL> alter database datafile 2 offline;

Database altered.

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     14
Next log sequence to archive   15
Current log sequence           15
SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

SQL> /

System altered.

SQL> /

System altered.

SQL> /

System altered.

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     19
Next log sequence to archive   19
Current log sequence           20

--删除部分归档日志
[grid@rac1 ~]$ asmcmd
ASMCMD> ls
DATA/
XIFENFEI/
ASMCMD> cd data
ASMCMD> ls
XFF/
rac-cluster/
ASMCMD> cd xff 
ASMCMD> ls
ARCHIVELOG/
CONTROLFILE/
ONLINELOG/
ASMCMD> cd archivelog
ASMCMD> ls
2012_03_03/
2012_04_13/
2012_04_30/
2012_05_01/
2012_05_24/
2012_06_12/
ASMCMD> cd 2012_06_12
ASMCMD> ls
thread_1_seq_15.280.785752747
thread_1_seq_16.281.785752845
thread_1_seq_17.282.785752929
thread_1_seq_18.283.785753043
thread_1_seq_19.284.785753115
ASMCMD> rm thread_1_seq_16.281.785752845
ASMCMD> rm thread_1_seq_15.280.785752747

尝试online 数据文件

SQL> alter database datafile 2 online;
alter database datafile 2 online
*
ERROR at line 1:
ORA-01113: file 2 needs media recovery
ORA-01110: data file 2: '+XIFENFEI/xff/datafile/sysaux.257.776961315'


SQL> recover datafile 2;
ORA-00279: change 1155352 generated at 06/12/2012 08:20:10 needed for thread 1
ORA-00289: suggestion :
+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747
ORA-00280: change 1155352 for thread 1 is in sequence #15


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: cannot open archived log
'+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747'
ORA-17503: ksfdopn:2 Failed to open file
+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747
ORA-15012: ASM file
'+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747' does not exist


ORA-00308: cannot open archived log
'+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747'
ORA-17503: ksfdopn:2 Failed to open file
+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747
ORA-15012: ASM file
'+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747' does not exist

准备bbed修改数据文件
现在datafile 2不能恢复,我们需要修改的就是该datafile header 相关的scn等信息,另外拷贝一个数据文件出来做修改时候参考

RMAN> copy datafile 2 to  '/tmp/auxsys.dbf_rman';

Starting backup at 2012-06-12 08:59:07
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=20 instance=XFF1 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=+XIFENFEI/xff/datafile/sysaux.257.776961315
output file name=/tmp/auxsys.dbf_rman tag=TAG20120612T090029 RECID=1 STAMP=785754322
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:03:50
Finished backup at 2012-06-12 09:05:36

RMAN>  copy datafile 4 to '/tmp/user.dbf_rman';

Starting backup at 2012-06-12 09:09:28
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=+XIFENFEI/xff/datafile/user_dd.dbf
output file name=/tmp/user.dbf_rman tag=TAG20120612T090932 RECID=2 STAMP=785754582
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
Finished backup at 2012-06-12 09:09:48

bbed修改datafile header

[oracle@rac1 tmp]$ bbed password=blockedit listfile=/tmp/o_bbed  mode=edit

BBED: Release 2.0.0.0.0 - Limited Production on Tue Jun 12 09:37:30 2012

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> info
 File#  Name                                                        Size(blks)
 -----  ----                                                        ----------
     1  /tmp/auxsys.dbf_rman                                                 0
     2  /tmp/user.dbf_rman                                                   0

BBED> set file 2 block 1
        FILE#           2
        BLOCK#          1


BBED> p kcvfhckp
struct kcvfhckp, 36 bytes                   @484     
   struct kcvcpscn, 8 bytes                 @484     
      ub4 kscnbas                           @484      0x0011a787
      ub2 kscnwrp                           @488      0x0000
   ub4 kcvcptim                             @492      0x2ed5a9cd
   ub2 kcvcpthr                             @496      0x0001
   union u, 12 bytes                        @500     
      struct kcvcprba, 12 bytes             @500     
         ub4 kcrbaseq                       @500      0x00000014
         ub4 kcrbabno                       @504      0x000000c5
         ub2 kcrbabof                       @508      0x0010
   ub1 kcvcpetb[0]                          @512      0x02
   ub1 kcvcpetb[1]                          @513      0x00
   ub1 kcvcpetb[2]                          @514      0x00
   ub1 kcvcpetb[3]                          @515      0x00
   ub1 kcvcpetb[4]                          @516      0x00
   ub1 kcvcpetb[5]                          @517      0x00
   ub1 kcvcpetb[6]                          @518      0x00
   ub1 kcvcpetb[7]                          @519      0x00

BBED> p kcvfhcpc                             
ub4 kcvfhcpc                                @140      0x00000086

BBED> p kcvfhccc                             
ub4 kcvfhccc                                @148      0x00000085

BBED> set file 1 block 1
        FILE#           1
        BLOCK#          1

BBED> p kcvfhckp
struct kcvfhckp, 36 bytes                   @484     
   struct kcvcpscn, 8 bytes                 @484     
      ub4 kscnbas                           @484      0x0011a118
      ub2 kscnwrp                           @488      0x0000
   ub4 kcvcptim                             @492      0x2ed59e3a
   ub2 kcvcpthr                             @496      0x0001
   union u, 12 bytes                        @500     
      struct kcvcprba, 12 bytes             @500     
         ub4 kcrbaseq                       @500      0x0000000f
         ub4 kcrbabno                       @504      0x0000c4ed
         ub2 kcrbabof                       @508      0x0010
   ub1 kcvcpetb[0]                          @512      0x02
   ub1 kcvcpetb[1]                          @513      0x00
   ub1 kcvcpetb[2]                          @514      0x00
   ub1 kcvcpetb[3]                          @515      0x00
   ub1 kcvcpetb[4]                          @516      0x00
   ub1 kcvcpetb[5]                          @517      0x00
   ub1 kcvcpetb[6]                          @518      0x00
   ub1 kcvcpetb[7]                          @519      0x00

BBED> p kcvfhcpc
ub4 kcvfhcpc                                @140      0x00000079

BBED>  p kcvfhccc 
ub4 kcvfhccc                                @148      0x00000078

/*
确定需要修改项kscnbas/kcvcptim/kcvfhcpc/kcvfhccc的相关信息
*/

BBED> set count 16
        COUNT           16

BBED> d file 2 block 1 offset 484                            
 File: /tmp/user.dbf_rman (2)
 Block: 1                Offsets:  484 to  499           Dba:0x00800001
------------------------------------------------------------------------
 87a71100 00001000 cda9d52e 01000000 

 <32 bytes per line>

BBED> m /x 87a71100 file 1 block 1 offset 484
BBED-00209: invalid number (87a71100)


BBED> m /x 87a7 file 1 block 1 offset 484
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  484 to  499           Dba:0x00400001
------------------------------------------------------------------------
 87a71100 00000000 3a9ed52e 01000000 

 <32 bytes per line>

BBED> d file 2 block 1 offset 492
 File: /tmp/user.dbf_rman (2)
 Block: 1                Offsets:  492 to  507           Dba:0x00800001
------------------------------------------------------------------------
 cda9d52e 01000000 14000000 c5000000 

 <32 bytes per line>

BBED> m /x cda9d52e file 1 block 1 offset 492
BBED-00209: invalid number (cda9d52e)


BBED> d file 1 block 1 offset 492
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  492 to  507           Dba:0x00400001
------------------------------------------------------------------------
 3a9ed52e 01000000 0f000000 edc40000 

 <32 bytes per line>

BBED> m /x cda9 file 1 block 1 offset 492
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  492 to  507           Dba:0x00400001
------------------------------------------------------------------------
 cda9d52e 01000000 0f000000 edc40000 

 <32 bytes per line>

BBED> d file 1 block 1 offset 140
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  140 to  155           Dba:0x00400001
------------------------------------------------------------------------
 79000000 2970bc2e 78000000 00000000 

 <32 bytes per line>

BBED> d file 2 block 1 offset 140
 File: /tmp/user.dbf_rman (2)
 Block: 1                Offsets:  140 to  155           Dba:0x00800001
------------------------------------------------------------------------
 86000000 2970bc2e 85000000 00000000 

 <32 bytes per line>

BBED> m /x 86000000 file 1 block 1 offset 140
BBED-00209: invalid number (86000000)


BBED> m /x 8600 file 1 block 1 offset 140
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  140 to  155           Dba:0x00400001
------------------------------------------------------------------------
 86000000 2970bc2e 78000000 00000000 

 <32 bytes per line>

BBED> d file 2 block 1 offset 148
 File: /tmp/user.dbf_rman (2)
 Block: 1                Offsets:  148 to  163           Dba:0x00800001
------------------------------------------------------------------------
 85000000 00000000 00000000 00000000 

 <32 bytes per line>

BBED> m /x 8500 file 1 block 1 offset 148
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  148 to  163           Dba:0x00400001
------------------------------------------------------------------------
 85000000 00000000 00000000 00000000 

 <32 bytes per line>

BBED> set file 1 block 1
        FILE#           1
        BLOCK#          1

BBED> p kcvfhckp
struct kcvfhckp, 36 bytes                   @484     
   struct kcvcpscn, 8 bytes                 @484     
      ub4 kscnbas                           @484      0x0011a787
      ub2 kscnwrp                           @488      0x0000
   ub4 kcvcptim                             @492      0x2ed5a9cd
   ub2 kcvcpthr                             @496      0x0001
   union u, 12 bytes                        @500     
      struct kcvcprba, 12 bytes             @500     
         ub4 kcrbaseq                       @500      0x0000000f
         ub4 kcrbabno                       @504      0x0000c4ed
         ub2 kcrbabof                       @508      0x0010
   ub1 kcvcpetb[0]                          @512      0x02
   ub1 kcvcpetb[1]                          @513      0x00
   ub1 kcvcpetb[2]                          @514      0x00
   ub1 kcvcpetb[3]                          @515      0x00
   ub1 kcvcpetb[4]                          @516      0x00
   ub1 kcvcpetb[5]                          @517      0x00
   ub1 kcvcpetb[6]                          @518      0x00
   ub1 kcvcpetb[7]                          @519      0x00

BBED> p kcvfhcpc
ub4 kcvfhcpc                                @140      0x00000086

BBED>  p kcvfhccc 
ub4 kcvfhccc                                @148      0x00000085

BBED> sum apply
Check value for File 1, Block 1:
current = 0x48c4, required = 0x48c4

使用修改后数据文件尝试online

SQL> alter database rename file '+XIFENFEI/xff/datafile/sysaux.257.776961315' to '/tmp/auxsys.dbf_rman';

Database altered.

SQL> recover database datafile 2 ;
ORA-00274: illegal recovery option DATAFILE


SQL> recover database datafile 2;
ORA-00274: illegal recovery option DATAFILE


SQL> recover datafile 2;
ORA-00283: recovery session canceled due to errors
ORA-01122: database file 2 failed verification check
ORA-01110: data file 2: '/tmp/auxsys.dbf_rman'
ORA-01207: file is more recent than control file - old control file

尝试重建控制文件

SQL> alter database backup controlfile to trace as '/tmp/xifenfei.ctl';

Database altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> STARTUP NOMOUNT
ORACLE instance started.

Total System Global Area  535662592 bytes
Fixed Size                  1346140 bytes
Variable Size             411043236 bytes
Database Buffers          117440512 bytes
Redo Buffers                5832704 bytes
SQL> @xifenfei_ctl

CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  ARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-12720: operation requires database is in EXCLUSIVE mode

--在rac中重建控制文件需要设置cluster_database=FALSE 

SQL> alter system set  cluster_database=FALSE scope=spfile;

System altered.

SQL> shutdown immediate;
ORA-01507: database not mounted


ORACLE instance shut down.
SQL> STARTUP NOMOUNT
ORACLE instance started.

Total System Global Area  535662592 bytes
Fixed Size                  1346140 bytes
Variable Size             411043236 bytes
Database Buffers          117440512 bytes
Redo Buffers                5832704 bytes
SQL> @xifenfei_ctl

Control file created.

online数据文件
重建控制文件恢复数据库之后 datafile 2自动online成功,省去了手工处理麻烦,如果没有自动online,请手工处理

SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

SQL> col name for a52
SQL> select name,file#,status from v$datafile;

NAME                                                      FILE# STATUS
---------------------------------------------------- ---------- -------
+XIFENFEI/xff/datafile/system.256.776961315                   1 SYSTEM
/tmp/auxsys.dbf_rman                                          2 ONLINE
+XIFENFEI/xff/datafile/undotbs1.258.776961317                 3 ONLINE
+XIFENFEI/xff/datafile/user_dd.dbf                            4 ONLINE
+XIFENFEI/xff/datafile/undotbs2.264.776961693                 5 ONLINE
+XIFENFEI/asm/datafile/xifenfei01.dbf.268.781967893           6 ONLINE

6 rows selected.

文件系统中的datafile 2 恢复到asm中

SQL> alter database datafile 2 offline;

Database altered.

RMAN> copy datafile 2 to '+XIFENFEI';

Starting backup at 2012-06-12 10:55:42
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=21 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=/tmp/auxsys.dbf_rman
output file name=+XIFENFEI/xff/datafile/sysaux.257.785761227 tag=TAG20120612T105800 RECID=1 STAMP=785762097
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:16:24
Finished backup at 2012-06-12 11:15:05


RMAN> switch datafile 2 to copy;

datafile 2 switched to datafile copy "+XIFENFEI/xff/datafile/sysaux.257.785761227"

RMAN> recover datafile 2;

Starting recover at 2012-06-12 11:30:32
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:01:30

Finished recover at 2012-06-12 11:34:11

RMAN> sql 'alter database datafile 2 online';

sql statement: alter database datafile 2 online

验证和收尾工作

SQL> select name,file#,status from v$datafile;

NAME                                                      FILE# STATUS
---------------------------------------------------- ---------- -------
+XIFENFEI/xff/datafile/system.256.776961315                   1 SYSTEM
+XIFENFEI/xff/datafile/sysaux.257.785761227                   2 ONLINE
+XIFENFEI/xff/datafile/undotbs1.258.776961317                 3 ONLINE
+XIFENFEI/xff/datafile/user_dd.dbf                            4 ONLINE
+XIFENFEI/xff/datafile/undotbs2.264.776961693                 5 ONLINE
+XIFENFEI/asm/datafile/xifenfei01.dbf.268.781967893           6 ONLINE


SQL> alter system set  cluster_database=true scope=spfile;

System altered.

--然后重启节点

resmgr:cpu quantum等待

昨天晚上数据库升级(使用exp/imp从9i升级到11g),开启业务,数据库出现很多resmgr:cpu quantum等待

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for IBM/AIX RISC System/6000: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> SELECT sid,event FROM v$session WHERE wait_class#<>6;

       SID EVENT
---------- ----------------------------------------------------------------
        27 resmgr:cpu quantum
        38 resmgr:cpu quantum
        43 resmgr:cpu quantum
        46 resmgr:cpu quantum
       113 resmgr:cpu quantum
       118 resmgr:cpu quantum
       125 resmgr:cpu quantum
       140 resmgr:cpu quantum
       143 resmgr:cpu quantum
       199 resmgr:cpu quantum
       205 resmgr:cpu quantum

       SID EVENT
---------- ----------------------------------------------------------------
       212 resmgr:cpu quantum
       220 resmgr:cpu quantum
       221 resmgr:cpu quantum
       223 resmgr:cpu quantum
       238 resmgr:cpu quantum
       241 resmgr:cpu quantum
       301 resmgr:cpu quantum
       313 resmgr:cpu quantum
       314 resmgr:cpu quantum
       405 resmgr:cpu quantum
       410 resmgr:cpu quantum

       SID EVENT
---------- ----------------------------------------------------------------
       415 resmgr:cpu quantum
       435 resmgr:cpu quantum
       502 resmgr:cpu quantum
       503 resmgr:cpu quantum
       509 resmgr:cpu quantum
       510 resmgr:cpu quantum
       512 resmgr:cpu quantum
       521 resmgr:cpu quantum
       526 resmgr:cpu quantum
       528 resmgr:cpu quantum
       532 resmgr:cpu quantum

       SID EVENT
---------- ----------------------------------------------------------------
       533 enq: TX - row lock contention
       589 resmgr:cpu quantum
       596 resmgr:cpu quantum
       600 resmgr:cpu quantum
       609 resmgr:cpu quantum
       611 resmgr:cpu quantum
       625 resmgr:cpu quantum
       635 null event
       707 resmgr:cpu quantum
       727 resmgr:cpu quantum
       731 SQL*Net message to client

44 rows selected.

查询alert日志

Sat Jun 09 06:00:00 2012
Setting Resource Manager plan SCHEDULER[0x310C]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Sun Jun 10 02:00:00 2012
Closing Resource Manager plan via scheduler window
Clearing Resource Manager plan via parameter

Sun Jun 10 06:00:00 2012
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Mon Jun 11 02:00:00 2012
Closing Resource Manager plan via scheduler window
Clearing Resource Manager plan via parameter

Mon Jun 11 22:00:00 2012
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Tue Jun 12 02:00:00 2012
Closing scheduler window
Closing Resource Manager plan via scheduler window
Clearing Resource Manager plan via parameter

Tue Jun 12 22:00:00 2012
Setting Resource Manager plan SCHEDULER[0x3108]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Wed Jun 13 02:00:00 2012
Closing scheduler window
Closing Resource Manager plan via scheduler window
Clearing Resource Manager plan via parameter

从这里可以看出来,因为SCHEDULER定时启动和关闭资源管理的DEFAULT_MAINTENANCE_PLAN从而导致在晚上10点到2点Resource Manager plan处于启用状态.上线测试刚好在晚上2点之前,所有当时查询的时候发现很多resmgr:cpu quantum等待是因为Resource Manager plan启用导致(使用SCHEDULER控制其启用和关闭),很多情况下数据库跑的应用比较单一,不是十分的需要启动资源管理.
在11g中关闭方法如下

1. Set the current resource manager plan to null (or another plan that is not restrictive):

alter system set resource_manager_plan='' scope=both;

2. Change the active windows to use the null resource manager plan (or other nonrestrictive plan) using:

execute dbms_scheduler.set_attribute('WEEKNIGHT_WINDOW','RESOURCE_PLAN',''); 
execute dbms_scheduler.set_attribute('WEEKEND_WINDOW','RESOURCE_PLAN','');

3. Then, for each window_name (WINDOW_NAME from DBA_SCHEDULER_WINDOWS), run:

execute dbms_scheduler.set_attribute('<window name>','RESOURCE_PLAN','');

SQL> select WINDOW_NAME  from DBA_SCHEDULER_WINDOWS;

WINDOW_NAME
------------------------------
MONDAY_WINDOW
TUESDAY_WINDOW
WEDNESDAY_WINDOW
THURSDAY_WINDOW
FRIDAY_WINDOW
SATURDAY_WINDOW
SUNDAY_WINDOW
WEEKNIGHT_WINDOW
WEEKEND_WINDOW

9 rows selected.