使用bbed让rac中的sysaux数据文件online

一个朋友的11g rac库的sysaux表空间因某种原因缺少历史归档,导致无法正常online,是的数据库的很多功能受限.通过实现展示恢复过程.
模拟环境

SQL> select name,file#,status from v$datafile;

NAME                                                      FILE# STATUS
---------------------------------------------------- ---------- -------
+XIFENFEI/xff/datafile/system.256.776961315                   1 SYSTEM
+XIFENFEI/xff/datafile/sysaux.257.776961315                   2 ONLINE
+XIFENFEI/xff/datafile/undotbs1.258.776961317                 3 ONLINE
+XIFENFEI/xff/datafile/user_dd.dbf                            4 ONLINE
+XIFENFEI/xff/datafile/undotbs2.264.776961693                 5 ONLINE
+XIFENFEI/asm/datafile/xifenfei01.dbf.268.781967893           6 ONLINE

6 rows selected.

SQL> alter database datafile 2 offline;

Database altered.

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     14
Next log sequence to archive   15
Current log sequence           15
SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

SQL> /

System altered.

SQL> /

System altered.

SQL> /

System altered.

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     19
Next log sequence to archive   19
Current log sequence           20

--删除部分归档日志
[grid@rac1 ~]$ asmcmd
ASMCMD> ls
DATA/
XIFENFEI/
ASMCMD> cd data
ASMCMD> ls
XFF/
rac-cluster/
ASMCMD> cd xff 
ASMCMD> ls
ARCHIVELOG/
CONTROLFILE/
ONLINELOG/
ASMCMD> cd archivelog
ASMCMD> ls
2012_03_03/
2012_04_13/
2012_04_30/
2012_05_01/
2012_05_24/
2012_06_12/
ASMCMD> cd 2012_06_12
ASMCMD> ls
thread_1_seq_15.280.785752747
thread_1_seq_16.281.785752845
thread_1_seq_17.282.785752929
thread_1_seq_18.283.785753043
thread_1_seq_19.284.785753115
ASMCMD> rm thread_1_seq_16.281.785752845
ASMCMD> rm thread_1_seq_15.280.785752747

尝试online 数据文件

SQL> alter database datafile 2 online;
alter database datafile 2 online
*
ERROR at line 1:
ORA-01113: file 2 needs media recovery
ORA-01110: data file 2: '+XIFENFEI/xff/datafile/sysaux.257.776961315'


SQL> recover datafile 2;
ORA-00279: change 1155352 generated at 06/12/2012 08:20:10 needed for thread 1
ORA-00289: suggestion :
+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747
ORA-00280: change 1155352 for thread 1 is in sequence #15


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: cannot open archived log
'+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747'
ORA-17503: ksfdopn:2 Failed to open file
+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747
ORA-15012: ASM file
'+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747' does not exist


ORA-00308: cannot open archived log
'+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747'
ORA-17503: ksfdopn:2 Failed to open file
+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747
ORA-15012: ASM file
'+DATA/xff/archivelog/2012_06_12/thread_1_seq_15.280.785752747' does not exist

准备bbed修改数据文件
现在datafile 2不能恢复,我们需要修改的就是该datafile header 相关的scn等信息,另外拷贝一个数据文件出来做修改时候参考

RMAN> copy datafile 2 to  '/tmp/auxsys.dbf_rman';

Starting backup at 2012-06-12 08:59:07
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=20 instance=XFF1 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=+XIFENFEI/xff/datafile/sysaux.257.776961315
output file name=/tmp/auxsys.dbf_rman tag=TAG20120612T090029 RECID=1 STAMP=785754322
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:03:50
Finished backup at 2012-06-12 09:05:36

RMAN>  copy datafile 4 to '/tmp/user.dbf_rman';

Starting backup at 2012-06-12 09:09:28
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=+XIFENFEI/xff/datafile/user_dd.dbf
output file name=/tmp/user.dbf_rman tag=TAG20120612T090932 RECID=2 STAMP=785754582
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
Finished backup at 2012-06-12 09:09:48

bbed修改datafile header

[oracle@rac1 tmp]$ bbed password=blockedit listfile=/tmp/o_bbed  mode=edit

BBED: Release 2.0.0.0.0 - Limited Production on Tue Jun 12 09:37:30 2012

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> info
 File#  Name                                                        Size(blks)
 -----  ----                                                        ----------
     1  /tmp/auxsys.dbf_rman                                                 0
     2  /tmp/user.dbf_rman                                                   0

BBED> set file 2 block 1
        FILE#           2
        BLOCK#          1


BBED> p kcvfhckp
struct kcvfhckp, 36 bytes                   @484     
   struct kcvcpscn, 8 bytes                 @484     
      ub4 kscnbas                           @484      0x0011a787
      ub2 kscnwrp                           @488      0x0000
   ub4 kcvcptim                             @492      0x2ed5a9cd
   ub2 kcvcpthr                             @496      0x0001
   union u, 12 bytes                        @500     
      struct kcvcprba, 12 bytes             @500     
         ub4 kcrbaseq                       @500      0x00000014
         ub4 kcrbabno                       @504      0x000000c5
         ub2 kcrbabof                       @508      0x0010
   ub1 kcvcpetb[0]                          @512      0x02
   ub1 kcvcpetb[1]                          @513      0x00
   ub1 kcvcpetb[2]                          @514      0x00
   ub1 kcvcpetb[3]                          @515      0x00
   ub1 kcvcpetb[4]                          @516      0x00
   ub1 kcvcpetb[5]                          @517      0x00
   ub1 kcvcpetb[6]                          @518      0x00
   ub1 kcvcpetb[7]                          @519      0x00

BBED> p kcvfhcpc                             
ub4 kcvfhcpc                                @140      0x00000086

BBED> p kcvfhccc                             
ub4 kcvfhccc                                @148      0x00000085

BBED> set file 1 block 1
        FILE#           1
        BLOCK#          1

BBED> p kcvfhckp
struct kcvfhckp, 36 bytes                   @484     
   struct kcvcpscn, 8 bytes                 @484     
      ub4 kscnbas                           @484      0x0011a118
      ub2 kscnwrp                           @488      0x0000
   ub4 kcvcptim                             @492      0x2ed59e3a
   ub2 kcvcpthr                             @496      0x0001
   union u, 12 bytes                        @500     
      struct kcvcprba, 12 bytes             @500     
         ub4 kcrbaseq                       @500      0x0000000f
         ub4 kcrbabno                       @504      0x0000c4ed
         ub2 kcrbabof                       @508      0x0010
   ub1 kcvcpetb[0]                          @512      0x02
   ub1 kcvcpetb[1]                          @513      0x00
   ub1 kcvcpetb[2]                          @514      0x00
   ub1 kcvcpetb[3]                          @515      0x00
   ub1 kcvcpetb[4]                          @516      0x00
   ub1 kcvcpetb[5]                          @517      0x00
   ub1 kcvcpetb[6]                          @518      0x00
   ub1 kcvcpetb[7]                          @519      0x00

BBED> p kcvfhcpc
ub4 kcvfhcpc                                @140      0x00000079

BBED>  p kcvfhccc 
ub4 kcvfhccc                                @148      0x00000078

/*
确定需要修改项kscnbas/kcvcptim/kcvfhcpc/kcvfhccc的相关信息
*/

BBED> set count 16
        COUNT           16

BBED> d file 2 block 1 offset 484                            
 File: /tmp/user.dbf_rman (2)
 Block: 1                Offsets:  484 to  499           Dba:0x00800001
------------------------------------------------------------------------
 87a71100 00001000 cda9d52e 01000000 

 <32 bytes per line>

BBED> m /x 87a71100 file 1 block 1 offset 484
BBED-00209: invalid number (87a71100)


BBED> m /x 87a7 file 1 block 1 offset 484
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  484 to  499           Dba:0x00400001
------------------------------------------------------------------------
 87a71100 00000000 3a9ed52e 01000000 

 <32 bytes per line>

BBED> d file 2 block 1 offset 492
 File: /tmp/user.dbf_rman (2)
 Block: 1                Offsets:  492 to  507           Dba:0x00800001
------------------------------------------------------------------------
 cda9d52e 01000000 14000000 c5000000 

 <32 bytes per line>

BBED> m /x cda9d52e file 1 block 1 offset 492
BBED-00209: invalid number (cda9d52e)


BBED> d file 1 block 1 offset 492
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  492 to  507           Dba:0x00400001
------------------------------------------------------------------------
 3a9ed52e 01000000 0f000000 edc40000 

 <32 bytes per line>

BBED> m /x cda9 file 1 block 1 offset 492
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  492 to  507           Dba:0x00400001
------------------------------------------------------------------------
 cda9d52e 01000000 0f000000 edc40000 

 <32 bytes per line>

BBED> d file 1 block 1 offset 140
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  140 to  155           Dba:0x00400001
------------------------------------------------------------------------
 79000000 2970bc2e 78000000 00000000 

 <32 bytes per line>

BBED> d file 2 block 1 offset 140
 File: /tmp/user.dbf_rman (2)
 Block: 1                Offsets:  140 to  155           Dba:0x00800001
------------------------------------------------------------------------
 86000000 2970bc2e 85000000 00000000 

 <32 bytes per line>

BBED> m /x 86000000 file 1 block 1 offset 140
BBED-00209: invalid number (86000000)


BBED> m /x 8600 file 1 block 1 offset 140
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  140 to  155           Dba:0x00400001
------------------------------------------------------------------------
 86000000 2970bc2e 78000000 00000000 

 <32 bytes per line>

BBED> d file 2 block 1 offset 148
 File: /tmp/user.dbf_rman (2)
 Block: 1                Offsets:  148 to  163           Dba:0x00800001
------------------------------------------------------------------------
 85000000 00000000 00000000 00000000 

 <32 bytes per line>

BBED> m /x 8500 file 1 block 1 offset 148
 File: /tmp/auxsys.dbf_rman (1)
 Block: 1                Offsets:  148 to  163           Dba:0x00400001
------------------------------------------------------------------------
 85000000 00000000 00000000 00000000 

 <32 bytes per line>

BBED> set file 1 block 1
        FILE#           1
        BLOCK#          1

BBED> p kcvfhckp
struct kcvfhckp, 36 bytes                   @484     
   struct kcvcpscn, 8 bytes                 @484     
      ub4 kscnbas                           @484      0x0011a787
      ub2 kscnwrp                           @488      0x0000
   ub4 kcvcptim                             @492      0x2ed5a9cd
   ub2 kcvcpthr                             @496      0x0001
   union u, 12 bytes                        @500     
      struct kcvcprba, 12 bytes             @500     
         ub4 kcrbaseq                       @500      0x0000000f
         ub4 kcrbabno                       @504      0x0000c4ed
         ub2 kcrbabof                       @508      0x0010
   ub1 kcvcpetb[0]                          @512      0x02
   ub1 kcvcpetb[1]                          @513      0x00
   ub1 kcvcpetb[2]                          @514      0x00
   ub1 kcvcpetb[3]                          @515      0x00
   ub1 kcvcpetb[4]                          @516      0x00
   ub1 kcvcpetb[5]                          @517      0x00
   ub1 kcvcpetb[6]                          @518      0x00
   ub1 kcvcpetb[7]                          @519      0x00

BBED> p kcvfhcpc
ub4 kcvfhcpc                                @140      0x00000086

BBED>  p kcvfhccc 
ub4 kcvfhccc                                @148      0x00000085

BBED> sum apply
Check value for File 1, Block 1:
current = 0x48c4, required = 0x48c4

使用修改后数据文件尝试online

SQL> alter database rename file '+XIFENFEI/xff/datafile/sysaux.257.776961315' to '/tmp/auxsys.dbf_rman';

Database altered.

SQL> recover database datafile 2 ;
ORA-00274: illegal recovery option DATAFILE


SQL> recover database datafile 2;
ORA-00274: illegal recovery option DATAFILE


SQL> recover datafile 2;
ORA-00283: recovery session canceled due to errors
ORA-01122: database file 2 failed verification check
ORA-01110: data file 2: '/tmp/auxsys.dbf_rman'
ORA-01207: file is more recent than control file - old control file

尝试重建控制文件

SQL> alter database backup controlfile to trace as '/tmp/xifenfei.ctl';

Database altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> STARTUP NOMOUNT
ORACLE instance started.

Total System Global Area  535662592 bytes
Fixed Size                  1346140 bytes
Variable Size             411043236 bytes
Database Buffers          117440512 bytes
Redo Buffers                5832704 bytes
SQL> @xifenfei_ctl

CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS  ARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-12720: operation requires database is in EXCLUSIVE mode

--在rac中重建控制文件需要设置cluster_database=FALSE 

SQL> alter system set  cluster_database=FALSE scope=spfile;

System altered.

SQL> shutdown immediate;
ORA-01507: database not mounted


ORACLE instance shut down.
SQL> STARTUP NOMOUNT
ORACLE instance started.

Total System Global Area  535662592 bytes
Fixed Size                  1346140 bytes
Variable Size             411043236 bytes
Database Buffers          117440512 bytes
Redo Buffers                5832704 bytes
SQL> @xifenfei_ctl

Control file created.

online数据文件
重建控制文件恢复数据库之后 datafile 2自动online成功,省去了手工处理麻烦,如果没有自动online,请手工处理

SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

SQL> col name for a52
SQL> select name,file#,status from v$datafile;

NAME                                                      FILE# STATUS
---------------------------------------------------- ---------- -------
+XIFENFEI/xff/datafile/system.256.776961315                   1 SYSTEM
/tmp/auxsys.dbf_rman                                          2 ONLINE
+XIFENFEI/xff/datafile/undotbs1.258.776961317                 3 ONLINE
+XIFENFEI/xff/datafile/user_dd.dbf                            4 ONLINE
+XIFENFEI/xff/datafile/undotbs2.264.776961693                 5 ONLINE
+XIFENFEI/asm/datafile/xifenfei01.dbf.268.781967893           6 ONLINE

6 rows selected.

文件系统中的datafile 2 恢复到asm中

SQL> alter database datafile 2 offline;

Database altered.

RMAN> copy datafile 2 to '+XIFENFEI';

Starting backup at 2012-06-12 10:55:42
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=21 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=/tmp/auxsys.dbf_rman
output file name=+XIFENFEI/xff/datafile/sysaux.257.785761227 tag=TAG20120612T105800 RECID=1 STAMP=785762097
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:16:24
Finished backup at 2012-06-12 11:15:05


RMAN> switch datafile 2 to copy;

datafile 2 switched to datafile copy "+XIFENFEI/xff/datafile/sysaux.257.785761227"

RMAN> recover datafile 2;

Starting recover at 2012-06-12 11:30:32
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:01:30

Finished recover at 2012-06-12 11:34:11

RMAN> sql 'alter database datafile 2 online';

sql statement: alter database datafile 2 online

验证和收尾工作

SQL> select name,file#,status from v$datafile;

NAME                                                      FILE# STATUS
---------------------------------------------------- ---------- -------
+XIFENFEI/xff/datafile/system.256.776961315                   1 SYSTEM
+XIFENFEI/xff/datafile/sysaux.257.785761227                   2 ONLINE
+XIFENFEI/xff/datafile/undotbs1.258.776961317                 3 ONLINE
+XIFENFEI/xff/datafile/user_dd.dbf                            4 ONLINE
+XIFENFEI/xff/datafile/undotbs2.264.776961693                 5 ONLINE
+XIFENFEI/asm/datafile/xifenfei01.dbf.268.781967893           6 ONLINE


SQL> alter system set  cluster_database=true scope=spfile;

System altered.

--然后重启节点

忘记执行end backup命令数据库恢复

遇到两次begin backup忘记end backup导致的悲剧.虽然不是自己亲身经历,但是感触很深,这里做了一个小实验,说明在begin backup后忘记end backup,而又丢失了备份归档日志,且数据库异常重启的事故恢复(这里为了加大实验难道,并且使用begin backup命令后的热备文件恢复)
模拟begin end

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/xifenfei/archive
Oldest online log sequence     37
Next log sequence to archive   39
Current log sequence           39
SQL> alter tablespace bbed begin backup;

Tablespace altered.

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/xifenfei/archive
Oldest online log sequence     37
Next log sequence to archive   39
Current log sequence           39
SQL> drop table chf.t_xff;

Table dropped.

SQL> create table chf.t_xff
  2    as
  3  select * from dba_objects;

Table created.

SQL> alter system switch logfile;

System altered.

SQL>   delete from chf.t_XFF;

30811 rows deleted.

SQL>   commit;

Commit complete.

SQL>   alter system  switch logfile;

System altered.

SQL> /

System altered.

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/xifenfei/archive
Oldest online log sequence     40
Next log sequence to archive   42
Current log sequence           42

cp备份文件

[oracle@xifenfei xifenfei]$ cp bbed01.dbf bbed01.dbf_05
[oracle@xifenfei xifenfei]$ cp bbed02.dbf bbed02.dbf_05

继续操作数据库

SQL> alter system switch logfile;

System altered.

SQL>   insert into chf.t_xff
  2    select * from dba_objects;

30811 rows created.

SQL> commit;

Commit complete.

SQL>  archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/xifenfei/archive
Oldest online log sequence     41
Next log sequence to archive   43
Current log sequence           43
SQL> alter system switch logfile;

System altered.

模拟异常关闭数据库

SQL> shutdown immediate;
ORA-01149: cannot shutdown - file 11 has online backup set
ORA-01110: data file 11: '/u01/oracle/oradata/xifenfei/bbed01.dbf'
SQL> shutdown abort;
ORACLE instance shut down.

删除部分归档日志(模拟归档日志丢失)

[oracle@xifenfei archive]$ mv 1_39.dbf 1_39.dbf_bak
[oracle@xifenfei archive]$ mv 1_40.dbf 1_40.dbf_bak

启动数据库

[oracle@xifenfei xifenfei]$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.4.0 - Production on Tue Jun 5 03:02:56 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  353441008 bytes
Fixed Size                   451824 bytes
Variable Size             184549376 bytes
Database Buffers          167772160 bytes
Redo Buffers                 667648 bytes
Database mounted.
ORA-01113: file 11 needs media recovery
ORA-01110: data file 11: '/u01/oracle/oradata/xifenfei/bbed01.dbf'

分析相关SCN

SQL> select file#,online_status "STATUS",to_char(change#,'9999999999999999') "SCN",
  2  To_char(time,'yyyy-mm-dd hh24:mi:ss')"TIME" from v$recover_file;

     FILE# STATUS  SCN               TIME
---------- ------- ----------------- -------------------
        11 ONLINE     12286828683164 2012-06-05 02:55:43
        12 ONLINE     12286828683164 2012-06-05 02:55:43

SQL> select file#,to_char(checkpoint_change#,'999999999999999') "SCN",
  2  to_char(last_change#,'999999999999999')"STOP_SCN" from v$datafile;

     FILE# SCN              STOP_SCN
---------- ---------------- ----------------
         1   12286828684636
         2   12286828684636
         3   12286828684636
         4   12286828684636
         5   12286828684636
         6   12286828684636
         7   12286828684636
         8   12286828684636
         9   12286828684636
        10   12286828684636
        11   12286828683164
        12   12286828683164

12 rows selected.

SQL> select file#,to_char(checkpoint_change#,'9999999999999999') "SCN",
  2  to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN"
  3  from v$datafile_header;

     FILE# SCN               RESETLOGS SCN
---------- ----------------- -----------------
         1    12286828684636            174968
         2    12286828684636            174968
         3    12286828684636            174968
         4    12286828684636            174968
         5    12286828684636            174968
         6    12286828684636            174968
         7    12286828684636            174968
         8    12286828684636            174968
         9    12286828684636            174968
        10    12286828684636            174968
        11    12286828683164            174968
        12    12286828683164            174968

12 rows selected.

SQL> select file#,to_char(CHANGE#,'9999999999999999') "SCN",
  2  to_char(TIME,'yyyy-mm-dd hh24:mi:ss') "TIME" from v$backup;

     FILE# SCN               TIME
---------- ----------------- -------------------
         1                 0
         2                 0
         3                 0
         4                 0
         5                 0
         6                 0
         7                 0
         8                 0
         9                 0 
        10                 0
        11    12286828683164 2012-06-05 02:55:43
        12    12286828683164 2012-06-05 02:55:43

12 rows selected.

发现数据库未end backup

Tue Jun  5 02:55:43 2012
alter tablespace bbed begin backup
Tue Jun  5 02:55:43 2012
Completed: alter tablespace bbed begin backup

尝试end backup
出现这个错误是正常的,因为我替换回来的bbed表空间数据文件的版本信息可能和控制文件的不一致,解决方法是重建控制文件

SQL> alter tablespace bbed end backup;
alter tablespace bbed end backup
*
ERROR at line 1:
ORA-01235: END BACKUP failed for 2 file(s) and succeeded for 0
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/oradata/xifenfei/bbed02.dbf'
ORA-01208: data file is an old version - not accessing current version
ORA-01122: database file 11 failed verification check
ORA-01110: data file 11: '/u01/oracle/oradata/xifenfei/bbed01.dbf'
ORA-01208: data file is an old version - not accessing current version

重建控制文件

SQL> shutdown abort;
ORACLE instance shut down.
SQL> STARTUP NOMOUNT
Total System Global Area  353441008 bytes
Fixed Size                   451824 bytes
Variable Size             184549376 bytes
Database Buffers          167772160 bytes
Redo Buffers                 667648 bytes

SQL>@ctl.sql
Control file created.

尝试恢复数据库

SQL> recover database;
ORA-00279: change 12286828683164 generated at 06/05/2012 02:55:43 needed for
thread 1
ORA-00289: suggestion : /u01/oracle/oradata/xifenfei/archive/1_39.dbf
ORA-00280: change 12286828683164 for thread 1 is in sequence #39


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: cannot open archived log
'/u01/oracle/oradata/xifenfei/archive/1_39.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3


ORA-00308: cannot open archived log
'/u01/oracle/oradata/xifenfei/archive/1_39.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3

执行end backup

SQL> alter tablespace bbed end backup;

Tablespace altered.

再次查看相关SCN
可以发现end backup之后,datafile header 的scn发生了改变,说明begin backup主要是冻住了datafile header scn

SQL> select file#,online_status "STATUS",to_char(change#,'9999999999999999') "SCN",
  2  To_char(time,'yyyy-mm-dd hh24:mi:ss')"TIME" from v$recover_file;

     FILE# STATUS  SCN               TIME
---------- ------- ----------------- -------------------
         1 ONLINE     12286828684636 2012-06-05 03:00:46
         2 ONLINE     12286828684636 2012-06-05 03:00:46
         3 ONLINE     12286828684636 2012-06-05 03:00:46
         4 ONLINE     12286828684636 2012-06-05 03:00:46
         5 ONLINE     12286828684636 2012-06-05 03:00:46
         6 ONLINE     12286828684636 2012-06-05 03:00:46
         7 ONLINE     12286828684636 2012-06-05 03:00:46
         8 ONLINE     12286828684636 2012-06-05 03:00:46
         9 ONLINE     12286828684636 2012-06-05 03:00:46
        10 ONLINE     12286828684636 2012-06-05 03:00:46
        11 ONLINE     12286828683821 2012-06-05 02:56:26
        12 ONLINE     12286828683821 2012-06-05 02:56:26

12 rows selected.

SQL> select file#,to_char(checkpoint_change#,'999999999999999') "SCN",
  2  to_char(last_change#,'999999999999999')"STOP_SCN" from v$datafile;

     FILE# SCN              STOP_SCN
---------- ---------------- ----------------
         1   12286828684636
         2   12286828684636
         3   12286828684636
         4   12286828684636
         5   12286828684636
         6   12286828684636
         7   12286828684636
         8   12286828684636
         9   12286828684636
        10   12286828684636
        11   12286828684636
        12   12286828684636

12 rows selected.

SQL> select file#,to_char(checkpoint_change#,'9999999999999999') "SCN",
  2  to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN"
  3  from v$datafile_header;

     FILE# SCN               RESETLOGS SCN
---------- ----------------- -----------------
         1    12286828684636            174968
         2    12286828684636            174968
         3    12286828684636            174968
         4    12286828684636            174968
         5    12286828684636            174968
         6    12286828684636            174968
         7    12286828684636            174968
         8    12286828684636            174968
         9    12286828684636            174968
        10    12286828684636            174968
        11    12286828683821            174968
        12    12286828683821            174968

12 rows selected.

再次尝试恢复数据库

SQL> recover database;
ORA-00279: change 12286828683821 generated at 06/05/2012 02:56:26 needed for
thread 1
ORA-00289: suggestion : /u01/oracle/oradata/xifenfei/archive/1_41.dbf
ORA-00280: change 12286828683821 for thread 1 is in sequence #41


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
Log applied.
Media recovery complete.
SQL> alter database open;

Database altered.

总结说明
在数据库忘记end backup,而又被异常重启数据库时候,会提示你需要恢复.这个时候如果你有所有的归档日志,那没有任何问题,直接recover就可以了.如果因为begin backup命令执行比较久,部分归档日志丢失,这个时候不能直接recover,可以先尝试end backup,然后在recover.如果在这个时候还发现有部分日志不存在,那只能考虑bbed修改datafile header的scn.
温馨提醒:各位dba在执行begin backup之后一定要记得end backup

rman通过nfs备份

挂载nfs

[root@xifenfei tmp]# mount -t nfs 192.168.1.90:/tmp/nfs /nfs
[root@xifenfei tmp]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              18G   12G  5.2G  70% /
tmpfs                 737M     0  737M   0% /dev/shm
/dev/sdb1              20G  7.8G   11G  42% /u01/oracle/oradata
192.168.1.90:/tmp/nfs
                       18G   13G  3.9G  77% /nfs

rman备份

[oracle@xifenfei nfs]$ rman target /

Recovery Manager: Release 10.2.0.1.0 - Production on Wed May 30 16:31:40 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: XFF (DBID=3426707456)

RMAN> backup datafile 1 format '/nfs/rman/system01_%U';

Starting backup at 3-JUN-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=146 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 05/30/2012 16:32:17
ORA-19602: cannot backup or copy active file in NOARCHIVELOG mode
continuing other job steps, job failed will not be re-run
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 3-JUN-12
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 05/30/2012 16:32:20
ORA-19504: failed to create file "/nfs/rman/system01_02nc9rgh_1_1"
ORA-27054: NFS file system where the file is created or resides is not mounted with correct options
Additional information: 3

重新挂载nfs

[root@xifenfei ~]# umount /nfs
[root@xifenfei tmp]# mount -t nfs -o rw,bg,hard,rsize=32768,wsize=32768,
>vers=3,nointr,timeo=600,tcp 192.168.1.90:/tmp/nfs /nfs

rman重新备份

[oracle@xifenfei ~]$ rman target /

Recovery Manager: Release 10.2.0.1.0 - Production on Wed May 30 16:38:14 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: XFF (DBID=3426707456)

RMAN> backup datafile 1 format '/nfs/rman/system01_%U';

Starting backup at 3-JUN-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=143 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=/u01/oracle/oradata/XFF/system01.dbf
channel ORA_DISK_1: starting piece 1 at 3-JUN-12
channel ORA_DISK_1: finished piece 1 at 3-JUN-12
piece handle=/nfs/rman/system01_07nc9rrv_1_1 tag=TAG20120530T163823 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:02:15
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 3-JUN-12
channel ORA_DISK_1: finished piece 1 at 3-JUN-12
piece handle=/nfs/rman/system01_08nc9s07_1_1 tag=TAG20120530T163823 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 3-JUN-12

查看nfs源端文件

[root@xifenfei rman]# pwd
/tmp/nfs/rman
[root@xifenfei rman]# ls -l
total 378300
-rw-r-----  1 54321 54321 379846656 Jun  3 13:45 system01_07nc9rrv_1_1
-rw-r-----  1 54321 54321   7143424 Jun  3 13:45 system01_08nc9s07_1_1

要点说明
Mount Options for Oracle files when used with NFS on NAS devices [ID 359515.1]

ORA-39126: 在 KUPW$WORKER.PUT_DDLS [TABLE_STATISTICS] 中 Worker 发生意外致命错误

使用impdp导入数据报如下错误导致导入终止

处理对象类型 SCHEMA_EXPORT/TABLE/TRIGGER
处理对象类型 SCHEMA_EXPORT/TABLE/INDEX/FUNCTIONAL_AND_BITMAP/INDEX
处理对象类型 SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_AND_BITMAP/INDEX_STATISTICS
处理对象类型 SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
ORA-39126: 在 KUPW$WORKER.PUT_DDLS [TABLE_STATISTICS] 中 Worker 发生意外致命错误

ORA-06502: PL/SQL: 数字或值错误
LPX-00225: end-element tag "HIST_GRAM_LIST_ITEM" does not match start-element tag "EPVALUE"

ORA-06512: 在 "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: 在 "SYS.KUPW$WORKER", line 9001

----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
26ABF4B0     20462  package body SYS.KUPW$WORKER
26ABF4B0      9028  package body SYS.KUPW$WORKER
26ABF4B0     16665  package body SYS.KUPW$WORKER
26ABF4B0      3956  package body SYS.KUPW$WORKER
26ABF4B0      9725  package body SYS.KUPW$WORKER
26ABF4B0      1775  package body SYS.KUPW$WORKER
290D454C         2  anonymous block

ORA-39097: 数据泵作业出现意外的错误 -1427
ORA-39065: DISPATCH 中出现意外的主进程异常错误
ORA-01427: 单行子查询返回多个行

作业 "EAS"."SYS_IMPORT_SCHEMA_01" 因致命错误于 15:21:20 停止

从这里可以看出是在执行TABLE_STATISTICS的时候因为EPVALUE列的数据类型和导入数据不匹配,问题发生上面错误,导致impdp job终止.

解决办法
参考文档:[ID 878626.1]
1.如果数据已经expdp导出,建议在导入的时候屏蔽掉统计信息导入EXCLUDE=STATISTICS,导入后使用DBMS_STATS 重新收集统计信息
2.如果数据尚未expdp导出,建议在导出的时候屏蔽掉统计信息导出EXCLUDE=STATISTICS导入后使用DBMS_STATS 重新收集统计信息

undo坏块导致数据库异常终止案例

在处理的众多undo问题的数据库中,这个是第一例遇到因为undo出现坏块导致数据库自动down案例(oracle 9.2.0.8 aix)
undo坏块导致数据库down

Fri Jun  1 00:45:13 2012
Successfully onlined Undo Tablespace 1.
Fri Jun  1 00:45:13 2012
SMON: enabling tx recovery
Fri Jun  1 00:45:13 2012
Database Characterset is ZHS16GBK
Fri Jun  1 00:45:13 2012
SMON: about to recover undo segment 52
SMON: about to recover undo segment 52
SMON: mark undo segment 52 as available
SMON: Restarting fast_start parallel rollback
SMON: about to recover undo segment 52
SMON: mark undo segment 52 as available
SMON: ignoring slave err,downgrading to serial rollback
SMON: about to recover undo segment 52
ORACLE Instance acc1 (pid = 9) - Error 1578 encountered while recovering transaction (52, 29).
Fri Jun  1 00:45:14 2012
Errors in file /oraacc/app/admin/acc/bdump/acc1_smon_734734.trc:
ORA-01578: ORACLE data block corrupted (file # 169, block # 55887)
ORA-01110: data file 169: '/dev/raccount07_01lv'
Fri Jun  1 00:45:15 2012
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: alter database open
Fri Jun  1 00:45:16 2012
Errors in file /oraacc/app/admin/acc/bdump/acc1_pmon_766940.trc:
ORA-00474: SMON process terminated with error
Fri Jun  1 00:45:16 2012
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 766940

这里可以看出数据库因为回滚段52出现坏块导致回滚的时候smon终止,数据库down

检测坏块

$ dbv file='/dev/raccount07_01lv' blocksize=8192

DBVERIFY: Release 9.2.0.8.0 - Production on Fri Jun 1 01:25:10 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

DBVERIFY - Verification starting : FILE = /dev/raccount07_01lv

DBV-00200: Block, dba 708893135, already marked corrupted

DBV-00200: Block, dba 708893151, already marked corrupted

DBV-00200: Block, dba 708893263, already marked corrupted


DBVERIFY - Verification complete

Total Pages Examined         : 1048320
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 1048320
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 0
Total Pages Marked Corrupt   : 3
Total Pages Influx           : 0
Highest block SCN            : 12843497215383 (2990.1545000343)

检测对应文件号/数据块号

SQL> select DBMS_UTILITY.data_block_address_file(708893135) file#,
  2  DBMS_UTILITY.data_block_address_block(708893135) block#  from dual;

     FILE#     BLOCK#
---------- ----------
       169      55759

SQL> select DBMS_UTILITY.data_block_address_file(708893151) file#,
  2  DBMS_UTILITY.data_block_address_block(708893151) block#  from dual;

     FILE#     BLOCK#
---------- ----------
       169      55775

SQL> select DBMS_UTILITY.data_block_address_file(708893263) file#,
  2  DBMS_UTILITY.data_block_address_block(708893263) block#  from dual;

     FILE#     BLOCK#
---------- ----------
       169      55887

解决办法

--隐含参数
_corrupted_rollback_segments= _SYSSMU52$

--open库后
alter session set "_smu_debug_mode"=4;
drop rollback segment "_SYSSMU52$";