使用bbed修复损坏datafile header

相关信息和准备工作

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE    9.2.0.3.0       Production
TNS for Linux: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production

SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') "www.orasos.com" from dual;

www.orasos.com
-------------------
2012-05-29 19:39:48

启动数据块异常

SQL> startup
ORACLE instance started.

Total System Global Area  353441008 bytes
Fixed Size                   451824 bytes
Variable Size             184549376 bytes
Database Buffers          167772160 bytes
Redo Buffers                 667648 bytes
Database mounted.
ORA-01110: data file 11: '/u01/oracle/oradata/xifenfei/bbed01.dbf'
ORA-01115: IO error reading block from file 11 (block # 1)
ORA-27072: skgfdisp: I/O error

bbed检测datafile header

[oracle@xifenfei ~]$ bbed password=blockedit blocksize=8192  listfile=/home/oracle/bbed.file mode=edit

BBED: Release 2.0.0.0.0 - Limited Production on Sat May 26 05:29:37 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> info
 File#  Name                                                        Size(blks)
 -----  ----                                                        ----------
     1  /home/oracle/users01.dbf                                             0
     2  /home/oracle/system01.dbf.head                                       0
     3  /home/oracle/data11.ora                                              0
     4  /u01/oracle/oradata/xifenfei/system01.dbf                            0
     5  /u01/oracle/oradata/xifenfei/users01.dbf                             0
     6  /home/oracle/data11.ora.10                                           0
    11  /u01/oracle/oradata/xifenfei/bbed01.dbf                              0
    12  /u01/oracle/oradata/xifenfei/bbed02.dbf                              0

BBED> set file 11
        FILE#           11

BBED> set block 1
        BLOCK#          1

BBED> map
 File: /u01/oracle/oradata/xifenfei/bbed01.dbf (11)
 Block: 1                                     Dba:0x02c00001
------------------------------------------------------------
BBED-00400: invalid blocktype (00)

BBED> d
 File: /u01/oracle/oradata/xifenfei/bbed01.dbf (11)
 Block: 1                Offsets:    0 to  511           Dba:0x02c00001
------------------------------------------------------------------------
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 

 <32 bytes per line>
--header 记录全部为0,证明数据文件header坏掉

拷贝数据块
为了方便,拷贝同一个表空间的数据块

BBED> set file 12
        FILE#           12

BBED> set block 1
        BLOCK#          1

BBED> d count 16
 File: /u01/oracle/oradata/xifenfei/bbed02.dbf (12)
 Block: 1                Offsets:    0 to   15           Dba:0x03000001
------------------------------------------------------------------------
 0b020000 01000003 00000000 00000104 

 <32 bytes per line>

BBED> copy dba 0x03000001 to dba 0x02c00001
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/xifenfei/bbed01.dbf (11)
 Block: 1                Offsets:    0 to   15           Dba:0x02c00001
------------------------------------------------------------------------
 0b020000 01000003 00000000 00000104 

 <32 bytes per line>

BBED> show
        FILE#           11
        BLOCK#          1
        OFFSET          0
        DBA             0x02c00001 (46137345 11,1)
        FILENAME        /u01/oracle/oradata/xifenfei/bbed01.dbf
        BIFILE          bifile.bbd
        LISTFILE        /home/oracle/bbed.file
        BLOCKSIZE       8192
        MODE            Edit
        EDIT            Unrecoverable
        IBASE           Dec
        OBASE           Dec
        WIDTH           80
        COUNT           16
        LOGFILE         log.bbd
        SPOOL           No

BBED> map
 File: /u01/oracle/oradata/xifenfei/bbed01.dbf (11)
 Block: 1                                     Dba:0x02c00001
------------------------------------------------------------
 Data File Header

 struct kcvfh, 360 bytes                    @0       

 ub4 tailchk                                @8188    

修改数据块内容

BBED>  p kcvfh
struct kcvfh, 360 bytes                     @0       
   struct kcvfhbfh, 20 bytes                @0       
      ub1 type_kcbh                         @0        0x0b
      ub1 frmt_kcbh                         @1        0x02
      ub1 spare1_kcbh                       @2        0x00
      ub1 spare2_kcbh                       @3        0x00
      ub4 rdba_kcbh                         @4        0x03000001
      ub4 bas_kcbh                          @8        0x00000000
      ub2 wrp_kcbh                          @12       0x0000
      ub1 seq_kcbh                          @14       0x01
      ub1 flg_kcbh                          @15       0x04 (KCBHFCKV)
      ub2 chkval_kcbh                       @16       0xb10a
      ub2 spare3_kcbh                       @18       0x0000
   struct kcvfhhdr, 76 bytes                @20      
      ub4 kccfhswv                          @20       0x09200000
      ub4 kccfhcvn                          @24       0x08000000
      ub4 kccfhdbi                          @28       0x5314b4cd
      text kccfhdbn[0]                      @32      X
      text kccfhdbn[1]                      @33      I
      text kccfhdbn[2]                      @34      F
      text kccfhdbn[3]                      @35      E
      text kccfhdbn[4]                      @36      N
      text kccfhdbn[5]                      @37      F
      text kccfhdbn[6]                      @38      E
      text kccfhdbn[7]                      @39      I
      ub4 kccfhcsq                          @40       0x000001d8
      ub4 kccfhfsz                          @44       0x00001400
      s_blkz kccfhbsz                       @48       0x00
      ub2 kccfhfno                          @52       0x000c
      ub2 kccfhtyp                          @54       0x0003
      ub4 kccfhacid                         @56       0x00000000
      ub4 kccfhcks                          @60       0x00000000
      text kccfhtag[0]                      @64       
      text kccfhtag[1]                      @65       
      text kccfhtag[2]                      @66       
      text kccfhtag[3]                      @67       
      text kccfhtag[4]                      @68       
      text kccfhtag[5]                      @69       
      text kccfhtag[6]                      @70       
      text kccfhtag[7]                      @71       
      text kccfhtag[8]                      @72       
      text kccfhtag[9]                      @73       
      text kccfhtag[10]                     @74       
      text kccfhtag[11]                     @75       
      text kccfhtag[12]                     @76       
      text kccfhtag[13]                     @77       
      text kccfhtag[14]                     @78       
      text kccfhtag[15]                     @79       
      text kccfhtag[16]                     @80       
      text kccfhtag[17]                     @81       
      text kccfhtag[18]                     @82       
      text kccfhtag[19]                     @83       
      text kccfhtag[20]                     @84       
      text kccfhtag[21]                     @85       
      text kccfhtag[22]                     @86       
      text kccfhtag[23]                     @87       
      text kccfhtag[24]                     @88       
      text kccfhtag[25]                     @89       
      text kccfhtag[26]                     @90       
      text kccfhtag[27]                     @91       
      text kccfhtag[28]                     @92       
      text kccfhtag[29]                     @93       
      text kccfhtag[30]                     @94       
      text kccfhtag[31]                     @95       
   ub4 kcvfhrdb                             @96       0x00000000
   struct kcvfhcrs, 8 bytes                 @100     
      ub4 kscnbas                           @100      0xc00a3405
      ub2 kscnwrp                           @104      0x0b2c
   ub4 kcvfhcrt                             @108      0x2ebeb8c3
   ub4 kcvfhrlc                             @112      0x2e51408f
   struct kcvfhrls, 8 bytes                 @116     
      ub4 kscnbas                           @116      0x0002ab78
      ub2 kscnwrp                           @120      0x0000
   ub4 kcvfhbti                             @124      0x00000000
   struct kcvfhbsc, 8 bytes                 @128     
      ub4 kscnbas                           @128      0x00000000
      ub2 kscnwrp                           @132      0x0000
   ub2 kcvfhbth                             @136      0x0000
   ub2 kcvfhsta                             @138      0x0000 (NONE)
   struct kcvfhckp, 36 bytes                @140     
      struct kcvcpscn, 8 bytes              @140     
         ub4 kscnbas                        @140      0xc00b6467
         ub2 kscnwrp                        @144      0x0b2c
      ub4 kcvcptim                          @148      0x2ebf0c07
      ub2 kcvcpthr                          @152      0x0001
      union u, 12 bytes                     @156     
         struct kcvcprba, 12 bytes          @156     
            ub4 kcrbaseq                    @156      0x00000015
            ub4 kcrbabno                    @160      0x0000429a
            ub2 kcrbabof                    @164      0x0010
         struct kcvcptr, 12 bytes           @156     
            struct kcrtrscn, 8 bytes        @156     
               ub4 kscnbas                  @156      0x00000015
               ub2 kscnwrp                  @160      0x429a
            ub4 kcrtrtim                    @164      0x09110010
      ub1 kcvcpetb[0]                       @168      0x02
      ub1 kcvcpetb[1]                       @169      0x00
      ub1 kcvcpetb[2]                       @170      0x00
      ub1 kcvcpetb[3]                       @171      0x00
      ub1 kcvcpetb[4]                       @172      0x00
      ub1 kcvcpetb[5]                       @173      0x00
      ub1 kcvcpetb[6]                       @174      0x00
      ub1 kcvcpetb[7]                       @175      0x00
   ub4 kcvfhcpc                             @176      0x0000000d
   ub4 kcvfhrts                             @180      0x2ebeea4f
   ub4 kcvfhccc                             @184      0x0000000c
   struct kcvfhbcp, 36 bytes                @188     
      struct kcvcpscn, 8 bytes              @188     
         ub4 kscnbas                        @188      0x00000000
         ub2 kscnwrp                        @192      0x0000
      ub4 kcvcptim                          @196      0x00000000
      ub2 kcvcpthr                          @200      0x0000
      union u, 12 bytes                     @204     
         struct kcvcprba, 12 bytes          @204     
            ub4 kcrbaseq                    @204      0x00000000
            ub4 kcrbabno                    @208      0x00000000
            ub2 kcrbabof                    @212      0x0000
         struct kcvcptr, 12 bytes           @204     
            struct kcrtrscn, 8 bytes        @204     
               ub4 kscnbas                  @204      0x00000000
               ub2 kscnwrp                  @208      0x0000
            ub4 kcrtrtim                    @212      0x00000000
      ub1 kcvcpetb[0]                       @216      0x00
      ub1 kcvcpetb[1]                       @217      0x00
      ub1 kcvcpetb[2]                       @218      0x00
      ub1 kcvcpetb[3]                       @219      0x00
      ub1 kcvcpetb[4]                       @220      0x00
      ub1 kcvcpetb[5]                       @221      0x00
      ub1 kcvcpetb[6]                       @222      0x00
      ub1 kcvcpetb[7]                       @223      0x00
   ub4 kcvfhbhz                             @224      0x00000000
   struct kcvfhxcd, 16 bytes                @228     
      ub4 space_kcvmxcd[0]                  @228      0x00000000
      ub4 space_kcvmxcd[1]                  @232      0x00000000
      ub4 space_kcvmxcd[2]                  @236      0x00000000
      ub4 space_kcvmxcd[3]                  @240      0x00000000
   word kcvfhtsn                            @244      12
   ub2 kcvfhtln                             @248      0x0004
   text kcvfhtnm[0]                         @250     B
   text kcvfhtnm[1]                         @251     B
   text kcvfhtnm[2]                         @252     E
   text kcvfhtnm[3]                         @253     D
   text kcvfhtnm[4]                         @254      
   text kcvfhtnm[5]                         @255      
   text kcvfhtnm[6]                         @256      
   text kcvfhtnm[7]                         @257      
   text kcvfhtnm[8]                         @258      
   text kcvfhtnm[9]                         @259      
   text kcvfhtnm[10]                        @260      
   text kcvfhtnm[11]                        @261      
   text kcvfhtnm[12]                        @262      
   text kcvfhtnm[13]                        @263      
   text kcvfhtnm[14]                        @264      
   text kcvfhtnm[15]                        @265      
   text kcvfhtnm[16]                        @266      
   text kcvfhtnm[17]                        @267      
   text kcvfhtnm[18]                        @268      
   text kcvfhtnm[19]                        @269      
   text kcvfhtnm[20]                        @270      
   text kcvfhtnm[21]                        @271      
   text kcvfhtnm[22]                        @272      
   text kcvfhtnm[23]                        @273      
   text kcvfhtnm[24]                        @274      
   text kcvfhtnm[25]                        @275      
   text kcvfhtnm[26]                        @276      
   text kcvfhtnm[27]                        @277      
   text kcvfhtnm[28]                        @278      
   text kcvfhtnm[29]                        @279      
   ub4 kcvfhrfn                             @280      0x0000000c
   struct kcvfhrfs, 8 bytes                 @284     
      ub4 kscnbas                           @284      0x00000000
      ub2 kscnwrp                           @288      0x0000
   ub4 kcvfhrft                             @292      0x2ebee9f9
   struct kcvfhafs, 8 bytes                 @296     
      ub4 kscnbas                           @296      0x00000000
      ub2 kscnwrp                           @300      0x0000
   ub4 kcvfhbbc                             @304      0x00000000
   ub4 kcvfhncb                             @308      0x00000000
   ub4 kcvfhmcb                             @312      0x00000000
   ub4 kcvfhlcb                             @316      0x00000000
   ub4 kcvfhbcs                             @320      0x00000000
   ub2 kcvfhofb                             @324      0x0000
   ub2 kcvfhnfb                             @326      0x0000
   ub4 kcvfhprc                             @328      0x00000000
   struct kcvfhprs, 8 bytes                 @332     
      ub4 kscnbas                           @332      0x00000000
      ub2 kscnwrp                           @336      0x0000
   struct kcvfhprfs, 8 bytes                @340     
      ub4 kscnbas                           @340      0x00000000
      ub2 kscnwrp                           @344      0x0000
   ub4 kcvfhtrt                             @356      0x00000000

/*需要修改内容
ub4 rdba_kcbh                         @4        0x03000001
ub4 kccfhfsz                          @44       0x00001400
ub2 kccfhfno                          @52       0x000c
struct kcvfhcrs, 8 bytes                 @100     
      ub4 kscnbas                           @100      0xc00a3405
      ub2 kscnwrp                           @104      0x0b2c
ub4 kcvfhrfn                             @280      0x0000000c
/
/*修改值(通过错误提示结合file$表)
rdba_kcbh  02c00001
kccfhfsz  00000500
kccfhfno 000b
kscnbas c00a32b8
kcvfhrfn 0000000b
/

BBED> set offset 4
        OFFSET          4

BBED> m /x 0100c002
 File: /u01/oracle/oradata/xifenfei/bbed01.dbf (11)
 Block: 1                Offsets:    4 to   19           Dba:0x02c00001
------------------------------------------------------------------------
 0100c002 00000000 00000104 0ab10000 

 <32 bytes per line>

BBED> set offset 44
        OFFSET          44

BBED> m /x 00050000
 File: /u01/oracle/oradata/xifenfei/bbed01.dbf (11)
 Block: 1                Offsets:   44 to   59           Dba:0x02c00001
------------------------------------------------------------------------
 00050000 00200000 0c000300 00000000 

 <32 bytes per line>

BBED> set offset 52
        OFFSET          52

BBED> m /x 
BBED-00203: incomplete/malformed command


BBED> m /x 0b00                                                                                                        
 File: /u01/oracle/oradata/xifenfei/bbed01.dbf (11)
 Block: 1                Offsets:   52 to   67           Dba:0x02c00001
------------------------------------------------------------------------
 0b000300 00000000 00000000 00000000 

 <32 bytes per line>

BBED> set offset 100
        OFFSET          100

BBED> m /x b8320ac0
 File: /u01/oracle/oradata/xifenfei/bbed01.dbf (11)
 Block: 1                Offsets:  100 to  115           Dba:0x02c00001
------------------------------------------------------------------------
 b8320ac0 2c0b0000 c3b8be2e 8f40512e 

 <32 bytes per line>

BBED> set offset 280
        OFFSET          280

BBED> m /x 0b000000
 File: /u01/oracle/oradata/xifenfei/bbed01.dbf (11)
 Block: 1                Offsets:  280 to  295           Dba:0x02c00001
------------------------------------------------------------------------
 0b000000 00000000 00000000 f9e9be2e 

 <32 bytes per line>

BBED> sum apply
Check value for File 11, Block 1:
current = 0xa777, required = 0xa777

重建控制文件open数据库

SQL> alter database backup controlfile to trace as '/tmp/t_xifenfie.ctl';

Database altered.

SQL> shutdown immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.

SQL> startup nomount
ORACLE instance started.

Total System Global Area  353441008 bytes
Fixed Size                   451824 bytes
Variable Size             184549376 bytes
Database Buffers          167772160 bytes
Redo Buffers                 667648 bytes

SQL> CREATE CONTROLFILE REUSE DATABASE "XIFENFEI" NORESETLOGS  ARCHIVELOG
  2      MAXLOGFILES 50
  3      MAXLOGMEMBERS 5
  4      MAXDATAFILES 100
  5      MAXINSTANCES 1
  6      MAXLOGHISTORY 226
  7  LOGFILE
  8    GROUP 1 '/u01/oracle/oradata/xifenfei/redo01.log'  SIZE 100M,
  9    GROUP 2 '/u01/oracle/oradata/xifenfei/redo02.log'  SIZE 100M,
 10    GROUP 3 '/u01/oracle/oradata/xifenfei/redo03.log'  SIZE 100M
 11  DATAFILE
 12    '/u01/oracle/oradata/xifenfei/system01.dbf',
 13    '/u01/oracle/oradata/xifenfei/undotbs01.dbf',
 14    '/u01/oracle/oradata/xifenfei/cwmlite01.dbf',
 15    '/u01/oracle/oradata/xifenfei/drsys01.dbf',
 16    '/u01/oracle/oradata/xifenfei/example01.dbf',
 17    '/u01/oracle/oradata/xifenfei/indx01.dbf',
 18    '/u01/oracle/oradata/xifenfei/odm01.dbf',
 19    '/u01/oracle/oradata/xifenfei/tools01.dbf',
 20    '/u01/oracle/oradata/xifenfei/users01.dbf',
 21    '/u01/oracle/oradata/xifenfei/xdb01.dbf',
 22    '/u01/oracle/oradata/xifenfei/bbed01.dbf',
 23    '/u01/oracle/oradata/xifenfei/bbed02.dbf'
 24  CHARACTER SET ZHS16GBK
 25  ;

Control file created.

SQL> recover database ;
ORA-00283: recovery session canceled due to errors
ORA-00264: no recovery required


SQL> alter database open;

Database altered.

至此通过拷贝相同表空间的datafile header修复损坏的datafile header

使用exp/imp 导入11g数据到9i

现在有个需求,需要使用exp/imp导入11g的数据库数据到9i中,解决这个问题一般来说想到三种方法思路,一个个尝试(其实从高版本服务端支持低版本客户端的原则,可以大概的猜测出使用9i的客户端处理该问题)
方法1:导出导入都使用11g客户端

--11g客户端导出
[oracle@xifenfei ~]$ exp chf/xifenfei file=/tmp/t_xifenfei.dmp 
>log=/tmp/t_xifenfei.log tables=chf.t_xifenfei

Export: Release 11.2.0.3.0 - Production on Fri May 18 18:15:18 2012

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set
server uses AL32UTF8 character set (possible charset conversion)

About to export specified tables via Conventional Path ...
. . exporting table                     T_XIFENFEI          2 rows exported
Export terminated successfully without warnings.

--11g客户端导入
[oracle@xifenfei ~]$ imp chf/xifenfei@ora9i file=/tmp/t_xifenfei_11g.dmp 
>log=/tmp/t_xifenfei.log tables=chf.t_xifenfei

Import: Release 11.2.0.3.0 - Production on Fri May 18 18:17:24 2012

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


IMP-00058: ORACLE error 6550 encountered
ORA-06550: line 1, column 33:
PLS-00302: component 'SET_NO_OUTLINES' must be declared
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored
IMP-00000: Import terminated unsuccessfully

这个错误是版本不兼容导致:PLS-00302: component ‘SET_NO_OUTLINES’ must be declared

方法2:11g客户端导出,9i客户端导入

--11g客户端导出
[oracle@xifenfei ~]$ exp chf/xifenfei file=/tmp/t_xifenfei.dmp 
>log=/tmp/t_xifenfei.log tables=chf.t_xifenfei

Export: Release 11.2.0.3.0 - Production on Fri May 18 18:15:18 2012

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set
server uses AL32UTF8 character set (possible charset conversion)

About to export specified tables via Conventional Path ...
. . exporting table                     T_XIFENFEI          2 rows exported
Export terminated successfully without warnings.

--传输到9i
[oracle@xifenfei tmp]$ scp t_xifenfei.dmp 192.168.1.10:/tmp/
The authenticity of host '192.168.1.10 (192.168.1.10)' can't be established.
RSA key fingerprint is 3d:0c:d1:4b:45:bd:a3:f5:25:eb:4d:52:d2:32:03:69.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.1.10' (RSA) to the list of known hosts.
oracle@192.168.1.10's password: 
t_xifenfei.dmp                          100%   56KB  56.0KB/s   00:00    

--9i客户端导入
[oracle@xifenfei ~]$ imp chf/xifenfei file=/tmp/t_xifenfei.dmp tables=t_xifenfei

Import: Release 9.2.0.4.0 - Production on Thu May 24 23:32:18 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production

IMP-00010: not a valid export file, header failed verification
IMP-00000: Import terminated unsuccessfully
--版本不兼容(高版本的dump文件低版本不能识别)

方法3:9i客户端导出,9i客户端导入

--9i客户端导出
[oracle@xifenfei ~]$ exp chf/xifenfei@ora11g file=/tmp/t_xifenfei_11g.dmp 
>log=/tmp/t_xifenfei.log tables=chf.t_xifenfei

Export: Release 9.2.0.4.0 - Production on Thu May 24 23:37:20 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set
server uses AL32UTF8 character set (possible charset conversion)

About to export specified tables via Conventional Path ...
. . exporting table                     T_XIFENFEI          2 rows exported
Export terminated successfully without warnings.

--9i客户端导入
[oracle@xifenfei log]$ imp chf/xifenfei file=/tmp/t_xifenfei_11g.dmp log=/tmp/xifenfei.log full=y

Import: Release 9.2.0.4.0 - Production on Fri May 25 03:22:14 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production

Export file created by EXPORT:V09.02.00 via conventional path
import done in ZHS16GBK character set and AL16UTF16 NCHAR character setSegmentation fault
--导入数据遇到setSegmentation fault异常终止

解决setSegmentation fault异常终止

--修改exu9defpswitches视图
[oracle@xifenfei ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Fri May 18 22:29:00 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> CREATE OR REPLACE VIEW exu9defpswitches ( 
  2                  compflgs, nlslensem ) AS 
  3          SELECT  a.value, b.value 
  4          FROM    sys.v$parameter a, sys.v$parameter b 
  5          WHERE   a.name = 'plsql_code_type' AND 
  6                  b.name = 'nls_length_semantics' ;

View created.

--9i导出11g数据
[oracle@xifenfei tmp]$ exp chf/xifenfei@ora11g file=/tmp/t_xifenfei_11g.dmp 
>log=/tmp/xifenfei.log tables=t_xifenfei

Export: Release 9.2.0.4.0 - Production on Fri May 25 04:08:32 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in ZHS16GBK character set and AL16UTF16 NCHAR character set
server uses AL32UTF8 character set (possible charset conversion)

About to export specified tables via Conventional Path ...
. . exporting table                     T_XIFENFEI          2 rows exported
Export terminated successfully without warnings.

--9i导入数据
[oracle@xifenfei tmp]$ imp chf/xifenfei file=/tmp/t_xifenfei_11g.dmp 
>log=/tmp/xifenfei.log tables=t_xifenfei

Import: Release 9.2.0.4.0 - Production on Fri May 25 04:08:53 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production

Export file created by EXPORT:V09.02.00 via conventional path
import done in ZHS16GBK character set and AL16UTF16 NCHAR character set
. importing CHF's objects into CHF
. . importing table                   "T_XIFENFEI"          2 rows imported
Import terminated successfully without warnings.
--至此导入成功,完成了11gr2数据导入到9ir2中

通过一系列的实验证明,需要把11g的数据导入到9i中,需要使用9i的客户端进行,其中exu9defpswitches视图需要重建,否则会出现setSegmentation fault异常,导致导入失败.

rac redo log file被意外覆盖数据库恢复

当前日志被覆盖导致错误
朋友的一客户在一套rac上包含了两个数据库,其其中一个库增加redo group时候,覆盖了另外一个库的redo,悲剧的是刚好是current redo

Wed May 16 17:03:05 2012
ALTER DATABASE OPEN
This instance was first to open
Wed May 16 17:03:09 2012
Beginning crash recovery of 2 threads
 parallel recovery started with 15 processes
Wed May 16 17:03:11 2012
Started redo scan
Wed May 16 17:03:11 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc:
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2'
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1'
Abort recovery for domain 0
Wed May 16 17:03:11 2012
Aborting crash recovery due to error 305
Wed May 16 17:03:11 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc:
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2'
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1'
ORA-305 signalled during: ALTER DATABASE OPEN...
Wed May 16 17:03:13 2012
Shutting down instance (abort)

使用_allow_resetlogs_corruption= TRUE进行恢复

Wed May 16 18:16:48 2012
SMON: enabling cache recovery
Wed May 16 18:16:48 2012
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Wed May 16 18:16:48 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2105454.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [2522], [18446744072024280773], 
[2522], [18446744072024247666], [], [], []
Wed May 16 18:16:50 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2105454.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [2522], [18446744072024280773], 
[2522], [18446744072024247666], [], [], []
Wed May 16 18:16:50 2012
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 2105454
ORA-1092 signalled during: alter database open resetlogs...

ORA-600[KCLCHKBLK_4], is signaled because the SCN in a tempfile block is too high.
The same reason caused the ORA-600[2662]s in the alert logs.
因为是临时文件的scn太大的问题,那就比较好解决:
启动数据库到mount状态,查询出来相关temp file,然后drop掉.

ORA-00600[6856]

Wed May 16 20:25:16 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2482210.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/dev/rods_redo2_1_1'
ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], []
ORACLE Instance odsdb1 (pid = 16) - Error 600 encountered while recovering transaction 
(10, 8) on object 7162533.
Wed May 16 20:25:16 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2482210.trc:
ORA-00600: internal error code, arguments: [6856], [0], [0], [], [], [], [], []

这里的4193和4194是比较熟悉的,根据这里的提示猜测6856也是和undo有关系
ORA-600[6856]SMON is trying to recover a dead transaction.
But the undo application runs into an internal error (trying to delete a row that is already deleted).
因为smon回滚的时候出现上面错误,解决方法是想办法终止回滚,使用event=”10513 trace name context forever, level 2″.

ORA-00600[4193]/ORA-00600[4194]

Wed May 16 20:25:17 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/dev/rods_redo2_1_1'
ORA-00600: internal error code, arguments: [4194], [22], [25], [], [], [], [], []
Wed May 16 20:25:18 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/dev/rods_redo2_1_1'
ORA-00600: internal error code, arguments: [4194], [22], [25], [], [], [], [], []
Wed May 16 20:25:56 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2547936.trc:
ORA-00600: internal error code, arguments: [4193], [22248], [22252], [], [], [], [], []

太常见错误,不再做说明,虽然使用event是的库open成功,因为部分回滚段有问题,该错误还是会出现(还是喜欢直接屏蔽回滚段)

ORA-00600[ktpridestroy2]

Wed May 16 20:36:26 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_smon_2101296.trc:
ORA-00600: internal error code, arguments: [ktpridestroy2], [], [], [], [], [], [], []

This error could be the result of a corruption and involves the parallel rollback that SMON enables each startup.
解决:fast_start_parallel_rollback=false

ORA-00600[kturacf1]/ORA-00600[kcbgcur_9]

Wed May 16 20:49:15 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_j000_2007088.trc:
ORA-00600: internal error code, arguments: [kturacf1], [2097152], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kcbgcur_9], [780140563], [4], [4294959056], [2097152], [], [], []

ORA-00600[kturacf1]错误未查询到原因
ORA-00600[kcbgcur_9]错误原因可能是:Buffers are pinned in a specific class order to prevent internal deadlocks.
因为这两个错误是job产生非致命错误,在这次的处理过程中可以忽略

ORA-00600[4097]

Wed May 16 21:05:05 2012
Errors in file /oracle/admin/odsdb/bdump/odsdb1_j000_1716282.trc:
ORA-12012: error on auto execute of job 6603
ORA-20001: ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
ORA-06512: at "EPBI.UP_SYSLOG_ONLINE_USER", line 141
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []

When an instance has a rollback segment offline and the instance crashes, or
the user does a shutdown abort, the rollback segment wrap number does not get
updated. If that segment is then dropped and recreated immediately after the
instance is restarted, the wrap number could be lower than existing wrap
numbers. This will cause the ORA-600[4097] to occur in subsequent
transactions using Rollback.
这个错误也是因为回滚段wrap number未被及时更新导致的异常.

总结这次恢复过程
1.因当前redo丢失,使用隐含参数打开库,然后出现ORA-600[KCLCHKBLK_4](这个错误比较少见,更何况这个库是10.2.0.4)
2.undo出现问题出现ORA-00600[6856]错误不是很多见
3.接下来都是一些列undo导致的问题,其实如果开始就直接使用隐含参数删除掉有问题回滚段,效果可能会比event好.
4.因为部分trace文件没有拿到,未做深入分析,文章列出整体恢复思路
5.本次恢复的数据库版本是10.2.0.4,根据mos描述,很多错误应不会发生,但是实际还是发生了,MOS的版本范围,不要太看重.
6.其实这篇文章的本质不是展示恢复过程,而是再一次提醒:操作数据库慎重,特别是一台机器上多套库.

ORA-00600[kcratr1_lostwrt]/ORA-00600[3020]错误恢复

open数据库alert日志报ORA-00600[kcratr1_lostwrt]错误

Mon May 14 14:57:28 2012
ALTER DATABASE OPEN
Mon May 14 14:57:29 2012
Beginning crash recovery of 1 threads
Mon May 14 14:57:29 2012
Started redo scan
Mon May 14 14:57:29 2012
Errors in file d:\oracle\admin\cqgasold\udump\cqgasold_ora_504.trc:
ORA-00600: 内部错误代码,参数: [kcratr1_lostwrt], [], [], [], [], [], [], []

ORA-600 signalled during: alter database open...

查询相关SCN
同一个查询中SCN相同,省略

SQL> select file#,online_status "STATUS",to_char(change#,'9999999999999999') "SCN",
  2  To_char(time,'yyyy-mm-dd hh24:mi:ss')"TIME" from v$recover_file;

未选定行

SQL> select file#,to_char(checkpoint_change#,'999999999999999') "SCN",
  2  to_char(last_change#,'999999999999999')"STOP_SCN" from v$datafile;

     FILE# SCN              STOP_SCN                                            
---------- ---------------- ----------------                                    
         1       7842987188                                                     
         2       7842987188                                                     
         3       7842987188                                                     

SQL> select file#,to_char(checkpoint_change#,'9999999999999999') "SCN",
  2  to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN"
  3  from v$datafile_header;

     FILE# SCN               RESETLOGS SCN                                      
---------- ----------------- -----------------                                  
         1        7842991811                 1                                  
         2        7842991811                 1                                  
         3        7842991811                 1                                  

这里看到奇怪现象datafile scn小于datafile_header scn,数据库异常断电一般来说也不会出现这样的情况,个人猜测是错误的恢复或者使用历史控制文件导致,对于这样的现状,我先尝试着使用using backup controlfile方式恢复,结果失败.估计控制文件有异常,本着先拉起库原则,重建控制文件.

进行完全恢复

SQL> recover database;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码,参数: [3020], [8388617], [1], [23403], [25], [112],[], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 9)
ORA-10564: tablespace UNDOTBS1
ORA-01110: 数据文件 2: 'D:\ORACLE\ORADATA\CQGASOLD\UNDO_1.DBF'
ORA-10560: block type 'KTU SMU HEADER BLOCK'

尝试跳过坏块继续恢复

SQL> recover database allow 1 corruption;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码,参数: [3020], [8388610], [1], [23403], [2264], [16],[], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 2)
ORA-10564: tablespace UNDOTBS1
ORA-01110: 数据文件 2: 'D:\ORACLE\ORADATA\CQGASOLD\UNDO_1.DBF'
ORA-10560: block type 'KTFB Bitmapped File Space Header'

使用dbv检查坏块数量

C:\>dbv file='d:\oracle\oradata\cqgasold\undo_1.dbf' blocksize=8192

DBVERIFY: Release 9.2.0.5.0 - Production on 星期二 5月 15 19:43:42 2012

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

DBVERIFY - 验证正在开始 : FILE = d:\oracle\oradata\cqgasold\undo_1.dbf

DBV-00200: 块, dba 8388617, 已经标记为崩溃
汇入的页369 - 可能是介质损坏
***
Corrupt block relative dba: 0x00800171 (file 2, block 369)
Fractured block found during dbv:
Data in bad block -
 type: 2 format: 2 rdba: 0x00800171
 last change scn: 0x0001.d37c57db seq: 0x1 flg: 0x04
 consistency value in tail: 0x4e930260
 check value in block header: 0x8202, computed block checksum: 0x4e5f
 spare1: 0x0, spare2: 0x0, spare3: 0x0
***

汇入的页417 - 可能是介质损坏
***
Corrupt block relative dba: 0x008001a1 (file 2, block 417)
Fractured block found during dbv:
Data in bad block -
 type: 2 format: 2 rdba: 0x008001a1
 last change scn: 0x0001.d37c53d4 seq: 0x2 flg: 0x04
 consistency value in tail: 0x4b6b0201
 check value in block header: 0x6ae7, computed block checksum: 0x5abc
 spare1: 0x0, spare2: 0x0, spare3: 0x0
***
…………
--类此记录很多,我放弃了跳过坏块修复的方法

恢复过程中提示坏块数据库文件离线恢复

SQL> alter database datafile 'd:\oracle\oradata\cqgasold\undo_1.dbf' offline;

数据库已更改。

SQL> recover database;
完成介质恢复。
SQL> alter database open;
alter database open
*
ERROR 位于第 1 行:
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00376: 此时无法读取文件 2
ORA-01110: 数据文件 2: 'D:\ORACLE\ORADATA\CQGASOLD\UNDO_1.DBF'

到了这一步,根据经验,数据库被open的可能性很多了,很有可能是open以后因为smon回滚导致数据库down

查看日志,屏蔽回滚段,完成恢复

Tue May 15 19:59:52 2012
alter database open
Tue May 15 19:59:52 2012
Beginning crash recovery of 1 threads
Tue May 15 19:59:52 2012
Started redo scan
Tue May 15 19:59:52 2012
Completed redo scan
 323 redo blocks read, 82 data blocks need recovery
Tue May 15 19:59:52 2012
Started recovery at
 Thread 1: logseq 23404, block 3, scn 0.0
Recovery of Online Redo Log: Thread 1 Group 4 Seq 23404 Reading mem 0
  Mem# 0 errs 0: F:\ORACLE\ORADATA\LOGCQGASOLD4.ORA
Tue May 15 19:59:52 2012
Completed redo application
Tue May 15 19:59:52 2012
Ended recovery at
 Thread 1: logseq 23404, block 326, scn 1.3548264979
 82 data blocks read, 82 data blocks written, 323 redo blocks read
Crash recovery completed successfully
Tue May 15 19:59:53 2012
Thread 1 advanced to log sequence 23405
Thread 1 opened at log sequence 23405
  Current log# 2 seq# 23405 mem# 0: D:\ORACLE\ORADATA\CQGASOLD\REDO02.LOG
Successful open of redo thread 1
Tue May 15 19:59:53 2012
SMON: enabling cache recovery
SMON: enabling tx recovery
Tue May 15 19:59:54 2012
Database Characterset is ZHS16GBK
Tue May 15 19:59:55 2012
replication_dependency_tracking turned off (no async multimaster replication found)
ORA-604 signalled during: alter database open...
Tue May 15 19:59:56 2012
SMON: about to recover undo segment 1
SMON: mark undo segment 1 as needs recovery
SMON: about to recover undo segment 2
SMON: mark undo segment 2 as needs recovery
SMON: about to recover undo segment 3
SMON: mark undo segment 3 as needs recovery
SMON: about to recover undo segment 4
SMON: mark undo segment 4 as needs recovery
SMON: about to recover undo segment 5
SMON: mark undo segment 5 as needs recovery
SMON: about to recover undo segment 6
SMON: mark undo segment 6 as needs recovery
SMON: about to recover undo segment 7
SMON: mark undo segment 7 as needs recovery
SMON: about to recover undo segment 8
SMON: mark undo segment 8 as needs recovery
SMON: about to recover undo segment 9
SMON: mark undo segment 9 as needs recovery
SMON: about to recover undo segment 10
SMON: mark undo segment 10 as needs recovery
Tue May 15 20:00:37 2012
Shutting down instance (abort)

看到这里,可以大概确定是因为undo文件离线,导致回滚段异常.
这个问题,基本上可以确定通过隐含参数屏蔽回滚段,然后open数据库,重建undo删除异常undo,数据库恢复完成。

记录一次rman备份ORA-19502/ORA-27063错误原因分析

rman备份出现ORA-19502/ORA-27063错误

RMAN> 2> 3> 4> 5> 6> 7> 8> 9> 10> 11> 12> 13> 14> 15> 16> 17> 18> 19> 20> 
allocated channel: t11
channel t11: sid=824 instance=ncdb1 devtype=DISK

allocated channel: t12
channel t12: sid=838 instance=ncdb1 devtype=DISK

allocated channel: t13
channel t13: sid=809 instance=ncdb1 devtype=DISK

allocated channel: t14
channel t14: sid=886 instance=ncdb1 devtype=DISK

allocated channel: t15
channel t15: sid=620 instance=ncdb1 devtype=DISK

allocated channel: t16
channel t16: sid=599 instance=ncdb1 devtype=DISK

allocated channel: t17
channel t17: sid=482 instance=ncdb1 devtype=DISK

allocated channel: t18
channel t18: sid=506 instance=ncdb1 devtype=DISK
一共开通8个通道

channel t12: starting full datafile backupset
channel t12: specifying datafile(s) in backupset
input datafile fno=00008 name=/dev/rnc32g_39
input datafile fno=00016 name=/dev/rnc32g_47
input datafile fno=00024 name=/dev/rnc32g_57
input datafile fno=00032 name=/dev/rnc32g_25
input datafile fno=00040 name=/dev/rnc32g_33
input datafile fno=00048 name=/dev/rnc32g_3
input datafile fno=00056 name=/dev/rnc32g_11
input datafile fno=00064 name=/dev/rnc32g_19
input datafile fno=00072 name=/dev/rnc32g_67
input datafile fno=00080 name=/dev/rnc32g_106
input datafile fno=00088 name=/dev/rnc32g_114
input datafile fno=00096 name=/dev/rnc32g_87
input datafile fno=00104 name=/dev/rnc32g_95
input datafile fno=00112 name=/dev/rnc32g_103
input datafile fno=00120 name=/dev/rnc32g_75
input datafile fno=00003 name=/dev/rnc50_sysaux
input datafile fno=00130 name=/dev/rnc32g_119
channel t12: starting piece 1 at 14-MAY-12
--通道12备份数据文件

channel t17: starting full datafile backupset
channel t17: specifying datafile(s) in backupset
input datafile fno=00002 name=/dev/rnc32g_22
input datafile fno=00013 name=/dev/rnc32g_44
input datafile fno=00021 name=/dev/rnc32g_54
input datafile fno=00029 name=/dev/rnc32g_62
input datafile fno=00037 name=/dev/rnc32g_30
input datafile fno=00045 name=/dev/rnc32g_38
input datafile fno=00053 name=/dev/rnc32g_8
input datafile fno=00061 name=/dev/rnc32g_16
input datafile fno=00069 name=/dev/rnc32g_64
input datafile fno=00077 name=/dev/rncundo_33g_4
input datafile fno=00085 name=/dev/rnc32g_111
input datafile fno=00093 name=/dev/rnc32g_84
input datafile fno=00101 name=/dev/rnc32g_92
input datafile fno=00109 name=/dev/rnc32g_100
input datafile fno=00117 name=/dev/rnc32g_72
input datafile fno=00006 name=/dev/rnc50_4g_1
channel t17: starting piece 1 at 14-MAY-12
--通道17备份数据文件

channel t15: finished piece 1 at 15-MAY-12
piece handle=/rman/db_mpnb04jl_1_1 tag=TAG20120514T204954 comment=NONE
channel t15: backup set complete, elapsed time: 06:07:59
channel t11: finished piece 1 at 15-MAY-12
piece handle=/rman/db_mlnb04jk_1_1 tag=TAG20120514T204954 comment=NONE
channel t11: backup set complete, elapsed time: 06:17:25
channel t16: finished piece 1 at 15-MAY-12
piece handle=/rman/db_mqnb04jm_1_1 tag=TAG20120514T204954 comment=NONE
channel t16: backup set complete, elapsed time: 06:34:49
channel t14: finished piece 1 at 15-MAY-12
piece handle=/rman/db_monb04jl_1_1 tag=TAG20120514T204954 comment=NONE
channel t14: backup set complete, elapsed time: 06:40:05
channel t18: finished piece 1 at 15-MAY-12
piece handle=/rman/db_msnb04jn_1_1 tag=TAG20120514T204954 comment=NONE
channel t18: backup set complete, elapsed time: 06:43:38
channel t13: finished piece 1 at 15-MAY-12
piece handle=/rman/db_mnnb04jl_1_1 tag=TAG20120514T204954 comment=NONE
channel t13: backup set complete, elapsed time: 07:40:56
--这里可以看出rman的备份完成了通道11/13/14/15/16/18,也就是说目前为止通道12/17未完成.

RMAN-03009: failure of backup command on t12 channel at 05/15/2012 04:39:58
ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30481025 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30480897 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
channel t12 disabled, job failed on it will be run on another channel
--通道12报错(硬盘空间不足)

channel t11: starting full datafile backupset
channel t11: specifying datafile(s) in backupset
input datafile fno=00008 name=/dev/rnc32g_39
input datafile fno=00016 name=/dev/rnc32g_47
input datafile fno=00024 name=/dev/rnc32g_57
input datafile fno=00032 name=/dev/rnc32g_25
input datafile fno=00040 name=/dev/rnc32g_33
input datafile fno=00048 name=/dev/rnc32g_3
input datafile fno=00056 name=/dev/rnc32g_11
input datafile fno=00064 name=/dev/rnc32g_19
input datafile fno=00072 name=/dev/rnc32g_67
input datafile fno=00080 name=/dev/rnc32g_106
input datafile fno=00088 name=/dev/rnc32g_114
input datafile fno=00096 name=/dev/rnc32g_87
input datafile fno=00104 name=/dev/rnc32g_95
input datafile fno=00112 name=/dev/rnc32g_103
input datafile fno=00120 name=/dev/rnc32g_75
input datafile fno=00003 name=/dev/rnc50_sysaux
input datafile fno=00130 name=/dev/rnc32g_119
channel t11: starting piece 1 at 15-MAY-12
--在通道12报错后,通道11已经完成了上次备份,所以启动备份通道12出错的数据文件

RMAN-03009: failure of backup command on t17 channel at 05/15/2012 04:39:58
ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753793 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753665 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
channel t17 disabled, job failed on it will be run on another channel
--通道17也因为磁盘空间报错

channel t13: starting full datafile backupset
channel t13: specifying datafile(s) in backupset
input datafile fno=00002 name=/dev/rnc32g_22
input datafile fno=00013 name=/dev/rnc32g_44
input datafile fno=00021 name=/dev/rnc32g_54
input datafile fno=00029 name=/dev/rnc32g_62
input datafile fno=00037 name=/dev/rnc32g_30
input datafile fno=00045 name=/dev/rnc32g_38
input datafile fno=00053 name=/dev/rnc32g_8
input datafile fno=00061 name=/dev/rnc32g_16
input datafile fno=00069 name=/dev/rnc32g_64
input datafile fno=00077 name=/dev/rncundo_33g_4
input datafile fno=00085 name=/dev/rnc32g_111
input datafile fno=00093 name=/dev/rnc32g_84
input datafile fno=00101 name=/dev/rnc32g_92
input datafile fno=00109 name=/dev/rnc32g_100
input datafile fno=00117 name=/dev/rnc32g_72
input datafile fno=00006 name=/dev/rnc50_4g_1
channel t13: starting piece 1 at 15-MAY-12
--通道13也尝试备份通道17失败的数据文件

RMAN-03009: failure of backup command on t11 channel at 05/15/2012 04:39:59
ORA-19504: failed to create file "/rman/db_mtnb104u_1_1"
ORA-27044: unable to write the header block of file
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: 3
Addition
--因为当前没有空闲空间,通道11终止,
--这个时候rman异常终止,导致后续的通道13终止记录未打印到日志

阅读完rman日志,很好理解因为存放rman备份的磁盘空间不足导致了一系列错误

检查磁盘剩余空间

Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/hd4          10.00      9.75    3%     6548     1% /
/dev/hd2          10.00      4.55   55%    84383     8% /usr
/dev/hd9var        5.00      4.04   20%     6290     1% /var
/dev/hd3           5.00      3.87   23%     1551     1% /tmp
/dev/hd1          10.00      9.91    1%      382     1% /home
/proc                 -         -    -         -     -  /proc
/dev/hd10opt       5.00      4.89    3%     3502     1% /opt
/dev/archalv      99.00     82.98   17%       96     1% /archa
/dev/fslv01       40.00     19.49   52%    72324     2% /ora10
/dev/fslv00     1800.00    467.25   75%       10     1% /rman

这下让人迷糊了,磁盘空间还剩余467.25G,怎么会报错呢?

分析原因

RMAN-03009: failure of backup command on t12 channel at 05/15/2012 04:39:58
ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30481025 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/rman/db_mmnb04jl_1_1", blockno 30480897 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
channel t12 disabled, job failed on it will be run on another channel

RMAN-03009: failure of backup command on t17 channel at 05/15/2012 04:39:58
ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753793 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/rman/db_mrnb04jm_1_1", blockno 30753665 (blocksize=8192)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
channel t17 disabled, job failed on it will be run on another channel

这两个通道在写入rman备份到磁盘中的时候,在05/15/2012 04:39:58发现磁盘空间不足,两个通道分别准备写入30480897/30753665号块的时候出错,那么当时这两个通道分别写入的数据块数为30480896/30753664,写入文件大小为(30480896+30753664)*8192/1024/1024/1024=467.1826171875G.这里可以看出磁盘剩余空间467.25G,其实当时已经写入了467.1826171875G,继续写入的时候出错.然后rman为了保证备份的正确性,自动删除了当时已经备份的467.1826171875G错误的备份文件.从而在备份结束后看到磁盘空间还有大量剩余而rman包空间不足的现象.