记录因磁盘头被重写,抢救redo恢复经历

客户使用赛门铁克做同城异地容灾部署extent rac,因某种情况导致主备容灾不同步,然后在主库中进行了若干操作,导致主库所有裸设备丢失,然后进行了一些列的操作,主库识别了裸设备,但是oracle出现异常
数据库裸设备异常

Fri May 31 22:07:39 2013
ORA-00202: control file: '/dev/rcontrol2'
ORA-27047: unable to read the header block of file
Additional information: 2
ORA-205 signalled during: ALTER DATABASE   MOUNT...

使用备份还原控制文件后,查询数据文件头v$datafile_header.error全部为”WRONG FILE TYPE”,使用bbed去查看,10个block以内全部是0,证明数据库文件头也损坏。因为客户的数据库虽然有rman备份,但涉及到memory,对redo的信息也很敏感,所以希望能够在他们当前的情况下评估redo是否可以应用,确保他们的数据不丢失.验证文件头

DATA FILE #1: 
  (name #4) /dev/rsystem
creation size=128000 block size=8192 status=0xe head=4 tail=4 dup=1
 tablespace 0, index=1 krfil=1 prev_file=0
 unrecoverable scn: 0x0000.00000000 01/01/1988 00:00:00
 Checkpoint cnt:22469 scn: 0x0000.7b4f9d86 05/29/2013 22:09:50
 Stop scn: 0xffff.ffffffff 05/15/2013 00:08:31
 Creation Checkpointed at scn:  0x0000.00000009 05/20/2007 21:52:41
 thread:1 rba:(0x1.3.10)
 enabled  threads:  01000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
 Offline scn: 0x0000.00000000 prev_range: 0
 Online Checkpointed at scn:  0x0000.00000000 
 thread:0 rba:(0x0.0.0)
 enabled  threads:  00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
 Hot Backup end marker scn: 0x0000.00000000
 aux_file is NOT DEFINED 
File header version cannot be determined due to corruption
Dump may be suspect
 V10 STYLE FILE HEADER:
	Software vsn=0=0x0, Compatibility Vsn=0=0x0
	Db ID=0=0x0, Db Name=''
	Activation ID=0=0x0
	Control Seq=0=0x0, File size=0=0x0
	File Number=0, Blksiz=0, File Type=0 UNKNOWN
Tablespace #0 -   rel_fn:0 
Creation   at   scn: 0x0000.00000000 01/01/1988 00:00:00
Backup taken at scn: 0x0000.00000000 01/01/1988 00:00:00 thread:0
 reset logs count:0x0 scn: 0x0000.00000000 reset logs terminal rcv data:0x0 scn: 0x0000.00000000
 prev reset logs count:0x0 scn: 0x0000.00000000 prev reset logs terminal rcv data:0x0 scn: 0x0000.00000000
 recovered at 01/01/1988 00:00:00
 status:0x0 root dba:0x00000000 chkpt cnt: 0 ctl cnt:0
begin-hot-backup file size: 0
Checkpointed at scn:  0x0000.00000000 
 thread:0 rba:(0x0.0.0)
 enabled  threads:  00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
Backup Checkpointed at scn:  0x0000.00000000 
 thread:0 rba:(0x0.0.0)
 enabled  threads:  00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000
External cache id: 0x0 0x0 0x0 0x0
Absolute fuzzy scn: 0x0000.00000000
Recovery fuzzy scn: 0x0000.00000000 01/01/1988 00:00:00
Terminal Recovery Stamp  01/01/1988 00:00:00

这里可以看出来文件头完全损坏,dump redo header

LOG FILE #1: 
  (name #1) /dev/rredo11
 Thread 1 redo log links: forward: 2 backward: 0
 siz: 0x3c000 seq: 0x00003f9c hws: 0x2 bsz: 512 nab: 0x1763c flg: 0x1 dup: 1
 Archive links: fwrd: 0 back: 0 Prev scn: 0x0000.7b458379
 Low scn: 0x0000.7b4583ad 05/29/2013 01:32:39
 Next scn: 0x0000.7b4fa0e5 05/29/2013 22:17:59
 FILE HEADER:
	Software vsn=0=0x0, Compatibility Vsn=0=0x0
	Db ID=0=0x0, Db Name=''
	Activation ID=0=0x0
	Control Seq=0=0x0, File size=0=0x0
	File Number=0, Blksiz=0, File Type=0 UNKNOWN
 descrip:""
 thread: 0 nab: 0x0 seq: 0x00000000 hws: 0x0 eot: 0 dis: 0
 reset logs count: 0x0 scn: 0x0000.00000000
 Low scn: 0x0000.00000000 01/01/1988 00:00:00
 Next scn: 0x0000.00000000 01/01/1988 00:00:00
 Enabled scn: 0x0000.00000000 01/01/1988 00:00:00
 Thread closed scn: 0x0000.00000000 01/01/1988 00:00:00
 Log format vsn: 0x0 Disk cksum: 0x0 Calc cksum: 0x0
 Terminal Recovery Stop scn: 0x0000.00000000
 Terminal Recovery Stamp  01/01/1988 00:00:00
 Most recent redo scn: 0x0000.00000000
 Largest LWN: 0 blocks
 Miscellaneous flags: 0x0
 Thread internal enable indicator: thr: 0, seq: 0 scn: 0x0000.00000000

验证redo header已经异常,dump redo logfile全部提示文件头错误


现在情况已经很明显,客户的库因为使用了裸设备,online 磁盘的过程中,所有的裸设备卷已经重写了文件头,oracle的datafile header信息不在了,无法正常操作完成.我们决定使用rman备份来恢复该数据库,然后想办法处理redo

注册带库备份集
在rman恢复过程中,我们遇到一个问题,客户的库rman备份策略有问题,一周一个全备,每天一个增量备份,一次归档备份,最后一次增量备份后备份了控制文件,但是最后一次归档备份之后无控制文件,而且是归档的备份发生在增量备份之后,因为是使用了带库无catalog库,我们增量恢复之后,数据不一致需要归档,但是归档,而归档的备份未记录在还原出来的控制文件中,需要人工注册带库的备份集到控制文件中

--涉及到3个节点都有配置不同的NB_ORA_CLIENT=pysa,如果在同一个节点中还原归档日志,需要配置如下
CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' PARMS  'ENV=(NB_ORA_CLIENT=pysa)';

--如果在默认节点直接分配sbt通道即可
configure default device type to sbt_tape;

--注册带库备份集
catalog device type 'sbt_tape' backuppiece 'al_21395_1_816744765';

分析redo
通过ue打开dd出来的redo文件,我们分析得到20000h(10*8192)全部为0,应该是和赛门铁克存储管理系统有关系,后面开始是aix的设备头信息

正常redo文件信息

该库redo信息对比(获得aix偏移量)

对比正常redo起点信息和经验我们定位到aix的设备头偏移量为1000h(4096),整体偏移量为21000h(10*8192+4096)
该库的redo起点为21000h,也就是说,我们需要执行的dd语句为类似语句(redo 大小为120M)

dd if=/dev/vx/rdsk/dg/redo31 bs=512 skip=264 count=245761 of=/arch/xifenfei/redo31

dd出来所有数据后,对先要已经应用了归档的库,继续尝试recover redo

SQL> recover database using backup controlfile;
ORA-00279: change 2069348436 generated at 05/30/2013 16:18:08 needed for thread
1
ORA-00289: suggestion : /arch/1_16289_623109141.dbf
ORA-00280: change 2069348436 for thread 1 is in sequence #16289


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo12
ORA-00279: change 2069348436 generated at 05/30/2013 16:18:08 needed for thread
2


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo23
ORA-00279: change 2069348436 generated at 05/30/2013 16:18:07 needed for thread
3
ORA-00289: suggestion : /arch/3_3898_623109141.dbf
ORA-00280: change 2069348436 for thread 3 is in sequence #3898


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo31
ORA-00279: change 2069455373 generated at 05/30/2013 20:03:26 needed for thread
1
ORA-00289: suggestion : /arch/1_16290_623109141.dbf
ORA-00280: change 2069455373 for thread 1 is in sequence #16290
ORA-00278: log file '/arch/xifenfei/redo12' no longer needed for this recovery


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo11
ORA-00279: change 2069475956 generated at 05/30/2013 20:04:09 needed for thread
3
ORA-00289: suggestion : /arch/3_3899_623109141.dbf
ORA-00280: change 2069475956 for thread 3 is in sequence #3899
ORA-00278: log file '/arch/xifenfei/redo31' no longer needed for this recovery


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/arch/xifenfei/redo33
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;

Database altered.

到这一步我们完整的通过dd跳过了由于赛门铁克管理磁盘导致的磁盘头损坏的块,从裸设备中复制出redo到文件系统,然后进行恢复,完整的抢救了客户的数据,减少了客户的损坏.这里温馨提示对于非常重要的系统(涉及钱),强烈建议redo多路冗余,光依靠存储容灾,备份,dg依然不够

通过基表获取segment header block

数据库不能open的时候,可以通过dul挖取相关基表(user$,obj$,ts$,tab$,seg$,file$),从而来获得segment header信息,然后通过dump该block,结合shell脚本获得extents分布脚本来获得extent分布

   SELECT NVL (u.name, 'SYS'),
          o.name,
          o.subname,
          so.object_type,
          s.type#,
          DECODE (BITAND (s.spare1, 2097408),
                  2097152, 'SECUREFILE',
                  256, 'ASSM',
                  'MSSM'),
          ts.ts#,
          ts.name,
          ts.blocksize,
          f.file#,
          s.block#,
          s.blocks * ts.blocksize,
          s.blocks,
          s.extents,
          s.iniexts * ts.blocksize,
          s.extsize * ts.blocksize,
          s.minexts,
          s.maxexts,
          DECODE (BITAND (s.spare1, 4194304), 4194304, bitmapranges, NULL),
          TO_CHAR (
             DECODE (
                BITAND (s.spare1, 2097152),
                2097152, DECODE (s.lists,
                                 0, 'NONE',
                                 1, 'AUTO',
                                 2, 'MIN',
                                 3, 'MAX',
                                 4, 'DEFAULT',
                                 'INVALID'),
                NULL)),
          DECODE (BITAND (s.spare1, 2097152), 2097152, s.groups, NULL),
          DECODE (BITAND (ts.flags, 3), 1, TO_NUMBER (NULL), s.extpct),
          DECODE (BITAND (ts.flags, 32),
                  32, TO_NUMBER (NULL),
                  DECODE (s.lists, 0, 1, s.lists)),
          DECODE (BITAND (ts.flags, 32),
                  32, TO_NUMBER (NULL),
                  DECODE (s.groups, 0, 1, s.groups)),
          s.file#,
          BITAND (s.cachehint, 3),
          BITAND (s.cachehint, 12) / 4,
          BITAND (s.cachehint, 48) / 16,
          NVL (s.spare1, 0),
          o.dataobj#
     FROM chf.user$ u,
          chf.obj$ o,
          chf.ts$ ts,
          ( SELECT DECODE (BITAND (t.property, 8192), 8192, 'NESTED TABLE', 'TABLE') OBJECT_TYPE,
          2 OBJECT_TYPE_ID,
          5 SEGMENT_TYPE_ID,
          t.obj# OBJECT_ID,
          t.file# HEADER_FILE,
          t.block# HEADER_BLOCK,
          t.ts# TS_NUMBER
     FROM chf.tab$ t) so,
          chf.seg$ s,
          chf.file$ f
    WHERE     s.file# = so.header_file
          AND s.block# = so.header_block
          AND s.ts# = so.ts_number
          AND s.ts# = ts.ts#
          AND o.obj# = so.object_id
          AND o.owner# = u.user#(+)
          AND s.type# = so.segment_type_id
          AND o.type# = so.object_type_id
          AND s.ts# = f.ts#
          AND s.file# = f.relfile#
          and o.name in('XIFENFEI','T_XIFENFEI');

记录一次ORA-600 4000数据库故障恢复

ORA-600[4000]错误
一朋友数据库因为当前redo丢失,在恢复的过程中启动报ORA-600[4000]错误

SMON: enabling cache recovery
Thu May 30 16:24:17 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/udump/xifenfei1_ora_1458370.trc:
ORA-00600: internal error code, arguments: [4000], [83], [], [], [], [], [], []
Thu May 30 16:24:19 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/udump/xifenfei1_ora_1458370.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [83], [], [], [], [], [], []
Thu May 30 16:24:19 2013
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1458370
ORA-1092 signalled during: alter database open resetlogs...

分析trace文件

*** 2013-05-30 16:24:17.979
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4000], [83], [], [], [], [], [], []
Current SQL statement for this session:
select ctime, mtime, stime from obj$ where obj# = :1
--确定是obj$对象异常,通过某种手段找到obj$的objid和dataobjid均为16,对应16进制为12

Block header dump:  0x0040007a
 Object id on Block? Y
 seg/obj: 0x12  csc: 0xc1e.a329e76f  itc: 1  flg: -  typ: 1 - DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0053.02a.000598bd  0x0d407e46.4f52.2f  --U-    1  fsc 0x0000.a329e772

这里比较明显obj$对象在rdba为0040007a的block上,scn为0c1e.a329e76f(13325725984623)且未提交的事务,这样的现象就决定了处理的特殊性(不是因为块延迟清理导致访问undo现象,该现象直接推进scn解决,而该情况不行)

数据文件头scn

SQL> SELECT DISTINCT CHECKPOINT# FROM V$DATAFILE_HEADER;

    CHECKPOINT_CHANGE#
-------------------------
           13324676536960

bbed查看文件头scn

   struct kcvfhckp, 160 bytes               @484     
      struct kcvcpscn, 8 bytes              @484     
         ub4 kscnbas                        @484      0x649c9a80
         ub2 kscnwrp                        @488      0x0c1e

这里看到的文件头scn也是为13324676536960(0c1e.649c9a80)和sql查询结果一致,也就是说数据库中的obj$的某个对象含有事务,且scn大于文件头scn(因为当前redo丢失,无法前滚,所以出现该情况),当数据库访问obj$的时候,为了事务的一致性,就需要访问undo(这里提示为83 回滚段),而undo异常,所以smon进程回滚失败,数据库无法正常启动

使用bbed提交事务

BBED> map
 File: /oradata/sys/xifenfei/system01.dbf (1)
 Block: 122                                   Dba:0x0040007a
------------------------------------------------------------
 KTB Data Block (Table/Cluster)

 struct kcbh, 20 bytes                      @0       

 struct ktbbh, 48 bytes                     @20      

 struct kdbh, 14 bytes                      @68      

 struct kdbt[1], 4 bytes                    @82      

 sb2 kdbr[108]                              @86      

 ub1 freespace[802]                         @302     

 ub1 rowdata[7084]                          @1104    

 ub4 tailchk                                @8188    


BBED> p ktbbh
struct ktbbh, 48 bytes                      @20      
   ub1 ktbbhtyp                             @20       0x01 (KDDBTDATA)
   union ktbbhsid, 4 bytes                  @24      
      ub4 ktbbhsg1                          @24       0x00000012
      ub4 ktbbhod1                          @24       0x00000012
   struct ktbbhcsc, 8 bytes                 @28      
      ub4 kscnbas                           @28       0xa329e76f
      ub2 kscnwrp                           @32       0x0c1e
   b2 ktbbhict                              @36       1
   ub1 ktbbhflg                             @38       0x02 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x00000000
   struct ktbbhitl[0], 24 bytes             @44      
      struct ktbitxid, 8 bytes              @44      
         ub2 kxidusn                        @44       0x0053
         ub2 kxidslt                        @46       0x002a
         ub4 kxidsqn                        @48       0x000598bd
      struct ktbituba, 8 bytes              @52      
         ub4 kubadba                        @52       0x0d407e46
         ub2 kubaseq                        @56       0x4f52
         ub1 kubarec                        @58       0x2f
      ub2 ktbitflg                          @60       0x2001 (KTBFUPB)<--需要提交
      union _ktbitun, 2 bytes               @62      
         b2 _ktbitfsc                       @62       0
         ub2 _ktbitwrp                      @62       0x0000
      ub4 ktbitbas                          @64       0xa329e772

BBED> set count 32
        COUNT           32

BBED> set offset 60
        OFFSET          60

BBED> d
 File: /oradata/sys/xifenfei/system01.dbf (1)
 Block: 122              Offsets:   60 to   91           Dba:0x0040007a
------------------------------------------------------------------------
 20010000 a329e772 0001006c ffff00ea 040c0368 03680000 006c1f7c 1f3c1efb 

 <32 bytes per line>

BBED> m /x 8001
 File: /oradata/sys/xifenfei/system01.dbf (1)
 Block: 122              Offsets:   60 to   91           Dba:0x0040007a
------------------------------------------------------------------------
 80010000 a329e772 0001006c ffff00ea 040c0368 03680000 006c1f7c 1f3c1efb 

 <32 bytes per line>

BBED> sum apply
Check value for File 1, Block 122:
current = 0xafd6, required = 0xafd6

尝试open数据库ORA-600[2662]解决

Thu May 30 21:16:00 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_smon_819664.trc:
ORA-00600: internal error code, arguments:[2662],[3102],[2737532996],[3102],[2745973074],[4194397],[],[]
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Thu May 30 21:16:01 2013
Trace dumping is performing id=[cdmp_20130530211601]
Thu May 30 21:16:02 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_smon_819664.trc:
ORA-00600: internal error code, arguments:[2662],[3102],[2737532997],[3102],[2745973074],[4194397],[],[]
Thu May 30 21:16:03 2013
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 2 out of maximum 100 non-fatal internal errors.
Thu May 30 21:16:05 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_smon_819664.trc:
ORA-00600: internal error code, arguments:[2662],[3102],[2737532997],[3102],[2745973074],[4194397],[],[]
Thu May 30 21:16:08 2013
Errors in file /u02/oracle/app/oracle/admin/xifenfei/bdump/xifenfei1_pmon_958764.trc:
ORA-00474: SMON process terminated with error
Thu May 30 21:16:08 2013
PMON: terminating instance due to error 474

数据库在open过程中遇到大量ORA-00600[2662],这个是因为数据库中文件头的scn小于访问的数据块scn导致该问题,解决方法推荐scn,如果数据库的scn本身就很大(和时间理论scn较接近),推进过程中可能遇到如下错误,这个时候就需要选择合适的方法/合适的值来推进scn

SQL> startup pfile=/home/oracle/pfile force
ORACLE instance started.

Total System Global Area 5.5835E+10 bytes
Fixed Size                  2177056 bytes
Variable Size            3.2867E+10 bytes
Database Buffers         2.2951E+10 bytes
Redo Buffers               14598144 bytes
Database mounted.
ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified

后面的工作因为没有redo前滚,而且该库故障时有大量事务在跑,现在无法前滚,导致大量的undo回滚段异常,index和data不一致等故障,需要做的就是屏蔽undo seg,重建undo,重建库

Viewing Information About CDBs and PDBs

在ORACLE 12C中引入了CDB和PDB的概念,实现了ORACLE数据库的可插拔,在一个CDB数据库中,有多个PDB,而每一个PDB又可以理解为一个独立的传统ORACLE 数据库,那为了能够通过一个sql查询获得整个CDB数据库的信息,ORALCE 引入了CDB_*开头的视图,该视图就是在传统的DBA_*视图基础之上增加了CON_ID,用来区分不同的PDB,从而实现了一个简单sql查询在有足够权限的情况下,可以查询所有PDB中信息
ORACLE 12C版本

SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

确定是否是CDB

SQL> SELECT CDB FROM V$DATABASE;

CDB
---
YES

YES表示该数据库是CDB,如果是NO表示是NO-CDB(普通数据库)

CDB中各容器信息

SQL> COLUMN NAME FORMAT A8
SQL> SELECT NAME, CON_ID, DBID, CON_UID, GUID FROM V$CONTAINERS ORDER BY CON_ID;

NAME         CON_ID       DBID    CON_UID GUID
-------- ---------- ---------- ---------- --------------------------------
CDB$ROOT          1 1922813718          1 DB85A3D39F8E7703E0431CAAE80A8C44
PDB$SEED          2 4048821679 4048821679 D49B80694E4A449BE0430100007F906F
PDB1              3 3313918585 3313918585 D49BF37938FB4C10E0430100007F6CBD
PDB2              4 3872456618 3872456618 D49BFA33332F4C3EE0430100007FA059

SQL> select CON_NAME_TO_ID('PDB1') FROM DUAL;

CON_NAME_TO_ID('PDB1')
----------------------
                     3

SQL> SELECT CON_DBID_TO_ID(3313918585) FROM DUAL;

CON_DBID_TO_ID(3313918585)
--------------------------
                         3

SQL> SELECT CON_UID_TO_ID(3313918585) FROM DUAL;

CON_UID_TO_ID(3313918585)
-------------------------
                        3

PDB部分信息

SQL> COLUMN PDB_NAME FORMAT A15
SQL> SELECT PDB_ID, PDB_NAME, STATUS FROM DBA_PDBS ORDER BY PDB_ID;

    PDB_ID PDB_NAME        STATUS
---------- --------------- -------------
         2 PDB$SEED        NORMAL
         3 PDB1            NORMAL
         4 PDB2            NORMAL

SQL> COLUMN NAME FORMAT A15
SQL> COLUMN RESTRICTED FORMAT A10
SQL> COLUMN OPEN_TIME FORMAT A30
SQL> SELECT NAME, OPEN_MODE, RESTRICTED, OPEN_TIME FROM V$PDBS;

NAME            OPEN_MODE  RESTRICTED OPEN_TIME
--------------- ---------- ---------- ------------------------------
PDB$SEED        READ ONLY  NO         12-MAY-13 08.51.53.177 AM
PDB1            READ WRITE NO         12-MAY-13 09.35.48.646 AM
PDB2            MOUNTED               12-MAY-13 08.56.59.859 AM

CDB中查询对象信息

SQL> COLUMN PDB_NAME FORMAT A15
SQL> COLUMN OWNER FORMAT A15
SQL> COLUMN TABLE_NAME FORMAT A30
SQL> SELECT p.PDB_ID, p.PDB_NAME, t.OWNER, t.TABLE_NAME
  2  FROM DBA_PDBS p, CDB_TABLES t
  3  WHERE p.PDB_ID > 2 
  4  AND T.TABLE_NAME='COL$'
  5  AND p.PDB_ID = t.CON_ID
  6  ORDER BY p.PDB_ID;

    PDB_ID PDB_NAME        OWNER           TABLE_NAME
---------- --------------- --------------- ------------------------------
         3 PDB1            SYS             COL$
         4 PDB2            SYS             COL$

查询在CDB中的PDB数据/临时文件信息

SQL> COLUMN PDB_ID FORMAT 999
SQL> COLUMN PDB_NAME FORMAT A8
SQL> COLUMN FILE_ID FORMAT 9999
SQL> COLUMN TABLESPACE_NAME FORMAT A10
SQL> COLUMN FILE_NAME FORMAT A45
SQL> SELECT p.PDB_ID, p.PDB_NAME, d.FILE_ID, d.TABLESPACE_NAME, d.FILE_NAME
  2  FROM DBA_PDBS p, CDB_DATA_FILES d
  3  WHERE p.PDB_ID = d.CON_ID
  4  ORDER BY p.PDB_ID;

PDB_ID PDB_NAME FILE_ID TABLESPACE FILE_NAME
------ -------- ------- ---------- ---------------------------------------------
     2 PDB$SEED       5 SYSTEM     +DATA/pdb$seed_system01.dbf
     2 PDB$SEED       7 SYSAUX     +DATA/pdb$seed_sysaux01.dbf
     3 PDB1           9 SYSAUX     +DATA/pdb1_pdb$seed_sysaux01.dbf
     3 PDB1          10 USERS      +DATA/cdb/pdb1_users01.dbf
     3 PDB1           8 SYSTEM     +DATA/pdb1_pdb$seed_system01.dbf
     4 PDB2          13 USERS      +DATA/cdb/pdb2_users01.dbf
     4 PDB2          12 SYSAUX     +DATA/pdb2_pdb$seed_sysaux01.dbf
     4 PDB2          11 SYSTEM     +DATA/pdb2_pdb$seed_system01.dbf

8 rows selected.

SQL> COLUMN CON_ID FORMAT 999
SQL> COLUMN FILE_ID FORMAT 9999
SQL> COLUMN TABLESPACE_NAME FORMAT A15
SQL> COLUMN FILE_NAME FORMAT A45
SQL> SELECT CON_ID, FILE_ID, TABLESPACE_NAME, FILE_NAME
  2  FROM CDB_TEMP_FILES
  3  ORDER BY CON_ID;

CON_ID FILE_ID TABLESPACE_NAME FILE_NAME
------ ------- --------------- ---------------------------------------------
     1       1 TEMP            +DATA/cdb/temp01.dbf
     2       2 TEMP            +DATA/pdbseed_temp01.dbf
     3       3 TEMP            +DATA/pdb1_temp01.dbf
     4       4 TEMP            +DATA/pdb2_temp01.dbf

查询PDB的service信息

SQL> COLUMN NETWORK_NAME FORMAT A30
SQL> COLUMN PDB FORMAT A15
SQL> COLUMN CON_ID FORMAT 999
SQL> SELECT PDB, NETWORK_NAME, CON_ID FROM CDB_SERVICES
  2  WHERE PDB IS NOT NULL AND
  3  CON_ID > 2
  4  ORDER BY PDB;

PDB             NETWORK_NAME                   CON_ID
--------------- ------------------------------ ------
PDB1            pdb1                                3
PDB2            pdb2                                4

PDB中可以修改参数

SELECT NAME FROM V$SYSTEM_PARAMETER
WHERE ISPDB_MODIFIABLE = 'TRUE'
ORDER BY NAME;

查看PDB历史信息

SQL> COLUMN DB_NAME FORMAT A10
SQL> COLUMN CON_ID FORMAT 999
SQL> COLUMN PDB_NAME FORMAT A15
SQL> COLUMN OPERATION FORMAT A16
SQL> COLUMN OP_TIMESTAMP FORMAT A10
SQL> COLUMN CLONED_FROM_PDB_NAME FORMAT A15
SQL> SELECT DB_NAME, CON_ID, PDB_NAME, OPERATION, OP_TIMESTAMP, CLONED_FROM_PDB_NAME
  2  FROM CDB_PDB_HISTORY
  3  WHERE CON_ID > 2
  4  ORDER BY CON_ID;

DB_NAME    CON_ID PDB_NAME        OPERATION        OP_TIMESTA CLONED_FROM_PDB
---------- ------ --------------- ---------------- ---------- ---------------
SEEDDATA        3 PDB$SEED        UNPLUG           29-APR-13
CDB             3 PDB1            CREATE           01-FEB-13  PDB$SEED
CDB             3 PDB$SEED        PLUG             01-FEB-13  PDB$SEED
SEEDDATA        4 PDB$SEED        UNPLUG           29-APR-13
CDB             4 PDB2            CREATE           01-FEB-13  PDB$SEED
CDB             4 PDB$SEED        PLUG             01-FEB-13  PDB$SEED

6 rows selected.

ORACLE 12C PDB 维护基础介绍

CDB和PDB是ORACLE 12C一个很亮的新特性,由于他们的引入导致传统的ORACLE数据库管理理念不少发生了改变,这里列举了部分最基本的cdb和pdb管理方式
cdb和pdb关系图

ORACLE 12C版本

SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

启动关闭pdb

SQL> startup
ORACLE instance started.

Total System Global Area  597098496 bytes
Fixed Size                  2291072 bytes
Variable Size             272632448 bytes
Database Buffers          314572800 bytes
Redo Buffers                7602176 bytes
Database mounted.
Database opened.
SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;

    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           MOUNTED
         4 3872456618 PDB2                           MOUNTED

SQL> alter PLUGGABLE database pdb1 open;

Pluggable database altered.

SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;

    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           READ WRITE
         4 3872456618 PDB2                           MOUNTED

SQL> alter PLUGGABLE database pdb1 close;

Pluggable database altered.

SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;

    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           MOUNTED
         4 3872456618 PDB2                           MOUNTED

SQL> alter PLUGGABLE database all open;

Pluggable database altered.

SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;

    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           READ WRITE
         4 3872456618 PDB2                           READ WRITE

SQL> alter PLUGGABLE database all close;

Pluggable database altered.

SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;

    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           MOUNTED
         4 3872456618 PDB2                           MOUNTED

SQL> alter session set container=pdb1;

Session altered.

SQL> startup
Pluggable Database opened.
SQL>  select con_id,dbid,NAME,OPEN_MODE from v$pdbs;

    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         3 3313918585 PDB1                           READ WRITE

pdb的管理可以在cdb中进行也可以在pdb中进行,如果是cdb中进行,需要PLUGGABLE关键字,如果是pdb中直接和普通数据库一样

登录pdb

[oracle@xifenfei ~]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 12-MAY-2013 08:07:02

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=xifenfei)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                11-MAY-2013 18:30:54
Uptime                    0 days 13 hr. 36 min. 8 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/grid/product/12.1/network/admin/listener.ora
Listener Log File         /u01/app/grid/diag/tnslsnr/xifenfei/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xifenfei)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=xifenfei)(PORT=5500))
(Security=(my_wallet_directory=/u01/oracle/12.1/db_1/admin/cdb/xdb_wallet))(Presentation=HTTP)(Session=RAW))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM", status READY, has 1 handler(s) for this service...
Service "cdb" has 1 instance(s).
  Instance "cdb", status READY, has 1 handler(s) for this service...
Service "cdbXDB" has 1 instance(s).
  Instance "cdb", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
  Instance "cdb", status READY, has 1 handler(s) for this service...
Service "pdb2" has 1 instance(s).
  Instance "cdb", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@xifenfei ~]$ tnsping pdb1

TNS Ping Utility for Linux: Version 12.1.0.1.0 - Production on 12-MAY-2013 08:07:09

Copyright (c) 1997, 2013, Oracle.  All rights reserved.

Used parameter files:


Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = xifenfei)
(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = pdb1)))
OK (20 msec)
[oracle@xifenfei ~]$ sqlplus sys/xifenfei@pdb1 as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on Sun May 12 08:08:02 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> show con_name;

CON_NAME
------------------------------
PDB1


[oracle@xifenfei ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on Sun May 12 08:09:14 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> alter session set container=pdb1;

Session altered.

SQL> show con_name;

CON_NAME
------------------------------
PDB1

pdb可以通过alter session container进入也可以直接通过tns方式登录

创建用户

SQL> show con_name;

CON_NAME
------------------------------
CDB$ROOT

SQL> select con_id,dbid,NAME,OPEN_MODE from v$pdbs;

    CON_ID       DBID NAME                           OPEN_MODE
---------- ---------- ------------------------------ ----------
         2 4048821679 PDB$SEED                       READ ONLY
         3 3313918585 PDB1                           READ WRITE
         4 3872456618 PDB2                           MOUNTED

SQL> create user xff identified by xifenfei;
create user xff identified by xifenfei
            *
ERROR at line 1:
ORA-65096: invalid common user or role name


SQL> !oerr ora 65096
65096, 00000, "invalid common user or role name"
// *Cause:  An attempt was made to create a common user or role with a name
//          that wass not valid for common users or roles.  In addition to 
//          the usual rules for user and role names, common user and role 
//          names must start with C## or c## and consist only of ASCII 
//          characters.
// *Action: Specify a valid common user or role name.
//

SQL> create user c##xff identified by xifenfei;

User created.

SQL> SELECT USERNAME,CON_ID,USER_ID FROM CDB_USERS WHERE USERNAME='C##XFF';

USERNAME       CON_ID    USER_ID
---------- ---------- ----------
C##XFF              1        103
C##XFF              3        104

SQL> alter session set container=pdb1;

Session altered.

SQL> show con_name

CON_NAME
------------------------------
PDB1
SQL> create user xff identified by xifenfei;

User created.

SQL> create user c##abc identified by xifenfei;
create user c##abc identified by xifenfei
            *
ERROR at line 1:
ORA-65094: invalid local user or role name

创建用户默认的是container=all,在cdb中只能创建全局用户(c##开头),会在cdb和所有的pdb中创建该用户(但是pdb中的全局用户需要另外授权才能够在pdb中访问)。在pdb中只能创建的用户为本地用户

用户授权

SQL> grant connect to c##xff;

Grant succeeded.

SQL> select GRANTEE,con_id from cdb_ROLE_PRIVS where GRANTED_ROLE='CONNECT' AND GRANTEE='C##XFF';

GRANTEE                            CON_ID
------------------------------ ----------
C##XFF                                  1

SQL> grant resource to c##xff container=all;

Grant succeeded.

SQL>  select GRANTEE,con_id from cdb_ROLE_PRIVS where GRANTED_ROLE='RESOURCE' AND  GRANTEE='C##XFF';

GRANTEE                            CON_ID
------------------------------ ----------
C##XFF                                  1
C##XFF                                  3

用户授权默认情况下是只会给当前container,在cdb中也可以指定container=all,对所有open的pdb且存在该用户都进行授权

修改参数

SQL> alter system set open_cursors=500 container=all;

System altered.

SQL> conn sys/xifenfei@pdb1 as sysdba
Connected.
SQL> show parameter open_cursors;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
open_cursors                         integer     500
SQL>  alter system set open_cursors=100;
 alter system set open_cursors=100
*
ERROR at line 1:
ORA-01219: database or pluggable database not open: queries allowed on fixed
tables or views only


SQL> alter database open;

Database altered.

SQL> alter system set open_cursors=100;

System altered.

SQL>  show parameter open_cursors;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
open_cursors                         integer     100
SQL> conn / as sysdba
Connected.
SQL> show parameter open_cursors;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
open_cursors                         integer     500

这里可以看到在cdb中修改,pdb会继承进去;如果在pdb中修改会覆盖pdb从cdb中继承的参数含义