重建控制文件丢失undo异常恢复—ORA-600 25025模拟与恢复

在现实的情况中,有些人因为某种原因重建控制文件(丢失undo[有意或者无意]),然后又resetlogs库尝试恢复,这样的操作可能导致各种比较麻烦的恢复,这里模拟ORA-600[25025]异常恢复
模拟ORA-600[25025]错误

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> alter database backup controlfile to trace as '/tmp/ctl';

Database altered.

SQL> create table chf.t_xifenfei_www as select * from dba_objects where 1=0;

Table created.

SQL> insert into chf.t_xifenfei_www select * from dba_objects;

74749 rows created.


--另外一个会话abort
SQL> shutdown abort;
ORACLE instance shut down.


SQL> STARTUP NOMOUNT
ORACLE instance started.

Total System Global Area  175775744 bytes
Fixed Size                  1343668 bytes
Variable Size             117444428 bytes
Database Buffers           50331648 bytes
Redo Buffers                6656000 bytes
SQL> !vi /tmp/ctl.sql

CREATE CONTROLFILE REUSE DATABASE "ORA11G" NORESETLOGS  ARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 8
    MAXLOGHISTORY 292
LOGFILE
  GROUP 1 '/u01/oracle/oradata/ora11g/redo01.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 '/u01/oracle/oradata/ora11g/redo02.log'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 '/u01/oracle/oradata/ora11g/redo03.log'  SIZE 50M BLOCKSIZE 512
DATAFILE
  '/u01/oracle/oradata/ora11g/system01.dbf',
  '/u01/oracle/oradata/ora11g/sysaux01.dbf',
 -- '/u01/oracle/oradata/ora11g/undo02.dbf',
  '/u01/oracle/oradata/ora11g/users01.dbf',
  '/u01/oracle/oradata/ora11g/dbfs01.dbf',
  '/u01/oracle/oradata/ora11g/tts_xifenfei02.dbf',
  '/u01/oracle/oradata/ora11g/tts_xifenfei01.dbf',
  '/u01/oracle/oradata/ora11g/system02.dbf',
  '/u01/oracle/oradata/ora11g/czum01.dbf',
  '/u01/oracle/oradata/ora11g/undotbs02.dbf',
  '/u01/oracle/oradata/sp2008',
  '/u01/oracle/oradata/sp_2009',
  '/u01/oracle/oradata/sp_2010',
  '/u01/oracle/oradata/sp_2011',
  '/u01/oracle/oradata/sp_2012',
  '/u01/oracle/oradata/sp_2013',
  '/u01/oracle/oradata/sp_2014',
  '/u01/oracle/oradata/sp_2015',
  '/u01/oracle/oradata/sp_2016',
  '/u01/oracle/oradata/sp_2017',
  '/u01/oracle/oradata/sp_2018',
  '/u01/oracle/oradata/sp_2019',
  '/u01/oracle/oradata/sp_2020',
  '/u01/oracle/oradata/sp_2021',
  '/u01/oracle/oradata/sp_2022',
  '/u01/oracle/oradata/sp_2023',
  '/u01/oracle/oradata/sp_2024',
  '/u01/oracle/oradata/sp_2025',
  '/u01/oracle/oradata/sp_20max'
CHARACTER SET ZHS16GBK
;

"/tmp/ctl.sql" [New] 43L, 1519C written

SQL> @/tmp/ctl.sql

Control file created.

SQL> recover database using backup controlfile until cancel;
ORA-00279: change 12696930343864 generated at 05/18/2013 01:17:54 needed for
thread 1
ORA-00289: suggestion :
/u01/oracle/oradata/ora11g/archivelog/1_38_805394597.dbf
ORA-00280: change 12696930343864 for thread 1 is in sequence #38


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/ora11g/redo01.log
ORA-00310: archived log contains sequence 37; sequence 38 required
ORA-00334: archived log: '/u01/oracle/oradata/ora11g/redo01.log'


ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/ora11g/system01.dbf'


SQL>  recover database using backup controlfile until cancel;
ORA-00279: change 12696930343864 generated at 05/18/2013 01:17:54 needed for
thread 1
ORA-00289: suggestion :
/u01/oracle/oradata/ora11g/archivelog/1_38_805394597.dbf
ORA-00280: change 12696930343864 for thread 1 is in sequence #38


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/ora11g/redo03.log
ORA-00310: archived log contains sequence 39; sequence 38 required
ORA-00334: archived log: '/u01/oracle/oradata/ora11g/redo03.log'


ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/ora11g/system01.dbf'


SQL> recover database using backup controlfile until cancel;
ORA-00279: change 12696930343864 generated at 05/18/2013 01:17:54 needed for
thread 1
ORA-00289: suggestion :
/u01/oracle/oradata/ora11g/archivelog/1_38_805394597.dbf
ORA-00280: change 12696930343864 for thread 1 is in sequence #38


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/ora11g/redo02.log
ORA-00279: change 12696930370956 generated at 08/26/2013 13:00:25 needed for
thread 1
ORA-00289: suggestion :
/u01/oracle/oradata/ora11g/archivelog/1_39_805394597.dbf
ORA-00280: change 12696930370956 for thread 1 is in sequence #39
ORA-00278: log file '/u01/oracle/oradata/ora11g/redo02.log' no longer needed
for this recovery


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/ora11g/redo03.log
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [25025], [3], [], [], [], [], [],
[], [], [], [], []
Process ID: 12603
Session ID: 125 Serial number: 3

alert日志信息

ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile until cancel  ...
ALTER DATABASE RECOVER    LOGFILE '/u01/oracle/oradata/ora11g/redo02.log'
Media Recovery Log /u01/oracle/oradata/ora11g/redo02.log
Mon Aug 26 13:05:00 2013
ORA-279 signalled during: ALTER DATABASE RECOVER    LOGFILE '/u01/oracle/oradata/ora11g/redo02.log'  ...
Mon Aug 26 13:05:10 2013
ALTER DATABASE RECOVER    LOGFILE '/u01/oracle/oradata/ora11g/redo03.log'
Media Recovery Log /u01/oracle/oradata/ora11g/redo03.log
Mon Aug 26 13:05:10 2013
Incomplete recovery applied all redo ever generated.
Recovery completed through change 12696930370973 time 08/26/2013 13:00:56
Media Recovery Complete (ora11g)
Completed: ALTER DATABASE RECOVER    LOGFILE '/u01/oracle/oradata/ora11g/redo03.log'
alter database open resetlogs
RESETLOGS after complete recovery through change 12696930370973
Archived Log entry 1 added for thread 1 sequence 37 ID 0xfa6fa6cb dest 1:
Archived Log entry 2 added for thread 1 sequence 38 ID 0xfa6fa6cb dest 1:
Archived Log entry 3 added for thread 1 sequence 39 ID 0xfa6fa6cb dest 1:
Clearing online redo logfile 1 /u01/oracle/oradata/ora11g/redo01.log
Clearing online log 1 of thread 1 sequence number 37
Mon Aug 26 13:05:22 2013
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /u01/oracle/oradata/ora11g/redo02.log
Clearing online log 2 of thread 1 sequence number 38
Clearing online redo logfile 2 complete
Clearing online redo logfile 3 /u01/oracle/oradata/ora11g/redo03.log
Clearing online log 3 of thread 1 sequence number 39
Clearing online redo logfile 3 complete
Resetting resetlogs activation ID 4201621195 (0xfa6fa6cb)
Online log /u01/oracle/oradata/ora11g/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/oracle/oradata/ora11g/redo02.log: Thread 1 Group 2 was previously cleared
Online log /u01/oracle/oradata/ora11g/redo03.log: Thread 1 Group 3 was previously cleared
Mon Aug 26 13:05:33 2013
Setting recovery target incarnation to 2
Mon Aug 26 13:05:33 2013
Using SCN growth rate of 16384 per second
Mon Aug 26 13:05:33 2013
Assigning activation ID 4220644150 (0xfb91eb36)
LGWR: STARTING ARCH PROCESSES
Mon Aug 26 13:05:33 2013
ARC0 started with pid=20, OS id=12679
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Aug 26 13:05:35 2013
ARC1 started with pid=21, OS id=12683
Mon Aug 26 13:05:35 2013
ARC2 started with pid=22, OS id=12687
Mon Aug 26 13:05:36 2013
ARC3 started with pid=24, OS id=12691
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /u01/oracle/oradata/ora11g/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Aug 26 13:05:36 2013
SMON: enabling cache recovery
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_12603.trc  (incident=146705):
ORA-00600: internal error code, arguments: [25025], [3], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/oracle/diag/rdbms/ora11g/ora11g/incident/incdir_146705/ora11g_ora_12603_i146705.trc
Mon Aug 26 13:05:45 2013
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Undo initialization errored: err:600 serial:0 start:57601994 end:57610584 diff:8590 (85 seconds)
Errors in file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_12603.trc:
ORA-00600: internal error code, arguments: [25025], [3], [], [], [], [], [], [], [], [], [], []
Errors in file /u01/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_12603.trc:
ORA-00600: internal error code, arguments: [25025], [3], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 12603): terminating the instance due to error 600
Instance terminated by USER, pid = 12603
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (12603) as a result of ORA-1092
Mon Aug 26 13:05:47 2013
ORA-1092 : opitsk aborting process

trace文件

*** 2013-08-26 13:05:38.945
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=7j16t46cacjt9) -----
alter database open resetlogs

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+41        call     kgdsdst()            BF8A34E4 ? 2 ?
ksedst1()+77         call     skdstdst()           BF8A34E4 ? 0 ? 1 ? 8592C38 ?
                                                   8592AB6 ? 8592C38 ?
ksedst()+33          call     ksedst1()            0 ? 1 ?
dbkedDefDump()+2704  call     ksedst()             0 ? BF8A40E7 ? 47EF1FF4 ?
                                                   BF8A3D66 ? 0 ? BF8A3BF4 ?
ksedmp()+47          call     dbkedDefDump()       3 ? 2 ?
ksfdmp()+59          call     ksedmp()             3EB ? BF8A5458 ? F1DADED ?
                                                   116056E0 ? 3EB ? 116056E0 ?
dbgexPhaseII()+1725  call     00000000             116056E0 ? 3EB ?
dbgexProcessError()  call     dbgexPhaseII()       B6C515AC ? B6A0C890 ?
+2550                                              BF8A8D30 ?
dbgeExecuteForError  call     dbgexProcessError()  B6C515AC ? B6A0C890 ? 1 ? 0 ?
()+65                                              B6C515AC ? B6A0C890 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  B6C515AC ? B6A0C890 ? 0 ? 1 ?
1794                          ()                   0 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   116056E0 ? B6C330D4 ? 258 ?
50
kgeade()+324         call     00000000             116056E0 ? B6C330D4 ? 258 ?
kgeriv_int()+107     call     kgeade()             116056E0 ? 11605808 ?
                                                   B6C330D4 ? 258 ? 0 ? 61C1 ?
kgeriv()+25          call     kgeriv_int()         116056E0 ? B6C330D4 ? 61C1 ?
                                                   0 ? 1 ? BF8A96B8 ?
kgesiv()+98          call     kgeriv()             116056E0 ? B6C330D4 ? 61C1 ?
                                                   1 ? BF8A96B8 ?
ksesic1()+48         call     kgesiv()             116056E0 ? B6C330D4 ? 61C1 ?
                                                   1 ? BF8A96B8 ? 61C1 ? 1 ?krta2rfn()+78        
                                                   call     ksesic1()            61C1 ? 0 ? 3 ? 0 ?
kttsinfo()+496       call     krta2rfn()           3 ? 0 ? 0 ? 0 ? 0 ? 0 ?
ktusmout_online_ut(  call     kttsinfo()           9 ? 0 ? 0 ? BF8A9948 ?
)+810
ktusmiut_init_ut()+  call     ktusmout_online_ut(  B000 ? 89E57F8 ? 9 ?

                                                   BF8A96B8 ?
348                           )
ktuini()+518         call     ktusmiut_init_ut()   0 ? 0 ? 0 ? 0 ? 1 ? 0 ?
adbdrv()+46345       call     ktuini()             0 ? BF8A9DE0 ? 1004BF42 ?
                                                   BF8AA258 ? B6A0BFFC ?
                                                   10EA1F20 ?
opiexe()+16835       call     adbdrv()             25C8F1F8 ? 0 ? 0 ? 2A34F528 ?
                                                   2A0400C8 ? BF8AF07C ?
opiosq0()+3437       call     opiexe()             4 ? 0 ? BF8AFE8C ?
kpooprx()+239        call     opiosq0()            3 ? E ? BF8B0184 ? A4 ? 0 ?
kpoal8()+607         call     kpooprx()            BF8B2D6C ? BF8B10AC ? 1D ?
                                                   1 ? 0 ? A4 ?
opiodr()+962         call     00000000             5E ? 1C ? BF8B2D68 ?
ttcpip()+1930        call     00000000             5E ? 1C ? BF8B2D68 ? 0 ?
opitsk()+1355        call     ttcpip()             11616580 ? 5E ? BF8B2D68 ?
                                                   0 ? BF8B29F8 ? BF8B2E90 ?
                                                   FDEBA80 ? 0 ? BF8B2EBC ?
opiino()+827         call     opitsk()             0 ? 0 ?
opiodr()+962         call     00000000             3C ? 4 ? BF8B3E2C ?
opidrv()+479         call     opiodr()             3C ? 4 ? BF8B3E2C ? 0 ?
sou2o()+80           call     opidrv()             3C ? 4 ? BF8B3E2C ?
opimai_real()+109    call     sou2o()              BF8B3E10 ? 3C ? 4 ?
                                                   BF8B3E2C ?
ssthrdmain()+212     call     00000000             2 ? BF8B3F58 ? 0 ? 47DA6F14 ?
                                                   BF8B3F14 ? 47D9A670 ?
main()+147           call     ssthrdmain()         2 ? BF8B3F58 ?
__libc_start_main()  call     00000000             2 ? BF8B4054 ? BF8B4060 ?
+220                                               47D9A828 ? 0 ? 1 ?
_start()+33          call     __libc_start_main()  856F3B4 ? 2 ? BF8B4054 ?
                                                   BCC1EA0 ? BCC1E90 ?
                                                   47D8C790 ?


--------------------- Binary Stack Dump ---------------------

MOS中有类似描述ORA-600 [25025] [25] While Opening the Clone Database in Resetlog Mode (Doc ID 603100.1),该解决方案是重建控制文件增加所有数据文件,在本次测试中,我就是人为除掉了undo,模拟undo丢失[其实数据库已经resetlogs过了,就算加入undo重建控制文件也不会成功(人工修改undo文件头除外)],又做了不正确的重建控制文件操作的故障,我提供解决方案如下

解决办法

--参数文件修改
undo_management='manual'

--尝试open数据库
recover database;
alter database open;

--新建undo
create undo tablespace undo_new datafile '' size 100m autoextend on next 10m maxsize 30G;

--屏蔽需要恢复回滚段
select tablespace_name,segment_name,status from dba_rollback_segs;
_corrupted_rollback_segments

--重启数据库使得_corrupted_rollback_segments生效
shutdown immediate;
startup 

--删除老undo
drop tablespace old_undo

--修改参数
shutdonw immediate
undo_management='auto'
undo_tablespace='unod_new'

--启动数据库
startup

--导出数据,导入新库

姊妹篇:重建控制文件丢失undo异常恢复—ORA-01173模拟与恢复

ORACLE db top 命令(oratop)

oracle 也推出来了数据库的top命令,叫做oratap,是使用c语言写的,目前只支持linux x86与x64的11gR2与12cR1
使用说明

[oracle@xifenfei tmp]$ ./oratap  -h

oratop: Release 13.2.4

Usage: 

         oratop [ [Options] [Logon] ]

         Logon:
                {username[/password][@connect_identifier] | / }
                [AS {SYSDBA|SYSOPER}]

         Options:
             -i : Interval Delay (requires value in seconds, default: 3s)
             -f : Long format for header & section 4 (default: 80 column)
             -r : IORL mode for Section 2 (default is IOPS)
             -d : Real-Time Top 5 Wait Events (default: Cumulative)
             -m : MODULE/ACTION mode for Section 4 (default: USER/PROGRAM_NAME)
             -b : Batch mode
             -n : maximum number of iterations (requires number)
             -h : Help

使用说明
1. Log in to the system as the Oracle RDBMS software installation owner
2. Stage the oratop executable on the server on which the tool will be executed. On a RAC system it is only necessary to stage the executable on one node as it is RAC aware.
3. Change the name of the executable, eg.

$ mv oratop* oratop

4. Validate the permissions for oratop are 755 (-rwxr-xr-x). If the permissions are not currently set to 755, set the permissions on oratop as follows:

$ chmod 755 oratop

5. Configure the execution environment as follows (if not already set accordingly)

$ export TERM=xterm #or vt100
$ export ORACLE_HOME=<11.2 database home>
$ export LD_LIBRARY_PATH=$ORACLE_HOME/lib
$ export PATH=$ORACLE_HOME/bin:$PATH
$ export ORACLE_SID=<local 11.2 database SID to be monitored> #only needed if connecting to a local database

6. Invoke the tool to monitor a local database as follows (interval can be any interval in seconds):

$ ./oratop -i 10 / as sysdba

7. Non-dba privileged user requires specific grants to use the tool. See User Guide for grants required.
8. Databases can be monitored remotely using a tns alias as described in the oratop Users Guide .

$ ./oratop -i 10 username/password@tns_alias
$ ./oratop -i 10 system/manager@tns_alias

9. To exit the program simply press the keyboard key “q”. To abort, CTRL-C.
Troubleshooting:
If you receive an error similar to
./oratop: error while loading shared libraries: libclntsh.so.11.1: cannot open shared object file: No such file or directory that is an indication that libclntsh.so.11.1 was not found in the path defined in LD_LIBRARY_PATH similar to

$ export LD_LIBRARY_PATH=$ORACLE_HOME/lib (ie., the location of libclntsh.so.11.1)

使用效果

[oracle@xifenfei tmp]$ ./oratap  -i 10 chf/xifenfei@ora11g_p

oratop: Release 13.2.4 Production on Mon Aug 26 03:10:17 2013

Copyright (c) 2011, Oracle.  All rights reserved.


Processing ...


oratop 1:  148 ora1 03:10:13 up  3.3m,  1 ins, 301M mt,   3 sn,  1 us,   6% db
ID %CU HLD MBPS IOPS %FR PGAU ASC ASI ASW ASP   AAS USN  TPS UCPS SSRT DBC DBW
 1   0   0    0    1   5 133M   1   0   0   0   0.1   3    0    2   0u   0   0 
                                                                               
EVENT     (Cumulative)           TOT WAITS  TIME(s)  AVG_MS PCT  WAIT_CLASS   
DB CPU                                          218          68               
control file parallel write           5535       30     5.5   9  System I/O   
db file sequential read              10080       27     2.7   8  User I/O     
os thread startup                      198       25   124.9   8  Concurrency  
db file async I/O submit              2575       21     8.0   7  System I/O   
                                                                              
ID   SID  SPID USR PROG  PGA OPN SQLID/BLOCKER  E/T STATUS STE WAIT_EVENT  W/T
 1   148  8254 CHF DEDI   2M SEL 4agz3g5aajkdc    0 ACTIVE CPU wa for cpu   0u

参考文档:
oratop – utility for near real-time monitoring of databases, RAC and Single Instance (Doc ID 1500864.1)

使用_allow_resetlogs_corruption导致ORA-00704/ORA-01555故障

以前写过一篇乱用_allow_resetlogs_corruption参数导致悲剧的文章,昨天晚上又遇到一个朋友不谨慎使用_allow_resetlogs_corruption导致ORA-00704/ORA-01555故障
环境描述
系统环境:solaris
数据库版本:10.2.0.5.7
数据存储方式:ASM
数据量:15T以上
补充事宜:数据库SCN距离headroom只有54天

报ORA-00020错误,实例crash
数据库因为超过了系统的进程数,出现dbwn进程写数据文件异常

Sun Aug 25 16:00:41 CST 2013
Errors in file /opt/oracle/admin/orcl/bdump/orcl_dbw0_7490.trc:
ORA-01148: 无法刷新数据文件 22 的文件大小
ORA-01110: 数据文件 22: '+DATA/orcl/datafile/index_jh.dbf'
ORA-00020: 超出最大进程数 ()
Sun Aug 25 16:00:41 CST 2013
Errors in file /opt/oracle/admin/orcl/bdump/orcl_dbw0_7490.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式
ORA-01110: 数据文件 22: '+DATA/orcl/datafile/index_jh.dbf'
Sun Aug 25 16:00:41 CST 2013
DBW0: terminating instance due to error 1242
Termination issued to instance processes. Waiting for the processes to exit
Sun Aug 25 16:00:51 CST 2013
Instance termination failed to kill one or more processes
Instance terminated by DBW0, pid = 7490

ORA-00600[kcbtema_10]
实例恢复出现ORA-00600: 内部错误代码, 参数: [kcbtema_10], [1], [], [], [], [], [], []

Sun Aug 25 19:19:23 CST 2013
ALTER DATABASE OPEN
Sun Aug 25 19:19:38 CST 2013
Beginning crash recovery of 1 threads
 parallel recovery started with 16 processes
Sun Aug 25 19:19:40 CST 2013
Started redo scan
Sun Aug 25 19:20:07 CST 2013
Completed redo scan
 12016413 redo blocks read, 93405 data blocks need recovery
Sun Aug 25 19:20:19 CST 2013
Started redo application at
 Thread 1: logseq 53681, block 1091966
Sun Aug 25 19:20:19 CST 2013
Recovery of Online Redo Log: Thread 1 Group 1 Seq 53681 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/redo_1_1.log
  Mem# 1: +DATA/orcl/onlinelog/redo_1_2.log
Sun Aug 25 19:20:21 CST 2013
Errors in file /opt/oracle/admin/orcl/bdump/orcl_p011_16944.trc:
ORA-00600: 内部错误代码, 参数: [kcbtema_10], [1], [], [], [], [], [], []
Sun Aug 25 19:20:23 CST 2013
Errors in file /opt/oracle/admin/orcl/bdump/orcl_p011_16944.trc:
ORA-00600: 内部错误代码, 参数: [kcbtema_10], [1], [], [], [], [], [], []
Sun Aug 25 19:20:23 CST 2013
Aborting crash recovery due to slave death, attempting serial crash recovery
Sun Aug 25 19:20:23 CST 2013
Beginning crash recovery of 1 threads
Sun Aug 25 19:20:23 CST 2013
Started redo scan
Sun Aug 25 19:20:47 CST 2013
Completed redo scan
 12016413 redo blocks read, 93405 data blocks need recovery
Sun Aug 25 19:20:54 CST 2013
Started redo application at
 Thread 1: logseq 53681, block 1091966
Sun Aug 25 19:20:54 CST 2013
Recovery of Online Redo Log: Thread 1 Group 1 Seq 53681 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/redo_1_1.log
  Mem# 1: +DATA/orcl/onlinelog/redo_1_2.log
Sun Aug 25 19:20:54 CST 2013
Errors in file /opt/oracle/admin/orcl/udump/orcl_ora_16751.trc:
ORA-00600: 内部错误代码, 参数: [kcbtema_10], [1], [], [], [], [], [], []
Sun Aug 25 19:20:56 CST 2013
Aborting crash recovery due to error 600
Sun Aug 25 19:20:56 CST 2013
Errors in file /opt/oracle/admin/orcl/udump/orcl_ora_16751.trc:
ORA-00600: 内部错误代码, 参数: [kcbtema_10], [1], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...

使用隐含参数

ALTER SYSTEM SET _allow_resetlogs_corruption=TRUE SCOPE=SPFILE;

报ORA-00704/ORA-01555
因为在前面的恢复中进行了不完全恢复,因此这里加入隐含参数,然后尝试resetlogs,然后报如下错误

Sun Aug 25 20:11:54 CST 2013
alter database open resetlogs
Sun Aug 25 20:12:10 CST 2013
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 13429649847189
Resetting resetlogs activation ID 1312390734 (0x4e397e4e)
Sun Aug 25 20:16:25 CST 2013
Setting recovery target incarnation to 2
Sun Aug 25 20:16:42 CST 2013
************************************************************
Warning: The SCN headroom for this database is only 54 days!
************************************************************
Sun Aug 25 20:16:43 CST 2013
Assigning activation ID 1352200163 (0x5098efe3)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: +DATA/orcl/onlinelog/redo_1_1.log
  Current log# 1 seq# 1 mem# 1: +DATA/orcl/onlinelog/redo_1_2.log
Successful open of redo thread 1
Sun Aug 25 20:16:43 CST 2013
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Aug 25 20:16:52 CST 2013
SMON: enabling cache recovery
Sun Aug 25 20:16:52 CST 2013
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0c36.d582339b):
Sun Aug 25 20:16:52 CST 2013
select ctime, mtime, stime from obj$ where obj# = :1
Sun Aug 25 20:16:52 CST 2013
Errors in file /opt/oracle/admin/orcl/udump/orcl_ora_2859.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号 143 (名称为 "_SYSSMU143$") 过小
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Termination issued to instance processes. Waiting for the processes to exit
Sun Aug 25 20:17:02 CST 2013
Instance termination failed to kill one or more processes
Instance terminated by USER, pid = 2859
ORA-1092 signalled during: alter database open resetlogs...

数据库当前SCN

SQL > select CHECKPOINT_CHANGE# from v$database;

CHECKPOINT_CHANGE#
------------------
    13429649947222

SQL > select distinct CHECKPOINT_CHANGE# from v$datafile_header;

CHECKPOINT_CHANGE#
------------------
    13429649947222

解决方法
因为该数据库版本为10.2.0.5.7,已经包含了scn patch,因此不能使用event或者隐含参数来修改scn,而且该库容量15T以上(asm),因此也无法使用bbed修改数据文件头,最后决定使用ordebug来解决该问题
使用oradebug DUMPvar SGA kcsgscn_
使用oradebug poke 0x14A9EDC30 8 0xC36E583BA56

sqlplus / as sysdba
startup mount

oradebug setmypid
oradebug DUMPvar SGA kcsgscn_
oradebug poke 0x14A9EDC30 8 0xC36E583BA56

recover database;
alter database open;

事后总结
查询MOS,发现ORA-00600[kcbtema_10] Raised During Recovery Operations (Doc ID 472282.1)

--故障原因
The cause of this problem has been identified and verified in unpublished Bug 5184359 ORA-600 [KCBTEMA_10].
Due to this bug, during recovery, the class designation of a data block has changed.
--处理方法 
SQL>startup mount
SQL>recover database;
SQL>alter database open;

因为MOS上给的解决思路在该数据库中已经无法尝试,不能确定该方法一定可行,但是对于本次的恢复过程中,没有任何直接recover database操作(只有一次不完全恢复)确实让人有无限的遗憾和可惜。对于本次应该先查询MOS,尝试该种方法,慎重使用_allow_resetlogs_corruption参数

ORACLE 12C dbms_utility.expand_sql_text 查看SQL视图基表

在ORACLE 12C之间的版本,如果一条sql中包含了N多视图,我们要查询这条sql访问了哪些基表,以及他们的关联条件是一件比较麻烦的时候,需要进入一个个视图然后一个个去分析,纯体力活,到了12C之后,ORACLE 提供了dbms_utility.expand_sql_text来获得某条sql所对应的全部基表,不用我们再一级一级的去看.

dbms_utility.expand_sql_text


创建测试表
三个表分别来自数据库的v$datafile,v$tablespace,v$tempfile视图

CDB_PDB@CHF> create table datafile
  2  as
  3  select * from v$datafile;

表已创建。

CDB_PDB@CHF> create table tablespace
  2  as
  3  select * from v$tablespace;

表已创建。

CDB_PDB@CHF> create table tempfile
  2  as
  3  select * from v$tempfile;

表已创建。

创建视图
创建三个视图,datafile与tablespace,tempfile与tablespace,这两个视图然后做union all产生新视图

CDB_PDB@CHF> create view v_xifenfei1 as
  2  select d.name dname,t.name tname from datafile d,tablespace t where d.ts#=t.ts#;

视图已创建。

CDB_PDB@CHF> create view v_xifenfei2 as
  2  select d.name dname,t.name tname from tempfile d,tablespace t where d.ts#=t.ts#;

视图已创建。

CDB_PDB@CHF> create view v_xifenfei
  2  as
  3  select * from v_xifenfei1
  4  union all
  5  select * from v_xifenfei2;

视图已创建。

找出查询视图sql对应基表

CDB_PDB@CHF>  set linesize 32767 pagesize 0 serveroutput on
CDB_PDB@CHF>  declare
  2     original_sql clob :='select * from v_xifenfei';
  3     expanded_sql clob := empty_clob();
  4  begin
  5      dbms_utility.expand_sql_text(original_sql,expanded_sql);
  6      dbms_output.put_line(expanded_sql);
  7  end;
  8  /
SELECT "A1"."DNAME" "DNAME","A1"."TNAME" "TNAME" FROM  ( (SELECT "A4"."DNAME" "DNAME","A4"."TNAME" "TNAME" FROM  (SELECT
 "A6"."NAME" "DNAME","A5"."NAME" "TNAME" FROM CHF."DATAFILE" "A6",CHF."TABLESPACE" "A5" WHERE "A6"."TS#"="A5"."TS#") "A4
") UNION ALL  (SELECT "A3"."DNAME" "DNAME","A3"."TNAME" "TNAME" FROM  (SELECT "A8"."NAME" "DNAME","A7"."NAME" "TNAME" FR
OM CHF."TEMPFILE" "A8",CHF."TABLESPACE" "A7" WHERE "A8"."TS#"="A7"."TS#") "A3")) "A1"

PL/SQL 过程已成功完成。

格式化sql语句

/* Formatted on 2013/8/24 22:33:33 (QP5 v5.227.12220.39754) */
SELECT "A1"."DNAME" "DNAME", "A1"."TNAME" "TNAME"
  FROM ( (SELECT "A4"."DNAME" "DNAME", "A4"."TNAME" "TNAME"
            FROM (SELECT "A6"."NAME" "DNAME", "A5"."NAME" "TNAME"
                    FROM CHF."DATAFILE" "A6", CHF."TABLESPACE" "A5"
                   WHERE "A6"."TS#" = "A5"."TS#") "A4")
        UNION ALL
        (SELECT "A3"."DNAME" "DNAME", "A3"."TNAME" "TNAME"
           FROM (SELECT "A8"."NAME" "DNAME", "A7"."NAME" "TNAME"
                   FROM CHF."TEMPFILE" "A8", CHF."TABLESPACE" "A7"
                  WHERE "A8"."TS#" = "A7"."TS#") "A3")) "A1"

这里就非常清晰的看到是datafile与tablespace、tempfile与tablespace做union all的sql语句

ORACLE 12C Windows-Linux 部署DATAGURAD

环境描述
win 64中的ORACLE 12C(primary)与Linux 64中的ORACLE 12C(standby)搭建datagurad

primary force logging

C:\Users\XIFENFEI>sqlplus / as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on 星期六 8月 24 16:59:53 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

CDB_CDB$ROOT@SYS> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
PL/SQL Release 12.1.0.1.0 - Production
CORE    12.1.0.1.0      Production
TNS for 64-bit Windows: Version 12.1.0.1.0 - Production
NLSRTL Version 12.1.0.1.0 - Production

CDB_CDB$ROOT@SYS> alter database force logging ;

数据库已更改。

primary rman backup

backup filesperset = 5 as compressed backupset database format 'd:/backup/rman/full_%U.rman';

primary 生成standby controlfile

CDB_CDB$ROOT@SYS> ALTER DATABASE CREATE STANDBY CONTROLFILE AS 'c:/control01.ctl';

数据库已更改。

CDB_CDB$ROOT@SYS> create pfile='e:/pfile.txt' from spfile;

文件已创建。

standby 参数文件

DB_CREATE_FILE_DEST='+DATA'
db_create_online_log_dest_1='+DATA'
db_unique_name='cdb_dg'
service_names='cdb'
log_archive_dest_1='LOCATION=/u02/app/oracle/archivelog/ valid_for=(all_logfiles,all_roles)'
log_archive_dest_2='service=primary lgwr async valid_for=(online_logfiles,primary_role) db_unique_name=cdb'
log_archive_config='dg_config=(cdb,cdb_dg)'
standby_file_management=auto
db_file_name_convert='E:\APP\XIFENFEI\ORADATA\','+DATA\CDB_DG\DATAFILE\'
log_file_name_convert='E:\APP\XIFENFEI\ORADATA\','+DATA\CDB_DG\LOGFILE\'
fal_server=primary

primary 修改参数

CDB_CDB$ROOT@SYS> alter system set log_archive_config='dg_config=(cdb,cdb_dg)';

系统已更改。

CDB_CDB$ROOT@SYS> alter system set log_archive_dest_2='
 2 service=standby lgwr async valid_for=(online_logfiles,primary_role) db_unique_name=cdb_dg';

系统已更改。

CDB_CDB$ROOT@SYS> alter system set log_archive_dest_1=
  2 'LOCATION=USE_DB_RECOVERY_FILE_DEST valid_for=(all_logfiles,all_roles)';

系统已更改。

CDB_CDB$ROOT@SYS> alter system set fal_server=standby;

系统已更改。

CDB_CDB$ROOT@SYS> alter system set standby_file_management=auto;

系统已更改。

CDB_CDB$ROOT@SYS> alter system set db_file_name_convert='+DATA\CDB_DG\DATAFILE\','E:\APP\XIFENFEI\ORADATA\' scope=spfile;

系统已更改。

CDB_CDB$ROOT@SYS> alter system set log_file_name_convert='+DATA\CDB_DG\LOGFILE\','E:\APP\XIFENFEI\ORADATA\' scope=spfile;

系统已更改。

primary and standby tns

STANDBY =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.30.32 )(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = cdb)
    )
  )

PRIMARY =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.30.1 )(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = cdb)
    )
  )

standby restore controlfile

ASMCMD> cp /tmp/CONTROL01.CTL control01.ctl
copying /tmp/CONTROL01.CTL -> +data/cdb/control01.ctl

standby password file

[oracle@xifenfei dbs]$ pwd
/u01/app/oracle/product/12.1/db_1/dbs
[oracle@xifenfei dbs]$ cp /tmp/rman/PWDcdb.ora  orapwcdb

standby mount

SYS% cdb> create spfile from pfile='initcdb.ora';

File created.

SYS% cdb> startup mount;
ORACLE instance started.

Total System Global Area  521936896 bytes
Fixed Size                  2290264 bytes
Variable Size             314576296 bytes
Database Buffers          197132288 bytes
Redo Buffers                7938048 bytes
Database mounted.

SYS% cdb> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
PL/SQL Release 12.1.0.1.0 - Production
CORE    12.1.0.1.0      Production
TNS for Linux: Version 12.1.0.1.0 - Production
NLSRTL Version 12.1.0.1.0 - Production

SYS% cdb> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
E:\APP\XIFENFEI\ORADATA\CDB\SYSTEM01.DBF
E:\APP\XIFENFEI\ORADATA\CDB\PDBSEED\SYSTEM01.DBF
E:\APP\XIFENFEI\ORADATA\CDB\SYSAUX01.DBF
E:\APP\XIFENFEI\ORADATA\CDB\PDBSEED\SYSAUX01.DBF
E:\APP\XIFENFEI\ORADATA\CDB\UNDOTBS01.DBF
E:\APP\XIFENFEI\ORADATA\CDB\USERS01.DBF
E:\APP\XIFENFEI\ORADATA\CDB\PDB\SYSTEM01.DBF
E:\APP\XIFENFEI\ORADATA\CDB\PDB\SYSAUX01.DBF
E:\APP\XIFENFEI\ORADATA\CDB\PDB\PDB_USERS01.DBF

9 rows selected.

standby rman restore

--清理控制文件中的备份集垃圾
DELETE noprompt OBSOLETE;
crosscheck backup;
delete noprompt expired backup;

--注册新备份集
catalog start with '/tmp/rman/';

--还原数据文件
run
{
set newname for database to '+data';
restore database;
switch datafile all;
}

standby clear redo

SYS% cdb> select group# from v$log;

    GROUP#
----------
         4
         6
         5

SYS% cdb> alter database clear logfile group 4;

Database altered.

SYS% cdb> alter database clear logfile group 5;

Database altered.

SYS% cdb> alter database clear logfile group 6;

Database altered.

standby add standby redolog

SYS% cdb> ALTER DATABASE ADD STANDBY LOGFILE GROUP 10  size 50M;

Database altered.

SYS% cdb> ALTER DATABASE ADD STANDBY LOGFILE GROUP 11  size 50M;

Database altered.

SYS% cdb> ALTER DATABASE ADD STANDBY LOGFILE GROUP 12  size 50M;

Database altered.

SYS% cdb> ALTER DATABASE ADD STANDBY LOGFILE GROUP 13  size 50M;

Database altered.

primary add standby redolog

CDB_CDB$ROOT@SYS> ALTER DATABASE ADD STANDBY LOGFILE GROUP 10  'E:\APP\XIFENFEI\ORADATA\CDB\std_redo10.log' size 50M;

数据库已更改。

CDB_CDB$ROOT@SYS> ALTER DATABASE ADD STANDBY LOGFILE GROUP 11  'E:\APP\XIFENFEI\ORADATA\CDB\std_redo11.log' size 50M;

数据库已更改。

CDB_CDB$ROOT@SYS> ALTER DATABASE ADD STANDBY LOGFILE GROUP 12  'E:\APP\XIFENFEI\ORADATA\CDB\std_redo12.log' size 50M;

数据库已更改。

CDB_CDB$ROOT@SYS> ALTER DATABASE ADD STANDBY LOGFILE GROUP 13  'E:\APP\XIFENFEI\ORADATA\CDB\std_redo13.log' size 50M;

数据库已更改。

standby readonly

SYS% cdb> ALTER DATABASE OPEN READ ONLY;

Database altered.

standby start mrp

SYS% cdb> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

Database altered.

DATAGURAD 是否正常检查

--primary
CDB_CDB$ROOT@SYS> archive log list;
数据库日志模式            存档模式
自动存档             启用
存档终点            USE_DB_RECOVERY_FILE_DEST
最早的联机日志序列     374
下一个存档日志序列   376
当前日志序列           376

--standby 
Mon Aug 12 13:56:51 2013
All non-current ORLs have been archived.
Mon Aug 12 13:56:53 2013
Media Recovery Log /u02/app/oracle/archivelog/1_370_820595806.dbf
Mon Aug 12 13:56:57 2013
Media Recovery Log /u02/app/oracle/archivelog/1_371_820595806.dbf
Mon Aug 12 13:57:02 2013
Media Recovery Log /u02/app/oracle/archivelog/1_372_820595806.dbf
Mon Aug 12 13:57:04 2013
Media Recovery Log /u02/app/oracle/archivelog/1_373_820595806.dbf
Mon Aug 12 13:57:05 2013
Media Recovery Log /u02/app/oracle/archivelog/1_374_820595806.dbf
Media Recovery Waiting for thread 1 sequence 375
Mon Aug 12 13:57:19 2013
Primary database is in MAXIMUM PERFORMANCE mode
RFS[2]: Assigned to RFS process (PID:26114)
RFS[2]: No standby redo logfiles created for thread 1 
RFS[2]: Opened log for thread 1 sequence 376 dbid 1937199326 branch 820595806
Mon Aug 12 13:57:19 2013
RFS[3]: Assigned to RFS process (PID:26118)
RFS[3]: Opened log for thread 1 sequence 375 dbid 1937199326 branch 820595806
Mon Aug 12 13:57:19 2013
Archived Log entry 16 added for thread 1 sequence 375 rlc 820595806 ID 0x7377d8de dest 2:
Mon Aug 12 13:57:22 2013
Media Recovery Log /u02/app/oracle/archivelog/1_375_820595806.dbf
Media Recovery Waiting for thread 1 sequence 376 (in transit)