ORACLE system 坏块恢复—C_TS$

Posted on 2014 年 11 月 22 日 by 惜分飞

一朋友给我电话,说他们客户公司数据库故障,被另外一家公司恢复了一天不能正常恢复,请求我协助解决.接手一看数据库已经被破坏的不像样子了,根据alert日志信息大概分析了故障原因和上家公司处理情况。后面接手后通过bbed修复block数据库恢复过程,在本次恢复中出现大量ORA-600错误,主要包括ORA-00600 400,ORA-00600 2662,ORA-00600 2663,ORA-00600 krhpfh_03-1209,ORA-00600 3600,ORA-00600 ktsitbs_info1,ORA-00600 4137,ORA-00600 4511,ORA-00600 4198,ORA-00600 6807等
故障原因redo文件丢失

Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
LGWR: terminating instance due to error 313
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_pmon_1394.trc:
ORA-00313: open failed for members of log group  of thread 
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_ckpt_1406.trc:
ORA-00313: open failed for members of log group  of thread 
Instance terminated by LGWR, pid = 1404

尝试clear redo文件方式恢复

Thu Nov 20 13:04:16 2014
alter database clear logfile group 9
Thu Nov 20 13:04:16 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:45 2014
alter database clear logfile group 9
Thu Nov 20 13:04:46 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:59 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:04:59 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...
Thu Nov 20 13:05:00 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:05:00 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...

不完全恢复resetlogs尝试打开数据库

ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Nov 20 13:49:01 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER CANCEL 
Thu Nov 20 13:49:03 2014
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL 
Thu Nov 20 13:49:33 2014
alter database open resetlogs
Thu Nov 20 13:49:34 2014
ORA-1113 signalled during: alter database open resetlogs...

使用隐含参数

_allow_resetlogs_corruption= TRUE

进行不完全恢复,尝试open数据库报ORA-600 4000错误

Thu Nov 20 14:35:02 2014
ALTER DATABASE   MOUNT
Thu Nov 20 14:35:07 2014
Setting recovery target incarnation to 2
Thu Nov 20 14:35:07 2014
Successful mount of redo thread 1, with mount id 4039504598
Thu Nov 20 14:35:07 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 14:40:33 2014
ALTER DATABASE RECOVER  database until cancel  
Thu Nov 20 14:40:33 2014
Media Recovery Start
Thu Nov 20 14:40:33 2014
Media Recovery failed with error 1610
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 14:41:23 2014
ALTER DATABASE RECOVER  database using backup controlfile until cancel  

Thu Nov 20 14:43:08 2014
alter database open resetlogs
Thu Nov 20 14:43:08 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31293973571
Resetting resetlogs activation ID 3855216310 (0xe5c9eeb6)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Online log /data2/oradata/redo0902.log: Thread 1 Group 9 was previously cleared
Thu Nov 20 14:43:14 2014
Setting recovery target incarnation to 3
Thu Nov 20 14:43:14 2014
Assigning activation ID 4039504598 (0xf0c5f2d6)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 14:43:14 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 14:43:14 2014
SMON: enabling cache recovery
Thu Nov 20 14:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1844
ORA-1092 signalled during: alter database open resetlogs...

尝试隐含屏蔽回滚段

_corrupted_rollback_segments= _SYSSMU1$, _SYSSMU2$,…………

错误依旧ORA-600 4000

Thu Nov 20 15:09:21 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 15:09:21 2014
SMON: enabling cache recovery
Thu Nov 20 15:09:21 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 624
ORA-1092 signalled during: alter database open

多次重启,resetlogs后,数据库出现ORA-600 2662错误

Successful open of redo thread 1
Thu Nov 20 17:13:24 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 17:13:24 2014
SMON: enabling cache recovery
Thu Nov 20 17:13:24 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 7967
ORA-1092 signalled during: ALTER DATABASE OPEN...
Thu Nov 20 17:18:23 2014
USER: terminating instance due to error 1092
Instance terminated by USER, pid = 7967

offline undo相关文件,尝试打开数据库

Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 17:52:31 2014
ALTER DATABASE RECOVER  database until cancel  
Thu Nov 20 17:52:31 2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 17:53:42 2014
ALTER DATABASE RECOVER CANCEL 
Thu Nov 20 17:53:44 2014
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ...
Thu Nov 20 17:56:34 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:56:35 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:57:01 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:02 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:26 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:27 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:43 2014
alter database open resetlogs
Thu Nov 20 17:57:43 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:58:58 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:58:58 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:59:15 2014
alter database open resetlogs
Thu Nov 20 17:59:15 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:59:35 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:35 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:50 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 17:59:50 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 18:00:07 2014
alter database open resetlogs
Thu Nov 20 18:00:07 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31294173628
Resetting resetlogs activation ID 4039492628 (0xf0c5c414)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Thu Nov 20 18:00:14 2014
Setting recovery target incarnation to 8
Thu Nov 20 18:00:14 2014
Assigning activation ID 4039504142 (0xf0c5f10e)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 18:00:15 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 18:00:15 2014
SMON: enabling cache recovery
Thu Nov 20 18:00:15 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
File #2 is offline, but is part of an online tablespace.
data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
File #100 is offline, but is part of an online tablespace.
data file 100: '/data2/oradata/undotbs02.dbf'
Thu Nov 20 18:00:28 2014
File #185 is offline, but is part of an online tablespace.
data file 185: '/data2/oradata/undotbs03.dbf'
Dictionary check complete
Thu Nov 20 18:00:35 2014
SMON: enabling tx recovery
Thu Nov 20 18:00:36 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_28472.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_28450.trc:
ORA-00604: error occurred at recursive SQL level 
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_28446.trc:
ORA-00604: error occurred at recursive SQL level 
Instance terminated by USER, pid = 28472
ORA-1092 signalled during: alter database open resetlogs...

不知道做了什么操作出现file 1 block 60坏块,很可能bbed修改错误导致

Thu Nov 20 19:18:15 2014
SMON: enabling cache recovery
Thu Nov 20 19:18:16 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Successfully onlined Undo Tablespace 1.
Thu Nov 20 19:18:16 2014
SMON: enabling tx recovery
Thu Nov 20 19:18:17 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 19:18:17 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 13232
ORA-1092 signalled during: alter database open...

尝试不完全恢复,并resetlogs操作

ALTER DATABASE RECOVER  database until cancel  
Thu Nov 20 19:33:41 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:33:41 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:33:42 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:34:06 2014
alter database open resetlogs
Thu Nov 20 19:34:06 2014
ORA-1139 signalled during: alter database open resetlogs...
Thu Nov 20 19:34:17 2014
alter database open
Thu Nov 20 19:34:17 2014
ORA-1190 signalled during: alter database open...
Thu Nov 20 19:35:57 2014
ALTER DATABASE RECOVER  database until cancel  
Thu Nov 20 19:35:57 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:35:58 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:35:59 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:37:19 2014
alter database open resetlogs
Thu Nov 20 19:37:19 2014
ORA-1139 signalled during: alter database open resetlogs...

继续打开报 ORA-600 3600错误

Thu Nov 20 19:43:14 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 19:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
DBW0: terminating instance due to error 471
Instance terminated by DBW0, pid = 20856
1
<strong>中间多次重启和resetlogs，还出现ORA-600 2663错误</strong>
1
Fri Nov 21 12:35:12 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov 21 12:35:12 2014
SMON: enabling cache recovery
Fri Nov 21 12:35:13 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mman_15572.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw1_15576.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 15596
ORA-1092 signalled during: ALTER DATABASE OPEN..

继续尝试打开数据库出现ORA-600 ktsitbs_info1错误

SMON: enabling cache recovery
Fri Nov 21 13:54:25 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Fri Nov 21 13:54:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Fri Nov 21 13:54:27 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 21111
ORA-1092 signalled during: alter database open...

以上是客户数据库故障原因和问题大概的处理过程,下面是我接手后的处理过程

dbv 检查system01.dbf文件,得到结果

HNDX-DB% dbv file=/opt/oracle/oradata/xifenfei/system01.dbf

DBVERIFY: Release 10.2.0.1.0 - Production on Fri Nov 21 16:22:37 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

DBVERIFY - Verification starting : FILE = /opt/oracle/oradata/xifenfei/system01.dbf
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv: 
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0

Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720


DBVERIFY - Verification complete

Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

这里知道数据库有两个坏块,而且根据对于bootstrap$的经验,可以大概确定60坏块很可能是C_TS$,第一反应type异常,498可能是seq$

对数据库启动过程做10046,得到trace文件

PARSING IN CURSOR #1 len=275 dep=2 uid=0 oct=3 lid=0 tim=27978051403575 hv=3408408745 ad='7df93cd0'
select name,online$,contents$,undofile#,undoblock#,blocksize,dflmaxext,dflinit,dflincr,dflextpct,dflminext, 
dflminlen, owner#,scnwrp,scnbas, NVL(pitrscnwrp, 0), NVL(pitrscnbas, 0), dflogging, bitmapped, inc#, flags, 
plugged, NVL(spare1,0), NVL(spare2,0) from ts$ where ts#=:1
END OF STMT
PARSE #1:c=0,e=92,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051403569
BINDS #1:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=ffffffff7dbac9a8  bln=22  avl=02  flg=05
  value=2
EXEC #1:c=0,e=310,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051404296
WAIT #1: nam='db file sequential read' ela= 42 file#=1 block#=60 blocks=1 obj#=-1 tim=27978051404449
Hex dump of (file 1, block 60)
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
FETCH #1:c=10000,e=4072,p=1,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051408438
STAT #1 id=1 cnt=0 pid=0 pos=1 obj=16 op='TABLE ACCESS CLUSTER TS$ (cr=2 pr=1 pw=0 time=4075 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=7 op='INDEX UNIQUE SCAN I_TS# (cr=1 pr=0 pw=0 time=13 us)'
*** 2014-11-22 14:44:43.235
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Current SQL statement for this session:
select max(maxconcurrency) from sys.wrh$_undostat  where instance_number = :1 and dbid = :2 
and snap_id in   (select snap_id from dba_hist_snapshot where end_interval_time >     
(select max(end_interval_time)-7 from dba_hist_snapshot))

这里显示了数据库启动报ORA-00600[ktsitbs_info1],[2]，明显的表示了b中的2是表示表空间号,由于ts$坏块,无法读取ts$中表空间信息,从而出现数据字典不一致,从而出现该错误。所以恢复该库的关键是修复file 1 block 60.

bbed尝试修复file 1 block 60

HNDX-DB% bbed password=blockedit mode=edit

BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 22 15:16:26 2014

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> set filename '/opt/oracle/oradata/xifenfei/system01.dbf'
        FILENAME        /opt/oracle/oradata/xifenfei/system01.dbf

BBED> set block 8192
        BLOCK#          8192

BBED> set block 60
        BLOCK#          60

BBED> set count 64
        COUNT           64

BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (128)


BBED> set block 61
        BLOCK#          61

BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)

 struct kcbh, 20 bytes                      @0       

 struct ktbbh, 72 bytes                     @20      

 struct kdbh, 14 bytes                      @92      

 struct kdbt[3], 12 bytes                   @106     

 sb2 kdbr[2]                                @118     

 ub1 freespace[7959]                        @122     

 ub1 rowdata[107]                           @8081    

 ub4 tailchk                                @8188    


BBED> p kcbh
struct kcbh, 20 bytes                       @0       
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0040003d
   ub4 bas_kcbh                             @8        0x0000235b
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x7a85
   ub2 spare3_kcbh                          @18       0x0000

BBED> set block 60
        BLOCK#          60

BBED> d
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 80000000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204 
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000 

 <32 bytes per line>

BBED> d block 61
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003d 0000235b 00000104 7a850000 01000000 00000006 00001837 
 00001738 00020200 00000000 0007002e 00000002 00800075 00012300 80000000 

 <32 bytes per line>

BBED> set block 60
        BLOCK#          60

BBED> m /x 06a2
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204 
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000 

 <32 bytes per line>

BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)

 struct kcbh, 20 bytes                      @0       

 struct ktbbh, 72 bytes                     @20      

 struct kdbh, 14 bytes                      @92      

 struct kdbt[3], 12 bytes                   @106     

 sb2 kdbr[2]                                @118     

 ub1 freespace[7598]                        @122     

 ub1 rowdata[468]                           @7720    

 ub4 tailchk                                @8188    


BBED> sum apply
Check value for File 0, Block 60:
current = 0xe908, required = 0xe908

BBED> verify
DBVERIFY - Verification starting
FILE = /opt/oracle/oradata/xifenfei/system01.dbf
BLOCK = 60


DBVERIFY - Verification complete

Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0

BBED>

尝试启动数据库

Sat Nov 22 15:51:33 2014
alter database open
Sat Nov 22 15:51:34 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 15:51:34 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 15:51:34 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 15:51:34 2014
Database Characterset is ZHS16CGB231280
Hex dump of (file 1, block 498) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_2818.trc
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
Reread of rdba: 0x004001f2 (file 1, block 498) found same corrupted data
Sat Nov 22 15:51:35 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_2803.trc:
ORA-00600: internal error code, arguments: [4000], [12], [], [], [], [], [], []
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=18, OS id=3000
Sat Nov 22 15:51:36 2014
Completed: alter database open
Sat Nov 22 15:51:36 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_3010.trc:
ORA-00600: internal error code, arguments: [6807], [AUDSES$], [144], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [6807], [WRI$_ALERT_SEQUENCE], [8783], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Sat Nov 22 15:51:38 2014
ORA-600 encountered when generating server alert SMG-3000
Sat Nov 22 15:51:38 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [ktcpoptx_0], [0x772705E60], [], [], [], [], [], []

只要出现ORA-600 4000和ORA-600 6807错误,其中ORA-600 6807错误比较明显是由于seq$坏块,导致AUDSES$ seq异常导致。ORA-600 4000应该是回滚段异常,继续分析回滚段

SQL> select name,ts#,status$ from undo$;

NAME                                  TS#    STATUS$
------------------------------ ---------- ----------
SYSTEM                                  0          2
_SYSSMU1$                               1          2
_SYSSMU2$                               1          2
_SYSSMU3$                               1          2
…………
_SYSSMU168$                             1          2
_SYSSMU169$                             1          2

这里很异常,system回滚段在数据库open之后,按照常理不可能处于STATUS$=2(OFFLINE)状态。而且其他回滚段全部为OFFLINE状态也属于异常情况.而且尝试drop undo报ORA-01561,另外在dba_rollback_segs中无SYSTEM(查询结果忘记保存)

SQL> drop tablespace undotbs1 including contents; 
drop tablespace undotbs1 including contents
*
ERROR at line 1:
ORA-01561: failed to remove all objects in the tablespace specified

通过这一系列很怀疑是由于bbed 修改了undo$等相关基表信息导致现在system中的undo信息混乱.信息反馈给客户后,客户想起来昨天给他们恢复的公司在bbed操作前备份了system01.dbf.突然感觉救星来了.实在怕不懂bbed的人折腾bbed

dbv检测备份文件

DBVERIFY - Verification starting : FILE = /data3/backup/system01.dbf_bak
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv: 
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0

Block Checking: DBA = 4194802, Block Type = KTB-managed data block
data header at 0x1002ef05c
kdbchk: row locked by non-existent transaction
        table=0   slot=4
        lockid=1   ktbbhitc=2
Page 498 failed with check code 6101


DBVERIFY - Verification complete

Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 1
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

好家伙只有一个物理坏块和一个逻辑坏块,而对于物理坏块block 60已经知道如何修复,逻辑坏块可以尝试设置隐含参数跳过去,bbed修改相关block(同上步骤)

再次启动数据库

dd if=/opt/oracle/oradata/xifenfei/system01.dbf bs=8192 count=2 of=/tmp/system01.2
dd if=/tmp/system01.2 of=/data3/backup/system01.dbf_bak bs=8192 count=2 conv=notrunc

Sat Nov 22 17:52:50 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Nov 22 17:53:38 2014
alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:53:39 2014
Completed: alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:55:43 2014
alter database open
Sat Nov 22 17:55:48 2014
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=18, OS id=15858
Sat Nov 22 17:56:10 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=17, OS id=15879
Sat Nov 22 17:56:19 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 17:56:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 17:56:19 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 17:56:20 2014
ARC1: STARTING ARCH PROCESSES
Sat Nov 22 17:56:20 2014
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Sat Nov 22 17:56:22 2014
Database Characterset is ZHS16CGB231280
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Nov 22 17:56:33 2014
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC1: Becoming the heartbeat ARCH
ARC2 started with pid=23, OS id=15928
QMNC started with pid=25, OS id=15996
Sat Nov 22 17:57:11 2014
Completed: alter database open
Sat Nov 22 17:57:18 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16010.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:57:26 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16012.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:58:17 2014
Starting background process EMN0
Sat Nov 22 18:00:03 2014
Shutting down instance: further logons disabled
EMN0 started with pid=71, OS id=16421
Sat Nov 22 18:00:12 2014
SMON: Restarting fast_start parallel rollback
Sat Nov 22 18:00:23 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_p000_15951.trc:
ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], []
Sat Nov 22 18:00:24 2014
Stopping background process CJQ0
Sat Nov 22 18:00:24 2014
Stopping background process QMNC
Sat Nov 22 18:00:27 2014
Doing block recovery for file 2 block 41
Block recovery from logseq 7, block 180883 to scn 214748389244
Sat Nov 22 18:00:27 2014
Recovery of Online Redo Log: Thread 1 Group 8 Seq 7 Reading mem 0
  Mem# 0 errs 0: /data2/oradata/redo0802.log
Block recovery stopped at EOT rba 7.180988.16
Block recovery completed at rba 7.180988.16, scn 50.24441
Sat Nov 22 18:00:32 2014
Stopping background process MMNL
Sat Nov 22 18:00:38 2014
Stopping background process MMON
Sat Nov 22 18:00:41 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sat Nov 22 18:00:42 2014
ORACLE Instance xifenfei (pid = 9) - Error 600 encountered while recovering transaction (3, 4).
Sat Nov 22 18:00:42 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

这里都是很常规的错误,查询undo$也已经正常,重建新undo表空间删除老undo,然后alert日志中无其他报错,数据库恢复至此完成,建议客户导出导入重建数据库

rm -rf delete datafile recovery

Posted on 2014 年 11 月 18 日 by 惜分飞

因为人员离职闹得不愉快,系统工程师离职后,由于公司未及时关闭其vpn,数据库服务器(Linux 6.5 Oracle 11.2.0.1)帐号未及时被修改,最后直接上去rm ORACLE_BASE给干掉,悲剧的是ORADATA目录也在里面,更加悲剧的是所有数据文件都在里面.也就是说数据库彻底被删除,而且没有任何备份.朋友咨询了我,让我给予支持.最后比较幸运,文件没有被覆盖,inode都还在,通过extundelete顺利恢复所有数据文件,控制文件,redo文件(extundelete恢复Linux被删除文件),数据库顺利打开,实现0丢失,算是一次完美的恢复

[root@DB1 tmp]# tar xvf extundelete-0.2.4.tar 
extundelete-0.2.4/
extundelete-0.2.4/acinclude.m4
extundelete-0.2.4/missing
extundelete-0.2.4/autogen.sh
extundelete-0.2.4/aclocal.m4
extundelete-0.2.4/configure
extundelete-0.2.4/LICENSE
extundelete-0.2.4/README
extundelete-0.2.4/install-sh
extundelete-0.2.4/config.h.in
extundelete-0.2.4/src/
extundelete-0.2.4/src/extundelete.cc
extundelete-0.2.4/src/block.h
extundelete-0.2.4/src/kernel-jbd.h
extundelete-0.2.4/src/insertionops.cc
extundelete-0.2.4/src/block.c
extundelete-0.2.4/src/cli.cc
extundelete-0.2.4/src/extundelete-priv.h
extundelete-0.2.4/src/extundelete.h
extundelete-0.2.4/src/jfs_compat.h
extundelete-0.2.4/src/Makefile.in
extundelete-0.2.4/src/Makefile.am
extundelete-0.2.4/configure.ac
extundelete-0.2.4/depcomp
extundelete-0.2.4/Makefile.in
extundelete-0.2.4/Makefile.am
[root@DB1 tmp]# cd extundelete-0.2.4
[root@DB1 extundelete-0.2.4]# ./configure 
Configuring extundelete 0.2.4
Writing generated files to disk
[root@DB1 extundelete-0.2.4]# make && make install
make -s all-recursive
Making all in src
Making install in src
  /usr/bin/install -c extundelete '/usr/local/bin'
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
/dev/sdb1       2.0T   71M  1.9T   1% /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
umount: /home: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                3914c(oracle)  8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 3914
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
/home:                8372c(oracle)
[root@DB1 extundelete-0.2.4]# kill -9 8372
[root@DB1 extundelete-0.2.4]# fuser -m -u /home
[root@DB1 extundelete-0.2.4]# umount /dev/sdb1
[root@DB1 extundelete-0.2.4]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       244G   11G  221G   5% /
tmpfs            16G   72K   16G   1% /dev/shm
/dev/sda1       190M   62M  119M  35% /boot
[root@DB1 extundelete-0.2.4]# extundelete /dev/sdb1 --restore-all
NOTICE: Extended attributes are not restored.
Loading filesystem metadata ... 16384 groups loaded.
Loading journal descriptors ... 26542 descriptors loaded.
Searching for recoverable inodes in directory / ... 
18896 recoverable inodes found.
Looking through the directory structure for deleted files ... 
2 recoverable inodes still lost.
Unable to restore inode 43778050 (file.43778050): Space has been reallocated.
[root@DB1 extundelete-0.2.4]# ls
acinclude.m4  autogen.sh  config.h.in  config.status  configure.ac  install-sh  Makefile     Makefile.in 
aclocal.m4    config.h    config.log   configure      depcomp       LICENSE     Makefile.am  missing    
[root@DB1 extundelete-0.2.4]# cd RECOVERED_FILES/
[root@DB1 RECOVERED_FILES]# ls
app  file.43778051  oracle  oraInventory
[root@DB1 RECOVERED_FILES]# cd app
[root@DB1 app]# ls
admin  cfgtoollogs  diag  oracle  oradata  orcl  ORCL
[root@DB1 app]# cd oradata
[root@DB1 oradata]# ls
orcl
[root@DB1 oradata]# cd orcl
[root@DB1 orcl]# ls
control01.ctl  redo01.log  redo02.log  redo03.log  sysaux01.dbf  system01.dbf  undotbs01.dbf  users01.dbf
[root@DB1 orcl]# ls -ltr
total 2908776
-rw-r--r--. 1 root root  734011392 Nov 18 02:06 system01.dbf
-rw-r--r--. 1 root root 1069555712 Nov 18 02:06 sysaux01.dbf
-rw-r--r--. 1 root root  120594432 Nov 18 02:06 undotbs01.dbf
-rw-r--r--. 1 root root  887365632 Nov 18 02:06 users01.dbf
-rw-r--r--. 1 root root    9748480 Nov 18 02:06 control01.ctl
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo01.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo02.log
-rw-r--r--. 1 root root   52429312 Nov 18 02:06 redo03.log
[root@DB1 orcl]#

再次提醒各位:数据库备份重于一切,防天灾的同时还要防人灾,也希望圈子里面以后不要听到类似故障.

惜分飞blog—Oracle异常恢复汇总

Posted on 2014 年 11 月 10 日 by 惜分飞

在2014年11月11日来临之际,我整理Blog中和异常恢复案例相关的部分文章,供大家参考:
dul处理分区表
 误删除dual表恢复
 dul恢复drop表测试
 跳过obj$坏块方法
 bbed解决ORA-01190
exp dmp文件损坏恢复
 当前联机日志损坏恢复
 ORA-01578坏块解决(1)
ORA-01578坏块解决(2)
undo异常处理步骤（9i）
DUL挖ORACLE 8.0数据库
 undo异常处理步骤（10g）
ORA-600 2663 故障恢复
 dul恢复truncate表测试
 bbed处理ORA-01200故障
 dul 10支持oracle 11g r2
使用 dul 挖数据文件初试
 sysaux数据文件异常恢复
 ORA-01244/ORA-01110解决
 恢复被rm意外删除数据文件
 ORA-00600[4194]故障解决
 ORA-01207/ORA-00338恢复
 DUL10直接支持ORACLE 8.0
bbed修改undo$(回滚段)状态
 记录8.0.5数据库恢复过程
 ORA-600[4194]/[4193]解决
 使用rman找回被误删除表空间
 记录一次系统回滚段坏块恢复
 使用bbed解决ORA-00600[2662]
ORA-00600[kcfrbd_3]故障解决
 数据库启动ORA-08103故障恢复
 asm disk header 彻底损坏恢复
 数据库恢复遭遇ORA-00600[3705]
Oracle 11g丢失access$恢复方法
 undo segment header坏块异常恢复
 ORA-600 kghstack_free2异常恢复
 ORA-00600[kccpb_sanity_check_2]
重建控制文件引发ORA-00218故障
 dul支持ORACLE 12C CDB数据库恢复
 ORACLE 8.0.5 ORA-01207故障恢复
 dul 10 export_mode=true功能增强
 完美解决dul处理clob字段乱码问题
 手工修复ASM DISK HEADER 异常
 undo坏块导致数据库异常终止案
 bbed 恢复 GLOBAL_NAME 为空故障
 通过bbed解决ORA-00600[4000]案例
 重建控制文件丢失数据文件导致悲剧
 异常断电导致current redo损坏处理
 创建控制文件遭遇ORA-600 kccscf_1
使用bbed修复损坏datafile header
通过修改控制文件scn推进数据库scn
bbed打开丢失部分system数据文件库
 某集团ebs数据库redo undo丢失导致悲剧
 obj$坏块exp/expdp导出不能正常执行
 处理fast_recovery_area无剩余空间案例
 obj$坏块情况下exp导出单个表解决方案
 ORA-00600 [ktbdchk1: bad dscn] 解决
 记录因磁盘头被重写,抢救redo恢复经历
 rac redo log file被意外覆盖数据库恢复
 使用bbed让rac中的sysaux数据文件online
通过bbed修改回滚段状态解决ORA-00704故障
 处理smon清理临时段导致数据库异常案例
 使用DUL挖数据文件恢复非数据外对象方法
 一次侥幸的OSD-04016 O/S-Error异常恢复
 记录一次ORA-600 4000数据库故障恢复
 一起ORA-600 3020故障恢复的大体思路
 redo异常 ORA-600 kclchkblk_4 故障恢复
 Oracle 12C的第一次异常恢复—文件头坏块
 数据库启动报ORA-00704 ORA-39714错误解决
 ORACLE 8.1.7 数据库ORA-600 4000故障恢复
 数据库报ORA-00607/ORA-00600[4194]错误
 创建控制文件遭遇ORA-00600[3753]故障解决
 ORA-00600[kcbshlc_1]导致数据库 down 案例
 ORACLE 8.1.7 数据库ORA-600 4194故障恢复
 ORA-00600[kcrf_resilver_log_1]异常恢复
 控制文件异常导致ORA-00600[kccsbck_first]
分享一次ORA-01113 ORA-01110故障处理过程
 记录一次ORA-600 3004 恢复过程和处理思路
 Oracle安全警示录:加错裸设备导致redo异常
 ORA-600 kcratr_nab_less_than_odr故障解决
 双机mount数据库出现ORA-00600[kccsbck_first]
又一起存储故障导致ORA-00333 ORA-00312恢复
 通过bbed模拟ORA-00607/ORA-00600 4194 故障
 记录一次ORA-00316 ORA-00312 redo异常恢复
 asmlib异常报ORA-00600[kfklLibFetchNext00]
某个pdb可以在root pdb open状态下进行恢复
 使用bbed解决ORA-00607/ORA-00600[4194]故障
 乱用_allow_resetlogs_corruption参数导致悲剧
 遭遇ORA-07445[kkdliac()+346]使用odu抢救数据
 ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O
创建控制文件出现ORA-01565 ORA-27041 OSD-04002
记录一次system表空间坏块(ORA-01578)数据库恢复
 模拟基表事务未提交数据库crash,undo丢失恢复异常恢复
 重建控制文件丢失undo异常恢复—ORA-01173模拟与恢复
 修改props$.NLS_CHARACTERSET导致ORA-00900异常恢复
 ORA-600[2037]与ORA-07445[kcbs_dump_adv_state]错误
 误drop tablespace后使用flashback database闪回异常处理
 数据库恢复历史再次刷新到Oracle 7.3.2版本—redo异常恢复
 ORA-27086: skgfglk: unable to lock file – already in use
ORACLE 12C ORA-07445[ktuHistRecUsegCrtMain()+1173]恢复
 ORA-00600[kcratr1_lostwrt]/ORA-00600[3020]错误恢复
 表空间online出现ORA-00600[kcbz_check_objd_typ]处理过程
 _allow_resetlogs_corruption和adjust_scn解决ORA-01190
spfile被覆盖导致ORA-600[kmgs_parameter_update_timeout_1]
重建控制文件丢失undo异常恢复—ORA-600 25025模拟与恢复
 使用_allow_resetlogs_corruption打开无归档日志rman备份库
 使用_allow_resetlogs_corruption导致ORA-00704/ORA-01555故障
 记录一次ORA-600 kccpb_sanity_check_2和ORA-600 kcbgtcr_13 错误恢复
 ORA-00600[17182],ORA-00600[25027],ORA-00600[kghfrempty:ds]故障处理
 因RAC的undo_management参数不一致导致数据库mount报ORA-01105 ORA-01606
使用bbed解决ORA-01178 file N created before last CREATE CONTROLFILE, cannot recreate
通过多次resetlogs规避类似ORA-01248: file N was created in the future of incomplete recovery错误
 bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决
 记录一次ORA-00600[kdxlin:psno out of range]/ORA-00600[3020]/ORA-00600[4000]/ORA-00600[4193]的数据库恢复

ORA-600 kclchkblk_4 故障恢复

Posted on 2014 年 11 月 03 日 by 惜分飞

朋友和我说,他们数据库由于存储控制器异常,导致数据库无法正常启动.
数据库recover database 提示需要已经覆盖的redo

-bash-3.2$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:23:35 2014

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, Data Mining and Real Application Testing options

SQL> recover database;
ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread
2
ORA-00289: suggestion :
/export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf
ORA-00280: change 1639063379 for thread 2 is in sequence #29919


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.

SQL> select thread#,group#,sequence# from v$log;

   THREAD#     GROUP#  SEQUENCE#
---------- ---------- ----------
         1          1      30261
         1          2      30262
         1          3      30258
         1          4      30259
         1          5      30260
         2          6      29966
         2          7      29967
         2          8      29968
         2          9      29969
         2         10      29970

10 rows selected.

SQL> select group#,member from v$logfile;

    GROUP# MEMBER
---------- --------------------------------------------------
         1 /dev/md/oradg/rdsk/d115
         2 /dev/md/oradg/rdsk/d116
         3 /dev/md/oradg/rdsk/d117
         4 /dev/md/oradg/rdsk/d118
         5 /dev/md/oradg/rdsk/d119
         6 /dev/md/oradg/rdsk/d120
         7 /dev/md/oradg/rdsk/d121
         8 /dev/md/oradg/rdsk/d122
         9 /dev/md/oradg/rdsk/d123
        10 /dev/md/oradg/rdsk/d124

10 rows selected.

SQL> recover database;
ORA-00279: change 1639063379 generated at 11/01/2014 12:06:33 needed for thread 2
ORA-00289: suggestion : /export/home/oracle/product/10.2.0/db_1/dbs/arch2_29919_790965041.dbf
ORA-00280: change 1639063379 for thread 2 is in sequence #29919


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.

数据库没归档，redo覆盖，无法继续恢复,使用_allow_resetlogs_corruption屏蔽前滚,继续恢复

SQL> startup mount pfile='/tmp/pfile.txt';
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2053120 bytes
Variable Size            1090522112 bytes
Database Buffers          855638016 bytes
Redo Buffers              148938752 bytes
Database mounted.
SQL> recover database until cancel;          
ORA-00279: change 1639063379 generated at 11/01/2014 12:58:20 needed for thread
1
ORA-00289: suggestion :
/export/home/oracle/product/10.2.0/db_1/dbs/arch1_30262_790965041.dbf
ORA-00280: change 1639063379 for thread 1 is in sequence #30262


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/dev/md/oradg/rdsk/d105'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

查看alert日志

Mon Nov 03 17:37:11 CST 2014
Completed crash recovery at
 Thread 2: logseq 1, block 3, scn 1639083385
 0 data blocks read, 0 data blocks written, 1 redo blocks read
Picked broadcast on commit scheme to generate SCNs
Mon Nov 03 17:37:12 CST 2014
Thread 2 advanced to log sequence 2 (thread open)
Thread 2 opened at log sequence 2
  Current log# 7 seq# 2 mem# 0: /dev/md/oradg/rdsk/d121
Successful open of redo thread 2
Mon Nov 03 17:37:12 CST 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Nov 03 17:37:13 CST 2014
SMON: enabling cache recovery
Mon Nov 03 17:37:13 CST 2014
Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], []
Mon Nov 03 17:37:14 CST 2014
Errors in file /export/home/oracle/admin/iesdb/udump/iesdb2_ora_15700.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [1640589405], [0], [1639117006], [], [], []
Mon Nov 03 17:37:14 CST 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 15700
ORA-1092 signalled during: alter database open...

参考:In 10.1.0.2: ORA-600 [kclchkblk_4] and ORA-600 [2662] After Recovery of Database (Doc ID 275902.1),删除tempfile继续打开数据库

SQL> alter database tempfile '/dev/md/oradg/rdsk/d109' drop;

Database altered.

SQL> c/109/110
  1* alter database tempfile '/dev/md/oradg/rdsk/d110' drop
SQL> /

Database altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced

数据库依然报ORA-00600kclchkblk_4]错误,因此解决该问题选择使用bbed修改文件头scn来完成,具体参考类似文章:使用bbed修复损坏datafile header

-bash-3.2$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Nov 3 17:41:17 2014

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

Connected to an idle instance.

SQL> startup pfile='/tmp/pfile.txt' mount; 
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2053120 bytes
Variable Size            1090522112 bytes
Database Buffers          855638016 bytes
Redo Buffers              148938752 bytes
Database mounted.
SQL> alter database open;

Database altered.

SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d109' size 8388608000 autoextend off;

Tablespace altered.

SQL> alter tablespace temp add tempfile '/dev/md/oradg/rdsk/d110' size 8388608000 autoextend off;

Tablespace altered.

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 2097152000 bytes
Fixed Size                  2053120 bytes
Variable Size            1090522112 bytes
Database Buffers          855638016 bytes
Redo Buffers              148938752 bytes
Database mounted.
Database opened.

至此数据库恢复完成,建议逻辑方式重建数据库

记录一次Oracle 12C 文件头损坏恢复

Posted on 2014 年 11 月 03 日 by 惜分飞

接到第一个使用Oracle 12C作为生产库的恢复救援.有两个业务数据文件报文件头损坏,其他数据文件全部是9月份的一次备份,在当前的条件下,希望我们能够帮他们恢复出来业务文件中的数据
数据库版本信息

SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

数据库故障
具体脚本请参考:数据库恢复检查脚本(Oracle Database Recovery Check)
控制文件信息

控制文件中关于数据文件信息

数据文件头信息

alert日志报错

Reading datafile '/app/oracle/oradata/freetouch/sales.dbf' for corruption at rdba: 0x00000001 (file 4, block 1)
Reread (file 4, block 1) found same corrupt data (no logical check)
Hex dump of (file 5, block 1) in trace file /app/oracle/diag/rdbms/valuenet/valuenet/trace/valuenet_ora_12384.trc

Corrupt block relative dba: 0x00000001 (file 5, block 1)
Fractured block found during kcvxfh v8
Data in bad block:
 type: 0 format: 2 rdba: 0x00000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0xa701
 computed block checksum: 0x0

Reading datafile '/app/oracle/oradata/freetouch/drp_200200' for corruption at rdba: 0x00000001 (file 5, block 1)
Reread (file 5, block 1) found same corrupt data (no logical check)
Hex dump of (file 4, block 1) in trace file /app/oracle/diag/rdbms/valuenet/valuenet/trace/valuenet_ora_12384.trc

Corrupt block relative dba: 0x00000001 (file 4, block 1)
Fractured block found during kcvxfh v8
Data in bad block:
 type: 0 format: 2 rdba: 0x00000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0xa701
 computed block checksum: 0x0

dbv 检测

[oracle@db trace]$ dbv file=/app/oracle/oradata/freetouch/drp_200200

DBVERIFY: Release 12.1.0.1.0 - Production on Sun Nov 2 14:08:34 2014

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /app/oracle/oradata/freetouch/drp_200200


DBVERIFY - Verification complete

Total Pages Examined         : 194560
Total Pages Processed (Data) : 114596
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 26198
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 37787
Total Pages Processed (Seg)  : 1
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 15979
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 129603862 (0.129603862)

[oracle@db ~]$ dbv file=/app/oracle/oradata/freetouch/sales.dbf

DBVERIFY: Release 12.1.0.1.0 - Production on Sun Nov 2 23:12:05 2014

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /app/oracle/oradata/freetouch/sales.dbf


DBVERIFY - Verification complete

Total Pages Examined         : 655360
Total Pages Processed (Data) : 294938
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 233404
Total Pages Failing   (Index): 0
Total Pages Processed (Lob)  : 38
Total Pages Failing   (Lob)  : 0
Total Pages Processed (Other): 23252
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 103728
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 134665298 (0.134665298)

dbv检测结果无坏块,但是v$datafile_header和alert日志中报坏块,初步判断是由于该文件是bigfile,dbv未检测到文件头坏块,实际该该数据文件头损坏，其他block正常.所幸的是该库有9月份的rman备份(中间归档丢失),因此使用rman还原出来9月份的数据文件,然后使用dd拷贝两个 block(block 0和block 1)到异常文件.

[root@db freetouch]# dd if=/app1/oracle/oradata/freetouch/sales.dbf bs=8192 count=2 of=/tmp/odu/sales.2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.0176368 s, 929 kB/s
[root@db freetouch]# dd if=/tmp/odu/sales.2 of=/app/oracle/oradata/freetouch/sales.dbf bs=8192 count=2 conv=notrunc 
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 6.4281e-05 s, 255 MB/s
[root@db freetouch]# dd if=/app1/oracle/oradata/freetouch/drp_200200 bs=8192 count=2 of=/tmp/odu/drp_200200.2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.0185934 s, 881 kB/s
[root@db freetouch]# dd if=/tmp/odu/drp_200200.2 of=/app/oracle/oradata/freetouch/drp_200200 bs=8192 count=2 conv=notrunc 
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 6.4419e-05 s, 254 MB/s

尝试恢复数据库

SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done


SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 4 belongs to an orphan incarnation
ORA-01110: data file 4: '/app/oracle/oradata/freetouch/sales.dbf'

使用bbed修改相关文件头,然后继续恢复
具体见：bbed解决ORA-01190类似方法处理

SQL> recover database using backup controlfile;
ORA-00279: change 129603904 generated at 11/02/2014 19:19:54 needed for thread
1
ORA-00289: suggestion :
/app/oracle/recovery_area/VALUENET/archivelog/2014_11_02/o1_mf_1_1_%u_.arc
ORA-00280: change 129603904 for thread 1 is in sequence #1


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel;
ORA-00308: cannot open archived log 'cancel;'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [129603911], [0],
[129603913], [29360256], [], [], [], [], [], []
Process ID: 19881
Session ID: 1 Serial number: 3

出现ORA-600[2662]错误,因为scn相差比较小，重启数据库机器,出现ORA-600[4194]错误

SQL> startup pfile='/tmp/pfile.txt' mount
ORACLE instance started.

Total System Global Area 2505338880 bytes
Fixed Size                  2291472 bytes
Variable Size             973080816 bytes
Database Buffers         1526726656 bytes
Redo Buffers                3239936 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/app/oracle/oradata/freetouch/system01.dbf'


SQL> recover database ;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database or pluggable database must be opened in read/write
mode.

重建控制文件后继续恢复

SQL> alter database open resetlogs;    
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4194], [46], [19], [], [], [], [],
[], [], [], [], []
Process ID: 20351
Session ID: 1 Serial number: 3

设置undo_management=MANUAL然后继续恢复

[oracle@db tmp]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on Sun Nov 2 19:29:45 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup pfile='/tmp/pfile.txt'
ORACLE instance started.

Total System Global Area 2505338880 bytes
Fixed Size                  2291472 bytes
Variable Size             973080816 bytes
Database Buffers         1526726656 bytes
Redo Buffers                3239936 bytes
Database mounted.
Database opened.

这次的恢复也证明Oracle 12C确实有着越来越多的用户在使用.