记录一次oracle现场故障处理经过

联系:手机/微信(+86 13429648788) QQ(107644445)QQ咨询惜分飞

标题:记录一次oracle现场故障处理经过

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

近期到现场进行了一个数据库恢复,我在恢复之前该库先由于硬件进行恢复,然后由其他人对其进行了一系列数据库恢复,但是未恢复成功,客户希望我们到现场进行处理(因为网络原因无法远程).接手库之后,处理第一个问题,是客户在进行现场备份的时候(把linux数据拷贝到win的过程中)发现有几个文件拷贝异常,这个错误很可能是由于当初的硬件故障修复之后留下的后遗症(由于io设备错误,无法运行此项请求),通过工具进行拷贝,恢复出来
20210403210131


DUL> copy file from  /oradata2/xifenfeidata.dbf to /oradata2/xifenfeidata.dbf

starting copy datafile '/oradata1/xifenfeidata.dbf' to '/oradata2/xifenfeidata.dbf'
read data error from file '/oradata1/xifenfeidata.dbf'.error message:Input/output error
read block# error: 560171
read data error from file '/oradata1/xifenfeidata.dbf'.error message:Input/output error
read block# error: 560179
datafile copy completed with 2 block error.
[oracle@localhost ~]$ dbv file=/oradata2/xifenfeidata.dbf blocksize=16384

DBVERIFY: Release 11.2.0.3.0 - Production on Mon Mar 29 17:28:17 2021

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oradata2/xifenfeidata.dbf
Page 560171 is marked corrupt
Corrupt block relative dba: 0x3bc88c2b (file 239, block 560171)
Completely zero block found during dbv: 

Page 560179 is marked corrupt
Corrupt block relative dba: 0x3bc88c33 (file 239, block 560179)
Completely zero block found during dbv: 



DBVERIFY - Verification complete

Total Pages Examined         : 4194302
Total Pages Processed (Data) : 2230726
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 1936953
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 26618
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 3
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 304929867 (106.304929867)

修复完相关无法拷贝文件之后,启动数据库报控制文件异常

Mon Mar 29 15:03:38 2021
alter database mount
USER (ospid: 29044): terminating the instance
Mon Mar 29 15:03:42 2021
System state dump requested by (instance=1, osid=29044), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/xff/xff/trace/xff_diag_28961.trc
Instance terminated by USER, pid = 29044

尝试重建ctl

[oracle@localhost ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Mon Mar 29 17:40:17 2021

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 1.7704E+10 bytes
Fixed Size                  2235568 bytes
Variable Size            2348811088 bytes
Database Buffers         1.5301E+10 bytes
Redo Buffers               52580352 bytes
SQL> @/tmp/ctl.sql
CREATE CONTROLFILE REUSE DATABASE xff NORESETLOGS  NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01189: file is from a different RESETLOGS than previous files
ORA-01110: data file 249: '/oradata/xff/system03.dbf'

初步判断是由于对方之前恢复导致部分文件resetlogs scn异常,通过bbed进行判断确认

BBED> set file 1
        FILE#           1

BBED> p kcvfhrls
struct kcvfhrls, 8 bytes                    @116     
   ub4 kscnbas                              @116      0x00000001
   ub2 kscnwrp                              @120      0x0000

BBED> set file 249
        FILE#           249

BBED> p kcvfhrls
struct kcvfhrls, 8 bytes                    @116     
   ub4 kscnbas                              @116      0x00000001
   ub2 kscnwrp                              @120      0x0000

通过bbed修改相关值,然后重建控制文件成功,尝试resetlogs库,报ORA-01248错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01248: file 234 was created in the future of incomplete recovery
ORA-01110: data file 234: '/oradata1/xifenfeidata5.DBF'

关于ORA-01248的错误解释

01248, 00000, "file %s was created in the future of incomplete recovery"
// *Cause:  Attempting to do a RESETLOGS open with a file entry in the
//          control file that was originally created after the UNTIL time 
//          of the incomplete recovery.
//          Allowing such an entry may hide the version of the file that 
//          is needed at this time.  The file number may be in use for 
//          a different file which would be lost if the RESETLOGS was allowed.
// *Action: If more recovery is desired then apply redo until the creation
//          time of the file is reached. If the file is not wanted and the
//          same file number is not in use at the stop time of the recovery,
//          then the file can be taken offline with the FOR DROP option.
//          Otherwise a different control file is needed to allow the RESETLOGS.
//          Another backup can be restored and recovered, or a control file can
//          be created via CREATE CONTROLFILE.

大概的意思是文件的创建时间大于文件当前的scn,通过查询确实如此

SQL> select file#,CREATION_CHANGE#,CREATION_TIME from v$datafile_header where file#=234;

           FILE# CREATION_CHANGE# CREATION_
---------------- ---------------- ---------
             234     419298664864 02-AUG-19

SQL> SELECT status,  
  2  to_char(checkpoint_change#,'9999999999999999') "SCN",
  3  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,
  4  count(*) ROW_NUM
  5  FROM v$datafile_header
  6  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  SCN               CHECKPOINT_TIME     FUZ          ROW_NUM
------- ----------------- ------------------- --- ----------------
ONLINE       417750848223 2021-02-23 23:50:46 YES                7
ONLINE       417750848223 2021-03-21 11:44:25 NO               396

通过对部分scn进行修改(比如减小创建时间的scn),然后尝试resetlogs库

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 5 with name
"_SYSSMU5_2708889888$" too small
Process ID: 3182
Session ID: 1 Serial number: 3

这个错误比较简单,参考以前的部分文章:在数据库open过程中常遇到ORA-01555汇总数据库open过程遭遇ORA-1555对应sql语句补充,处理之后,数据库open成功

SQL> startup mount;
ORACLE instance started.

Total System Global Area 1.7704E+10 bytes
Fixed Size                  2235568 bytes
Variable Size            2348811088 bytes
Database Buffers         1.5301E+10 bytes
Redo Buffers               52580352 bytes
Database mounted.
SQL> alter database open;

Database altered.

本次数据库恢复基本上完成,已经最大限度恢复数据,导出数据到新库,完成恢复任务

Oracle Recovery Tools 解决ORA-01190 ORA-01248等故障

联系:手机/微信(+86 13429648788) QQ(107644445)QQ咨询惜分飞

标题:Oracle Recovery Tools 解决ORA-01190 ORA-01248等故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天有一个客户数据库恢复请求,通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)脚本分析发现resetlog信息异常
20210106182747


导致数据库恢复报ORA-01190 ORA-01110错

alter database open
Errors in file c:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_4404.trc:
ORA-01190: 控制文件或数据文件 1 来自最后一个 RESETLOGS 之前
ORA-01110: 数据文件 1: 'C:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1190 signalled during: alter database open...

通过Oracle Recovery Tools工具进行修复resetlog 信息
20210106183450


再次尝试open数据库报ORA-1248错

SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-01248: ?? 44 ????????????
ORA-01110: ???? 44: 'E:\ORADATA\ORCL\XIFENFEI.DBF'

Wed Jan 06 14:44:44 2021
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1248 signalled during: alter database open resetlogs...

再次通过Oracle Recovery Tools进行修复SCN,数据库open成功

T:\xff>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on 星期三 1月 6 14:47:36 2021

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

已连接到空闲例程。

SQL> startup mount 
ORACLE 例程已经启动。

Total System Global Area 6.9214E+10 bytes
Fixed Size                  2182712 bytes
Variable Size            3.5165E+10 bytes
Database Buffers         3.3823E+10 bytes
Redo Buffers              224296960 bytes
数据库装载完毕。
SQL>
SQL>
SQL> alter database open;

数据库已更改。

通过多次resetlogs规避类似ORA-01248: file N was created in the future of incomplete recovery错误

数据库现状
控制文件
recover_xifenfei0
控制文件中数据文件信息
recover_xifenfei1
数据文件头信息
recover_xifenfei2
redo信息
recover_xifenfei3
根据当前数据库恢复检查脚本(Oracle Database Recovery Check)收集的信息,数据库的是非归档状态,而且redo已经覆盖,数据库datafile 5 无法直接online.遇到这样情况,可以使用bbed修改文件头scn实现online(使用bbed让rac中的sysaux数据文件online),也可以通过使用_allow_resetlogs_corruption等隐含参数实现online.本恢复案例中有180个数据文件,160个offline,然后open数据库,所以大量数据文件无法正常online,bbed工作量太大.在恢复过程中不幸遇到ORA-01248

数据库resetlogs出现ORA-01248错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01248: file 5 was created in the future of incomplete recovery
ORA-01110: data file 5: 'F:\TTDATA\PUBRTS.DAT'

alert日志记录

Fri Oct 10 15:09:26 2014
alter database open resetlogs
Fri Oct 10 15:09:26 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1248 signalled during: alter database open resetlogs...
Fri Oct 10 15:15:22 2014
alter database open
Fri Oct 10 15:15:22 2014
ORA-1589 signalled during: alter database open...
Fri Oct 10 15:15:30 2014
alter database  open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1248 signalled during: alter database  open resetlogs...

尝试offline文件然后resetlogs

SQL>ALTER DATABASE DATAFILE 5  OFFLINE;

Database altered.


sql>ALTER DATABASE OPEN RESETLOGS;

ERROR at line 1:
ORA-01245: ffline file 5 will be lost if resetlogs is done
ORA-01110: data file 5: 'F:\TTDATA\PUBRTS.DAT'

alert日志

Fri Oct 10 15:19:37 2014
ALTER DATABASE DATAFILE 5 offline
Fri Oct 10 15:19:37 2014
Completed: ALTER DATABASE DATAFILE 5 offline
Fri Oct 10 15:19:40 2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...

出现该错误原因是由于数据库是非归档模式,offline数据文件需要使用offline drop

Fri Oct 10 15:22:16 2014
alter database datafile 5 offline drop
Fri Oct 10 15:22:17 2014
Completed: alter database datafile 5 offline drop
Fri Oct 10 15:23:13 2014
alter database open resetlogs
Fri Oct 10 15:23:14 2014
Fri Oct 10 15:23:49 2014
RESETLOGS after complete recovery through change 1422423346
Resetting resetlogs activation ID 3503292347 (0xd0cfffbb)
Fri Oct 10 15:24:01 2014
Setting recovery target incarnation to 3
Fri Oct 10 15:24:04 2014
Assigning activation ID 3649065262 (0xd980512e)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=23, OS id=3772
Fri Oct 10 15:24:04 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=24, OS id=3668
Fri Oct 10 15:24:05 2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLTTDB\REDO01.LOG
Successful open of redo thread 1
Fri Oct 10 15:24:05 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Oct 10 15:24:05 2014
ARC0: STARTING ARCH PROCESSES
ARC2: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Becoming the 'no FAL' ARCH
ARC2 started with pid=25, OS id=636
Fri Oct 10 15:24:06 2014
ARC0: Becoming the 'no SRL' ARCH
Fri Oct 10 15:24:06 2014
ARC1: Becoming the heartbeat ARCH
Fri Oct 10 15:24:06 2014
SMON: enabling cache recovery
Fri Oct 10 15:24:07 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
File #5 is offline, but is part of an online tablespace.
data file 5: 'F:\TTDATA\PUBRTS.DAT'
Dictionary check complete
Fri Oct 10 15:24:19 2014
SMON: enabling tx recovery
Fri Oct 10 15:24:19 2014
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=26, OS id=868
Fri Oct 10 15:24:21 2014
LOGSTDBY: Validating controlfile with logical metadata
Fri Oct 10 15:24:21 2014
LOGSTDBY: Validation complete
Completed: alter database open resetlogs

open成功后,再次resetlogs库,实现数据文件online

Fri Oct 10 15:28:44 2014
ALTER DATABASE DATAFILE 5 online
Fri Oct 10 15:28:44 2014
Completed: ALTER DATABASE DATAFILE 5 online
Fri Oct 10 15:31:46 2014
alter database open resetlogs
Fri Oct 10 15:31:46 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Setting recovery target incarnation to 4
Fri Oct 10 15:32:00 2014
Assigning activation ID 3649091231 (0xd980b69f)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=23, OS id=700
Fri Oct 10 15:32:00 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=24, OS id=3360
Fri Oct 10 15:32:01 2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLTTDB\REDO01.LOG
Successful open of redo thread 1
Fri Oct 10 15:32:01 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Oct 10 15:32:01 2014
ARC0: STARTING ARCH PROCESSES
ARC2: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Becoming the 'no FAL' ARCH
ARC2 started with pid=25, OS id=2016
Fri Oct 10 15:32:02 2014
ARC0: Becoming the 'no SRL' ARCH
Fri Oct 10 15:32:02 2014
ARC1: Becoming the heartbeat ARCH
Fri Oct 10 15:32:02 2014
SMON: enabling cache recovery
Fri Oct 10 15:32:03 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Fri Oct 10 15:32:15 2014
Dictionary check complete
Fri Oct 10 15:32:15 2014
SMON: enabling tx recovery
Fri Oct 10 15:32:15 2014
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=26, OS id=256
Fri Oct 10 15:32:17 2014
LOGSTDBY: Validating controlfile with logical metadata
Fri Oct 10 15:32:17 2014
LOGSTDBY: Validation complete
Completed: alter database open resetlogs