Oracle断电故障处理

Posted on 2022 年 05 月 31 日 by 惜分飞

异常断电导致数据库异常恢复文件报ORA-00283 ORA-00742 ORA-00312

 
D:\check_db>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on 星期二 5月 31 00:38:42 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> recover datafile 1;
ORA-00283: 恢复会话因错误而取消
ORA-00742: 日志读取在线程 %d 序列 %d 块 %d 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1:
'D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ONLINELOG\O1_MF_3_HJ32KJD5_.LOG'

这个错误比较明显是由于异常断电引起的写丢失导致.而且这种故障在没有备份的情况下，没有什么好处理方法,只能屏蔽一致性强制拉库,尝试强制拉库报错如下

SQL> startup mount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area 2.0310E+10 bytes
Fixed Size                  2290000 bytes
Variable Size            3690991280 bytes
Database Buffers         1.6576E+10 bytes
Redo Buffers               40837120 bytes
数据库装载完毕。
SQL> recover database until cancel;
ORA-00279: 更改 18755939194213 (在  生成) 对于线程 1 是必需的


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ONLINELOG\O1_MF_3_HJ32KJD5_.LOG
ORA-00600: internal error code, arguments: [3020], [2], [78824], [8467432], [],
[], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 78824, file
offset is 645726208 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: 'D:\ORADATA\ORCL\SYSAUX01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 80834


ORA-01112: 未启动介质恢复


SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [krsi_al_hdr_update.15], [4294967295], [], [],[], [], [], [], [], [], [], []

ORA-600 krsi_al_hdr_update.15错误,主要是由于redo异常导致无法resetlogs成功,具体参考：Alter Database Open Resetlogs returns error ORA-00600: [krsi_al_hdr_update.15], (Doc ID 2026541.1)描述,处理这个问题之后,再次resetlogs库，报ORA-600 2662错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [4366], [4112122046],
[4366], [4112228996], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [4366], [4112122045],
[4366], [4112228996], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [4366], [4112122040],
[4366], [4112228996], [12583040], [], [], [], [], [], []
进程 ID: 4644
会话 ID: 1701 序列号: 3

这个问题比较简单,通过修改scn即可绕过去,之后数据库open报ORA-600 4194等错误

SQL> alter database open ;
alter database open 
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [4194], [

SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc  (incident=322982):
ORA-00600: internal error code, arguments: [4137], [10.33.3070116], [0], [0], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_322982\orcl_smon_5112_i322982.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
replication_dependency_tracking turned off (no async multimaster replication found)
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_3340.trc  (incident=323030):
ORA-00600: 内部错误代码, 参数: [4194], [
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_323030\orcl_ora_3340_i323030.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 31 09:05:04 2022
Sweep [inc][322982]: completed
ORACLE Instance orcl (pid = 13) - Error 600 encountered while recovering transaction (10, 33).
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc:
ORA-00600: internal error code, arguments: [4137], [10.33.3070116], [0], [0], [], [], [], [], [], [], [], []
Checker run found 1 new persistent data failures
Tue May 31 09:05:05 2022
Sweep [inc][323030]: completed
Sweep [inc2][322982]: completed
Tue May 31 09:05:14 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc  (incident=322983):
ORA-00600: internal error code, arguments: [4193], [10.33.3070116], [0], [], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_322983\orcl_smon_5112_i322983.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 31 09:05:14 2022
ORA-600 signalled during: alter database open...
Block recovery stopped at EOT rba 2.61.16
Block recovery completed at rba 2.61.16, scn 4366.4112429058
Block recovery from logseq 2, block 60 to scn 18755939643393
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ONLINELOG\O1_MF_2_K9BSVC11_.LOG
Block recovery completed at rba 2.61.16, scn 4366.4112429058
Dumping diagnostic data in directory=[cdmp_2022053],requested by(instance=1,osid=5112(SMON)),summary=[incident=322983].
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc:
ORA-01595: error freeing extent (3) of rollback segment (1))
ORA-00600: internal error code, arguments: [4193], [10.33.3070116], [3], [], [], [], [], [], [], [], [], []

对异常undo进行处理,数据库正常open成功

SQL> shutdown immediate;
ORA-00600: 内部错误代码, 参数: [4193], [


SQL> shutdown abort;
ORACLE 例程已经关闭。
SQL> startup mount
ORACLE 例程已经启动。

Total System Global Area 2.0310E+10 bytes
Fixed Size                  2290000 bytes
Variable Size            3690991280 bytes
Database Buffers         1.6576E+10 bytes
Redo Buffers               40837120 bytes
数据库装载完毕。
SQL> alter database open;

数据库已更改。

hcheck检测有一些字典不一致,建议客户逻辑导出,然后导入到新库中

HCheck Version 07MAY18 on 31-5月 -2022 09:12:22
----------------------------------------------
Catalog Version 11.2.0.4.0 (1102000400)
db_name: ORCL

                                   Catalog       Fixed
Procedure Name                     Version    Vs Release    Timestamp      Resul
t
------------------------------ ... ---------- -- ---------- -------------- -----
-
.- LobNotInObj                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- MissingOIDOnObjCol          ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- SourceNotInObj              ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- OversizedFiles              ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- PoorDefaultStorage          ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- PoorStorage                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- TabPartCountMismatch        ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- OrphanedTabComPart          ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- MissingSum$                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- MissingDir$                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- DuplicateDataobj            ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- ObjSynMissing               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjSeqMissing               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedUndo                ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndex               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndexPartition      ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndexSubPartition   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedTable               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedTablePartition      ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedTableSubPartition   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- MissingPartCol              ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedSeg$                ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndPartObj#         ... 1102000400 <=  *All Rel* 05/31 09:12:23 FAIL

HCKE-0024: Orphaned Index Partition Obj# (no OBJ$) (Doc ID 1360935.1)
ORPHAN INDPART$: OBJ#=149167 BO#=6378 - no OBJ$ row
ORPHAN INDPART$: OBJ#=149168 BO#=6378 - no OBJ$ row

.- DuplicateBlockUse           ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- FetUet                      ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- Uet0Check                   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- SeglessUET                  ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadInd$                     ... 1102000400 <=  *All Rel* 05/31 09:12:23 FAIL

HCKE-0030: OBJ$ INDEX entry has no IND$ or INDPART$/INDSUBPART$ entry (Doc ID 13
60528.1)
OBJ$ INDEX PARTITION has no INDPART$ entry: Obj#=148278 SYS Name=WRH$_FILESTATXS
_PK PARTITION=WRH$_FILEST_1572571104_16462
OBJ$ INDEX PARTITION has no INDPART$ entry: Obj#=148920 SYS Name=WRH$_FILESTATXS
_PK PARTITION=WRH$_FILEST_1572571104_16678

.- BadTab$                     ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadIcolDepCnt               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjIndDobj                  ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- TrgAfterUpgrade             ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjType0                    ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadOwner                    ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- StmtAuditOnCommit           ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadPublicObjects            ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadSegFreelist              ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadDepends                  ... 1102000400 <=  *All Rel* 05/31 09:12:23 WARN

HCKW-0016: Dependency$ p_timestamp mismatch for VALID objects (Doc ID 1361045.1)

[E] - P_OBJ#=6376 D_OBJ#=6765

.- CheckDual                   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjectNames                 ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadCboHiLo                  ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- ChkIotTs                    ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- NoSegmentIndex              ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- BadNextObject               ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- DroppedROTS                 ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- FilBlkZero                  ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- DbmsSchemaCopy              ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- OrphanedObjError            ... 1102000400 >  1102000000 05/31 09:12:24 PASS
.- ObjNotLob                   ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- MaxControlfSeq              ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- SegNotInDeferredStg         ... 1102000400 >  1102000000 05/31 09:12:24 PASS
.- SystemNotRfile1             ... 1102000400 >   902000000 05/31 09:12:24 PASS
.- DictOwnNonDefaultSYSTEM     ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- OrphanTrigger               ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- ObjNotTrigger               ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
---------------------------------------
31-5月 -2022 09:12:24  Elapsed: 2 secs
---------------------------------------
Found 4 potential problem(s) and 1 warning(s)
Contact Oracle Support with the output and trace file
to check if the above needs attention or not

PL/SQL 过程已成功完成。

O/S-Error: (OS 23) 数据错误(循环冗余检查) 数据库恢复

Posted on 2022 年 05 月 18 日 by 惜分飞

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：O/S-Error: (OS 23) 数据错误(循环冗余检查) 数据库恢复

有客户数据库运行过程中突然crash,检测发现ORA-27070 OSD-04016 O/S-Error: (OS 23) 等报错

Thu May 12 11:25:53 2022
KCF: write/open error block=0x19e95f online=1
     file=57 H:\ORADATA\xifenfei\XFF51.DBF
     error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 23) 数据错误(循环冗余检查)。'
Thu May 12 11:25:53 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw0_3532.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式
ORA-01114: 将块写入文件 57 时出现 IO 错误 (块 # 1698143)
ORA-01110: 数据文件 57: 'H:\ORADATA\xifenfei\XFF51.DBF'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 23) 数据错误(循环冗余检查)。

DBW0: terminating instance due to error 1242
Thu May 12 11:25:54 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_mman_3528.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:54 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_lgwr_3544.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:55 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw1_3536.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:55 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_psp0_3524.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:55 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_ckpt_3548.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:25:55 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_pmon_3520.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:26:06 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_q002_37468.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:26:08 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_reco_3556.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:26:08 2022
Errors in file e:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_smon_3552.trc:
ORA-01242: 数据文件出现介质故障: 数据库处于 NOARCHIVELOG 模式

Thu May 12 11:26:10 2022
Instance terminated by DBW0, pid = 3532

再次重启数据库报错 ORA-27070: 异步读取/写入失败 OSD-04016: 异步 I/O 请求排队时出错。类似错误

dbv检查数据文件报异常

通过以上信息基本上可以确认是由于底层故障(文件系统或者硬件故障),导致数据库文件访问异常,检查系统日志发现异常

通过专业恢复软件对异常文件进行恢复,实现数据库正常open(跳过坏块)

ASM删除表空间恢复

Posted on 2022 年 05 月 10 日 by 惜分飞

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ASM删除表空间恢复

前几天刚刚恢复了一个文件系统层面drop 表空间的case(分享运气超级好的一次drop tablespace 数据恢复),又一客户删除表空间(认为是不要的表空间),结果发现业务上丢失了很多表数据,通过分析和回顾以往事件,确认由于在以前数据迁移过来的过程中,数据写入了和原库一致的表空间,而没有恢复到本该恢复的新表空间中,这次删除该空间导致很多表数据丢失.而且该客户是asm环境,drop tablespace带上了including contents and datafiles语句,导致该表空间对应的数据文件也丢失.对于这类数据的恢复,一般情况下先通过asm层面恢复出来被删除的数据文件,然后再对被删除的数据文件按照丢失system的方式恢复里面的表数据(这个客户有历史备份便于整合)
在恢复被删除的文件之前,需要先确认对应的被删除的表空间信息和对应的文件信息,通过对底层字典分析file$,ts$,结合alert日志,可以确认被删除文件的文件号,文件名称等信息

由于文件已经从asm磁盘组中删除,无法直接恢复,通过对asm磁盘组进行扫描找出对应的block信息,参考:asm磁盘组操作不当导致数据文件丢失恢复类似处理方法,分析文件是否异常

初步判断文件恢复效果应该不错,恢复出来数据文件，然后进行dbv检查

后续的操作比较简单,使用oracle dul恢复出来按照类似方法:dul恢复drop表测试数据即可,业务进行核对即可.如果你遭遇到此类情况,而且无有效备份,尽可能保护现场(不要对asm/文件系统系统进行写入操作)，然后联系我们进行处理,最大限度恢复数据

分享运气超级好的一次drop tablespace 数据恢复

Posted on 2022 年 04 月 15 日 by 惜分飞

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：分享运气超级好的一次drop tablespace 数据恢复

分享一次运气超级好的恢复,本身是一个测试库,应用厂商今天准备把应用正式上线,操作流程是:先删除用户,然后删除表空间,在创建表空间导入数据正式上线,不知何种原因最终客户在测试业务中做了一些正式数据,结果是无情的被删除了,通过alert日志找到应用厂商的一些操作记录
2021年8月份创建了业务表空间

Wed Aug 18 09:49:03 2021
create tablespace xifenfei datafile 'D:\app\Administrator\oradata\xifenfei\xifenfei.dbf' size 10g
Wed Aug 18 09:52:28 2021
Completed: create tablespace xifenfei datafile 'D:\app\Administrator\oradata\xifenfei\xifenfei.dbf' size 10g

今天删除表空间

Tue Apr 12 11:15:02 2022
drop tablespace xifenfei including contents and datafiles
WARNING: Cannot delete file D:\APP\ADMINISTRATOR\ORADATA\xifenfei\xifenfei.DBF
Errors in file d:\app\administrator\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_4296.trc:
ORA-01265: 鏃犳硶鍒犻櫎 DATA D:\APP\ADMINISTRATOR\ORADATA\xifenfei\xifenfei.DBF
ORA-27056: 鏃犳硶鍒犻櫎鏂囦欢
OSD-04024: 无法删除文件。
O/S-Error: (OS 32) 另一个程序正在使用此文件，进程无法访问。
Completed: drop tablespace xifenfei including contents and datafiles

然后客户创建新表空间提示ORA-01119,然后人工删除掉该数据文件

Tue Apr 12 11:49:02 2022
create tablespace xifenfei datafile'D:\oracle\oradata\xifenfei\xifenfei.dbf'size 20480m
ORA-1119 signalled during: create tablespace xifenfei datafile'D:\oracle\oradata\xifenfei\xifenfei.dbf'size 20480m...
Tue Apr 12 11:49:16 2022
create tablespace xifenfei datafile'D:\oracle\oradata\xifenfei\xifenfei.dbf'size 20480m
ORA-1119 signalled during: create tablespace xifenfei datafile'D:\oracle\oradata\xifenfei\xifenfei.dbf'size 20480m...

创建新表空间成功，并增加数据文件

Tue Apr 12 12:08:43 2022
create tablespace xifenfei datafile'D:\app\Administrator\oradata\xifenfei\xifenfei.dbf'size 5120m
Tue Apr 12 12:10:25 2022
Completed: create tablespace xifenfei datafile'D:\app\Administrator\oradata\xifenfei\xifenfei.dbf'size 5120m
Tue Apr 12 12:11:19 2022
alter tablespace xifenfei add datafile'D:\app\Administrator\oradata\xifenfei\xifenfei1.dbf'size 5120m
Tue Apr 12 12:13:02 2022
Completed: alter tablespace xifenfei add datafile'D:\app\Administrator\oradata\xifenfei\xifenfei1.dbf'size 5120m
alter tablespace xifenfei add datafile'D:\app\Administrator\oradata\xifenfei\xifenfei2.dbf'size 5120m
Tue Apr 12 12:14:52 2022
Completed: alter tablespace xifenfei add datafile'D:\app\Administrator\oradata\xifenfei\xifenfei2.dbf'size 5120m

基本情况就是客户删除了一个10G的业务数据文件，然后创建了3个5G的业务数据文件,现在要恢复被以前的两个表的核心数据，需要做的就是把以前的10G的数据文件找出来,但是由于删除10G文件之后又写入了15G的数据文件(而且这里面有文件的file#和删除的文件一致),理论上无法直接做block层面扫描恢复,对于此类情况,尝试文件系统层面直接反删除恢复,不过没有任何记录,文件目录被覆盖,这条路走不通.通过block扫描,发现2个file# 5文件的起始位置（分别是block 2和block 0）,而且结束位置文件大小分别是10G和5G,根据经验这两个连续的磁盘分配空间很可能就是这两个file# 5的文件

通过winhex把数据拷贝出来,使用工具检测

除损坏的block 1之外（block 0 不统计在内）,其他block都正常,也就是说这个10G的被删除的数据文件,只是丢失一个文件头,业务数据全部再,后续通过dul恢复客户需要数据,完成这次数据恢复,类似这种文件丢失,文件系统损坏,文件大小为0kb等类似恢复,参见以前类似blog:
win文件系统损坏oracle恢复
dbca删除库和rm删库恢复
文件系统重新分区oracle恢复
restore database误操作恢复
文件系统损坏导致数据文件异常恢复
Oracle 数据文件大小为0kb或者文件丢失恢复
rm -rf 删除数据文件恢复方法—文件系统反删除+oracle碎片重组

ORA-600 ktbsdp2 处理

Posted on 2022 年 04 月 05 日 by 惜分飞

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ORA-600 ktbsdp2 处理

客户反馈数据库异常:两个节点rac,两个节点都启动,其中一个节点无法正常open,另外一个节点一段时间后也会挂。以下是无法正常open节点报错信息(正常open节点最终挂掉报错信息也是类似)

2022-03-30T19:19:12.870813+08:00
[84321] Successfully onlined Undo Tablespace 4.
Undo initialization finished serial:0 start:1728252021 end:1728256302 diff:4281 ms (4.3 seconds)
Verifying minimum file header compatibility for tablespace encryption..
Verifying file header compatibility for tablespace encryption completed for pdb 0
2022-03-30T19:19:13.953252+08:00
Database Characterset is ZHS16GBK
2022-03-30T19:19:14.538155+08:00
Errors in file /oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_p02o_85718.trc  (incident=1093927):
ORA-00600: internal error code, arguments: [ktbsdp2], [18446744073709551615], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_1093927/xff2_p02o_85718_i1093927.trc
2022-03-30T19:19:15.536582+08:00
ORACLE Instance xff2 (pid = 57) - Error 607 encountered while recovering transaction (73, 12) on object 112841.
2022-03-30T19:19:33.699944+08:00
Errors in file /oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_smon_84007.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [ktbsdp2], [18446744073709551615], [], [], [], [], [], [], [], [], [], []
2022-03-30T19:19:34.673840+08:00
Errors in file /oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_smon_84007.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [ktbsdp2], [18446744073709551615], [], [], [], [], [], [], [], [], [], []
2022-03-30T19:19:34.673954+08:00
Errors in file /oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_smon_84007.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [ktbsdp2], [18446744073709551615], [], [], [], [], [], [], [], [], [], []
Errors in file /oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_smon_84007.trc  (incident=1092704):
ORA-607 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /oracle/app/oracle/diag/rdbms/xff/xff2/incident/incdir_1092704/xff2_smon_84007_i1092704.trc
2022-03-30T19:19:35.422779+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************
2022-03-30T19:19:36.154689+08:00
Starting background process GTX0
2022-03-30T19:19:36.169007+08:00
GTX0 started with pid=370, OS id=87409 
2022-03-30T19:19:36.645876+08:00
USER (ospid: 84007): terminating the instance due to error 607
2022-03-30T19:19:36.680109+08:00
opiodr aborting process unknown ospid (87439) as a result of ORA-1092
2022-03-30T19:19:36.681091+08:00
ORA-1092 : opitsk aborting process
2022-03-30T19:19:36.740357+08:00
System state dump requested by (instance=2, osid=84007 (SMON)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_diag_83895_20220330191936.trc
2022-03-30T19:19:40.135579+08:00
Instance terminated by USER, pid = 84007

对于上述报错信息分析,初步判断是由于事务异常导致,查询mos发现类似报错Bug 32208691 – After upgrade from 12.1 to 19.3 drop columns fails ORA-600[ktbsdp2] ORA-600[4512] (Doc ID 32208691.8),通过咨询客户,确认他们这边是通过plsql dev工具对id为112841表进行增加列的时候网络中断导致增加失败,后续我尝试对该表进行查询发现也报该错误,基本上可以确认由于该表事务异常导致,通过dul把该表数据恢复,然后drop 该表,数据库启动正常,未见其他报错,通过hcheck检查,数据库字典基本一致(除一些统计信息异常,原则上不影响数据库运行)

[oracle@xifenfei2 ~]$ sqlplus / as sysdba @hcheck.sql

SQL*Plus: Release 12.2.0.1.0 Production on Thu Mar 31 00:38:32 2022

Copyright (c) 1982, 2016, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

HCheck Version 07MAY18 on 31-MAR-2022 00:38:34
----------------------------------------------
Catalog Version 12.2.0.1.0 (1202000100)
db_name: xff
Is CDB?: NO

                                   Catalog       Fixed
Procedure Name                     Version    Vs Release    Timestamp
Result
------------------------------ ... ---------- -- ---------- --------------
------
.- LobNotInObj                 ... 1202000100 <=  *All Rel* 03/31 00:38:34 PASS
.- MissingOIDOnObjCol          ... 1202000100 <=  *All Rel* 03/31 00:38:34 PASS
.- SourceNotInObj              ... 1202000100 <=  *All Rel* 03/31 00:38:34 PASS
.- OversizedFiles              ... 1202000100 <=  *All Rel* 03/31 00:38:38 PASS
.- PoorDefaultStorage          ... 1202000100 <=  *All Rel* 03/31 00:38:38 PASS
.- PoorStorage                 ... 1202000100 <=  *All Rel* 03/31 00:38:38 PASS
.- TabPartCountMismatch        ... 1202000100 <=  *All Rel* 03/31 00:38:38 PASS
.- OrphanedTabComPart          ... 1202000100 <=  *All Rel* 03/31 00:38:38 PASS
.- MissingSum$                 ... 1202000100 <=  *All Rel* 03/31 00:38:38 PASS
.- MissingDir$                 ... 1202000100 <=  *All Rel* 03/31 00:38:38 PASS
.- DuplicateDataobj            ... 1202000100 <=  *All Rel* 03/31 00:38:40 PASS
.- ObjSynMissing               ... 1202000100 <=  *All Rel* 03/31 00:38:42 PASS
.- ObjSeqMissing               ... 1202000100 <=  *All Rel* 03/31 00:38:42 PASS
.- OrphanedUndo                ... 1202000100 <=  *All Rel* 03/31 00:38:44 PASS
.- OrphanedIndex               ... 1202000100 <=  *All Rel* 03/31 00:38:44 PASS
.- OrphanedIndexPartition      ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- OrphanedIndexSubPartition   ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- OrphanedTable               ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- OrphanedTablePartition      ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- OrphanedTableSubPartition   ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- MissingPartCol              ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- OrphanedSeg$                ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- OrphanedIndPartObj#         ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- DuplicateBlockUse           ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- FetUet                      ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- Uet0Check                   ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- SeglessUET                  ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- BadInd$                     ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- BadTab$                     ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- BadIcolDepCnt               ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- ObjIndDobj                  ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- TrgAfterUpgrade             ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- ObjType0                    ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- BadOwner                    ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- StmtAuditOnCommit           ... 1202000100 <=  *All Rel* 03/31 00:38:45 PASS
.- BadPublicObjects            ... 1202000100 <=  *All Rel* 03/31 00:38:46 PASS
.- BadSegFreelist              ... 1202000100 <=  *All Rel* 03/31 00:38:50 PASS
.- BadDepends                  ... 1202000100 <=  *All Rel* 03/31 00:38:50 PASS
.- CheckDual                   ... 1202000100 <=  *All Rel* 03/31 00:38:57 PASS
.- ObjectNames                 ... 1202000100 <=  *All Rel* 03/31 00:38:57 WARN

HCKW-0018: OBJECT name clashes with SCHEMA name (Doc ID 2363142.1)
Schema=MHWZ PACKAGE=MHWZ.MHWZ
Schema=MHWZ PACKAGE BODY=MHWZ.MHWZ

.- BadCboHiLo                  ... 1202000100 <=  *All Rel* 03/31 00:39:01 WARN

HCKW-0019: HIST_HEAD$.LOWVAL > HIVAL (Doc ID 1361047.1)
OBJ# 324163 INTCOL#=22
OBJ# 482668 INTCOL#=4
OBJ# 442865 INTCOL#=31
OBJ# 436924 INTCOL#=31
OBJ# 580529 INTCOL#=8
OBJ# 459432 INTCOL#=31
OBJ# 451260 INTCOL#=31
OBJ# 530980 INTCOL#=21
OBJ# 498442 INTCOL#=5
OBJ# 652114 INTCOL#=8
OBJ# 701695 INTCOL#=21
OBJ# 831961 INTCOL#=31
OBJ# 831962 INTCOL#=31
OBJ# 831963 INTCOL#=31

.- ChkIotTs                    ... 1202000100 <=  *All Rel* 03/31 00:39:09 PASS
.- NoSegmentIndex              ... 1202000100 <=  *All Rel* 03/31 00:39:09 PASS
.- BadNextObject               ... 1202000100 <=  *All Rel* 03/31 00:39:09 PASS
.- DroppedROTS                 ... 1202000100 <=  *All Rel* 03/31 00:39:09 PASS
.- FilBlkZero                  ... 1202000100 <=  *All Rel* 03/31 00:39:09 PASS
.- DbmsSchemaCopy              ... 1202000100 <=  *All Rel* 03/31 00:39:09 PASS
.- OrphanedIdnseqObj           ... 1202000100 >  1201000000 03/31 00:39:09 PASS
.- OrphanedIdnseqSeq           ... 1202000100 >  1201000000 03/31 00:39:09 PASS
.- OrphanedObjError            ... 1202000100 >  1102000000 03/31 00:39:09 PASS
.- ObjNotLob                   ... 1202000100 <=  *All Rel* 03/31 00:39:09 PASS
.- MaxControlfSeq              ... 1202000100 <=  *All Rel* 03/31 00:39:09 PASS
.- SegNotInDeferredStg         ... 1202000100 >  1102000000 03/31 00:39:13 PASS
.- SystemNotRfile1             ... 1202000100 >   902000000 03/31 00:39:13 PASS
.- DictOwnNonDefaultSYSTEM     ... 1202000100 <=  *All Rel* 03/31 00:39:13 PASS
.- OrphanTrigger               ... 1202000100 <=  *All Rel* 03/31 00:39:13 PASS
.- ObjNotTrigger               ... 1202000100 <=  *All Rel* 03/31 00:39:13 PASS
---------------------------------------
31-MAR-2022 00:39:13  Elapsed: 39 secs
---------------------------------------
Found 0 potential problem(s) and 16 warning(s)
Contact Oracle Support with the output and trace file
to check if the above needs attention or not

PL/SQL procedure successfully completed.

Statement processed.

Complete output is in trace file:
/oracle/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_26887_HCHECK.trc