硬件故障后数据文件大小不对故障处理—Oracle碎片扫描恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:硬件故障后数据文件大小不对故障处理—Oracle碎片扫描恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有硬件恢复圈朋友找到我,说硬件恢复之后dbv报dbv-00102错误,让我给看看是否可以处理
dbv-00102


这个是oracle dbv中一种常见错误,一般是由于block 0 不对,或者是由于文件大小不对引起,让把恢复文件发给我,进行检查

SQL> select name,bytes/1024/1024/1024 from v$datafile_header;

NAME                                                                             BYTES/1024/1024/1024
-------------------------------------------------------------------------------- --------------------
H:\BAIDUNETDISK\ORADATA\XXXXORCL\SYSTEM01.DBF                                             2.080078125
H:\BAIDUNETDISK\ORADATA\XXXXORCL\SYSAUX01.DBF                                             2.880859375
H:\BAIDUNETDISK\ORADATA\XXXXORCL\UNDOTBS01.DBF                                           9.0087890625
H:\BAIDUNETDISK\ORADATA\XXXXORCL\USERS01.DBF                                          31.993408203125
H:\BAIDUNETDISK\ORADATA\XXXXORCL\USERS02.DBF                                                8.1640625
H:\BAIDUNETDISK\ORADATA\XXXXORCL\USERS03.DBF                                              7.958984375
H:\BAIDUNETDISK\ORADATA\XXXXORCL\USERS04.DBF                                              7.958984375
H:\BAIDUNETDISK\ORADATA\XXXXORCL\USERS05.DBF                                                 7.890625

已选择8行。

确定USER02-USERS05的dbf文件实际大小(数据文件头记录)在8G左右,但是目前恢复出来的文件大小只有4G左右
4g


在恢复工具中直接查看文件大小情况
rs

这里比较明显rs中虽然显示文件状态良好,但是实际大小也不对(得出经验:以后恢复中不能太依赖这个状态),根据反馈现场是三个盘的raid5,中途做了一次强制上线,然后客户也使用win pe拷贝过一次数据,大小和现在一样,也是少了近4G.第一反应可能是由于raid盘弄的不对,但是经过对其他文件的确认,多完全没有问题,排除了盘错误的问题,怀疑是由于文件系统异常导致,对于这种的情况,文件系统层面肯定无法恢复,考虑使用自研的OraScan工具进行扫描(OraScan(Oracle 碎片扫描工具) 使用说明)
ora1
ora2

通过OraScan扫描找到相关block,并提取出来数据文件
file

使用dbv检测文件

C:\Users\XFF>dbv file=H:\BaiduNetdisk\xff\YFKJORCL.USERS.4.7.4.N.DBF

DBVERIFY: Release 11.2.0.4.0 - Production on 星期日 6月 7 18:06:30 2026

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - 开始验证: FILE = H:\BAIDUNETDISK\XFF\YFKJORCL.USERS.4.7.4.N.DBF


DBVERIFY - 验证完成

检查的页总数: 1043200
处理的页总数 (数据): 67167
失败的页总数 (数据): 0
处理的页总数 (索引): 37995
失败的页总数 (索引): 0
处理的页总数 (其他): 861109
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 76929
标记为损坏的总页数: 0
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 347454063 (0.347454063)

把文件拷贝替换掉之前恢复的USERS02-USER05.DBF,然后尝试打开数据库

SQL> recover database ;
完成介质恢复。
SQL> alter database open ;
alter database open
*
第 1 行出现错误:
ORA-03113: 通信通道的文件结尾
进程 ID: 3308
会话 ID: 14 序列号: 3

查看alert日志分析原因

Sun Jun 07 14:43:51 2026
Recovery of Online Redo Log: Thread 1 Group 2 Seq 36464 Reading mem 0
  Mem# 0: H:\BAIDUNETDISK\ORADATA\YFKJORCL\REDO02.LOG
Completed: ALTER DATABASE RECOVER  database   
alter database open 
Beginning crash recovery of 1 threads
 parallel recovery started with 19 processes
Started redo scan
Completed redo scan
 read 2353 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 36464, block 15876
Recovery of Online Redo Log: Thread 1 Group 2 Seq 36464 Reading mem 0
  Mem# 0: H:\BAIDUNETDISK\ORADATA\YFKJORCL\REDO02.LOG
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 36464, block 20582, scn 347475303
 0 data blocks read, 0 data blocks written, 2353 redo k-bytes read
Sun Jun 07 14:43:57 2026
Errors in file c:\app\xff\diag\rdbms\yfkjorcl\o11201\trace\o11201_lgwr_2204.trc:
ORA-00314: ?? 3 (???? 1) ??? sequence# 36462 ? 32025 ???
ORA-00312: ???? 3 ?? 1: 'H:\BAIDUNETDISK\ORADATA\YFKJORCL\REDO03.LOG'
Errors in file c:\app\xff\diag\rdbms\yfkjorcl\o11201\trace\o11201_lgwr_2204.trc:
ORA-00314: ?? 3 (???? 1) ??? sequence# 36462 ? 32025 ???
ORA-00312: ???? 3 ?? 1: 'H:\BAIDUNETDISK\ORADATA\YFKJORCL\REDO03.LOG'
Errors in file c:\app\xff\diag\rdbms\yfkjorcl\o11201\trace\o11201_ora_3308.trc:
ORA-00314: 日志 1 (用于线程 ) 要求的 sequence#  与  不匹配
ORA-00312: 联机日志 3 线程 1: 'H:\BAIDUNETDISK\ORADATA\YFKJORCL\REDO03.LOG'
USER (ospid: 3308): terminating the instance due to error 314
Sun Jun 07 14:44:02 2026
Instance terminated by USER, pid = 3308

由于redo group 异常导致库无法正常open,但是由于已经recover database成功,因此大概率可以clear该redo 组

SQL> select group#,status from v$log;

    GROUP# STATUS
---------- ----------------
         1 INACTIVE
         3 INACTIVE
         2 CURRENT

SQL> alter database clear logfile group 3;

数据库已更改。

SQL> alter database open;

数据库已更改。

数据库open成功,然后使用expdp导出数据,完成本次恢复任务.

obet实现对数据文件坏块检测功能

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:obet实现对数据文件坏块检测功能

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

通过一段时间的测试和使用,obet修复了不少bug,关于obet的以往功能和特性的文章:
Oracle数据块编辑工具( Oracle Block Editor Tool)-obet
obet(Oracle Block Editor Tool)第二版发布
并且也在客户的生产环境上进行了实战:obet快速修改scn/resetlogs恢复数据库(缺少归档,ORA-00308).利用周末的时间又对obet的工具进行了功能增强,增加了dbv(数据块校验)功能.

Oracle dbv的不足
1. oracle dbv需要在安装oracle服务端的环境下才能执行
2. oracle dbv对于文件大小不正确(文件头记录block数和实际文件大小不匹配),文件头损坏等情况都可能导致dbv无法执行,类似下面的报错

C:\Users\XFF>dbv file=H:\BaiduNetdisk\kingdee\system01.dbf

DBVERIFY: Release 11.2.0.4.0 - Production on 星期日 1月 11 17:30:29 2026

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


DBV-00107: 未知标头格式 (0) (2054913149)

C:\Users\XFF>
C:\Users\XFF>dbv file=H:\BaiduNetdisk\kingdee\users01.dbf

DBVERIFY: Release 11.2.0.4.0 - Production on 星期日 1月 11 20:10:05 2026

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


DBV-00102: FILE (H:\BAIDUNETDISK\KINGDEE\USERS01.DBF) 在 end read 操作 (-1) 期间出现文件 I/O 错误

C:\Users\XFF>

3. oracle dbv一条命令执行检测一个数据文件,如果数据文件多,检查起来很繁琐
4. oracle dbv么有检查进度,对于io性能较慢,数据文件较大的情况,无法跟踪检查进度.

obet的dbv功能使用
1. 配置listfile.txt文件,格式为file# name

1 H:\BaiduNetdisk\kingdee\system01.dbf
5 H:\BaiduNetdisk\kingdee\EAS_D_EAS_STANDARD.ORA

2. 启动obet,执行open listfile.txt(如果不在obet目录提供完整路径)

OBET> open listfile.txt
Loaded 2 files from config file 'listfile.txt'.

OBET> info

Loaded files (2 total):
----------------------------------------
Number  Path
----------------------------------------
     1  H:\BaiduNetdisk\kingdee\system01.dbf
     5  H:\BaiduNetdisk\kingdee\EAS_D_EAS_STANDARD.ORA
----------------------------------------

3. 执行dbv命令(logfile 部分为可选),默认会记录日志在obet目录下面dbv_年月日时分秒.log的日志

OBET> dbv

===============================================
DBV (Data Block Verification)
Block Size: 8192 bytes
===============================================

Verifying file #1: H:\BaiduNetdisk\kingdee\system01.dbf (131841 blocks) - Started: 2026-01-11 18:03:58
file 1, block 0: checksum error (expected 0xC4DA, got 0xC478), bad block
file 1, block 1: checksum error (expected 0xF835, got 0xB835), bad block
  Progress: 10000 / 131841 blocks checked...
  Progress: 20000 / 131841 blocks checked...
……………….
  Progress: 120000 / 131841 blocks checked...
  Progress: 130000 / 131841 blocks checked...
  File #1 completed: 0 all zero, 0 soft corrupted, 0 tailchk error, 2 checksum error, 0 rdba error
Verifying file #5: H:\BaiduNetdisk\kingdee\EAS_D_EAS_STANDARD.ORA (4194303 blocks) - Started: 2026-01-11 18:04:08
  Progress: 10000 / 4194303 blocks checked...
  Progress: 20000 / 4194303 blocks checked...
……………………
  Progress: 260000 / 4194303 blocks checked...
  Progress: 270000 / 4194303 blocks checked...
file 5, block 277678: tailchk error (expected 0x0228BB21, got 0x0228AD21), bad block
file 5, block 277679: rdba error (expected 277679, got 277615), bad block
file 5, block 277680: rdba error (expected 277680, got 277616), bad block
file 5, block 277681: rdba error (expected 277681, got 277617), bad block
………………
file 5, block 277692: rdba error (expected 277692, got 277628), bad block
file 5, block 277693: rdba error (expected 277693, got 277629), bad block
file 5, block 277694: rdba error (expected 277694, got 277630), bad block
file 5, block 279406: tailchk error (expected 0x02281E22, got 0x00000700), bad block
file 5, block 279407: rdba error (expected 279407, got 448), bad block
file 5, block 279408: rdba error (expected 279408, got 0), bad block
………………
  Progress: 280000 / 4194303 blocks checked...
  Progress: 290000 / 4194303 blocks checked...
  Progress: 300000 / 4194303 blocks checked...
  Progress: 310000 / 4194303 blocks checked...
file 5, block 312932: tailchk error (expected 0x0106C0B8, got 0x010629B8), bad block
file 5, block 312933: rdba error (expected 312933, got 312869), bad block
file 5, block 312934: rdba error (expected 312934, got 312870), bad block
file 5, block 312935: rdba error (expected 312935, got 312871), bad block
file 5, block 312936: rdba error (expected 312936, got 312872), bad block
file 5, block 312937: rdba error (expected 312937, got 312873), bad block
file 5, block 312938: rdba error (expected 312938, got 312874), bad block
file 5, block 312939: rdba error (expected 312939, got 312875), bad block
………………
  Progress: 4180000 / 4194303 blocks checked...
  Progress: 4190000 / 4194303 blocks checked...
  File #5 completed: 1 all zero, 0 soft corrupted, 13 tailchk error, 1 checksum error, 255 rdba error

DBV completed at: 2026-01-11 18:09:30

===============================================
DBV Summary:
Total blocks checked: 4325872
Total all zero blocks found: 1
Total all rdba error blocks found: 255
Total all tailchk error blocks found: 13
Total all soft corrupted blocks found: 0
Total all checksum error blocks found: 3
Total bad blocks found: 272
===============================================

Detailed report saved to: dbv_20260111180358.log

OBET>

检测效果截图
obet-dbv