ORA-600 ksvworkmsgalloc: bad reaper

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 ksvworkmsgalloc: bad reaper

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有一个朋友说他们想把12c的库还原到19c版本中然后进行升级测试,结果在打开库的过程中发现几个错误,让我给帮忙分析下
resetlogs 报ORA-00392 ORA-00312

SQL> alter database open resetlogs upgrade;
alter database open resetlogs upgrade
*
ERROR at line 1:
ORA-00392: log 7 of thread 1 is being cleared, operation not allowed
ORA-00312: online log 7 thread 1: '/DBS1/data/NDBS/onlinelog/redo07_m1.log '
ORA-00312: online log 7 thread 1: '/DBS1/arch/NDBS/onlinelog/redo07_m2.log '

这个错误一般是由于redo状态不对,比如标记为了CLEARING_CURRENT,处理操作

SQL> select group#,status from v$log;

          GROUP# STATUS
---------------- ----------------
               1 CLEARING
               2 CLEARING
               3 CLEARING
               4 CLEARING
              10 CLEARING
               6 CLEARING
               7 CLEARING_CURRENT
               8 CLEARING
               9 CLEARING
               5 CLEARING

10 rows selected.


SQL> alter database clear logfile group 7;

Database altered.

SQL> select group#,status from v$log;

          GROUP# STATUS
---------------- ----------------
               1 CLEARING
               2 CLEARING
               3 CLEARING
               4 CLEARING
              10 CLEARING
               6 CLEARING
               7 CURRENT
               8 CLEARING
               9 CLEARING
               5 CLEARING

10 rows selected.

再次reseltogs报ORA-600 ksvworkmsgalloc: bad reaper错误

SQL> alter database open resetlogs upgrade;
alter database open resetlogs upgrade
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [ksvworkmsgalloc: bad reaper], [0x080010003], [], [], []

这个错误通过查询MOS 发现Open Resetlogs Fail with ORA-00600[ksvworkmsgalloc: bad reaper] (Doc ID 2728106.1)文章中描述,由于non-ASM to ASM环境redo文件在clear的时候触发该问题
KSVWORKMSGALLOW


是由于db_create_online_log_dest_1参数没有设置导致,对于该库是由asm环境到文件系统,估计也是在resetlogs的时候clear redo报出来该错误,解决办法给该库设置上
db_create_online_log_dest_1=/DBS1/data,db_create_online_log_dest_2=/DBS1/arch,然后打开库成功
QQ20250519-231821

ORA-600 krccfl_chunk故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 krccfl_chunk故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个数据库启动包ORA-600 krccfl_chunk错误

2025-05-06T10:37:47.428203+08:00
Completed: ALTER DATABASE MOUNT /* db agent *//* {2:50212:2} */
ALTER DATABASE OPEN /* db agent *//* {2:50212:2} */
2025-05-06T10:37:47.433709+08:00
This instance was first to open
Block change tracking file is current.
Ping without log force is disabled:
  not an Exadata system.
start recovery: pdb 0, passed in flags x4 (domain enable 5) 
2025-05-06T10:37:48.203383+08:00
Beginning crash recovery of 2 threads
2025-05-06T10:37:48.568120+08:00
 parallel recovery started with 32 processes
2025-05-06T10:37:48.610951+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.611037+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.611243+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.611438+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.614947+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.616591+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617188+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617253+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617428+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617606+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617676+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.617809+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.636568+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.636568+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.636620+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.637156+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.637300+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.637881+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.637999+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638112+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638241+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638304+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638338+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.638347+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.641621+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.642926+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643092+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643192+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643204+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643372+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643516+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.643573+08:00
start recovery: pdb 0, passed in flags x5 (domain enable 5) 
2025-05-06T10:37:48.748956+08:00
Started redo scan
2025-05-06T10:37:49.849382+08:00
Completed redo scan
 read 469347 KB redo, 1213 data blocks need recovery
2025-05-06T10:37:50.007840+08:00
Started redo application at
 Thread 1: logseq 369323, block 651514, offset 0
 Thread 2: logseq 132962, block 1319944, offset 0
2025-05-06T10:37:50.016910+08:00
Recovery of Online Redo Log: Thread 1 Group 13 Seq 369323 Reading mem 0
  Mem# 0: +DATA/orcl/ONLINELOG/group_13.349.978709791
  Mem# 1: +FRA/orcl/ONLINELOG/group_13.12992.978709793
2025-05-06T10:37:50.025725+08:00
Recovery of Online Redo Log: Thread 2 Group 18 Seq 132962 Reading mem 0
  Mem# 0: +DATA/orcl/ONLINELOG/group_18.354.978710003
  Mem# 1: +FRA/orcl/ONLINELOG/group_18.12997.978710005
2025-05-06T10:37:51.063556+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_ora_68031.trc(incident=868005)(PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [krccfl_chunk], [0x7F9BBB30BE58], [166528],[],[],[],[],[],[],[],[],[]
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl2/incident/incdir_868005/orcl2_ora_68031_i868005.trc
2025-05-06T10:37:52.269823+08:00
Dumping diagnostic data in directory=[cdmp_20250506103752],requested by(instance=2,osid=68031),summary=[incident=868005].
2025-05-06T10:37:52.306517+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2025-05-06T10:37:52.310723+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310813+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310820+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310853+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310902+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310907+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310945+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.310950+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.310987+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311002+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311009+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311017+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311055+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311055+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311064+08:00
Slave encountered ORA-10388 exception during crash recovery
2025-05-06T10:37:52.311071+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311080+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311107+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311119+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311126+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311135+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p000_69617.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311156+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311184+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311203+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311205+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311211+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p001_69619.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311276+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p002_69621.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311276+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311280+08:00
Recovery slave process is holding some recovery locks. Killing the instance now.
2025-05-06T10:37:52.311308+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p003_69623.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311329+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p004_69625.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311341+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p005_69627.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311345+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p007_69631.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311353+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p008_69633.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311374+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p006_69629.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311386+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p009_69635.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311402+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p00a_69637.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311513+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p00c_69641.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.311515+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_p00b_69639.trc:
ORA-10388: parallel query server interrupt (failure)
2025-05-06T10:37:52.348331+08:00
USER (ospid: 69617): terminating the instance due to error 10388
2025-05-06T10:37:52.585589+08:00
System state dump requested by (instance=2, osid=69617 (P000)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_67490_20250506103752.trc
2025-05-06T10:37:54.016704+08:00
License high water mark = 34
2025-05-06T10:37:55.387072+08:00
Instance terminated by USER, pid = 69617
2025-05-06T10:37:55.388683+08:00
Warning: 2 processes are still attach to shmid 2850830:
 (size: 45056 bytes, creator pid: 65902, last attach/detach pid: 67492)
2025-05-06T10:37:56.018027+08:00
USER (ospid: 69907): terminating the instance
2025-05-06T10:37:56.021711+08:00
Instance terminated by USER, pid = 69907

查询mos发现类似文章:
Database doesn’t open after crash ORA-00600 [krccfl_chunk] (Doc ID 2967548.1)
Bug 33251482 – ORA-487 / ORA-600 [krccfl_chunk] : CTWR process terminated during PDB creation (Doc ID 33251482.8)
ORA-600-krccfl_chunk


分析这个客户情况,通过trace信息:Block change tracking file is current. 可以确认是启用了BCT,而且日志信息也反应出来是pdb环境。进一步分析客户的情况,发现他们在以前有一个数据文件创建到了本地(实际是rac环境)

2024-12-23T11:07:09.168322+08:00
PDBODS(5):Completed: alter tablespace PDBODS_DATA add datafile 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\USERS02.DBF'
 size 5000M autoextend on next 1000M maxsize 32000M

数据库中现在实际存储路径/u01/app/oracle/product/12.2.0.1/dbhome_1/dbs/D:APPADMINISTRATORORADATAORCLUSERS 02.DBF
基于这种情况,解决问题比较简单:在本地数据文件所在节点禁用BCT,然后open库,把数据文件拷贝到asm中即可

Oracle Recovery Tools恢复案例总结—202505

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle Recovery Tools恢复案例总结—202505

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

开发出来Oracle Recovery Tools小工具已经一段时间,而且在大量的客户恢复case中使用,大大的提高的恢复效率,特别是win平台需要bbed或者类似工具的时候.现在对该工具在实战中的一些case进行总结:
Oracle Recovery Tools修复空闲坏块
Oracle Recovery Tools实战批量坏块修复
Oracle Recovery Tools快速恢复ORA-19909
Oracle Recovery Tools 解决ORA-600 3020故障
Oracle Recovery Tools恢复csc higher than block scn
Oracle Recovery Tools恢复MISSING00000文件故障
Oracle Recovery Tools快速恢复重建ctl遗漏数据文件故障
一键恢复ORA-01113 ORA-01110—Oracle Recovery Tools
Oracle Recovery Tools 解决ORA-01190 ORA-01248等故障
Oracle Recovery Tools快速解决sysaux文件不能online问题
Oracle Recovery Tools恢复—ORA-00704 ORA-01555故障
ORA-01113 ORA-01110错误不一定都要Oracle Recovery Tools解决
Oracle Recovery Tools解决ORA-00279 ORA-00289 ORA-00280故障
Oracle Recovery Tools修复ORA-600 6101/kdxlin:psno out of range故障
Oracle Recovery Tools工具一键解决ORA-00376 ORA-01110故障(文件offline)
Oracle Recovery Tools修复ORA-00742、ORA-600 ktbair2: illegal inheritance故障
Oracle Recovery Tools快速恢复断电引起的无法正常启动数据库(ORA-01555,MISSING000等问题)
软件下载:OraRecovery下载
使用说明:使用说明

ORA-600 kddummy_blkchk 数据库循环重启

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kddummy_blkchk 数据库循环重启

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个运行在hp平台的10g rac突然异常,之后运行一段时间就自动重启,客户让对其进行分析和解决

Thu May  8 06:23:21 2025
ALTER DATABASE OPEN
Picked broadcast on commit scheme to generate SCNs
Thu May  8 06:23:21 2025
Thread 1 opened at log sequence 74302
  Current log# 1 seq# 74302 mem# 0: /dev/vgdata/rrac_redo01
Successful open of redo thread 1
Thu May  8 06:23:21 2025
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu May  8 06:23:21 2025
SMON: enabling cache recovery
Thu May  8 06:23:22 2025
Successfully onlined Undo Tablespace 1.
Thu May  8 06:23:22 2025
SMON: enabling tx recovery
Thu May  8 06:23:22 2025
Database Characterset is ZHS16CGB231280
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 4
replication_dependency_tracking turned off (no async multimaster replication found)
Thu May  8 06:23:23 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Starting background process QMNC
QMNC started with pid=22, OS id=15792
Thu May  8 06:23:25 2025
ORACLE Instance orcl1 (pid = 13) - Error 607 encountered while recovering transaction (9, 33) on object 775794.
Thu May  8 06:23:25 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Thu May  8 06:23:26 2025
Completed: ALTER DATABASE OPEN
Thu May  8 06:23:26 2025
Doing block recovery for file 118 block 333578
Block recovery from logseq 74302, block 22 to scn 46740761996
Thu May  8 06:23:26 2025
Recovery of Online Redo Log: Thread 1 Group 1 Seq 74302 Reading mem 0
  Mem# 0: /dev/vgdata/rrac_redo01
Block recovery stopped at EOT rba 74302.33.16
Block recovery completed at rba 74302.33.16, scn 10.3791089036
Thu May  8 06:23:33 2025
Trace dumping is performing id=[cdmp_20250508062324]
Thu May  8 06:25:55 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Thu May  8 06:25:58 2025
ORACLE Instance orcl1 (pid = 13) - Error 607 encountered while recovering transaction (9, 33) on object 775794.
Thu May  8 06:27:32 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
Doing block recovery for file 118 block 333578
Block recovery from logseq 74302, block 372 to scn 46740952565
Thu May  8 06:27:41 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
ORACLE Instance orcl1 (pid = 13) - Error 607 encountered while recovering transaction (9, 33) on object 775794.
Thu May  8 06:27:43 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Doing block recovery for file 118 block 333578
Block recovery from logseq 74302, block 372 to scn 46740952565
Thu May  8 06:27:45 2025
Recovery of Online Redo Log: Thread 1 Group 1 Seq 74302 Reading mem 0
  Mem# 0: /dev/vgdata/rrac_redo01
Block recovery completed at rba 74302.394.16, scn 10.3791279606
Thu May  8 06:27:47 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_smon_15721.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [118], [333578], [18019], [], [], [], []
Thu May  8 06:28:07 2025
Errors in file /users/oracle/admin/orcl/bdump/orcl1_pmon_15690.trc:
ORA-00474: SMON process terminated with error
Thu May  8 06:28:07 2025
PMON: terminating instance due to error 474

这个数据库重启是由于smon进程异常导致数据库关闭,然后rac自动拉起数据库,从而出现了循环重启启动,smon异常的主要原因是由于“ORACLE Instance orcl1 (pid = 13) – Error 607 encountered while recovering transaction (9, 33) on object 775794.”这个表示有实物要回滚,但是遇到了ORA-600 kddummy_blkchk异常,无法完成回滚从而使得smon进程异常进而使得数据实例crash.对于这个问题处理相对比较简单
1. 通过ORA-600 kddummy_blkchk找出来报错对象

ORA-00600: internal error code, arguments: [kddummy_blkchk], [a], [b], 1

ARGUMENTS:
Arg [a] Absolute file number
Arg [b] Block number
Arg 1 Internal error code returned from kcbchk() which indicates the problem encountered.

2. 屏蔽实物回滚,打开数据库,然后对通过dba_extents 查询出来的对象/或者根据事务报错的object信息查询出来对象进行重建,完成本次恢复任务

关于ORA-00600: internal error code, arguments: [kddummy_blkchk], [a], [b], 相关目前已知bug

Bug Fixed Description
12349316 11.2.0.4, 12.1.0.2, 12.2.0.1 DBMS_SPACE_ADMIN.TABLESPACE_FIX_BITMAPS fails with ORA-600 [kddummy_blkchk] / ORA-600 [kdBlkCheckError] / ORA-607
17325413 11.2.0.3.BP23, 11.2.0.4.2, 11.2.0.4.BP04, 12.1.0.1.3, 12.1.0.2, 12.2.0.1 Drop column with DEFAULT value and NOT NULL definition ends up with Dropped Column Data still on Disk leading to Corruption
13715932 11.2.0.4, 12.1.0.1 ORA-600 [kddummy_blkchk] [18038] while adding extents to a large datafile
12417369 11.2.0.2.5, 11.2.0.2.BP13, 11.2.0.2.GIPSU05, 11.2.0.3, 12.1.0.1 Block corruption from rollback on compressed table
10324526 10.2.0.5.4, 11.1.0.7.8, 11.2.0.2.3, 11.2.0.2.BP06, 11.2.0.3, 12.1.0.1 ORA-600 [kddummy_blkchk] [6106] / corruption on COMPRESS table in TTS
10113224 11.2.0.3, 12.1.0.1 Index coalesce may generate invalid redo if blocks in the buffer cache are invalid/corrupted
9726702 11.2.0.3, 12.1.0.1 DBMS_SPACE_ADMIN.assm_segment_verify reports HWM related inconsistencies
9724970 11.2.0.1.BP08, 11.2.0.2.2, 11.2.0.2.BP02, 11.2.0.3, 12.1.0.1 Block Corruption with PDML UPDATE. ORA_600 [4511] OERI[kdblkcheckerror] by block check
9711859 10.2.0.5.1, 11.1.0.7.6, 11.2.0.2, 12.1.0.1 ORA-600 [ktsptrn_fix-extmap] / ORA-600 [kdblkcheckerror] during extent allocation caused by bug 8198906
9581240 11.1.0.7.9, 11.2.0.2, 12.1.0.1 Corruption / ORA-600 [kddummy_blkchk] [6101] / ORA-600 [7999] after RENAME operation during ROLLBACK
9350204 11.2.0.3, 12.1.0.1 Spurious ORA-600 [kddummy_blkchk] .. [6145] during CR operations on tables with ROWDEPENDENCIES
9231605 11.1.0.7.4, 11.2.0.1.3, 11.2.0.1.BP02, 11.2.0.2, 12.1.0.1 Block corruption with missing row on a compressed table after DELETE
9119771 11.2.0.2, 12.1.0.1 OERI [kddummy_blkchk]…[6108] from ‘SHRINK SPACE CASCADE’
9019113 11.2.0.1.BP02, 11.2.0.2, 12.1.0.1 ORA-600 [17182] ORA-7445 [memcpy] ORA-600 [kdBlkCheckError] for OLTP COMPRESS table in OLTP Compression REDO during RECOVERY
8951812 11.2.0.2, 12.1.0.1 Corrupt index by rebuild online. Possible OERI [kddummy_blkchk] by SMON
8720802 10.2.0.5, 11.2.0.1.BP07, 11.2.0.2, 12.1.0.1 Add check for row piece pointing to itself (db_block_checking,dbv,rman,analyze)
8331063 11.2.0.3, 12.1.0.1 Corrupt Undo. ORA-600 [2015] in Undo Block During Rollback
6523037 11.2.0.1.BP07, 11.2.0.2.2, 11.2.0.2.BP01, 11.2.0.3, 12.1.0.1 Corruption / ORA-600 [kddummy_blkchk] [6110] on update
8277580 11.1.0.7.2, 11.2.0.1, 11.2.0.2, 12.1.0.1 Corruption on compressed tables during Recovery and Quick Multi Delete (QMD).
9964102 11.2.0.1 OERI:2015 / OERI:kddummy_blkchk / undo corruption from supplementat logging with compressed tables
8613137 11.1.0.7.2, 11.2.0.1 ORA-600 updating table with DEFERRED constraints
8360192 11.1.0.7.6, 11.2.0.1 ORA-600 [kdBlkCheckError] [6110] / corruption from insert
8239658 10.2.0.5, 11.2.0.1 Dump / corruption writing row to compressed table
8198906 10.2.0.5, 11.2.0.1 OERI [kddummy_blkchk] / OERI [5467] for an aborted transaction of allocating extents
7715244 11.1.0.7.2, 11.2.0.1 Corruption on compressed tables. Error codes 6103 / 6110
7662491 10.2.0.4.2, 10.2.0.5, 11.1.0.7.4, 11.2.0.1 Array Update can corrupt a row. Errors OERI[kghstack_free1] or OERI[kddummy_blkchk][6110]
7411865 10.2.0.4.2, 10.2.0.5, 11.1.0.7.1, 11.2.0.1 OERI:13030 / ORA-1407 / block corruption from UPDATE .. RETURNING DML with trigger
7331181 11.2.0.1 ORA-1555 or OERI [kddummy_blkchk] [file#] [block#] [6126] during CR Rollback in query
7293156 11.1.0.7, 11.2.0.1 ORA-600 [2023] by Parallel Transaction Rollback when applying Multi-block undo Head-piece / Tail-piece
7041254 11.1.0.7.5, 11.2.0.1 ORA-19661 during RMAN restore check logical of compressed backup / IOT dummy key
6760697 10.2.0.4.3, 10.2.0.5, 11.1.0.7, 11.2.0.1 DBMS_SPACE_ADMIN.ASSM_SEGMENT_VERIFY does not detect certain segment header block corruption
6647480 10.2.0.4.4, 10.2.0.5, 11.1.0.7.3, 11.2.0.1 Corruption / OERI [kddummy_blkchk] .. [18021] with ASSM
6134368 10.2.0.5, 11.2.0.1 ORA-1407 / block corruption from UPDATE .. RETURNING DML with trigger – SUPERCEDED
6057203 10.2.0.4, 11.1.0.7, 11.2.0.1 Corruption with zero length column (ZLC) / OERI [kcbchg1_6] from Parallel update
6653934 10.2.0.4.2, 10.2.0.5, 11.1.0.7 Dump / block corruption from ONLINE segment shrink with ROWDEPENDENCIES
6674196 10.2.0.4, 10.2.0.5, 11.1.0.6 OERI / buffer cache corruption using ASM, OCFS or any ksfd client like ODM
5599596 10.2.0.4, 11.1.0.6 Block corruption / OERI [kddummy_blkchk] on clustered or compressed tables
5496041 10.2.0.4, 11.1.0.6 OERI[6006] / index corruption on compressed index
5386204 10.2.0.4.1, 10.2.0.5, 11.1.0.6 Block corruption / OERI[kddummy_blkchk] after direct load of ASSM segment
5363584 10.2.0.4, 11.1.0.6 Array insert into table can corrupt redo
4602031 10.2.0.2, 11.1.0.6 Block corruption from UPDATE or MERGE into compressed table
4493447 11.1.0.6 Spurious ORA-600 [kddummy_blkchk] [file#] [block#] [6145] on rollback of array update
4329302 11.1.0.6 OERI [kddummy_blkchk] [file#] [block#] [6145] on rollback of update with logminer
6075487 10.2.0.4 OERI[kddummy_blkchk]..[18020/18026] for DDL on plugged ASSM tablespace with FLASHBACK
4054640 10.1.0.5, 10.2.0.1 Block corruption / OERI [kddummy_blkchk] at physical standby
4000840 10.1.0.4, 10.2.0.1, 9.2.0.7 Update of a row with more than 255 columns can cause block corruption
3772033 9.2.0.7, 10.1.0.4, 10.2.0.1 OERI[ktspfmb_create1] creating a LOB in ASSM using 2k blocksize

 

记录一次asm disk加入到vg通过恢复直接open库的案例

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:记录一次asm disk加入到vg通过恢复直接open库的案例

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户在不清楚磁盘被asm disk使用的情况下,直接分区做pv,加入到vg中并且分配给了lv,导致数据库异常
QQ20250504
144809


通过操作系统层面分析,确认客户把data磁盘组的一个磁盘给处理掉了,导致数据库报错

WARNING: ASMB force dismounting group 2 (DATA) due to failover
SUCCESS: diskgroup DATA was dismounted
2025-05-04T07:03:19.910082+08:00
KCF: read, write or open error, block=0x201544 online=1
        file=102 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/xifenfei_61.dbf'
        error=15078 txt: ''
2025-05-04T07:03:19.918972+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbwc_18507.trc:
2025-05-04T07:03:19.952045+08:00
KCF: read, write or open error, block=0x2013e7 online=1
        file=102 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/xifenfei_61.dbf'
        error=15078 txt: ''
2025-05-04T07:03:19.964538+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbw7_18486.trc:
2025-05-04T07:03:19.967133+08:00
KCF: read, write or open error, block=0x230e71 online=1
        file=105 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/xifenfei_64.dbf'
        error=15078 txt: ''
2025-05-04T07:03:19.973289+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbw2_18466.trc:
2025-05-04T07:03:19.978514+08:00
KCF: read, write or open error, block=0x1f6e91 online=1
        file=86 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/xifenfei_52.dbf'
        error=15078 txt: ''
2025-05-04T07:03:19.991060+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbwd_18511.trc:
2025-05-04T07:03:19.995762+08:00
KCF: read, write or open error, block=0x7f8 online=1
        file=15 '+DATA/ORCL/F7D939D6DBE06C71E053C30114AC1F10/DATAFILE/undotbs01.dbf'
        error=15078 txt: ''
2025-05-04T07:03:20.006862+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_dbwa_18498.trc:
2025-05-04T07:03:20.020739+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_imr0_18937.trc:

这个客户比较幸运,处理该磁盘之后,没有往对应的lv中写入太多数据,导致覆盖部分很少

[root@rac01 rules.d]# df -h
文件系统               容量  已用  可用 已用% 挂载点
/dev/mapper/nlas-root  800G  272G  528G   34% /
devtmpfs               284G     0  284G    0% /dev
tmpfs                  284G  637M  283G    1% /dev/shm
tmpfs                  284G  4.0G  280G    2% /run
tmpfs                  284G     0  284G    0% /sys/fs/cgroup
/dev/mapper/nlas-home  200G   64M  200G    1% /home
/dev/sda1              197M  158M   40M   80% /boot
tmpfs                   57G   40K   57G    1% /run/user/0
tmpfs                   57G   48K   57G    1% /run/user/1000
[root@rac01 rules.d]# pvs
  PV         VG   Fmt  Attr PSize   PFree
  /dev/sda2  nlas lvm2 a--  564.00g    0 
  /dev/sdb1  nlas lvm2 a--   <2.00t 1.51t
[root@rac01 rules.d]# vgs
  VG   #PV #LV #SN Attr   VSize VFree
  nlas   2   3   0 wz--n- 2.55t 1.51t
[root@rac01 rules.d]# lvs
  LV   VG   Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  home nlas -wi-ao---- 200.00g                                                    
  root nlas -wi-ao---- 800.00g                                                    
  swap nlas -wi-ao----  64.00g                                                    

通过底层对磁盘进行分析,发现备份的磁盘头均以损坏,通过深入分析确认f1b1在sdb磁盘的第10个au上,通过相关信息,使用dul工具加载磁盘组,并分析元数据信息,发现恢复数据需要的元数据都可以正常加载
asm-dul


直接使用dul抽取数据到文件系统,然后open数据库成功
open-asm

然后通过rman 检测坏块(3T多的库只有不到5000个坏块,相对来说效果非常好),对于坏块对象进行处理,完美完成本次恢复工作.对于这次能够有这样好的恢复效果有几个因素:
1)asm disk 加入到vg,并分配给lv之后,立刻停止写入操作,避免了因为写入数据而覆盖asm 磁盘的带来的风险
2)由于是19c库,默认au为4M,使得数据库文件数据相对比较靠后,覆盖几率小了一点
3)由于文件系统是xfs,相对覆盖比ext4会少很多
4)是云环境的ssd磁盘,没有触发trim功能
以前类似asm disk异常恢复的相关case汇总:
asm磁盘加入vg恢复
asm磁盘dd破坏恢复
asm磁盘分区丢失恢复
pvid=yes导致asm无法mount
win asm disk header 异常恢复
又一例asm disk 加入vg故障
pvcreate asm disk导致asm磁盘组异常恢复
asm disk被加入到另外一个磁盘组故障恢复
再一例asm disk被误加入vg并且扩容lv恢复
再一起asm disk被格式化成ext3文件系统故障恢复
一次完美的asm disk被格式化ntfs恢复
asm disk误设置pvid导致asm diskgroup无法mount恢复
asm disk被分区,格式化为ext4恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例