网卡异常导致数据库实例启动异常

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:网卡异常导致数据库实例启动异常

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一套集群,一个节点启动正常,另外一个节点无法正常启动实例,启动异常节点alert日志

Tue Mar 07 19:07:29 2023
IPC Send timeout detected. Receiver ospid 6386 [
Tue Mar 07 19:07:29 2023
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms0_6386.trc:
IPC Send timeout detected. Receiver ospid 6402 [
Tue Mar 07 19:07:29 2023
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms4_6402.trc:
Tue Mar 07 19:07:29 2023
Received an instance abort message from instance 1
Please check instance 1 alert and LMON trace files for detail.
System state dump requested by (instance=2, osid=6384 (LMD0)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_diag_6374_20230307190729.trc
LMD0 (ospid: 6384): terminating the instance due to error 481
Dumping diagnostic data in directory=[cdmp_20230307190729],
      requested by (instance=2, osid=6384 (LMD0)), summary=[abnormal instance termination].
Instance terminated by LMD0, pid = 6384

正常节点alert日志

Tue Mar 07 19:02:07 2023
Reconfiguration started (old inc 20, new inc 22)
List of instances:
 1 2 (myinst: 1)
 Global Resource Directory frozen
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Tue Mar 07 19:02:08 2023
 LMS 5: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Tue Mar 07 19:02:08 2023
 LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Tue Mar 07 19:02:08 2023
 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Tue Mar 07 19:02:08 2023
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Tue Mar 07 19:02:08 2023
 LMS 3: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Tue Mar 07 19:02:08 2023
 LMS 7: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Tue Mar 07 19:02:08 2023
Tue Mar 07 19:02:08 2023
 LMS 4: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 LMS 6: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Submitted all GCS remote-cache requests
 Fix write in gcs resources
Tue Mar 07 19:02:27 2023
IPC Send timeout detected. Sender: ospid 6936 [oracle@xffnode1.localdomain (PING)]
Receiver: inst 2 binc 441249706 ospid 59731
Tue Mar 07 19:07:29 2023
IPC Send timeout detected. Sender: ospid 6946 [oracle@xffnode1.localdomain (LMS0)]
Receiver: inst 2 binc 429479852 ospid 6386
Tue Mar 07 19:07:29 2023
IPC Send timeout detected. Sender: ospid 6962 [oracle@xffnode1.localdomain (LMS4)]
Receiver: inst 2 binc 429479854 ospid 6402
Tue Mar 07 19:07:29 2023
IPC Send timeout detected. Sender: ospid 6966 [oracle@xffnode1.localdomain (LMS5)]

通过上述日志,可以确认主要由于两个节点之间无法正常通讯,从而使得新节点无法加入到集群(无法完成集群重组),从而使得实例启动异常.一般出现这类情况最检查的就是私网异常,通过分析oswnetstat记录发现packet reassembles failed特别严重
20230307230341


一般出现该问题,考虑是由于ipfrag_*_thresh默认值不足导致,通过设置

net.ipv4.ipfrag_high_thresh = 16777216
net.ipv4.ipfrag_low_thresh = 15728640

临时请库成功,但是数据库实例重组时间依旧过长
20230307230658


packet reassembles failed依旧在增加,通过分析网卡情况发现网卡异常,采用haip(双万兆网卡)的其中一块网卡异常
20230307230813

为了数据库性能不收太大影响,临时禁用异常网卡,重启库正常
20230308001033

后续等网络层面解决之后再启用该网卡

最新版oracle dul工具

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:最新版oracle dul工具

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

oracle官方dul工具继续更新,现在已经更新到12.2.0.2.5版本,可以支持oracle 6及其以上的所有版本,是oracle数据库在极端情况下恢复利器

[oracle@xifenfei dul]$ ./dul

Data UnLoader: 12.2.0.2.5 - Internal Only - on Sun Mar  5 15:12:11 2023
with 64-bit io functions and the decompression option

Copyright (c) 1994 2023 Bernard van Duijnen All rights reserved.

 Strictly Oracle Internal Use Only


DUL: Warning: Could not open parameter file <init.dul>
DUL: Warning: Compatible is set to 11 Values can be 6|7|8|9|10|11|12|17|18
DUL: Warning: no parameter file means no logfile
DUL> 

配置init.dul文件之后

[oracle@iZbp1hx0enix3hix1kvyrxZ dul]$ ./dul

Data UnLoader: 12.2.0.2.5 - Internal Only - on Sun Mar  5 15:22:26 2023
with 64-bit io functions and the decompression option

Copyright (c) 1994 2023 Bernard van Duijnen All rights reserved.

 Strictly Oracle Internal Use Only


Found db_id = 1588579327
Found db_name = ORCL
DUL> show datafiles;
ts# rf# start   blocks offs open  err file name
  0   1     0    97281    0    1    0 /u01/app/oracle/oradata/orcl/system01.dbf
  1   2     0   387841    0    1    0 /u01/app/oracle/oradata/orcl/sysaux01.dbf
  2   3     0    37761    0    1    0 /u01/app/oracle/oradata/orcl/undotbs01.dbf
  4   4     0     5761    0    1    0 /u01/app/oracle/oradata/orcl/users01.dbf
  7   5     0    16385    0    1    0 /u01/app/oracle/oradata/orcl/t_xifenfei01.dbf
DUL> bootstrap;
Probing file = 1, block = 520
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      59 rows unloaded
Reading BOOTSTRAP.dat 59 entries loaded
Parsing Bootstrap$ contents
Generating dict.ddl for version 11
 OBJ$: segobjno 18, file 1 block 240
 TAB$: segobjno 2, tabno 1, file 1  block 144
 COL$: segobjno 2, tabno 5, file 1  block 144
 USER$: segobjno 10, tabno 1, file 1  block 208
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$   86411 rows unloaded
. unloading table                      TAB$
DUL: Warning: Block has been marked soft corrupt
DUL: Error: While processing ts# 0 file# 1 block# 13335
    2904 rows unloaded
. unloading table                      COL$
DUL: Warning: Block has been marked soft corrupt
DUL: Error: While processing ts# 0 file# 1 block# 13335
   94714 rows unloaded
. unloading table                     USER$      88 rows unloaded
Reading USER.dat 88 entries loaded
Reading OBJ.dat 86411 entries loaded and sorted 86411 entries
Reading TAB.dat 2904 entries loaded
Reading COL.dat 94714 entries loaded and sorted 94714 entries
Reading BOOTSTRAP.dat 59 entries loaded

DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 11
 OBJ$: segobjno 18, file 1 block 240
 TAB$: segobjno 2, tabno 1, file 1  block 144
 COL$: segobjno 2, tabno 5, file 1  block 144
 USER$: segobjno 10, tabno 1, file 1  block 208
 TABPART$: segobjno 591, file 1 block 4000
 INDPART$: segobjno 596, file 1 block 4040
 TABCOMPART$: segobjno 613, file 1 block 4176
 INDCOMPART$: segobjno 618, file 1 block 4216
 TABSUBPART$: segobjno 603, file 1 block 4096
 INDSUBPART$: segobjno 608, file 1 block 4136
 IND$: segobjno 2, tabno 3, file 1  block 144
 ICOL$: segobjno 2, tabno 4, file 1  block 144
 LOB$: segobjno 2, tabno 6, file 1  block 144
 COLTYPE$: segobjno 2, tabno 7, file 1  block 144
 TYPE$: segobjno 518, tabno 1, file 1  block 3464
 COLLECTION$: segobjno 518, tabno 2, file 1  block 3464
 ATTRIBUTE$: segobjno 518, tabno 3, file 1  block 3464
 LOBFRAG$: segobjno 624, file 1 block 4264
 LOBCOMPPART$: segobjno 627, file 1 block 4288
 UNDO$: segobjno 15, file 1 block 224
 TS$: segobjno 6, tabno 2, file 1  block 176
 PROPS$: segobjno 98, file 1 block 800
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   86411 rows unloaded
. unloading table                      TAB$
DUL: Warning: Block has been marked soft corrupt
DUL: Error: While processing ts# 0 file# 1 block# 13335

DUL: Warning: Recreating file "TAB.ctl"
    2904 rows unloaded
. unloading table                      COL$
DUL: Warning: Block has been marked soft corrupt
DUL: Error: While processing ts# 0 file# 1 block# 13335

DUL: Warning: Recreating file "COL.ctl"
   94714 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
      88 rows unloaded
. unloading table                  TABPART$     143 rows unloaded
. unloading table                  INDPART$     124 rows unloaded
. unloading table               TABCOMPART$       1 row  unloaded
. unloading table               INDCOMPART$       0 rows unloaded
. unloading table               TABSUBPART$      32 rows unloaded
. unloading table               INDSUBPART$       0 rows unloaded
. unloading table                      IND$
DUL: Warning: Block has been marked soft corrupt
DUL: Error: While processing ts# 0 file# 1 block# 13335
    4931 rows unloaded
. unloading table                     ICOL$
DUL: Warning: Block has been marked soft corrupt
DUL: Error: While processing ts# 0 file# 1 block# 13335
    7644 rows unloaded
. unloading table                      LOB$
DUL: Warning: Block has been marked soft corrupt
DUL: Error: While processing ts# 0 file# 1 block# 13335
    1031 rows unloaded
. unloading table                  COLTYPE$
DUL: Warning: Block has been marked soft corrupt
DUL: Error: While processing ts# 0 file# 1 block# 13335
    2565 rows unloaded
. unloading table                     TYPE$    2909 rows unloaded
. unloading table               COLLECTION$    1002 rows unloaded
. unloading table                ATTRIBUTE$   11328 rows unloaded
. unloading table                  LOBFRAG$       1 row  unloaded
. unloading table              LOBCOMPPART$       0 rows unloaded
. unloading table                     UNDO$      21 rows unloaded
. unloading table                       TS$       8 rows unloaded
. unloading table                    PROPS$      36 rows unloaded
Reading USER.dat 88 entries loaded
Reading OBJ.dat 86411 entries loaded and sorted 86411 entries
Reading TAB.dat 2904 entries loaded
Reading COL.dat 94714 entries loaded and sorted 94714 entries
Reading TABPART.dat 143 entries loaded and sorted 143 entries
Reading TABCOMPART.dat 1 entries loaded and sorted 1 entries
Reading TABSUBPART.dat 32 entries loaded and sorted 32 entries
Reading INDPART.dat 124 entries loaded and sorted 124 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 4931 entries loaded
Reading LOB.dat
DUL: Notice: Increased the size of DC_LOBS from 1024 to 8192 entries
 1031 entries loaded
Reading ICOL.dat 7644 entries loaded
Reading COLTYPE.dat 2565 entries loaded
Reading TYPE.dat 2909 entries loaded
Reading ATTRIBUTE.dat 11328 entries loaded
Reading COLLECTION.dat 1002 entries loaded
Reading BOOTSTRAP.dat 59 entries loaded
Reading LOBFRAG.dat 1 entries loaded and sorted 1 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 8 entries loaded
Reading PROPS.dat 36 entries loaded
Database character set is ZHS16GBK
Database national character set is AL16UTF16
DUL> 

误删除asm disk导致磁盘组无法mount数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:误删除asm disk导致磁盘组无法mount数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户误删除asm disk两个lun(由于这个是这个存储的特殊性,删除lun之后,存储层面无法恢复出来对应的lun数据,导致客户彻底放弃了硬件层面恢复的可能性.),导致asm磁盘组无法正常mount

SQL> ALTER DISKGROUP DATA MOUNT  /* asm agent *//* {1:27928:40938} */ 
NOTE: cache registered group DATA number=3 incarn=0x60fa38b1
NOTE: cache began mount (first) of group DATA number=3 incarn=0x60fa38b1
NOTE: Assigning number (3,0) to disk (/dev/rdisk/VD02_DBF)
NOTE: Assigning number (3,1) to disk (/dev/rdisk/VD03_DBF)
NOTE: Assigning number (3,2) to disk (/dev/rdisk/VD04_DBF)
NOTE: Assigning number (3,3) to disk (/dev/rdisk/VD05_DBF)
NOTE: Assigning number (3,4) to disk (/dev/rdisk/VD06_DBF)
Thu Dec 29 10:21:20 2022
NOTE: GMON heartbeating for grp 3
GMON querying group 3 at 29 for pid 29, osid 3770
NOTE: Assigning number (3,5) to disk ()
NOTE: Assigning number (3,6) to disk ()
GMON querying group 3 at 30 for pid 29, osid 3770
NOTE: cache dismounting (clean) group 3/0x60FA38B1 (DATA) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 3770, image: oracle@db1 (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 3/0x60FA38B1 (DATA) 
NOTE: cache ending mount (fail) of group DATA number=3 incarn=0x60fa38b1
NOTE: cache deleting context for group DATA 3/0x60fa38b1
GMON dismounting group 3 at 31 for pid 29, osid 3770
NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0002 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0003 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0004 in mode 0x7f marked for de-assignment
NOTE: Disk  in mode 0x7f marked for de-assignment
NOTE: Disk  in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "6" is missing from group number "3" 
ORA-15042: ASM disk "5" is missing from group number "3" 
ERROR: ALTER DISKGROUP DATA MOUNT  /* asm agent *//* {1:27928:40938} */

这个客户应该有三个磁盘组存放数据文件,其中data磁盘组的7个磁盘被删除了2个lun,导致data磁盘组无法mount,客户希望尽可能恢复其中数据,对于这种情况,由于2个lun完全丢失,直接通过dul之类的工具拷贝asm数据文件恢复不可行(因为很多asm的元数据也会在丢失的lun里面,导致拷贝出来的数据文件异常太多,恢复效果会很差),对于这种情况采用asm disk header 彻底损坏恢复的恢复方法,尽可能的从block层面恢复出来所有可以恢复的数据块中的数据
20230209100350


由于这个其中涉及了system表空间(oracle损坏严重),结合客户几年前的一个system历史备份文件,恢复出来字典,然后尽可能的恢复数据文件,最终最大限度给客户恢复数据,让客户的损失降到最低.

win强制修改盘符导致oracle异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:win强制修改盘符导致oracle异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户反馈,他们在系统没有关闭数据库的情况下,强制修改了win系统盘符,然后导致数据库异常,启动报错

Sat Feb 25 12:50:40 2023
Recovery of Online Redo Log: Thread 1 Group 2 Seq 10440 Reading mem 0
  Mem# 0 errs 0: G:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO02.LOG
Sat Feb 25 12:50:40 2023
Completed redo application
Sat Feb 25 12:50:40 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p001_5604.trc:
ORA-00600: 内部错误代码, 参数: [2037], [25801018], [2973409798], [6], [255], [25], [1198764346], [100796692]

Sat Feb 25 12:50:40 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_7648.trc:
ORA-07445: 出现异常错误: 核心转储 [ACCESS_VIOLATION][_kslwlmod+166][PC:0x469742][ADDR:0x54F8][UNABLE_TO_WRITE][]
ORA-04096: 触发器 '' 的 WHEN 子句过大, 限量为 2K

Sat Feb 25 12:50:40 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_7252.trc:
ORA-00600: internal error code, arguments: [ksuapc2], [258], [0], [2], [1], [2], [], []

Sat Feb 25 12:50:43 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p001_5604.trc:
ORA-00081: 地址范围 [0x77240440, 0x77240444) 不可读
ORA-07445: 出现异常错误: 核心转储 [ACCESS_VIOLATION][_ksl_cleanup+723][PC:0x46E373][ADDR:0x1C][UNABLE_TO_READ][]
ORA-00081: 地址范围 [0x77240440, 0x77240444) 不可读
ORA-00600: 内部错误代码, 参数: [2037], [25801018], [2973409798], [6], [255], [25], [1198764346], [100796692]

Sat Feb 25 12:50:45 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_dbw0_6332.trc:
ORA-07445: ??????: ???? [ACCESS_VIOLATION][_kews_idle_wait+378][PC:0x604AE6][ADDR:0xED30C470][UNABLE_TO_WRITE][]

Sat Feb 25 12:50:48 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_7648.trc:
ORA-00600: 内部错误代码, 参数: [kslwlflux:1], [0xAB805400], [0x549C], [2], [], [], [], []
ORA-00081: 地址范围 [0x74480443, 0x74480447) 不可读
ORA-00600: 内部错误代码, 参数: [kslwlflux:1], [0xAB805400], [0x549C], [2], [], [], [], []
ORA-00081: 地址范围 [0x74480443, 0x74480447) 不可读
ORA-00600: 内部错误代码, 参数: [kslwlflux:1], [0xAB805400], [0x549C], [2], [], [], [], []
ORA-00081: 地址范围 [0x74480443, 0x74480447) 不可读
ORA-00600: 内部错误代码, 参数: [kslwlflux:1], [0xAB805400], [0x549C], [2], [], [], [], []
ORA-00081: 地址范围 [0x74480443, 0x74480447) 不可读
ORA-00600: 内部错误代码, 参数: [kslwlflux:1], [0xAB805400], [0x549C], [2], [], [], [], []
ORA-00081: 地址范围 [0x74480443, 0x74480447) 不可读
ORA-00600: 内部错误代码, 参数: [kslwlflux:1], [0xAB805400], [0x549C], [2], [], [], [], []
ORA-00081: 地址范围 [0x74480443, 0x74480447) 不可读
ORA-00600: 内部错误代码, 参数: [kslwlflux:1], [0xAB805400], [0x549C], [2], [], [], [], []
ORA-00081: 地址范围 [0x74480443, 0x74480447) 不可读
ORA-00600: 内部错误代码, 参

Sat Feb 25 12:51:34 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_d000_8116.trc:
ORA-07445: ??????: ???? [ACCESS_VIOLATION][_kmcgms+121][PC:0x5D6C71][ADDR:0x50][UNABLE_TO_WRITE][]

Sat Feb 25 12:52:04 2023
USER: terminating instance due to error 472
Sat Feb 25 12:52:48 2023
USER: terminating instance due to error 472
Sat Feb 25 12:52:48 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_6252.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION][_ksuitm+631][PC:0x410C07][ADDR:0x1][UNABLE_TO_READ][]

Sat Feb 25 12:55:35 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_7656.trc:
ORA-07445: ??????: ???? [ACCESS_VIOLATION][_kews_idle_wait+378][PC:0x604AE6][ADDR:0xE530C470][UNABLE_TO_WRITE][]

通过恢复一些恢复之后,数据库open之后又挂掉

Sat Feb 25 15:05:49 2023
SMON: enabling tx recovery
Sat Feb 25 15:05:49 2023
Database Characterset is ZHS16GBK
Sat Feb 25 15:05:50 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_5308.trc:
ORA-00600: 内部错误代码, 参数: [4194], [34], [31], [], [], [], [], []

Sat Feb 25 15:05:50 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_7568.trc:
ORA-00600: 内部错误代码, 参数: [kcbgtcr_13], [], [], [], [], [], [], []

Sat Feb 25 15:05:51 2023
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Sat Feb 25 15:05:51 2023
Doing block recovery for file 2 block 2951
Block recovery from logseq 10441, block 78 to scn 109906860017
Sat Feb 25 15:05:51 2023
Recovery of Online Redo Log: Thread 1 Group 3 Seq 10441 Reading mem 0
  Mem# 0 errs 0: G:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO03.LOG
Block recovery stopped at EOT rba 10441.81.16
Block recovery completed at rba 10441.81.16, scn 25.2532677517
Doing block recovery for file 2 block 113
Block recovery from logseq 10441, block 78 to scn 109906859718
Sat Feb 25 15:05:52 2023
Recovery of Online Redo Log: Thread 1 Group 3 Seq 10441 Reading mem 0
  Mem# 0 errs 0: G:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO03.LOG
Block recovery completed at rba 10441.80.16, scn 25.2532677516
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=30, OS id=6904
Sat Feb 25 15:05:53 2023
db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Sat Feb 25 15:05:53 2023
Completed: alter database open
Sat Feb 25 15:05:53 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j000_5400.trc:
ORA-00600: 内部错误代码, 参数: [4194], [6], [4], [], [], [], [], []

Sat Feb 25 15:05:54 2023
DEBUG: Replaying xcb 0xac458698, pmd 0x8d7e9b9c for failed op 8
Doing block recovery for file 2 block 1515
No block recovery was needed
Sat Feb 25 15:05:55 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j000_5400.trc:
ORA-00600: 内部错误代码, 参数: [4194], [6], [4], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [4194], [6], [4], [], [], [], [], []

Sat Feb 25 15:05:56 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j000_5400.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4194], [6], [4], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [6], [4], [], [], [], [], []

Sat Feb 25 15:05:57 2023
Doing block recovery for file 2 block 2951
Block recovery from logseq 10441, block 78 to scn 109906860017
Sat Feb 25 15:05:57 2023
Recovery of Online Redo Log: Thread 1 Group 3 Seq 10441 Reading mem 0
  Mem# 0 errs 0: G:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO03.LOG
Block recovery completed at rba 10441.81.16, scn 25.2532677620
Doing block recovery for file 2 block 113
Block recovery from logseq 10441, block 78 to scn 109906860083
Sat Feb 25 15:05:57 2023
Recovery of Online Redo Log: Thread 1 Group 3 Seq 10441 Reading mem 0
  Mem# 0 errs 0: G:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO03.LOG
Block recovery completed at rba 10441.138.16, scn 25.2532677684
Sat Feb 25 15:05:57 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j003_1716.trc:
ORA-12012: 自动执行作业 42568 出错
ORA-00607: 当更改数据块时出现内部错误
ORA-00607: 当更改数据块时出现内部错误

Sat Feb 25 15:05:59 2023
Flush retried for xcb 0xac4c5a80, pmd 0x8c0cec74
Doing block recovery for file 2 block 2951
Block recovery from logseq 10441, block 78 to scn 109906860017
Sat Feb 25 15:05:59 2023
Recovery of Online Redo Log: Thread 1 Group 3 Seq 10441 Reading mem 0
  Mem# 0 errs 0: G:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO03.LOG
Sat Feb 25 15:05:59 2023
DEBUG: Replaying xcb 0xac458698, pmd 0x8d7e9b9c for failed op 8
Doing block recovery for file 2 block 1515
No block recovery was needed
Sat Feb 25 15:05:59 2023
Block recovery completed at rba 10441.81.16, scn 25.2532677620
Sat Feb 25 15:06:00 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j002_6400.trc:
ORA-00600: internal error code, arguments: [4194], [6], [4], [], [], [], [], []

Sat Feb 25 15:06:02 2023
DEBUG: Replaying xcb 0xac458698, pmd 0x8d7e9b9c for failed op 8
Doing block recovery for file 2 block 1515
No block recovery was needed
Sat Feb 25 15:06:02 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_pmon_1076.trc:
ORA-00600: 内部错误代码, 参数: [4194], [6], [4], [], [], [], [], []

Sat Feb 25 15:06:03 2023
PMON: terminating instance due to error 472
Sat Feb 25 15:06:03 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j007_7188.trc:
ORA-00472: PMON 进程因错误而终止

Sat Feb 25 15:06:03 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j006_7624.trc:
ORA-00472: PMON 进程因错误而终止

Sat Feb 25 15:06:03 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j005_5688.trc:
ORA-00472: PMON  process terminated with error

Sat Feb 25 15:06:10 2023
Errors in file g:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_7568.trc:
ORA-00472: PMON 进程因错误而终止

Instance terminated by PMON, pid = 1076

这个ORA-600 4194错误主要是由于undo异常,从而引起pmon异常,报ORA-00472错误.对undo进行处理,数据库稳定open,逻辑导出数据,完成恢复工作,完美帮忙客户恢复数据.

再一例asm disk被误加入vg并且扩容lv恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:再一例asm disk被误加入vg并且扩容lv恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

又一客户把三块asm disk磁盘加入到vg里面(两个节点的root vg中),并且还进行了扩容
sdj-asmdisk


加入之后,还对该lv里面进行了expdp数据导出(导出一半失败了,数据库挂了),进而引起了大量的asm磁盘中数据块被文件系统中的expdp导出的dmp复写
20230217200348

通过对损坏磁盘进行kfed分析大概判断损坏到90GB位置
20230217200625

对于这种asm磁盘损坏比较多的情况,常规kfed无法进行修复,只能采用底层基于block层面的扫描恢复,参考:
asm disk header 彻底损坏恢复
通过底层处理恢复出来数据文件:
datafile

然后通过可以有的2天之前的备份,结合dul工具,恢复出来最近数据,业务层面进行整合,完成本次数据恢复
有过类似的恢复案例:
asm disk被加入vg恢复
又一例asm disk 加入vg故障