ORA-01092: ORACLE 例程终止 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-01092: ORACLE 例程终止 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-01092: ORACLE 例程终止。强行断开连接 错误

SQL> RECOVER DATABASE;
完成介质恢复。
SQL> ALTER DATABASE OPEN;
ALTER DATABASE OPEN
*
ERROR 位于第 1 行:
ORA-01092: ORACLE 例程终止。强行断开连接

查看alert日志

Wed Jul 21 12:32:04 2021
SMON: enabling cache recovery
Wed Jul 21 12:32:04 2021
Errors in file c:\oracle\admin\dcpdm\udump\dcpdm_ora_3004.trc:
ORA-00600: ?????????: [4194], [34], [8], [], [], [], [], []

Wed Jul 21 12:32:05 2021
Recovery of Online Redo Log: Thread 1 Group 2 Seq 495 Reading mem 0
  Mem# 0 errs 0: C:\ORACLE\ORADATA\DCPDM\REDO02.LOG
Recovery of Online Redo Log: Thread 1 Group 2 Seq 495 Reading mem 0
  Mem# 0 errs 0: C:\ORACLE\ORADATA\DCPDM\REDO02.LOG
Wed Jul 21 12:32:05 2021
Errors in file c:\oracle\admin\dcpdm\udump\dcpdm_ora_3004.trc:
ORA-00604: ?? SQL ? 1 ????
ORA-00607: ?????????????
ORA-00600: ?????????: [4194], [34], [8], [], [], [], [], []

Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Wed Jul 21 12:32:05 2021
Errors in file c:\oracle\admin\dcpdm\bdump\dcpdm_pmon_13020.trc:
ORA-00604: error occurred at recursive SQL level 

Instance terminated by USER, pid = 3004
ORA-1092 signalled during: ALTER DATABASE OPEN...

trace文件信息

*** 2021-07-21 12:32:04.000
ksedmp: internal or fatal error
ORA-00600: ?????????: [4194], [34], [8], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
_ksedmp+147          CALLrel  _ksedst+0            
_ksfdmp.108+e        CALLrel  _ksedmp+0            3
_kgeriv+89           CALLreg  00000000             4E59D98 3
_kseipre.107+3f      CALLrel  _kgeriv+0            
_ksesic2+24          CALLrel  _kseipre.107+0       
__VInfreq__kturdb+8  CALLrel  _ksesic2+0           1062 0 22 0 8
b                                                  
_kcoapl+1df          CALLreg  00000000             2BB0F94 2BB100A 11 6C37C014
_kcbapl+71           CALLrel  _kcoapl+0            2BB0F90 6C37C000 1 0 2000
_kcrfwr+734          CALLrel  _kcbapl+0            2BB0F90 6C3FC788 50D4FA0
_kcbchg1+7ec         CALLrel  _kcrfwr+0            
_ktuchg+630          CALLrel  _kcbchg1+0           0 4 50D5228 50D5240 0 0
_ktbchg2+75          CALLrel  _ktuchg+0            2 66F589A4 1 2C8CD14 2C8CD1C
                                                   2BB0F90 2C8C32C 2BB0ED0 0 0
_kddchg+18f          CALLrel  _ktbchg2+0           0 66F589A4 2C8CD14 2C8CD1C
                                                   2BB0F90 2C8C324 2BB0ED0 0 0
_kduovw.53+6e3       CALLrel  _kddchg+0            2C8C2E8 2C8CD14 2C8CD1C
                                                   2BB0F90 2BB0ED0 0 0
_kduurp.53+61a       CALLrel  _kduovw.53+0         2C8C2E8
_kdusru+aa5          CALLrel  _kduurp.53+0         2C8C2E8 66F589FC
_kauupd+12e          CALLrel  _kdusru+0            2C8C71C 66F589FC 2C8C2E8 0
_updrow+729          CALLrel  _kauupd+0            2C8C718 66F589FC 2C8C2E8 0
                                                   66F58448 E F 66F60EE0 12
                                                   50DBBA4 50DBBA8
_qerupFetch+107      CALLrel  _updrow+0            
_updaul+202          CALL???  00000000             66F58660 0 66F6BC3C 7FFF
_updThreePhaseExe+b  CALLrel  _updaul+0            66F6B9D0 50DBD34 0
6                                                  
_updexe+105          CALLrel  _updThreePhaseExe+0  66F6B9D0 0 2C8C2E8 50DBE10
                                                   66F6B9D0 1 50DBE10 0
_opiexe+f97          CALLrel  _updexe+0            66F6B9D0 50DBF4C
_opiodr+4cd          CALLreg  00000000             4 3 50DC898
_rpidrus.43+99       CALLrel  _opiodr+0            4 3 50DC898 A
_skgmstack+71        CALLreg  00000000             50DC488
_rpidru+6d           CALLrel  _skgmstack+0         50DC4A0 4E59C20 F618 778198
                                                   50DC488
_rpiswu2+17e         CALLreg  00000000             50DC7C0
_rpidrv+109          CALLrel  _rpiswu2+0           
_rpiexe+33           CALLrel  _rpidrv+0            A 4 50DC898 8
_ktuscu+2a8          CALLrel  _rpiexe+0            A
_kqrcmt+2c2          CALL???  00000000             66F6D654 3
..1.18_2.filter.95+  CALLrel  _kqrcmt+0            67B88CD4 1 0 4E59D98 4E59D98
159                                                FF 0 0 0
..1.23_5.filter.99+  CALLrel  _ktcrcm+0            67B88CD4 0 0 0 0 1 0 0
14d                                                
_ktuini+64           CALLrel  _ktuiup.99+0         50DD994
_adbdrv+2665         CALLrel  _ktuini+0            50DD994
..1.5_1.filter.29+2  CALLrel  _adbdrv+0            
9d                                                 
_opiosq0+9a4         CALLrel  _opiexe+0            4 0 50DDDDC
_kpooprx+c6          CALLrel  _opiosq0+0           3 E 50DDE74 24
_kpoal8+225          CALLrel  _kpooprx+0           50DE73C 50DE684 13 1 0 24
_opiodr+4cd          CALLreg  00000000             5E 14 50DE738
_ttcpip+a86          CALLreg  00000000             5E 14 50DE738 0
_opitsk+2f4          CALLrel  _ttcpip+0            
_opiino+5fc          CALLrel  _opitsk+0            0 0 4E5FEE8 2BDF044 F3 0
_opiodr+4cd          CALLreg  00000000             3C 4 50DFBD8
_opidrv+233          CALLrel  _opiodr+0            3C 4 50DFBD8 0
_sou2o+19            CALLrel  _opidrv+0            
_opimai+10a          CALLrel  _sou2o+0             
_OracleThreadStart@  CALLrel  _opimai+0            
4+35c                                              
7C824826             CALLreg  00000000             
 
--------------------- Binary Stack Dump ---------------------

比较明显时候由于在更新undo$的时候需要找前镜像信息

Block image after block recovery:
buffer tsn: 0 rdba: 0x0040018b (1/395)
scn: 0x0000.07d52871 seq: 0x01 flg: 0x04 tail: 0x28710201
frmt: 0x02 chkval: 0xc85e type: 0x02=KTU UNDO BLOCK
 
********************************************************************************
UNDO BLK:  
xid: 0x0000.05a.0000002d  seq: 0x33  cnt: 0x22  irb: 0x22  icl: 0x0   flg: 0x0000
 
 Rec Offset      Rec Offset      Rec Offset      Rec Offset      Rec Offset
---------------------------------------------------------------------------
0x01 0x1f04     0x02 0x1e20     0x03 0x1d3c     0x04 0x1c58     0x05 0x1b74     
0x06 0x1a90     0x07 0x19ac     0x08 0x18c8     0x09 0x17e4     0x0a 0x1700     
0x0b 0x161c     0x0c 0x1538     0x0d 0x1454     0x0e 0x1370     0x0f 0x128c     
0x10 0x11a8     0x11 0x10c4     0x12 0x0fe0     0x13 0x0efc     0x14 0x0e18     
0x15 0x0d34     0x16 0x0c50     0x17 0x0b6c     0x18 0x0a88     0x19 0x09a4     
0x1a 0x08c0     0x1b 0x07dc     0x1c 0x06f8     0x1d 0x0614     0x1e 0x0530     
0x1f 0x044c     0x20 0x0368     0x21 0x0284     0x22 0x01a0     
 
*-----------------------------
* Rec #0x1  slt: 0x0b  objn: 15(0x0000000f)  objd: 15  tblspc: 0(0x00000000)
*       Layer:  11 (Row)   opc: 1   rci 0x00   
Undo type:  Regular undo    Begin trans    Last buffer split:  No 
Temp Object:  No 
Tablespace Undo:  No 
rdba: 0x00000000
*-----------------------------
uba: 0x0040018a.0033.22 ctl max scn: 0x0000.07853941 prv tx scn: 0x0000.07853943
KDO undo record:
KTB Redo 
op: 0x04  ver: 0x01  
op: L  itl: xid:  0x0000.042.0000002d uba: 0x0040018a.0033.22
                      flg: C---    lkc:  0     scn: 0x0000.07d23460
KDO Op code: URP row dependencies Disabled
  xtype: XA  bdba: 0x0040006a  hdba: 0x00400069
itli: 1  ispac: 0  maxfr: 4863
tabn: 0 slot: 7(0x7) flag: 0x2c lock: 0 ckix: 0
ncol: 17 nnew: 12 size: 0
col  1: [ 9]  5f 53 59 53 53 4d 55 37 24
col  2: [ 2]  c1 02
col  3: [ 2]  c1 03
col  4: [ 3]  c2 02 06
col  5: [ 6]  c5 02 20 14 40 24
col  6: [ 1]  80
col  7: [ 4]  c3 0e 21 2d
col  8: [ 3]  c2 1b 34
col  9: [ 1]  80
col 10: [ 2]  c1 03
col 11: [ 2]  c1 02
col 16: [ 2]  c1 02

这部分信息异常,导致数据库update undo$的时候报ORA-00600: ?????????: [4194], [34], [8], [], [], [], [], []错误,通过修改对应的block信息,数据库正常open成功

SQL> alter database open;

数据库已更改。

但是关闭数据库又报ORA-600 4194错误

SQL> shutdown immediate;
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [94], [61], [], [], [], [], []

alert日志信息

Wed Jul 21 12:58:42 2021
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 3
Waiting for dispatcher 'D000' to shutdown
All dispatchers and shared servers shutdown
Wed Jul 21 12:58:45 2021
ALTER DATABASE CLOSE NORMAL
Wed Jul 21 12:58:45 2021
Errors in file c:\oracle\admin\dcpdm\udump\dcpdm_ora_13628.trc:
ORA-00600: 内部错误代码,参数: [4194], [94], [61], [], [], [], [], []

Recovery of Online Redo Log: Thread 1 Group 3 Seq 496 Reading mem 0
  Mem# 0 errs 0: C:\ORACLE\ORADATA\DCPDM\REDO03.LOG
Recovery of Online Redo Log: Thread 1 Group 3 Seq 496 Reading mem 0
  Mem# 0 errs 0: C:\ORACLE\ORADATA\DCPDM\REDO03.LOG
ORA-607 signalled during: ALTER DATABASE CLOSE NORMAL...

通过重建undo,数据库启动关闭正常,也没有再报其他错误,建议逻辑方式重建库
参考以前的类似文章:
数据库报ORA-00607/ORA-00600[4194]错误
使用bbed解决ORA-00607/ORA-00600[4194]故障
使用bbed解决ORA-00607/ORA-00600[4194]故障

ora-600 kfdpMetaBlk_pickle 故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ora-600 kfdpMetaBlk_pickle 故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户反馈集群的crs无法正常启动观察发现是由于gmon进程crash asm实例导致,经过测试确认是在mount data磁盘组的时候会触发给问题

SQL> alter diskgroup data mount;
alter diskgroup data mount
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 7517
Session ID: 918 Serial number: 5

对应的alert日志报ORA-600 [kfdpMetaBlk_pickle:01], [4294967295]错误

SQL> alter diskgroup data mount
NOTE: cache registered group DATA number=2 incarn=0x3078f05f
NOTE: cache began mount (first) of group DATA number=2 incarn=0x3078f05f
NOTE: Assigning number (2,1) to disk (/dev/rdisk/disk93)
NOTE: Assigning number (2,3) to disk (/dev/rdisk/disk96)
NOTE: Assigning number (2,2) to disk (/dev/rdisk/disk94)
NOTE: Assigning number (2,0) to disk (/dev/rdisk/disk92)
Sat Jul 17 05:21:01 2021
Errors in file /u01/app/crs_base/diag/asm/+asm/+ASM2/trace/+ASM2_gmon_7457.trc  (incident=255833):
ORA-00600: internal error code, arguments: [kfdpMetaBlk_pickle:01], [4294967295], [0], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/crs_base/diag/asm/+asm/+ASM2/incident/incdir_255833/+ASM2_gmon_7457_i255833.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/crs_base/diag/asm/+asm/+ASM2/trace/+ASM2_gmon_7457.trc:
ORA-00600: internal error code, arguments: [kfdpMetaBlk_pickle:01], [4294967295], [0], [], [], [], [], [], [], [], [], []
GMON (ospid: 7457): terminating the instance due to error 493
Sat Jul 17 05:21:03 2021
System state dump requested by (instance=2, osid=7457 (GMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/crs_base/diag/asm/+asm/+ASM2/trace/+ASM2_diag_7429.trc
Instance terminated by GMON, pid = 7457

对于ORA-600 [kfdpMetaBlk_pickle:01], [4294967295]错误,查询了mos没有任何有效信息
kfdpMetaBlk_pickle


对应的trace文件发现如下信息

2021-07-17 03:51:16.277603*:800002A2:KGF:kgfdputl.c@1411:kgfdpMetaSet_getMaxClique():   inc=2 ver=4294967295
2021-07-17 03:51:16.277619 :800002A3:KFDP:kfdp.c@9314:kfdpMetaSet_filterOld(): filtered old meta on disk 2
2021-07-17 03:51:16.277620 :800002A4:KFDP:kfdp.c@9314:kfdpMetaSet_filterOld(): filtered old meta on disk 2
2021-07-17 03:51:16.277992 :800002A5:KFDP:kfdp.c@9417:kfdpMetaSet_readDta():kfdpMetaSet_readDta unpickle upto 6 metablks
2021-07-17 03:51:16.277993 :800002A6:KFDP:kfdp.c@9425:kfdpMetaSet_readDta():kfdpMetaSet_readDta unpickle metablk for disk 3
2021-07-17 03:51:16.278154 :800002A7:KFDP:kfdp.c@9425:kfdpMetaSet_readDta():kfdpMetaSet_readDta unpickle metablk for disk 1
2021-07-17 03:51:16.278268 :800002A8:KFDP:kfdp.c@5851:kfdp_read(): kfdp_read end ok=1
2021-07-17 03:51:16.278277 :800002A9:KFDP:kfdp.c@7073:kfdp_doQuery(): kfdp_doQuery   rewrite_kfdp=1
2021-07-17 03:51:16.278282 :800002AA:KFDP:kfdp.c@12511:kfdpLckValue_pickle(): kfdpLckValue_pickle size=0 
                            endian=0xff ndisks=0 lckvalid=0
2021-07-17 03:51:16.278293 :800002AB:db_trace:kfdp.c@12803:kfdpLck_convPriv(): [10499:19:396] 
                            kfdpLck_conv: grp=1, type=0, mode=5, line=7155
2021-07-17 03:51:16.278294 :800002AC:KFDP:kfdp.c@12663:kfdpLckValue_unpickle(): kfdpLckValue_unpickle
                            size=28 res=0 ok=0 ver=-1 dcnt=0 lckvalid=0 flags=0x2 inst=0 (I am 2) version=0
2021-07-17 03:51:16.278499*:800002AD:KGF:kgfdputl.c@485:kgfdpDta_getAllDsks(): kgfdpDta_getAllDsks using 
                            saved iterator 0x9ffffffffd571220 with 4 disks
2021-07-17 03:51:16.278688 :800002AE:KFDP:kfdp.c@5566:kfdp_write(): kfdp_write: pstDskCnt=3 grow=0 degenerate=0
2021-07-17 03:51:16.278688*:800002AF:KGF:kgfdputl.c@2619:kgfdpTraceSet(): writing pst to disks (n=3): 0 1 3

通过删除信息,基本上可以确认由于pst信息异常(pst中记录的只有0 1 3三个磁盘,认为2是老磁盘),但是实际磁盘为4个,导致gmon进程异常.通过底层解决该问题,数据库恢复成功

SQL> recover database using backup controlfile;
ORA-00279: change 30075814973 generated at 07/17/2021 01:12:08 needed for
thread 2
ORA-00289: suggestion : +FRA
ORA-00280: change 30075814973 for thread 2 is in sequence #120561


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/tmp/asm/group_16
ORA-00279: change 30075814973 generated at 07/17/2021 01:11:54 needed for
thread 1
ORA-00289: suggestion :
+FRA/xff/archivelog/2021_07_17/thread_1_seq_79949.1543.1078103529
ORA-00280: change 30075814973 for thread 1 is in sequence #79949


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/tmp/asm/group_13
ORA-00279: change 30075815013 generated at 07/17/2021 01:12:09 needed for
thread 1
ORA-00289: suggestion : +FRA
ORA-00280: change 30075815013 for thread 1 is in sequence #79950
ORA-00278: log file '/tmp/asm/group_13' no longer needed for this recovery


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/tmp/asm/group_11
Log applied.
Media recovery complete.

SQL> alter database open resetlogs;

Database altered.

运气不错,对于该故障的恢复,实现数据0丢失.

删除分区 oracle asm disk 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:删除分区 oracle asm disk 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到一个朋友数据库故障请求case.大概操作是这样的:有一个39T的lun,通过parted分了15个分区,给oracle asm使用创建磁盘组data4,然后分了4个分区做成data5(由于ausize写错误了),删除掉磁盘组和这四个分区.然后重新分配了6个分区,并且使用最后5个分区创建了data5磁盘组.使用了一段时间之后,由于oracle空间不足,检查的时候误以为这个lun就前面15个分区使用,人工把后面的6个分区给删除了,并且创建了4个新分区,然后发现数据库crash了,发现误删除了在使用的分区.然后又把新创建的4个分区给删除了.接手该故障的时候,这个39T lun的分区信息如下

[root@node1 linux64]# parted /dev/mapper/36000d31003d39e000000000000000004
GNU Parted 2.1
Using /dev/mapper/36000d31003d39e000000000000000004
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print                                                            
Model: Linux device-mapper (multipath) (dm)
Disk /dev/mapper/36000d31003d39e000000000000000004: 39.6TB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start   End     Size    File system  Name         Flags
 1      2097kB  2000GB  2000GB               asmdata4-1
 2      2000GB  4000GB  2000GB               asmdata4-2
 3      4000GB  6000GB  2000GB               asmdata4-3
 4      6000GB  8000GB  2000GB               asmdata4-4
 5      8000GB  10.0TB  2000GB               asmdata4-5
 6      10.0TB  12.0TB  2000GB               asmdata4-6
 7      12.0TB  14.0TB  2000GB               asmdata4-7
 8      14.0TB  16.0TB  2000GB               asmdata4-8
 9      16.0TB  18.0TB  2000GB               asmdata4-9
10      18.0TB  20.0TB  2000GB               asmdata4-10
11      20.0TB  22.0TB  2000GB               asmdata4-11
12      22.0TB  24.0TB  2000GB               asmdata4-12
13      24.0TB  26.0TB  2000GB               asmdata4-13
14      26.0TB  28.0TB  2000GB               asmdata4-14
15      28.0TB  30.0TB  2000GB               asmdata4-15

客户正常使用情况下,这个lun上面相关分区的asm disk信息

 SQL> CREATE DISKGROUP DATA4 EXTERNAL REDUNDANCY  DISK 
'/dev/mapper/36000d31003d39e000000000000000004p1' SIZE 1907346M ,
 '/dev/mapper/36000d31003d39e000000000000000004p2' SIZE 1907350M ,
 '/dev/mapper/36000d31003d39e000000000000000004p3' SIZE 1907348M ,
 '/dev/mapper/36000d31003d39e000000000000000004p4' SIZE 1907348M ,
 '/dev/mapper/36000d31003d39e000000000000000004p5' SIZE 1907350M  
ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='8M' /* ASMCA */ 

SQL> ALTER DISKGROUP DATA4 ADD  DISK 
'/dev/mapper/36000d31003d39e000000000000000004p10' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p6' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p7' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p8' SIZE 1907350M ,
'/dev/mapper/36000d31003d39e000000000000000004p9' SIZE 1907348M /* ASMCA */ 

SQL> ALTER DISKGROUP DATA4 ADD  DISK 
'/dev/mapper/36000d31003d39e000000000000000004p11' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p12' SIZE 1907350M ,
'/dev/mapper/36000d31003d39e000000000000000004p13' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p14' SIZE 1907348M ,
'/dev/mapper/36000d31003d39e000000000000000004p15' SIZE 1907350M /* ASMCA */ 

SQL> CREATE DISKGROUP DATA5 EXTERNAL REDUNDANCY  DISK 
'/dev/mapper/36000d31003d39e000000000000000004p17' SIZE 1716614M ,
'/dev/mapper/36000d31003d39e000000000000000004p18' SIZE 1716614M ,
'/dev/mapper/36000d31003d39e000000000000000004p19' SIZE 1716614M ,
'/dev/mapper/36000d31003d39e000000000000000004p20' SIZE 1716614M ,
'/dev/mapper/36000d31003d39e000000000000000004p21' SIZE 1621246M  
ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='4M' /* ASMCA */ 

基于客户现在的情况,data4中的所有分区都正常,主要是要找出来data5中的5个分区的数据.因为客户不确定p16分区大小,导致后续的5个分区起始位置不好定位.从而使得恢复无法进行.通过shell脚本结合kfed尝试定位asm disk header信息

#!/bin/bash
j=xxxxxxxxxxx
for ((i=xxxxxx; i<=j; i++))
do
 echo "-----$i--------" >> /home/get_au1.txt
 kfed read /dev/mapper/36000d31003d39e000000000000000004 aun=$i |
 > grep  "kfdhdb.dskname:              DATA" >> /home/get_au.txt
done

结果发现无法获取到结果,通过分析发现这里由于lun过大,导致aun值过大,从而使得kfed溢出无法读取到正常值.根据parted的特性,人工dd部分block进行分析

[root@node1 bak]# kfed read xifenfei.dd aun=134|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3357988283 ; 0x00c: 0xc826d5bb
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:              DATA5_0000 ; 0x028: length=10
kfdhdb.grpname:                   DATA5 ; 0x048: length=5
kfdhdb.fgname:               DATA5_0000 ; 0x068: length=10
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.crestmp.hi:             33116450 ; 0x0a8: HOUR=0x2 DAYS=0x9 MNTH=0x4 YEAR=0x7e5
kfdhdb.crestmp.lo:            477378560 ; 0x0ac: USEC=0x0 MSEC=0x10e SECS=0x7 MINS=0x7
kfdhdb.mntstmp.hi:             33116450 ; 0x0b0: HOUR=0x2 DAYS=0x9 MNTH=0x4 YEAR=0x7e5
kfdhdb.mntstmp.lo:            486256640 ; 0x0b4: USEC=0x0 MSEC=0x2ec SECS=0xf MINS=0x7
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  4194304 ; 0x0bc: 0x00400000
kfdhdb.mfact:                    454272 ; 0x0c0: 0x0006ee80
kfdhdb.dsksize:                  429153 ; 0x0c4: 0x00068c61
kfdhdb.pmcnt:                         2 ; 0x0c8: 0x00000002

顺利找到了data5中的第一块磁盘,而且确定了起始位置,然后构造相关的dd语句把分区的数据dd到一个新磁盘中

dd if=/dev/mapper/36000d31003d39e000000000000000004 bs=4M skip=xxxxx count=429153 of=/dev/sdu

然后通过kfed查看数据
20210704160347


通过类似方法依次处理,最终把5块asm disk全部找到,并且顺利dd到新的磁盘中.尝试启动crs,并mount data5
20210704153634

20210704153548

data5 磁盘组mount成功之后,数据库顺利启动,实现lun中删除分区之后,asm磁盘组数据完美恢复
20210704153728

这次运气还不错,仅仅是对lun的分区使用了parted进行了删除和创建等操作,没有格式化文件系统和做成新的asm disk,不然数据会有一部分丢失.对于有部分破坏的分区,需要通过底层碎片的方法进行最大限度抢救数据.参考类似文档:
asm disk被加入vg恢复
又一例asm格式化文件系统恢复
文件系统损坏导致数据文件异常恢复
一次完美的asm disk被格式化ntfs恢复
Oracle 数据文件大小为0kb或者文件丢失恢复
再一起asm disk被格式化成ext3文件系统故障恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
oracle asm disk格式化恢复—格式化为ntfs文件系统
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例

drop tablesapce 数据恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:drop tablesapce 数据恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户执行了drop tablespace xxx including contents and datafiles,虽然删除命令返回错误,但是该表空间中大量对象被删除,文件没有被从系统层面删除

Tue Jun 29 14:38:26 2021
DROP TABLESPACE "XFF" INCLUDING CONTENTS AND DATAFILES CASCADE CONSTRAINTS
Tue Jun 29 14:38:32 2021
Thread 1 advanced to log sequence 975 (LGWR switch)
  Current log# 3 seq# 975 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO03.LOG
ORA-604 signalled during: DROP TABLESPACE "XFF" INCLUDING CONTENTS AND DATAFILES CASCADE CONSTRAINTS...
Tue Jun 29 14:40:44 2021
ALTER DATABASE DATAFILE 'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF'  AUTOEXTEND ON NEXT 50M
Completed: ALTER DATABASE DATAFILE 'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF'  AUTOEXTEND ON NEXT 50M
ALTER TABLESPACE "XFF" DROP DATAFILE  'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF02.DBF'
WARNING: Cannot delete file D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF02.DBF
Errors in file d:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_ora_2528.trc:
ORA-01265: 无法删除 DATA D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF02.DBF
ORA-27056: 无法删除文件
OSD-04024: 无法删除文件。
O/S-Error: (OS 32) 另一个程序正在使用此文件,进程无法访问。
Completed: ALTER TABLESPACE "XFF" DROP DATAFILE  'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF02.DBF'
Tue Jun 29 14:58:51 2021
ALTER DATABASE DATAFILE 'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF'  AUTOEXTEND ON NEXT 50M MAXSIZE UNLIMITED
Completed: ALTER DATABASE DATAFILE 'D:\APP\ADMINISTRATOR\ORADATA\XFF\XFF'  AUTOEXTEND ON NEXT 50M MAXSIZE UNLIMITED

通过工具获取obj$字典信息,结合user$,判断业务用户表数据基本上全部删除
20210701193502


对于这类情况,常规方法无法恢复,只能按照表被drop的思路进行恢复.参考文章:dul恢复drop表测试,通过结合客户提供的表结构和一些特殊方法恢复出来所有对象的obj#,dataobj#信息,快速的恢复了客户数据,得到客户认可
20210701194153

ORA-15335: ASM metadata corruption detected in disk group ‘DATA’

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-15335: ASM metadata corruption detected in disk group ‘DATA’

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

asm磁盘组增加磁盘进行扩容之后报ORA-15335: ASM metadata corruption detected in disk group ‘DATA’和ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479],磁盘组dismount,然后mount之后立马dismount掉.

Tue Jun 29 09:19:09 2021
SQL> ALTER DISKGROUP DATA ADD  DISK '/dev/raw/raw5' SIZE 102400M /* ASMCA */ 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (2,1) to disk (/dev/raw/raw5)
NOTE: requesting all-instance membership refresh for group=2
NOTE: initializing header on grp 2 disk DATA_0001
NOTE: requesting all-instance disk validation for group=2
Tue Jun 29 09:19:11 2021
NOTE: skipping rediscovery for group 2/0xb0c845ce (DATA) on local instance.
NOTE: requesting all-instance disk validation for group=2
NOTE: skipping rediscovery for group 2/0xb0c845ce (DATA) on local instance.
NOTE: initiating PST update: grp = 2
Tue Jun 29 09:19:16 2021
GMON updating group 2 at 7 for pid 27, osid 25020
NOTE: PST update grp = 2 completed successfully 
NOTE: membership refresh pending for group 2/0xb0c845ce (DATA)
GMON querying group 2 at 8 for pid 18, osid 3852
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/raw/raw5
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
GMON querying group 2 at 9 for pid 18, osid 3852
SUCCESS: refreshed membership for 2/0xb0c845ce (DATA)
Tue Jun 29 09:19:20 2021
SUCCESS: ALTER DISKGROUP DATA ADD  DISK '/dev/raw/raw5' SIZE 102400M /* ASMCA */
NOTE: starting rebalance of group 2/0xb0c845ce (DATA) at power 1
Starting background process ARB0
Tue Jun 29 09:19:21 2021
ARB0 started with pid=33, OS id=25176 
NOTE: assigning ARB0 to group 2/0xb0c845ce (DATA) with 1 parallel I/O
cellip.ora not found.
Tue Jun 29 09:19:24 2021
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Tue Jun 29 09:19:46 2021
WARNING: cache read  a corrupt block: group=2(DATA) dsk=0 blk=7 disk=0 (DATA_0000) incarn=3915953476 au=0 blk=7 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ERROR: cache failed to read group=2(DATA) dsk=0 blk=7 from disk(s): 0(DATA_0000)
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
NOTE: cache initiating offline of disk 0 group DATA
NOTE: process _arb0_+asm1 (25176) initiating offline of disk 0.3915953476 (DATA_0000) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 0/0xe968b544, mask = 0x6a, op = clear
Tue Jun 29 09:19:46 2021
GMON updating disk modes for group 2 at 10 for pid 33, osid 25176
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Tue Jun 29 09:19:46 2021
NOTE: cache dismounting (not clean) group 2/0xB0C845CE (DATA) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 25395, image: oracle@frsrac1 (B000)
Tue Jun 29 09:19:46 2021
NOTE: halting all I/Os to diskgroup 2 (DATA)
Tue Jun 29 09:19:46 2021
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=11.10715 last written ABA 11.10715
WARNING: Offline for disk DATA_0000 in mode 0x7f failed.
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc  (incident=54665):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0000" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_54665/+ASM1_arb0_25176_i54665.trc
Tue Jun 29 09:19:46 2021
kjbdomdet send to inst 2
detach from dom 2, sending detach message to inst 2
Tue Jun 29 09:19:46 2021
List of instances:
 1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 24)
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 2 invalid = TRUE 
 796 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
Tue Jun 29 09:19:46 2021
WARNING: dirty detached from domain 2
NOTE: cache dismounted group 2/0xB0C845CE (DATA) 
SQL> alter diskgroup DATA dismount force /* ASM SERVER:2965915086 */ 
Tue Jun 29 09:19:47 2021
ERROR: ORA-15130 thrown in ARB0 for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0000" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
Tue Jun 29 09:19:47 2021
NOTE: stopping process ARB0
Tue Jun 29 09:19:47 2021
Sweep [inc][54665]: completed
Tue Jun 29 09:19:47 2021
Sweep [inc2][54665]: completed
NOTE: cache deleting context for group DATA 2/0xb0c845ce
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_3852.trc:
ORA-15130: diskgroup "DATA" is being dismounted
GMON dismounting group 2 at 11 for pid 27, osid 25395
NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
SUCCESS: diskgroup DATA was dismounted
SUCCESS: alter diskgroup DATA dismount force /* ASM SERVER:2965915086 */

通过kfed分析报错block,确认错误

kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       7 ; 0x004: blk=7
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  2183628676 ; 0x00c: 0x82278784 <<======该值错误,应该为:686982479
kfbh.fcn.base:                     3430 ; 0x010: 0x00000d66
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdatb.aunum:                      2240 ; 0x000: 0x000008c0
kfdatb.shrink:                      448 ; 0x004: 0x01c0
kfdatb.ub2pad:                        0 ; 0x006: 0x0000

通过修复该错误,并且禁止reblance操作[增加磁盘数据需要重新分布],mount磁盘组,然后open库,发现redo已经被覆盖(非归档),强制打开库报错

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [0], [2691201882], [0],
[2691227745], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [0], [2691201881], [0],
[2691227745], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [2691201879], [0],
[2691227745], [12583040], [], [], [], [], [], []
Process ID: 25110
Session ID: 287 Serial number: 3

通过对scn进行处理,数据库顺利open

SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 5044088832 bytes
Fixed Size		    2261928 bytes
Variable Size		 1442843736 bytes
Database Buffers	 3590324224 bytes
Redo Buffers		    8658944 bytes
Database mounted.
SQL> alter database open;

Database altered.