ORA-01110 ORA-17070 OSD-04006 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-01110 ORA-17070 OSD-04006 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友找到我说应用访问数据库和导出数据都报ORA-01110 ORA-17070 OSD-04006之类错误,数据库可以正常open,但是业务访问关键数据和导出报错
20220611175122
20220611175156


对于这个错误,根据以往恢复经验,初步判断可能硬件异常(比如坏道,硬件故障)或者文件系统异常引起,让客户尝试拷贝该文件,确认该文件也无法拷贝
20220611174852

对于这种情况,如果放弃该文件,恢复其他文件数据,那样数据丢失比例太大,直接通过特定恢复工具对其损坏文件进行拷贝,最大限度强求当前文件数据,发现一些扇区损坏跳过继续拷贝
20220611174928

通过坏块检查工具进行检查确认该文件76个block损坏(对于32G的数据文件损坏1M数据,比较好效果)
20220611174909

对坏块进行处理,然后使用expdp导出数据,最大限度抢救数据
20220611175039

fdisk分区导致asm disk破坏数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:fdisk分区导致asm disk破坏数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

尝试mount data磁盘组

SQL> alter diskgroup DATADG mount 
NOTE: cache registered group DATADG number=1 incarn=0xbc43fafd
NOTE: cache began mount (first) of group DATADG number=1 incarn=0xbc43fafd
NOTE: Assigning number (1,0) to disk (/dev/raw/raw2)
Thu Jun 02 10:14:33 2022
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 27 for pid 27, osid 3853
NOTE: Assigning number (1,1) to disk ()
GMON querying group 1 at 28 for pid 27, osid 3853
NOTE: cache dismounting (clean) group 1/0xBC43FAFD (DATADG) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 3853, image: oracle@node1 (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0xBC43FAFD (DATADG) 
NOTE: cache ending mount (fail) of group DATADG number=1 incarn=0xbc43fafd
NOTE: cache deleting context for group DATADG 1/0xbc43fafd
GMON dismounting group 1 at 29 for pid 27, osid 3853
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
ERROR: diskgroup DATADG was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "1" is missing from group number "1" 
ERROR: alter diskgroup DATADG mount
Thu Jun 02 10:14:33 2022
ASM Health Checker found 1 new failures

报错信息比较明显 datadg的disk number 为1的磁盘丢失了。通过fdisk确认磁盘情况

Disk /dev/sdb: 42.9 GB, 42949672960 bytes
64 heads, 32 sectors/track, 40960 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0006c2be

   Device Boot      Start         End      Blocks   Id  System

Disk /dev/sda: 53.7 GB, 53687091200 bytes
64 heads, 32 sectors/track, 51200 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00061443

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           2        2049     2097152   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2            2050       10241     8388608   82  Linux swap / Solaris
Partition 2 does not end on cylinder boundary.
/dev/sda3           10242       12289     2097152   83  Linux
Partition 3 does not end on cylinder boundary.
/dev/sda4           12290       51200    39844864    5  Extended
Partition 4 does not end on cylinder boundary.
/dev/sda5           12291       14338     2097152   83  Linux
/dev/sda6           14340       50178    36699136   83  Linux
/dev/sda7           50180       51200     1045504   83  Linux

Disk /dev/sdc: 214.7 GB, 214748364800 bytes
255 heads, 63 sectors/track, 26108 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x1b3fba6b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1        1045     8393931   83  Linux
/dev/sdc2            1046       26108   201318547+  83  Linux

Disk /dev/sdd: 536.9 GB, 536870912000 bytes
255 heads, 63 sectors/track, 65270 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x4c63ecad

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       65270   524281243+  83  Linux

Disk /dev/sde: 536.9 GB, 536870912000 bytes
255 heads, 63 sectors/track, 65270 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdf: 536.9 GB, 536870912000 bytes
255 heads, 63 sectors/track, 65270 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

根据客户反馈,异常的应该是一个500G的磁盘,而其中sdb为分区,通过kfed命令分析,确认sdc1为ocr磁盘,sdc2为datadg的一块磁盘,另外一块磁盘应该在sdd,sde,sdf三者之中,通过kfed分析sde,sdf均不可能是asm disk(一块是文件系统,一块是彻底没有使用的空盘),如果datadg的磁盘没有丢失,那应该就是sdd这块磁盘,通过dd 磁盘100M空间,然后通过kfed进行分析确认

E:\TEMP\xff>kfed read sdd.dd
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
006648400 00000000 00000000 00000000 00000000  [................]
        Repeat 26 times
0066485B0 00000000 00000000 4C63ECAD 01000000  [..........cL....]
0066485C0 FE830001 003FFFFF CB370000 00003E7F  [......?...7..>..]
0066485D0 00000000 00000000 00000000 00000000  [................]
        Repeat 1 times
0066485F0 00000000 00000000 00000000 AA550000  [..............U.]
006648600 00000000 00000000 00000000 00000000  [................]
  Repeat 223 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

E:\TEMP\xff>kfed read sdd1.dd
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
006768400 00000000 00000000 00000000 00000000  [................]
        Repeat 26 times
0067685B0 00000000 00000000 70D364B4 FE000000  [.........d.p....]
0067685C0 FE83FFFF D13FFFFF BB7603EB 00003A93  [......?...v..:..]
0067685D0 00000000 00000000 00000000 00000000  [................]
        Repeat 1 times
0067685F0 00000000 00000000 00000000 AA550000  [..............U.]
006768600 02038201 00000008 80000001 826037C1  [.............7`.]
006768EA0 00000079 00800105 0000007A 00800105  [y.......z.......]
006768EB0 0000007C 00800105 0000007D 00800105  [|.......}.......]
0067693C0 0000015C 00800105 0000015D 00800105  [\.......].......]
0067693D0 0000015F 00800105 00000160 00800105  [_.......`.......]
0067693E0 00000161 00800105 00000163 00800105  [a.......c.......]
0067693F0 00000164 00800105 00000166 00800105  [d.......f.......]
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

E:\TEMP\xff>kfed read sdd.dd blkn=1|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       1 ; 0x004: blk=1
kfbh.block.obj:              2147483649 ; 0x008: disk=1
kfbh.check:                  2197087544 ; 0x00c: 0x82f4e538
kfbh.fcn.base:                   616391 ; 0x010: 0x000967c7
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdfsb.aunum:                         0 ; 0x000: 0x00000000
kfdfsb.max:                         254 ; 0x004: 0x00fe
kfdfsb.cnt:                         254 ; 0x006: 0x00fe
kfdfsb.bound:                         0 ; 0x008: 0x0000
kfdfsb.flag:                          1 ; 0x00a: B=1
kfdfsb.ub1spare:                      0 ; 0x00b: 0x00
kfdfsb.spare[0]:                      0 ; 0x00c: 0x00000000
kfdfsb.spare[1]:                      0 ; 0x010: 0x00000000
kfdfsb.spare[2]:                      0 ; 0x014: 0x00000000

通过上述信息分析,基本上可以确认sdd磁盘以前是asm disk,但是被fdisk进行了分区,基于这种情况,通过对磁盘组进行修复

E:\TEMP\xff>kfed read sdd.ok
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:              2147483649 ; 0x008: disk=1
kfbh.check:                   424926402 ; 0x00c: 0x1953dcc2
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        1 ; 0x024: 0x0001
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:             DATADG_0001 ; 0x028: length=11
kfdhdb.grpname:                  DATADG ; 0x048: length=6
kfdhdb.fgname:              DATADG_0001 ; 0x068: length=11
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.crestmp.hi:             33074858 ; 0x0a8: HOUR=0xa DAYS=0x15 MNTH=0xb YEAR=0x7e2
kfdhdb.crestmp.lo:           2375520256 ; 0x0ac: USEC=0x0 MSEC=0x1e4 SECS=0x19 MINS=0x23
kfdhdb.mntstmp.hi:             33074858 ; 0x0b0: HOUR=0xa DAYS=0x15 MNTH=0xb YEAR=0x7e2
kfdhdb.mntstmp.lo:           2375522304 ; 0x0b4: USEC=0x0 MSEC=0x1e6 SECS=0x19 MINS=0x23
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000
kfdhdb.mfact:                    113792 ; 0x0c0: 0x0001bc80
kfdhdb.dsksize:                  512000 ; 0x0c4: 0x0007d000
kfdhdb.pmcnt:                         6 ; 0x0c8: 0x00000006
kfdhdb.fstlocn:                       1 ; 0x0cc: 0x00000001
kfdhdb.altlocn:                       2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn:                      0 ; 0x0d4: 0x00000000
kfdhdb.redomirrors[0]:                0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]:                0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]:                0 ; 0x0dc: 0x0000
kfdhdb.redomirrors[3]:                0 ; 0x0de: 0x0000
kfdhdb.dbcompat:              168820736 ; 0x0e0: 0x0a100000
kfdhdb.grpstmp.hi:             33072461 ; 0x0e4: HOUR=0xd DAYS=0xa MNTH=0x9 YEAR=0x7e2
kfdhdb.grpstmp.lo:           3452534784 ; 0x0e8: USEC=0x0 MSEC=0x260 SECS=0x1c MINS=0x33

磁盘组mount成功,数据库open成功,实现数据0丢失
20220611171941
20220611172005


使用rman对数据库进行备份,并且重建磁盘组实现数据0丢失

ORA-600 kcvent_internal_02故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kcvent_internal_02故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-00600: internal error code, arguments: [kcvent_internal_02]错,无法正常open

Reconfiguration complete
 parallel recovery started with 32 processes
Started redo scan
Completed redo scan
 read 22775 KB redo, 5055 data blocks need recovery
Started redo application at
 Thread 2: logseq 166395, block 88
Recovery of Online Redo Log: Thread 2 Group 3 Seq 166395 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/group_3.283.1036687245
  Mem# 1: +FLASH/orcl/onlinelog/group_3.264.1036687257
Recovery of Online Redo Log: Thread 2 Group 4 Seq 166396 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/group_4.284.1036687257
  Mem# 1: +FLASH/orcl/onlinelog/group_4.265.1036687257
Completed redo application of 15.97MB
Completed instance recovery at
 Thread 2: logseq 166396, block 15854, scn 27533037896
 5055 data blocks read, 5055 data blocks written, 22775 redo k-bytes read
Thread 2 advanced to log sequence 166397 (thread recovery)
Redo thread 2 internally disabled at seq 166397 
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_35652472.trc  (incident=195549):
ORA-00600: internal error code, arguments: [kcvent_internal_02], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_195549/orcl1_ora_35652472_i195549.trc

对应的trace文件信息

Dump continued from file: /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_35652472.trc
ORA-00600: internal error code, arguments: [kcvent_internal_02], [], [], [], [], [], [], [], [], [], [], []

========= Dump for incident 195549 (ORA 600 [kcvent_internal_02]) ========

*** 2022-06-06 22:17:48.743
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=5fmpzya54p4hf) -----
ALTER DATABASE OPEN /* db agent *//* {1:38339:2} */

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+40        bl       0000000109B1E77C     000000000 ? 000000001 ?
                                                   000000003 ? 000000000 ?
                                                   000000000 ? 000000001 ?
                                                   000000003 ? 000000000 ?
ksedst1()+112        call     skdstdst()           16F60DC8B26FAB02 ?
                                                   4846284100000000 ?
                                                   FFFFFFFFFFE46D0 ?
                                                   283C6E7C6A9A6 ? 10A6B923C ?
                                                   000000000 ? 110737880 ?
                                                   2050033FFFE46D8 ?
ksedst()+40          call     ksedst1()            000000000 ? 00000000A ?
                                                   07FFFFFFF ? 700000000003670 ?
                                                   000000000 ? 000000000 ?
                                                   000002004 ? 000000001 ?
dbkedDefDump()+1516  call     ksedst()             000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 300000003 ?
ksedmp()+72          call     dbkedDefDump()       310737880 ? 110000D40 ?
                                                   FFFFFFFFFFE4EE0 ? 1106AB740 ?
                                                   100124BB8 ? 000000000 ?
                                                   700011D7387FF08 ? 1106AB740 ?
ksfdmp()+100         call     ksedmp()             000000002 ? 000000000 ?
                                                   000000002 ? 10AF01CA8 ?
                                                   10A041C38 ? 000000000 ?
                                                   11073C760 ? 110737880 ?
dbgexPhaseII()+1904  call     ksfdmp()             000000000 ? 00000000A ?
                                                   000000002 ? 000000000 ?
                                                   000000002 ? 10A041C30 ?
                                                   000000000 ? 001050005 ?
dbgexProcessError()  call     dbgexPhaseII()       110737880 ? 11073A970 ?
+1556                                              00002FBDD ? 200000000 ?
                                                   FFFFFFFFFFE5DF8 ? 00000006C ?
                                                   200000000 ? 1000000000 ?
dbgeExecuteForError  call     dbgexProcessError()  110737880 ? 11073C760 ?
()+72                                              100000703 ? 000004000 ?
                                                   000000000 ? FFFFFFFFFFE9608 ?
                                                   000000001 ? 11073E4A8 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  FFFFFFFFFFE92B0 ?
2044                          ()                   700011D61558BB8 ? 102878B5C ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFE9608 ? 000000000 ?
                                                   000000000 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   07FFFFFFF ? 700000000003670 ?
68                                                 25800000001 ? 109E4A618 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFEA0B0 ? 1109C0040 ?
kgeadse()+380        call     dbkePostKGE_kgsf()   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 069186EAB ?
kgerinv_internal()+  call     kgeadse()            000000002 ? 000000002 ?
48                                                 000000001 ? FFFFFFFFFFEAB58 ?
                                                   10A4E02F0 ? 000000002 ?
                                                   FFFFFFFFFFE9FE0 ? 000000000 ?
kgerinv()+48         call     kgerinv_internal()   200000002 ? 000000002 ?
                                                   FFFFFFFFFFEA060 ? 000000000 ?
                                                   102860EB0 ? FFFFFFFFFFEA458 ?
                                                   10285CE74 ? FFFFFFFFFFEA358 ?
kgeasnmierr()+72     call     kgerinv()            38400000001 ? 000000000 ?
                                                   10A4E0D20 ? 497F0A29CAE0 ?
                                                   000000001 ? FFFFFFFFFFEA1C0 ?
                                                   10A4E0D20 ? 110000D78 ?
kcvent_internal()+1  call     kgeasnmierr()        FFFFFFFFFFEA1C0 ? 200000002 ?
532                                                1F0410001F041 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000004 ?
kctenb_internal()+2  call     kcvent_internal()    FFFFFFFFFFEB378 ? 200000002 ?
772                                                FFFFFFFFFFEB448 ?
                                                   FFFFFFFFFFEB2E8 ?
                                                   41F6C57900000000 ?
                                                   000000000 ? FFFFFFFFFFEB330 ?
                                                   1106AB740 ?
kcfopd()+1508        call     kctenb_internal()    07FFFFFFF ? 000000000 ?
                                                   000000018 ? FFFFFFFFFFEC380 ?
                                                   000000000 ? 110A39050 ?
                                                   FFFFFFFFFFEC390 ? 000000000 ?
adbdrv()+8028        call     kcfopd()             081F0AD00 ? 00000000F ?
                                                   0FFFED4C0 ? 000000000 ?
                                                   FFFFFFFFFFED548 ? 100000000 ?
                                                   000000000 ? 1000100000000 ?
opiexe()+16048       call     adbdrv()             2300000023 ? 100000001 ?
                                                   000000000 ? FFFFFFFFFFF6960 ?
                                                   000000000 ? FFFFFFFFFFF6B60 ?
                                                   FFFFFFFFFFF6A98 ? 200000002 ?
opiosq0()+3984       call     opiexe()             700011E117B3B20 ? 000000000 ?
                                                   FFFFFFFFFFF7ED8 ? 110000D78 ?
                                                   000000001 ? 1109FA438 ?
                                                   FFFFFFFFFFF7E70 ?
                                                   2216414400000001 ?
kpooprx()+316        call     opiosq0()            300000000 ? 000000000 ?
                                                   000000000 ? A4000000000000 ?
                                                   000000000 ? FFFFFFFFFFF87F0 ?
                                                   28104221FFFF86F0 ?
                                                   1109FAB08 ?
kpoal8()+872         call     kpooprx()            1000CE68C ? 000000001 ?
                                                   FFFFFFFFFFFAD14 ? 100000001 ?
                                                   000000000 ? A40000000000A4 ?
                                                   109EB6D00 ? 000000000 ?
opiodr()+908         call     kpoal8()             100000000 ? 9001000A0091108 ?
                                                   000000FFF ? 07FFFFFF8 ?
                                                   FFFFFFFFFFF8F10 ? 000000018 ?
                                                   000000000 ? 000072FFF ?
ttcpip()+1028        call     opiodr()             5EFFFFA480 ? 1C00200048 ?
                                                   FFFFFFFFFFFA9F8 ? 000530058 ?
                                                   1108BEE30 ? 000000028 ?
                                                   FFFFFFFFFFFA3A0 ? 1108BEC70 ?
opitsk()+1612        call     ttcpip()             110135440 ? 000002078 ?
                                                   000000000 ? 110000D78 ?
                                                   110005210 ? 000000000 ?
                                                   FFFFFFFFFFFAA20 ?
                                                   2222208009EF13C0 ?
opiino()+940         call     opitsk()             110024C58 ? 000000000 ?
                                                   11079B550 ? 1107A0850 ?
                                                   110737880 ? FFFFFFFFFFFCAE0 ?
                                                   FFFFFFFFFFFEB3C ? 000000101 ?
opiodr()+908         call     opiino()             3C006C787C ?
                                                   BFF0000000000000 ?
                                                   FFFFFFFFFFFEF60 ?
                                                   FFFFFFFFFFFD5E9 ?
                                                   FFFFFFFFFFFD630 ? 1106AB740 ?
                                                   FFFFFFFFFFFD650 ?
                                                   9FFFFFFF000E608 ?
opidrv()+1132        call     opiodr()             3C0AFBC600 ? 410134340 ?
                                                   FFFFFFFFFFFEF60 ? 07530312F ?
                                                   108820CE4 ? 1106AB740 ?
                                                   7264626D732F6F72 ?
                                                   1106AB740 ?
sou2o()+136          call     opidrv()             3C0882A9D0 ? 41170031F ?
                                                   FFFFFFFFFFFEF60 ?
                                                   110017002A0000 ? 0E0DDF00D ?
                                                   1106AB740 ?
                                                   BADC0FFEE0DDF00D ?
                                                   BADC0FFEE0DDF00D ?
opimai_real()+560    call     sou2o()              FFFFFFFFFFFEFD0 ?
                                                   BADC0FFEE0DDF00D ?
                                                   90000000008BE3C ?
                                                   BADC0FFEE0DDF00D ?
                                                   000000002 ? 9001000A0091108 ?
                                                   A0000000A000000 ? 10B671248 ?
ssthrdmain()+276     call     opimai_real()        10B6B1D74 ? 9001000A0095260 ?
                                                   FFFFFFFFFFFF0B0 ? 10B6B1598 ?
                                                   FFFFFFFFFFFF0D0 ?
                                                   FFFFFFFFFFFF428 ?
                                                   900000000100968 ?
                                                   9001000A0091108 ?
main()+204           call     ssthrdmain()         240000000 ? FFFFFFFFFFFF418 ?
                                                   8FFFFFFF0000090 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   BADC0FFEE0DDF00D ?
                                                   BADC0FFEE0DDF00D ?
__start()+112        call     main()               000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
 

--------------------- Binary Stack Dump ---------------------

该错误在mos,互联网上没有任何信息,不过在alert日志中发现类似信息

Mon Jun 06 23:03:58 2022
Error: Controlfile sequence number in file header is different from the one in memory
       Please check that the correct mount options are used if controlfile is located on NFS

初步判断可能和这个错误有关系,解决相关问题后,尝试open库

SQL> recover database;

ORA-00279: change 27533037896 generated at 06/06/2022 22:17:46 needed for
thread 2
ORA-00289: suggestion :
+FLASH/orcl/archivelog/2022_06_06/thread_2_seq_166396.6532.1106691471
ORA-00280: change 27533037896 for thread 2 is in sequence #166396


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
Log applied.
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01216: thread 2 is expected to be disabled after CREATE CONTROLFILE

SQL> !oerr ora 01216
01216, 00000, "thread %s is expected to be disabled after CREATE CONTROLFILE"
// *Cause:  A thread that was given during CREATE CONTROLFILE is enabled, but
//          the datafiles indicate that it should be disabled.  This is
//          probably because the logs supplied to the CREATE CONTROLFILE
//          are old (from before the disabling of the thread).
// *Action: This thread is not required to run the database.  The CREATE
//          CONTROLFILE statement can be reissued without the problem thread,
//          and, if desired, the thread can be recreated after the database
//          is open.

ORA-01216这个错误比较也比较少见,但是感觉和thread有关系,大概的意思是thread 被disable了

SQL> select thread#,STATUS FROM V$THREAD;

   THREAD# STATUS
---------- ------------------
         1 CLOSED
         2 CLOSED

通过人工强制把thread个open,然后数据库启动成功

SQL> select thread#,status from v$thread;

   THREAD# STATUS
---------- ------------------
         1 OPEN
         2 CLOSED

SQL> alter database open;

Database altered.

然后启动thread 2,open 第二个节点

--需要open节点
QL> startup
ORACLE instance started.

Total System Global Area 1.2961E+11 bytes
Fixed Size                  2262400 bytes
Variable Size            3.3018E+10 bytes
Database Buffers         9.6368E+10 bytes
Redo Buffers              221818880 bytes
ORA-01618: redo thread 2 is not enabled - cannot mount


--已经open节点
SQL> ALTER DATABASE ENABLE THREAD 2;

Database altered.

--需要open节点
SQL> ALTER DATABASE MOUNT;

Database altered.

SQL> ALTER DATABASE OPEN;

Database altered.
xifenfei1:/home/grid$crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       xifenfei1                                      
               ONLINE  ONLINE       xifenfei2                                      
ora.FLASH.dg
               ONLINE  ONLINE       xifenfei1                                      
               ONLINE  ONLINE       xifenfei2                                      
ora.LISTENER.lsnr
               ONLINE  ONLINE       xifenfei1                                      
               ONLINE  ONLINE       xifenfei2                                      
ora.OCR.dg
               ONLINE  ONLINE       xifenfei1                                      
               ONLINE  ONLINE       xifenfei2                                      
ora.asm
               ONLINE  ONLINE       xifenfei1                  Started             
               ONLINE  ONLINE       xifenfei2                  Started             
ora.gsd
               OFFLINE OFFLINE      xifenfei1                                      
               OFFLINE OFFLINE      xifenfei2                                      
ora.net1.network
               ONLINE  ONLINE       xifenfei1                                      
               ONLINE  ONLINE       xifenfei2                                      
ora.ons
               ONLINE  ONLINE       xifenfei1                                      
               ONLINE  ONLINE       xifenfei2                                      
ora.registry.acfs
               ONLINE  ONLINE       xifenfei1                                      
               ONLINE  ONLINE       xifenfei2                                      
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       xifenfei1                                      
ora.cvu
      1        OFFLINE OFFLINE                                                   
ora.xifenfei1.vip
      1        ONLINE  ONLINE       xifenfei1                                      
ora.xifenfei2.vip
      1        ONLINE  ONLINE       xifenfei2                                      
ora.oc4j
      1        ONLINE  ONLINE       xifenfei2                                      
ora.orcl.db
      1        ONLINE  ONLINE       xifenfei1                  Open                
      2        ONLINE  ONLINE       xifenfei2                  Open                
ora.scan1.vip
      1        ONLINE  ONLINE       xifenfei1                                      

Oracle断电故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle断电故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

异常断电导致数据库异常恢复文件报ORA-00283 ORA-00742 ORA-00312

 
D:\check_db>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on 星期二 5月 31 00:38:42 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> recover datafile 1;
ORA-00283: 恢复会话因错误而取消
ORA-00742: 日志读取在线程 %d 序列 %d 块 %d 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1:
'D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ONLINELOG\O1_MF_3_HJ32KJD5_.LOG'

这个错误比较明显是由于异常断电引起的写丢失导致.而且这种故障在没有备份的情况下,没有什么好处理方法,只能屏蔽一致性强制拉库,尝试强制拉库报错如下

SQL> startup mount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area 2.0310E+10 bytes
Fixed Size                  2290000 bytes
Variable Size            3690991280 bytes
Database Buffers         1.6576E+10 bytes
Redo Buffers               40837120 bytes
数据库装载完毕。
SQL> recover database until cancel;
ORA-00279: 更改 18755939194213 (在  生成) 对于线程 1 是必需的


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ONLINELOG\O1_MF_3_HJ32KJD5_.LOG
ORA-00600: internal error code, arguments: [3020], [2], [78824], [8467432], [],
[], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 78824, file
offset is 645726208 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: 'D:\ORADATA\ORCL\SYSAUX01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 80834


ORA-01112: 未启动介质恢复


SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [krsi_al_hdr_update.15], [4294967295], [], [],[], [], [], [], [], [], [], []

ORA-600 krsi_al_hdr_update.15错误,主要是由于redo异常导致无法resetlogs成功,具体参考:Alter Database Open Resetlogs returns error ORA-00600: [krsi_al_hdr_update.15], (Doc ID 2026541.1)描述,处理这个问题之后,再次resetlogs库,报ORA-600 2662错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [4366], [4112122046],
[4366], [4112228996], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [4366], [4112122045],
[4366], [4112228996], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [4366], [4112122040],
[4366], [4112228996], [12583040], [], [], [], [], [], []
进程 ID: 4644
会话 ID: 1701 序列号: 3

这个问题比较简单,通过修改scn即可绕过去,之后数据库open报ORA-600 4194等错误

SQL> alter database open ;
alter database open 
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [4194], [
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc  (incident=322982):
ORA-00600: internal error code, arguments: [4137], [10.33.3070116], [0], [0], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_322982\orcl_smon_5112_i322982.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
replication_dependency_tracking turned off (no async multimaster replication found)
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_3340.trc  (incident=323030):
ORA-00600: 内部错误代码, 参数: [4194], [
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_323030\orcl_ora_3340_i323030.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 31 09:05:04 2022
Sweep [inc][322982]: completed
ORACLE Instance orcl (pid = 13) - Error 600 encountered while recovering transaction (10, 33).
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc:
ORA-00600: internal error code, arguments: [4137], [10.33.3070116], [0], [0], [], [], [], [], [], [], [], []
Checker run found 1 new persistent data failures
Tue May 31 09:05:05 2022
Sweep [inc][323030]: completed
Sweep [inc2][322982]: completed
Tue May 31 09:05:14 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc  (incident=322983):
ORA-00600: internal error code, arguments: [4193], [10.33.3070116], [0], [], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_322983\orcl_smon_5112_i322983.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 31 09:05:14 2022
ORA-600 signalled during: alter database open...
Block recovery stopped at EOT rba 2.61.16
Block recovery completed at rba 2.61.16, scn 4366.4112429058
Block recovery from logseq 2, block 60 to scn 18755939643393
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ONLINELOG\O1_MF_2_K9BSVC11_.LOG
Block recovery completed at rba 2.61.16, scn 4366.4112429058
Dumping diagnostic data in directory=[cdmp_2022053],requested by(instance=1,osid=5112(SMON)),summary=[incident=322983].
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc:
ORA-01595: error freeing extent (3) of rollback segment (1))
ORA-00600: internal error code, arguments: [4193], [10.33.3070116], [3], [], [], [], [], [], [], [], [], []

对异常undo进行处理,数据库正常open成功

SQL> shutdown immediate;
ORA-00600: 内部错误代码, 参数: [4193], [


SQL> shutdown abort;
ORACLE 例程已经关闭。
SQL> startup mount
ORACLE 例程已经启动。

Total System Global Area 2.0310E+10 bytes
Fixed Size                  2290000 bytes
Variable Size            3690991280 bytes
Database Buffers         1.6576E+10 bytes
Redo Buffers               40837120 bytes
数据库装载完毕。
SQL> alter database open;

数据库已更改。

hcheck检测有一些字典不一致,建议客户逻辑导出,然后导入到新库中

HCheck Version 07MAY18 on 31-5月 -2022 09:12:22
----------------------------------------------
Catalog Version 11.2.0.4.0 (1102000400)
db_name: ORCL

                                   Catalog       Fixed
Procedure Name                     Version    Vs Release    Timestamp      Resul
t
------------------------------ ... ---------- -- ---------- -------------- -----
-
.- LobNotInObj                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- MissingOIDOnObjCol          ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- SourceNotInObj              ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- OversizedFiles              ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- PoorDefaultStorage          ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- PoorStorage                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- TabPartCountMismatch        ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- OrphanedTabComPart          ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- MissingSum$                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- MissingDir$                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- DuplicateDataobj            ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- ObjSynMissing               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjSeqMissing               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedUndo                ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndex               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndexPartition      ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndexSubPartition   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedTable               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedTablePartition      ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedTableSubPartition   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- MissingPartCol              ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedSeg$                ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndPartObj#         ... 1102000400 <=  *All Rel* 05/31 09:12:23 FAIL

HCKE-0024: Orphaned Index Partition Obj# (no OBJ$) (Doc ID 1360935.1)
ORPHAN INDPART$: OBJ#=149167 BO#=6378 - no OBJ$ row
ORPHAN INDPART$: OBJ#=149168 BO#=6378 - no OBJ$ row

.- DuplicateBlockUse           ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- FetUet                      ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- Uet0Check                   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- SeglessUET                  ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadInd$                     ... 1102000400 <=  *All Rel* 05/31 09:12:23 FAIL

HCKE-0030: OBJ$ INDEX entry has no IND$ or INDPART$/INDSUBPART$ entry (Doc ID 13
60528.1)
OBJ$ INDEX PARTITION has no INDPART$ entry: Obj#=148278 SYS Name=WRH$_FILESTATXS
_PK PARTITION=WRH$_FILEST_1572571104_16462
OBJ$ INDEX PARTITION has no INDPART$ entry: Obj#=148920 SYS Name=WRH$_FILESTATXS
_PK PARTITION=WRH$_FILEST_1572571104_16678

.- BadTab$                     ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadIcolDepCnt               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjIndDobj                  ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- TrgAfterUpgrade             ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjType0                    ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadOwner                    ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- StmtAuditOnCommit           ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadPublicObjects            ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadSegFreelist              ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadDepends                  ... 1102000400 <=  *All Rel* 05/31 09:12:23 WARN

HCKW-0016: Dependency$ p_timestamp mismatch for VALID objects (Doc ID 1361045.1)

[E] - P_OBJ#=6376 D_OBJ#=6765

.- CheckDual                   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjectNames                 ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadCboHiLo                  ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- ChkIotTs                    ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- NoSegmentIndex              ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- BadNextObject               ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- DroppedROTS                 ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- FilBlkZero                  ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- DbmsSchemaCopy              ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- OrphanedObjError            ... 1102000400 >  1102000000 05/31 09:12:24 PASS
.- ObjNotLob                   ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- MaxControlfSeq              ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- SegNotInDeferredStg         ... 1102000400 >  1102000000 05/31 09:12:24 PASS
.- SystemNotRfile1             ... 1102000400 >   902000000 05/31 09:12:24 PASS
.- DictOwnNonDefaultSYSTEM     ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- OrphanTrigger               ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- ObjNotTrigger               ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
---------------------------------------
31-5月 -2022 09:12:24  Elapsed: 2 secs
---------------------------------------
Found 4 potential problem(s) and 1 warning(s)
Contact Oracle Support with the output and trace file
to check if the above needs attention or not

PL/SQL 过程已成功完成。

pvcreate asm disk导致asm磁盘组异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:pvcreate asm disk导致asm磁盘组异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一客户asm磁盘组异常,无法正常mount

SQL> alter diskgroup datadg mount 
2022-05-28T19:08:55.114960+08:00
NOTE: cache registered group DATADG 1/0x2B504997
NOTE: cache began mount (first) of group DATADG 1/0x2B504997
NOTE: Assigning number (1,3) to disk (/dev/oracleasm/disks/DATA05)
NOTE: Assigning number (1,2) to disk (/dev/oracleasm/disks/DATA03)
NOTE: Assigning number (1,1) to disk (/dev/oracleasm/disks/DATA02)
2022-05-28T19:08:55.150062+08:00
ERROR: no read quorum in group: required 1, found 0 disks
2022-05-28T19:08:55.150684+08:00
NOTE: cache dismounting (clean) group 1/0x2B504997 (DATADG)
NOTE: messaging CKPT to quiesce pins Unix process pid: 15103, image: oracle@XFF01 (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: LGWR not being messaged to dismount
NOTE: cache dismounted group 1/0x2B504997 (DATADG)
NOTE: cache ending mount (fail) of group DATADG number=1 incarn=0x2b504997
NOTE: cache deleting context for group DATADG 1/0x2b504997
2022-05-28T19:08:55.191073+08:00
GMON dismounting group 1 at 36 for pid 37, osid 15103
2022-05-28T19:08:55.191258+08:00
NOTE: Disk DATA02 in mode 0x8 marked for de-assignment
NOTE: Disk DATA03 in mode 0x8 marked for de-assignment
NOTE: Disk DATA05 in mode 0x8 marked for de-assignment
ERROR: diskgroup DATADG was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATADG" cannot be mounted
ORA-15040: diskgroup is incomplete

通过报错信息,初步判断是由于少了asm disk导致(依据:1. ORA-15040,2.asmlib中的DATA01丢失),初步判断由于某种原因导致asmlib的磁盘异常,从而使得asm磁盘组无法正常mount,通过对dd 到本地的asm磁盘进行分析

C:\Users\XFF>kfed read H:\TEMP\asmdd\sdb6-o.dd
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
0066E8200 00000000 00000000 00000000 00000000  [................]
        Repeat 31 times
0066E8400 4542414C 454E4F4C 00000001 00000000  [LABELONE........]
0066E8410 4E06D490 00000020 324D564C 31303020  [...N ...LVM2 001]
0066E8420 34535542 476A7667 42546C48 6D384675  [BUS4gvjGHlTBuF8m]
0066E8430 7A385273 4B495777 73336242 33637449  [sR8zwWIKBb3sItc3]
0066E8440 48001000 000001E8 00100000 00000000  [...H............]
0066E8450 00000000 00000000 00000000 00000000  [................]
0066E8460 00000000 00000000 00001000 00000000  [................]
0066E8470 000FF000 00000000 00000000 00000000  [................]
0066E8480 00000000 00000000 00000002 00000000  [................]
0066E8490 00000000 00000000 00000000 00000000  [................]
  Repeat 214 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

通过这部分信息可以确认,一个asm disk被创建了pv,进一步分析pv信息
pv


对于这样的情况,表示asm disk被创建了pv但是pv没有加入到任何vg中,也就意味着该disk没有太大破坏,通过信息确认
20220529124428
20220529124734

主要是这两个部分信息被损坏,可以通过一些方法对这两个block信息进行重构

C:\Users\XFF>kfed read H:\TEMP\asmdd\sdb6.dd|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3196491921 ; 0x00c: 0xbe869891
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:   ORCLDISKDATA01 ; 0x000: length=14
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:        12592 ; 0x00c: 0x00003130
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                203424000 ; 0x020: 0x0c200100
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:                  DATA01 ; 0x028: length=6
kfdhdb.grpname:                  DATADG ; 0x048: length=6
kfdhdb.fgname:                   DATA01 ; 0x068: length=6
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.crestmp.hi:             33083792 ; 0x0a8: HOUR=0x10 DAYS=0xc MNTH=0x4 YEAR=0x7e3
kfdhdb.crestmp.lo:           2268043264 ; 0x0ac: USEC=0x0 MSEC=0x3e6 SECS=0x32 MINS=0x21
kfdhdb.mntstmp.hi:             33134479 ; 0x0b0: HOUR=0xf DAYS=0x1c MNTH=0x5 YEAR=0x7e6
-- More  --

C:\Users\XFF>kfed read H:\TEMP\asmdd\sdb6.dd blkn=1|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       1 ; 0x004: blk=1
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  2177715180 ; 0x00c: 0x81cd4bec
kfbh.fcn.base:                  3721754 ; 0x010: 0x0038ca1a
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdfsb.aunum:                         0 ; 0x000: 0x00000000
kfdfsb.max:                        1014 ; 0x004: 0x03f6
kfdfsb.cnt:                        1014 ; 0x006: 0x03f6
kfdfsb.bound:                         0 ; 0x008: 0x0000
kfdfsb.flag:                          1 ; 0x00a: B=1
kfdfsb.ub1spare:                      0 ; 0x00b: 0x00
kfdfsb.spare[0]:                      0 ; 0x00c: 0x00000000
kfdfsb.spare[1]:                      0 ; 0x010: 0x00000000
kfdfsb.spare[2]:                      0 ; 0x014: 0x00000000
kfdfse[0].fse:                        0 ; 0x018: FREE=0x0 FRAG=0x0
kfdfse[1].fse:                        0 ; 0x019: FREE=0x0 FRAG=0x0
kfdfse[2].fse:                        0 ; 0x01a: FREE=0x0 FRAG=0x0
kfdfse[3].fse:                        0 ; 0x01b: FREE=0x0 FRAG=0x0
kfdfse[4].fse:                        0 ; 0x01c: FREE=0x0 FRAG=0x0
kfdfse[5].fse:                        0 ; 0x01d: FREE=0x0 FRAG=0x0
kfdfse[6].fse:                        0 ; 0x01e: FREE=0x0 FRAG=0x0
kfdfse[7].fse:                        0 ; 0x01f: FREE=0x0 FRAG=0x0
kfdfse[8].fse:                        0 ; 0x020: FREE=0x0 FRAG=0x0

通过dd写入到原磁盘,通过oracleasm scandisks扫描磁盘
scandisks


磁盘组mount成功
mount

数据库顺利open
20220529140558

20220529140813


这个案例能够完美恢复,主要是客户没有做进一步破坏,没有把这个pv加入到vg中并且写入数据,以前有过类似案例因为写入了数据,恢复比这个难多了,效果也没有这个好asm disk被加入vg恢复
如果不幸有类似oracle asm disk被破坏(格式化,dd部分,做成lv等),需要进行恢复支持,可以联系我们,做专业的恢复评估,最大限度,最快速度抢救数据,减少损失
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com
恢复过部分asm异常案例:
删除分区 oracle asm disk 恢复
asm disk 磁盘部分被清空恢复
又一例asm格式化文件系统恢复
一次完美的asm disk被格式化ntfs恢复
oracle asm disk格式化恢复—格式化为ext4文件系统
oracle asm disk格式化恢复—格式化为ntfs文件系统
分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例