asm disk格式化恢复

接到网友请求,由于操作人员粗心把asm disk的磁盘映射到另外的机器上,并且格式化为了win ntfs文件系统,导致asm 磁盘组异常,数据库无法使用
asm 日志报ORA-27072错

Mon Nov 30 12:00:13 2015
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27070: async read/write failed
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 21) 设备未就绪。
WARNING: IO Failed. group:1 disk(number.incarnation):0.0xf0f0bbfb disk_path:\\.\ORCLDISKDATA0
	 AU:1 disk_offset(bytes):2093056 io_size:4096 operation:Write type:synchronous
	 result:I/O error process_id:868
WARNING: disk 0.4042308603 (DATA_0000) not responding to heart beat
ERROR: too many offline disks in PST (grp 1)
WARNING: Disk DATA_0000 in mode 0x7f will be taken offline
Mon Nov 30 12:00:13 2015
NOTE: process 576:37952 initiating offline of disk 0.4042308603 (DATA_0000) with mask 0x7e in group 1
WARNING: Disk DATA_0000 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 1, dsk = 0/0xf0f0bbfb, mode = 0x15
kfdp_updateDsk(): 5 
kfdp_updateDskBg(): 5 
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):1.0xf0f0bbfc disk_path:\\.\ORCLDISKDATA1
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):1.0xf0f0bbfc disk_path:\\.\ORCLDISKDATA1
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):2.0xf0f0bbfd disk_path:\\.\ORCLDISKDATA2
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):2.0xf0f0bbfd disk_path:\\.\ORCLDISKDATA2
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):3.0xf0f0bbfe disk_path:\\.\ORCLDISKDATA3
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):3.0xf0f0bbfe disk_path:\\.\ORCLDISKDATA3
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):4.0xf0f0bbff disk_path:\\.\ORCLDISKDATA4
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):4.0xf0f0bbff disk_path:\\.\ORCLDISKDATA4
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):6.0xf0f0bc01 disk_path:\\.\ORCLDISKDATA6
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):6.0xf0f0bc01 disk_path:\\.\ORCLDISKDATA6
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):7.0xf0f0bc02 disk_path:\\.\ORCLDISKDATA7
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):7.0xf0f0bc02 disk_path:\\.\ORCLDISKDATA7
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
ERROR: no PST quorum in group: required 1, found 0
WARNING: Disk DATA_0000 in mode 0x7f offline aborted
Mon Nov 30 12:00:14 2015
SQL> alter diskgroup DATA dismount force /* ASM SERVER */ 
NOTE: cache dismounting (not clean) group 1/0xBB404B03 (DATA) 
Mon Nov 30 12:00:14 2015
NOTE: halting all I/Os to diskgroup DATA
Mon Nov 30 12:00:14 2015
NOTE: LGWR doing non-clean dismount of group 1 (DATA)
NOTE: LGWR sync ABA=367.7265 last written ABA 367.7265
NOTE: cache dismounted group 1/0xBB404B03 (DATA) 
kfdp_dismount(): 6 
kfdp_dismountBg(): 6 
NOTE: De-assigning number (1,0) from disk (\\.\ORCLDISKDATA0)
NOTE: De-assigning number (1,1) from disk (\\.\ORCLDISKDATA1)
NOTE: De-assigning number (1,2) from disk (\\.\ORCLDISKDATA2)
NOTE: De-assigning number (1,3) from disk (\\.\ORCLDISKDATA3)
NOTE: De-assigning number (1,4) from disk (\\.\ORCLDISKDATA4)
NOTE: De-assigning number (1,5) from disk (\\.\ORCLDISKDATA5)
NOTE: De-assigning number (1,6) from disk (\\.\ORCLDISKDATA6)
NOTE: De-assigning number (1,7) from disk (\\.\ORCLDISKDATA7)
SUCCESS: diskgroup DATA was dismounted
NOTE: cache deleting context for group DATA 1/-1153414397
SUCCESS: alter diskgroup DATA dismount force /* ASM SERVER */
ERROR: PST-initiated MANDATORY DISMOUNT of group DATA

这里的asm日志很明显由于asm disk无法正常访问,报ORA-27072错误,磁盘组强制dismount.

分析磁盘情况
asm-disk1
asm-disk2


通过与客户沟通,确定从I到O本为asm disk 被格式化为了NTFS文件系统的磁盘,结合asmtool分析可以发现还有一个asm disk没有格式化掉,该磁盘组中一个共有8个磁盘格式化掉了7个.

通过kfed分析磁盘信息

C:\Users\Administrator>kfed read '\\.\J:'
kfbh.endian:                        235 ; 0x000: 0xeb
kfbh.hard:                           82 ; 0x001: 0x52
kfbh.type:                          144 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                         78 ; 0x003: 0x4e
kfbh.block.blk:               542328404 ; 0x004: T=0 NUMB=0x20534654
kfbh.block.obj:                 2105376 ; 0x008: TYPE=0x0 NUMB=0x2020
kfbh.check:                        2050 ; 0x00c: 0x00000802
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                    63488 ; 0x014: 0x0000f800
kfbh.spare1:                   16711743 ; 0x018: 0x00ff003f
kfbh.spare2:                       2048 ; 0x01c: 0x00000800
ERROR!!!, failed to get the oracore error message

C:\Users\Administrator>kfed read '\\.\J:' blkn=2
kfbh.endian:                         70 ; 0x000: 0x46
kfbh.hard:                           73 ; 0x001: 0x49
kfbh.type:                           76 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                         69 ; 0x003: 0x45
kfbh.block.blk:                  196656 ; 0x004: T=0 NUMB=0x30030
kfbh.block.obj:                33563364 ; 0x008: TYPE=0x0 NUMB=0x22e4
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                    65537 ; 0x010: 0x00010001
kfbh.fcn.wrap:                    65592 ; 0x014: 0x00010038
kfbh.spare1:                        416 ; 0x018: 0x000001a0
kfbh.spare2:                       1024 ; 0x01c: 0x00000400
ERROR!!!, failed to get the oracore error message

C:\Users\Administrator>kfed read '\\.\J:' blkn=256
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                           13 ; 0x002: KFBTYP_PST_NONE
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:              2147483648 ; 0x004: T=1 NUMB=0x0
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                    17662471 ; 0x00c: 0x010d8207
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
ERROR!!!, failed to get the oracore error message

C:\Users\Administrator>kfed read '\\.\J:' blkn=510
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: T=0 NUMB=0xfe
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                   717599272 ; 0x00c: 0x2ac5b228
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:    ORCLDISKDATA6 ; 0x000: length=13
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:           54 ; 0x00c: 0x00000036
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
…………

通过分析,可以确定asm disk的备份block没有被覆盖,原则上可以通过备份block实现磁盘组恢复,从而减小了恢复难度

kfed恢复磁盘头

C:\Users\Administrator> kfed repair '\\.\J:'
C:\Users\Administrator>kfed read '\\.\J:' 
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: T=0 NUMB=0xfe
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                   717599272 ; 0x00c: 0x2ac5b228
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:    ORCLDISKDATA6 ; 0x000: length=13
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:           54 ; 0x00c: 0x00000036
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
…………

确定asm disk相关信息
对于7个被格式化的磁盘都进行类似处理之后,通过工具看到相关磁盘信息如下
asm-disk3


恢复处理
根据ntfs的文件系统分布,我们可以知道,虽然asm disk header备份block正常,但是asm disk中间部分依旧有不少au会被破坏
ntfs


这样的情况,不合适直接使用工具拷贝出来datafile(由于可能记录block的字典正好被覆盖,导致拷贝出来的文件异常,在恢复过程中我们也做了试验小文件拷贝ok,大文件拷贝然后使用dbv检测有很多坏块),我们采用工具(asm disk header 彻底损坏恢复)从底层扫描直接重组出来asm disk中的数据文件,然后结合拷贝出来的控制文件,redo文件,参数文件,然后通过重命名相关路径,然后直接open数据库

Q:\>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on 星期三 1月 22 16:08:18 2014

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


SQL> set pages 1000
SQL> col name for a100
SQL> set lines 150
SQL> select file#,name from v$datafile;

     FILE# NAME
---------- --------------------------------------------------------------------
         1 +DATA/vspdb/datafile/system.256.778520603
         2 +DATA/vspdb/datafile/sysaux.257.778520603
         3 +DATA/vspdb/datafile/undotbs1.258.778520603
         4 +DATA/vspdb/datafile/users.259.778520603
         5 +DATA/vspdb/datafile/vsp_tbs.293.779926097
        …………
       147 +DATA/vspdb/datafile/index_dg.418.864665747
       148 +DATA/vspdb/datafile/data_dg.419.864667053
       149 +DATA/vspdb/datafile/vsp_mm_tbs.420.890410367
       150 +DATA/vspdb/datafile/vsp_mm_tbs.421.890410457

SQL> select member from v$logfile;

MEMBER
-------------------------------------------------------------------------------------
+DATA/vspdb/onlinelog/group_7.263.862676593
+DATA/vspdb/onlinelog/group_7.262.862676601
+DATA/vspdb/onlinelog/group_4.410.862652291
+DATA/vspdb/onlinelog/group_4.411.862652307
+DATA/vspdb/onlinelog/group_5.412.862653715
+DATA/vspdb/onlinelog/group_5.413.862653727
+DATA/vspdb/onlinelog/group_6.414.862676425
+DATA/vspdb/onlinelog/group_6.415.862676433

重命名数据文件和redo文件,open数据库

SQL> recover database;
完成介质恢复。
SQL> alter database open;

数据库已更改。

已用时间:  00: 00: 04.51

由于部分block被覆盖,使用空块代替,导致数据访问到该block就会出现ora-8103(模拟普通ORA-08103并解决,模拟极端ORA-08103并解决)错误,对于该种对象,最简单处理方法就是直接通过dul抽出来数据然后truncate table重新导入数据,当然如果你想彻底安全逻辑方式重建库最靠谱

ORA-15032 ORA-15040 ORA-15042 asm故障恢复

接到一个朋友恢复请求,19个lun的asm 磁盘组,由于其中一个lun有问题,他们进行了增加一个新lun,删除老lun的方法操作,但是操作一半hang住了(因为坏的lun是底层损坏,无法完成rebalance),然后存储工程师继续修复异常lun,非常幸运异常lun修复好了,但是高兴过了头,直接从存储上删除了新加入的lun(已经rebalance一部分数据进去了),这个时候asm dg彻底趴下了,不能mount成功,请求恢复支持。由于某种原因,无法从lun层面恢复,只能让我们提供数据库层面恢复

Mon Sep 21 19:52:35 2015
SQL> alter diskgroup  dg_XFF add disk '/dev/rhdisk116' size 716800M drop disk dg_XFF_0012
NOTE: Assigning number (1,20) to disk (/dev/rhdisk116)
NOTE: requesting all-instance membership refresh for group=1
NOTE: initializing header on grp 1 disk DG_XFF_0020
NOTE: requesting all-instance disk validation for group=1
Mon Sep 21 19:52:44 2015
NOTE: skipping rediscovery for group 1/0xb94738f1 (DG_XFF) on local instance.
NOTE: requesting all-instance disk validation for group=1
NOTE: skipping rediscovery for group 1/0xb94738f1 (DG_XFF) on local instance.
NOTE: initiating PST update: grp = 1
Mon Sep 21 19:52:44 2015
GMON updating group 1 at 25 for pid 27, osid 12124486
NOTE: PST update grp = 1 completed successfully
NOTE: membership refresh pending for group 1/0xb94738f1 (DG_XFF)
GMON querying group 1 at 26 for pid 18, osid 10092734
NOTE: cache opening disk 20 of grp 1: DG_XFF_0020 path:/dev/rhdisk116
GMON querying group 1 at 27 for pid 18, osid 10092734
SUCCESS: refreshed membership for 1/0xb94738f1 (DG_XFF)
Mon Sep 21 19:52:47 2015
SUCCESS: alter diskgroup  dg_XFF add disk '/dev/rhdisk116' size 716800M drop disk dg_XFF_0012
NOTE: starting rebalance of group 1/0xb94738f1 (DG_XFF) at power 1
Starting background process ARB0
Mon Sep 21 19:52:47 2015
ARB0 started with pid=28, OS id=10944804
NOTE: assigning ARB0 to group 1/0xb94738f1 (DG_XFF) with 1 parallel I/O
NOTE: Attempting voting file refresh on diskgroup DG_XFF
Mon Sep 21 20:35:06 2015
SQL> ALTER DISKGROUP DG_XFF MOUNT  /* asm agent *//* {1:51107:7083} */
NOTE: cache registered group DG_XFF number=1 incarn=0xdd6f975a
NOTE: cache began mount (first) of group DG_XFF number=1 incarn=0xdd6f975a
NOTE: Assigning number (1,0) to disk (/dev/rhdisk10)
NOTE: Assigning number (1,1) to disk (/dev/rhdisk11)
NOTE: Assigning number (1,2) to disk (/dev/rhdisk16)
NOTE: Assigning number (1,3) to disk (/dev/rhdisk17)
NOTE: Assigning number (1,4) to disk (/dev/rhdisk22)
NOTE: Assigning number (1,5) to disk (/dev/rhdisk23)
NOTE: Assigning number (1,6) to disk (/dev/rhdisk28)
NOTE: Assigning number (1,7) to disk (/dev/rhdisk29)
NOTE: Assigning number (1,8) to disk (/dev/rhdisk33)
NOTE: Assigning number (1,9) to disk (/dev/rhdisk34)
NOTE: Assigning number (1,10) to disk (/dev/rhdisk4)
NOTE: Assigning number (1,11) to disk (/dev/rhdisk40)
NOTE: Assigning number (1,12) to disk (/dev/rhdisk41)
NOTE: Assigning number (1,13) to disk (/dev/rhdisk45)
NOTE: Assigning number (1,14) to disk (/dev/rhdisk46)
NOTE: Assigning number (1,15) to disk (/dev/rhdisk5)
NOTE: Assigning number (1,16) to disk (/dev/rhdisk52)
NOTE: Assigning number (1,17) to disk (/dev/rhdisk53)
NOTE: Assigning number (1,18) to disk (/dev/rhdisk57)
NOTE: Assigning number (1,19) to disk (/dev/rhdisk58)
Wed Sep 30 11:08:07 2015
NOTE: start heartbeating (grp 1)
GMON querying group 1 at 33 for pid 35, osid 4194488
NOTE: Assigning number (1,20) to disk ()
GMON querying group 1 at 34 for pid 35, osid 4194488
NOTE: cache dismounting (clean) group 1/0xDD6F975A (DG_XFF)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0xDD6F975A (DG_XFF)
NOTE: cache ending mount (fail) of group DG_XFF number=1 incarn=0xdd6f975a
NOTE: cache deleting context for group DG_XFF 1/0xdd6f975a
GMON dismounting group 1 at 35 for pid 35, osid 4194488
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
ERROR: diskgroup DG_XFF was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "20" is missing from group number "1"
ERROR: ALTER DISKGROUP DG_XFF MOUNT  /* asm agent *//* {1:51107:7083} */

这里比较明显,由于存储工程师直接删除了lun,这里导致磁盘组DG_XFF丢失asm disk 20,使得磁盘组无法直接mount,由于该磁盘组已经进行了较长时间的rebalance,丢失的盘中已经有大量数据(包括元数据),因此就算修改pst让磁盘组mount起来(不一定成功),也会丢失大量数据,也不一定可以直接拿出来里面的数据,如果只是加入盘,但是由于某种原因没有做rebalance,那我们直接可以通过修改pst,使得磁盘组mount起来。因此对于这样的情况,我们能够做的,只能从底层扫描磁盘,生成数据文件(因为有部分文件的元数据在丢失lun之上,如果直接使用现存元数据信息,直接拷贝,或者unload数据都会丢失大量数据),然后再进一步unload数据,完成恢复。需要恢复磁盘信息

grp# dsk# bsize ausize disksize diskname        groupname       path
---- ---- ----- ------ -------- --------------- --------------- -------------
   1    0  4096  4096K   179200 DG_XFF_0000     DG_XFF          /dev/rhdisk10
   1    1  4096  4096K   179200 DG_XFF_0001     DG_XFF          /dev/rhdisk11
   1    2  4096  4096K   179200 DG_XFF_0002     DG_XFF          /dev/rhdisk16
   1    3  4096  4096K   179200 DG_XFF_0003     DG_XFF          /dev/rhdisk17
   1    4  4096  4096K   179200 DG_XFF_0004     DG_XFF          /dev/rhdisk22
   1    5  4096  4096K   179200 DG_XFF_0005     DG_XFF          /dev/rhdisk23
   1    6  4096  4096K   179200 DG_XFF_0006     DG_XFF          /dev/rhdisk28
   1    7  4096  4096K   179200 DG_XFF_0007     DG_XFF          /dev/rhdisk29
   1    8  4096  4096K   179200 DG_XFF_0008     DG_XFF          /dev/rhdisk33
   1    9  4096  4096K   179200 DG_XFF_0009     DG_XFF          /dev/rhdisk34
   1   10  4096  4096K   179200 DG_XFF_0010     DG_XFF          /dev/rhdisk4
   1   11  4096  4096K   179200 DG_XFF_0011     DG_XFF          /dev/rhdisk40
   1   12  4096  4096K   179200 DG_XFF_0012     DG_XFF          /dev/rhdisk41
   1   13  4096  4096K   179200 DG_XFF_0013     DG_XFF          /dev/rhdisk45
   1   14  4096  4096K   179200 DG_XFF_0014     DG_XFF          /dev/rhdisk46
   1   15  4096  4096K   179200 DG_XFF_0015     DG_XFF          /dev/rhdisk5
   1   16  4096  4096K   179200 DG_XFF_0016     DG_XFF          /dev/rhdisk52
   1   17  4096  4096K   179200 DG_XFF_0017     DG_XFF          /dev/rhdisk53
   1   18  4096  4096K   179200 DG_XFF_0018     DG_XFF          /dev/rhdisk57
   1   19  4096  4096K   179200 DG_XFF_0019     DG_XFF          /dev/rhdisk58

这次运气比较好,丢失的磁盘组只是一个业务磁盘组,而且里面只有19个表空间,10个分区表,因此在数据字典完成的情况下,恢复10个分区表(一共6443个分区)的数据,整体恢复效果如下:
RECOVER


从整体数据量看恢复比例为:6003.26953/6027.26935*100%=99.6018127%,对于丢失了一个已经rebalance的大部分的lun,依旧能够恢复如此的数据,整体看非常理想.不得不说,老熊威武

asm disk误设置pvid导致asm diskgroup无法mount恢复

转载:asm disk误设置pvid导致asm diskgroup无法mount恢复
有朋友找到我说他们把以前存储到AIX直连的存储切换为含光纤交换机的存储网络后,RAC无法启动,让我给予支持.通过分析是由于换盘之后开始磁盘顺序不对,维护人员对其asm disk 设置了pvid,导致asm 磁盘组无法正常mount,从而使得含votedisk的dg的asm disk无法正常访问,从而RAC的cssd进程无法启动,同样数据文件的磁盘组也无法mount,通过kfed修复成功,实现数据0丢失.
平台版本信息(2节点RAC)

$ sqlplus -v

SQL*Plus: Release 11.2.0.4.0 Production

$ uname -a
AIX db2 1 7 00F9733E4C00

GI日志报错信息

2014-12-20 16:44:08.769: 
[ohasd(6946818)]CRS-2769:Unable to failover resource 'ora.diskmon'.
2014-12-20 16:44:11.775: 
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; 
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:26.791: 
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; 
、Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:41.812: 
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; 
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以看出来是由于RAC启动过程中无法获得votedisk使得其无法正常启动,通过分析日志找出来votedisk相关磁盘

2014-12-15 17:36:15.424: 
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk4; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.433: 
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk5; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.445: 
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk6; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以知道rhdisk4,5,6为votedisk对应磁盘,使用kfed查看磁盘头信息

$kfed read /dev/rhdisk4
kfbh.endian:                        201 ; 0x000: 0xc9
kfbh.hard:                          194 ; 0x001: 0xc2
kfbh.type:                          212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                        193 ; 0x003: 0xc1
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
1102BEE00 C9C2D4C1 00000000 00000000 00000000  [................]
1102BEE10 00000000 00000000 00000000 00000000  [................]
        Repeat 6 times
1102BEE80 00F9733D 67553E0A 00000000 00000000  [..s=gU>.........]
1102BEE90 00000000 00000000 00000000 00000000  [................]
  Repeat 246 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]

$kfed read /dev/rhdisk4 blkn=1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       1 ; 0x004: blk=1
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3883664132 ; 0x00c: 0xe77c0304
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdfsb.aunum:                         0 ; 0x000: 0x00000000
kfdfsb.max:                         254 ; 0x004: 0x00fe
kfdfsb.cnt:                          23 ; 0x006: 0x0017
kfdfsb.bound:                         0 ; 0x008: 0x0000
kfdfsb.flag:                          1 ; 0x00a: B=1
kfdfsb.ub1spare:                      0 ; 0x00b: 0x00
kfdfsb.spare[0]:                      0 ; 0x00c: 0x00000000
kfdfsb.spare[1]:                      0 ; 0x010: 0x00000000
kfdfsb.spare[2]:                      0 ; 0x014: 0x00000000
kfdfse[0].fse:                      119 ; 0x018: FREE=0x7 FRAG=0x7
kfdfse[1].fse:                       16 ; 0x019: FREE=0x0 FRAG=0x1
…………

$kfed read /dev/rhdisk4 blkn=510
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: blk=254
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3460116983 ; 0x00c: 0xce3d31f7
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        2 ; 0x026: KFDGTP_NORMAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:                CRS_0000 ; 0x028: length=8
kfdhdb.grpname:                     CRS ; 0x048: length=3
kfdhdb.fgname:                 CRS_0000 ; 0x068: length=8
…………

由上述分析可以基本上确定是asm disk header 被破坏,进一步分析破坏原因

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active      
hdisk1          00f9733e21b953e6                    rootvg          active      
hdisk2          00f9733e21b97a83                    appvg           active      
hdisk3          00f9733e21b98434                    appvg           active      
hdisk4          00f9733d67553e0a                    None                        
hdisk5          00f9733d67553f31                    None                        
hdisk6          00f9733d67554011                    None                        
hdisk7          00f9733d67554165                    None                        
hdisk8          00f9733d675541e5                    None                        
hdisk9          00f9733d675542e4                    None                        
hdisk10         none                                None   

[db2/dev#]ls -l rhdisk*
crw-------    2 root     system       24,  1 Oct 18 11:45 rhdisk0
crw-------    1 root     system       24,  3 Oct 18 13:27 rhdisk1
crw-------    1 root     system       24,  5 Dec 20 20:02 rhdisk10
crw-------    1 root     system       24,  2 Oct 18 13:32 rhdisk2
crw-------    1 root     system       24,  0 Oct 18 13:32 rhdisk3
crw-rw----    1 grid     asmadmin     24,  8 Dec 20 20:02 rhdisk4
crw-rw----    1 grid     asmadmin     24,  9 Dec 20 20:02 rhdisk5
crw-rw----    1 grid     asmadmin     24, 10 Dec 20 20:02 rhdisk6
crw-rw----    1 grid     asmadmin     24,  4 Dec 20 20:02 rhdisk7
crw-rw----    1 grid     asmadmin     24,  6 Dec 20 20:02 rhdisk8
crw-rw----    1 grid     asmadmin     24,  7 Dec 20 20:02 rhdisk9

从这里基本上可以看出来,是由于磁盘头被重写了pvid,导致asm disk header 被破坏.进一步分析asm log,确定哪些磁盘被用作asm disk

SQL> CREATE DISKGROUP CRS NORMAL REDUNDANCY  DISK '/dev/rhdisk4',
'/dev/rhdisk5',
'/dev/rhdisk6' ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */ 
NOTE: Assigning number (1,0) to disk (/dev/rhdisk4)
NOTE: Assigning number (1,1) to disk (/dev/rhdisk5)
NOTE: Assigning number (1,2) to disk (/dev/rhdisk6)
NOTE: initializing header on grp 1 disk CRS_0000
NOTE: initializing header on grp 1 disk CRS_0001
NOTE: initializing header on grp 1 disk CRS_0002

SQL> CREATE DISKGROUP DATA EXTERNAL REDUNDANCY  DISK 
'/dev/rhdisk9' SIZE 614400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */ 
NOTE: Assigning number (2,0) to disk (/dev/rhdisk9)
NOTE: initializing header on grp 2 disk DATA_0000


SQL> CREATE DISKGROUP FBA EXTERNAL REDUNDANCY  DISK 
'/dev/rhdisk8' SIZE 204800M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */ 
NOTE: Assigning number (3,0) to disk (/dev/rhdisk8)
NOTE: initializing header on grp 3 disk FBA_0000


SQL> CREATE DISKGROUP ARCH EXTERNAL REDUNDANCY  DISK 
'/dev/rhdisk7' SIZE 102400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */ 
NOTE: Assigning number (4,0) to disk (/dev/rhdisk7)
NOTE: initializing header on grp 4 disk ARCH_0000

这里可以确定asm disk为rhdisk[4-9],通过kfed分析全部和rhdisk4一样的问题,也符合lspv查询出来的结果,使用kfed repair修复asm disk header后

SQL> alter diskgroup data mount;

Diskgroup altered.

SQL> alter diskgroup fba mount;

Diskgroup altered.

SQL> alter diskgroup arch mount;

Diskgroup altered.

SQL> alter diskgroup crs mount;

Diskgroup altered.

SQL> select group_number,disk_number,path from v$asm_disk;

GROUP_NUMBER DISK_NUMBER PATH
------------ ----------- --------------------------------------------------
           2           0 /dev/rhdisk4
           2           1 /dev/rhdisk5
           2           2 /dev/rhdisk6
           1           0 /dev/rhdisk7
           4           0 /dev/rhdisk8
           3           0 /dev/rhdisk9

6 rows selected.

SQL> select group_number,name from v$asm_diskgroup;

GROUP_NUMBER NAME
------------ ------------------------------------------------------------
           1 ARCH
           2 CRS
           3 DATA
           4 FBA

这里证明通过kfed对磁盘头的修复,asm磁盘组已经全部mount成功,GI状态也恢复正常

[db2/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.CRS.dg
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.DATA.dg
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.FBA.dg
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.asm
               ONLINE  ONLINE       db1                      Started             
               ONLINE  ONLINE       db2                      Started             
ora.gsd
               OFFLINE OFFLINE      db1                                          
               OFFLINE OFFLINE      db2                                          
ora.net1.network
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.ons
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.registry.acfs
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1                                          
ora.cvu
      1        ONLINE  ONLINE       db1                                          
ora.db1.vip
      1        ONLINE  ONLINE       db1                                          
ora.db2.vip
      1        ONLINE  ONLINE       db2                                          
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open                
      2        ONLINE  ONLINE       db2                      Open                
ora.oc4j
      1        ONLINE  ONLINE       db1                                          
ora.scan1.vip
      1        ONLINE  ONLINE       db1                                          

这里忽略了一个问题,在修复磁盘头之前没有清除pvid,导致磁盘头修复后,pvid依然存储在odm中

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active      
hdisk1          00f9733e21b953e6                    rootvg          active      
hdisk2          00f9733e21b97a83                    appvg           active      
hdisk3          00f9733e21b98434                    appvg           active      
hdisk4          00f9733d67553e0a                    None                        
hdisk5          00f9733d67553f31                    None                        
hdisk6          00f9733d67554011                    None                        
hdisk7          00f9733d67554165                    None                        
hdisk8          00f9733d675541e5                    None                        
hdisk9          00f9733d675542e4                    None                        
hdisk10         none                                None   

通过分析发现fba磁盘组中无任何记录,使用该磁盘组进行直接清除pvid测试

$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:13:31 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> alter diskgroup fba dismount;

Diskgroup altered.

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
$ exit
You have mail in /usr/spool/mail/root
[db2/#]chdev -l hdisk8 -a pv=clear
hdisk8 changed
[db2/#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active      
hdisk1          00f9733e21b953e6                    rootvg          active      
hdisk2          00f9733e21b97a83                    appvg           active      
hdisk3          00f9733e21b98434                    appvg           active      
hdisk4          00f9733d67553e0a                    None                        
hdisk5          00f9733d67553f31                    None                        
hdisk6          00f9733d67554011                    None                        
hdisk7          00f9733d67554165                    None                        
hdisk8          none                                None                        
hdisk9          00f9733d675542e4                    None                        
hdisk10         none                                None                        
[db2/#]su - grid
$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:15:19 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> alter diskgroup fba mount;

Diskgroup altered.

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

通过测试直接清除pvid asm 磁盘头依然工作正常,关闭GI,使用chdev清除hdisk[4-9]所有pvid,启动GI一切正常

[db1/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.CRS.dg
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.DATA.dg
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.FBA.dg
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.asm
               ONLINE  ONLINE       db1                      Started             
               ONLINE  ONLINE       db2                      Started             
ora.gsd
               OFFLINE OFFLINE      db1                                          
               OFFLINE OFFLINE      db2                                          
ora.net1.network
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.ons
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
ora.registry.acfs
               ONLINE  ONLINE       db1                                          
               ONLINE  ONLINE       db2                                          
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1                                          
ora.cvu
      1        ONLINE  ONLINE       db1                                          
ora.db1.vip
      1        ONLINE  ONLINE       db1                                          
ora.db2.vip
      1        ONLINE  ONLINE       db2                                          
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open                
      2        ONLINE  ONLINE       db2                      Open                
ora.oc4j
      1        ONLINE  ONLINE       db1                                          
ora.scan1.vip
      1        ONLINE  ONLINE       db1                                          
[db1/#]lspv
hdisk0          00f9733df7c7a9db                    rootvg          active      
hdisk1          00f9733d21dad8fe                    rootvg          active      
hdisk2          00f9733d21dbd08b                    appvg           active      
hdisk3          00f9733d21dbd2ab                    appvg           active      
hdisk4          none                                None                        
hdisk5          none                                None                        
hdisk6          none                                None                        
hdisk7          none                                None                        
hdisk8          none                                None                        
hdisk9          none                                None                        
hdisk10         none                                None 

至此设置pvid导致asm disk header损坏的asm 恢复正常,实现数据0丢失。
温馨提示:aix asm disk磁盘中不能设置pvid,否则将会导致asm disk header 损坏,无法正常mount

asm disk header 彻底损坏恢复

在asm 磁盘组不能mount的情况下,如果是磁盘头的少数部分损坏,或者是asm disk header存在,可以通过kfed修复,或者使用备份的磁盘头信息去恢复从而实现磁盘组mount来恢复数据库.如果没有备份也无法修复可以尝试使用amdu,dul来实现对不能mount的磁盘组进行恢复.在极端情况下(比如磁盘组完全丢失),amdu/dul都无论为力的情况下,可以考虑使用扫描磁盘找出来datafile 的方法求救数据的最后稻草.本实验大概的模拟了asm disk 前10M完全损坏的情况下数据库恢复
测试准备
创建新表空间,创建T_XIFENFEI测试表

SQL> create tablespace xifenfei datafile '+XIFENFEI' SIZE 50m;

Tablespace created.

SQL> CREATE TABLE T_XIFENFEI TABLESPACE XIFENFEI
  2  AS SELECT * FROM DBA_OBJECTS;

Table created.

SQL> SELECT COUNT(*) FROM T_XIFENFEI;

  COUNT(*)
----------
     50031

SQL> select ts#,rfile#,bytes/1024/1024,blocks,name from v$datafile;

       TS#     RFILE# BYTES/1024/1024     BLOCKS NAME
---------- ---------- --------------- ---------- --------------------------------------------------
         0          1             480      61440 +XIFENFEI/asm10g/datafile/system.256.845260203
         1          2              25       3200 +XIFENFEI/asm10g/datafile/undotbs1.258.845260205
         2          3             250      32000 +XIFENFEI/asm10g/datafile/sysaux.257.845260203
         4          4               5        640 +XIFENFEI/asm10g/datafile/users.259.845260205
         6          5              50       6400 +XIFENFEI/asm10g/datafile/xifenfei.266.845262139

SQL> select GROUP_NUMBER,DISK_NUMBER,STATE,TOTAL_MB,FREE_MB,NAME,path from  v$asm_disk;

GROUP_NUMBER DISK_NUMBER STATE      TOTAL_MB    FREE_MB NAME                 PATH
------------ ----------- -------- ---------- ---------- -------------------- ------------------
           1           0 NORMAL         2048          0 XIFENFEI_0000        /dev/raw/raw1
           1           1 NORMAL          784          0 XIFENFEI_0001        /dev/raw/raw2
           1           2 NORMAL         7059          0 XIFENFEI_0002        /dev/raw/raw3

--关闭数据库
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

--关闭ASM
SQL> shutdown immediate
ASM diskgroups dismounted
ASM instance shutdown

查看裸设备对应磁盘

[oracle@xifenfei dul]$ more /etc/sysconfig/rawdevices
/dev/raw/raw1   /dev/sdc
/dev/raw/raw2   /dev/sdd1
/dev/raw/raw3   /dev/sdd2

dd磁盘头
dd asm disk 前面10M,彻底破坏asm disk

[oracle@xifenfei ~]$ dd if=/dev/zero of=/dev/raw/raw1 bs=1M count=10 conv=notrunc 
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.175424 seconds, 59.8 MB/s
[oracle@xifenfei ~]$ dd if=/dev/zero of=/dev/raw/raw2 bs=1M count=10 conv=notrunc 
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.11584 seconds, 90.5 MB/s
[oracle@xifenfei ~]$ dd if=/dev/zero of=/dev/raw/raw3 bs=1M count=10 conv=notrunc 
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.353435 seconds, 29.7 MB/s

kfed查看磁盘
确定所有asm disk header完全被破坏

[oracle@xifenfei dul]$ kfed read /dev/raw/raw1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
[oracle@xifenfei dul]$ kfed read /dev/raw/raw2
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
[oracle@xifenfei dul]$ kfed read /dev/raw/raw3
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj:                       0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000

amdu查看asm 磁盘

[oracle@xifenfei ~]$ amdu -diskstring '/dev/raw/raw*'
amdu_2014_04_18_23_17_17/
[oracle@xifenfei ~]$ cd amdu_2014_04_18_23_17_17
[oracle@xifenfei amdu_2014_04_18_23_17_17]$ ls
report.txt
[oracle@xifenfei amdu_2014_04_18_23_17_17]$ more report.txt 
-*-amdu-*-
…………
--------------------------------- Operations ---------------------------------

------------------------------- Disk Selection -------------------------------
 -diskstring '/dev/raw/raw*'

------------------------------ Reading Control -------------------------------

------------------------------- Output Control -------------------------------

********************************* DISCOVERY **********************************

----------------------------- DISK REPORT N0001 ------------------------------
                Disk Path: /dev/raw/raw1
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 65536 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **

----------------------------- DISK REPORT N0002 ------------------------------
                Disk Path: /dev/raw/raw2
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 65536 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **

----------------------------- DISK REPORT N0003 ------------------------------
                Disk Path: /dev/raw/raw3
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 65536 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **


******************************* END OF REPORT ********************************

通过这里证明,当asm disk header 损坏严重之时,amdu无法识别,更加无法恢复相关数据库

dul查看完全损坏asm disk header
测试在asm disk header完全损坏情况下,dul是否还能够实现asm磁盘组中抽取数据,同理amdu也无法正常工作.

[oracle@xifenfei dul]$ ./dul

Data UnLoader: 10.2.0.5.28 - Internal Only - on Sat Apr 19 04:02:02 2014
with 64-bit io functions

Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.

 Strictly Oracle Internal Use Only


DUL: Warning: block 0 is not a disk header block
DUL: Error: Block is not in use
DUL: Error: Block type mismatch ( seen 0 expect 1) when parsing block 0 of disk /dev/raw/raw1

DUL: Warning: block 0 is not a disk header block
DUL: Error: Block is not in use
DUL: Error: Block type mismatch ( seen 0 expect 1) when parsing block 0 of disk /dev/raw/raw2

DUL: Warning: block 0 is not a disk header block
DUL: Error: Block is not in use
DUL: Error: Block type mismatch ( seen 0 expect 1) when parsing block 0 of disk /dev/raw/raw3

这里可以看出来,当asm disk header完全异常,dul也无法识别出来asm磁盘组(该情况下dul无法正常操作)

通过工具扫描磁盘抽取数据块

CPFL> scan disk  /dev/raw/raw1
Scanning  disk /dev/raw/raw1, at 2014-04-19 04:05:11
Completed  disk /dev/raw/raw1, at 2014-04-19 04:05:56

CPFL> scan  disk  /dev/raw/raw1
Scanning  disk /dev/raw/raw1, at 2014-04-19 04:05:56
Completed  disk /dev/raw/raw1, at 2014-04-19 04:06:15

CPFL> scan  disk  /dev/raw/raw1
Scanning  disk /dev/raw/raw1, at 2014-04-19 04:06:15
Completed  disk /dev/raw/raw1, at 2014-04-19 04:07:44

CPFL> list datafiles
 Tablespace: SYSTEM    File:    1  Blocks:      61440 
 Tablespace: UNDOTBS1  File:    2  Blocks:       3200 
 Tablespace: SYSAUX    File:    3  Blocks:      32000 
 Tablespace: USERS     File:    4  Blocks:        640 
 Tablespace: XIFENFEI  File:    5  Blocks:       6400 

CPFL> copy datafile 1 to /u01/oracle/oradata/datafile/1.dbf
copy datafile start: 2014-04-19 04:10:35
copy datafile 1 have blocks 61440
copy datafile completed: 2014-04-19 04:11:18

CPFL> copy datafile 2  to /u01/oracle/oradata/datafile/2.dbf
copy datafile start: 2014-04-19 04:11:52
copy datafile 2 have blocks 3200
copy datafile completed: 2014-04-19 04:11:54

CPFL>  copy datafile 3  to /u01/oracle/oradata/datafile/3.dbf
copy datafile start: 2014-04-19 04:12:03
copy datafile 3 have blocks 32000
copy datafile completed: 2014-04-19 04:12:27

CPFL>  copy datafile 4  to /u01/oracle/oradata/datafile/4.dbf
copy datafile start: 2014-04-19 04:13:07
copy datafile 4 have blocks 640
copy datafile completed: 2014-04-19 04:13:08

CPFL> copy datafile 5 to /u01/oracle/oradata/datafile/5.dbf
copy datafile start: 2014-04-19 04:13:18
copy datafile 5 have blocks 6400
copy datafile completed: 2014-04-19 04:13:19

查看使用工具抽取数据文件

[oracle@xifenfei datafile]$ ls -l
total 830320
-rw-r--r-- 1 oracle oinstall 503324672 Apr 19 04:34 1.dbf
-rw-r--r-- 1 oracle oinstall  26222592 Apr 19 04:34 2.dbf
-rw-r--r-- 1 oracle oinstall 262152192 Apr 19 04:34 3.dbf
-rw-r--r-- 1 oracle oinstall   5251072 Apr 19 04:34 4.dbf
-rw-r--r-- 1 oracle oinstall  52436992 Apr 19 04:34 5.dbf

dul验证抽取文件

[oracle@xifenfei dul]$ ./dul

Data UnLoader: 10.2.0.5.28 - Internal Only - on Sat Apr 19 06:56:09 2014
with 64-bit io functions

Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.

 Strictly Oracle Internal Use Only


DUL: Warning: Recreating file "dul.log"
Found db_id = 181793355
Found db_name = ASM10G

DUL> show datafiles;
ts# rf# start   blocks offs open  err file name
  0   1     0    61440    0    1    0 /u01/oracle/oradata/datafile/1.dbf
  1   2     0     3200    0    1    0 /u01/oracle/oradata/datafile/2.dbf
  2   3     0    32000    0    1    0 /u01/oracle/oradata/datafile/3.dbf
  4   4     0      640    0    1    0 /u01/oracle/oradata/datafile/4.dbf
  6   5     0     6400    0    1    0 /u01/oracle/oradata/datafile/5.dbf
DUL> bootstrap;
Probing file = 1, block = 377
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      57 rows unloaded
DUL: Warning: Dictionary cache DC_BOOTSTRAP is empty
Reading BOOTSTRAP.dat 57 entries loaded
Parsing Bootstrap$ contents
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 10
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$   51171 rows unloaded
. unloading table                      TAB$    1576 rows unloaded
. unloading table                      COL$   55264 rows unloaded
. unloading table                     USER$      59 rows unloaded
Reading USER.dat 59 entries loaded
Reading OBJ.dat 51171 entries loaded and sorted 51171 entries
Reading TAB.dat 1576 entries loaded
Reading COL.dat 55264 entries loaded and sorted 55264 entries
Reading BOOTSTRAP.dat 57 entries loaded

DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 10
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
 TABPART$: segobjno 266, file 1 block 2121
 INDPART$: segobjno 271, file 1 block 2161
 TABCOMPART$: segobjno 288, file 1 block 2297
 INDCOMPART$: segobjno 293, file 1 block 2345
 TABSUBPART$: segobjno 278, file 1 block 2217
 INDSUBPART$: segobjno 283, file 1 block 2257
 IND$: segobjno 2, tabno 3, file 1  block 25
 ICOL$: segobjno 2, tabno 4, file 1  block 25
 LOB$: segobjno 2, tabno 6, file 1  block 25
 COLTYPE$: segobjno 2, tabno 7, file 1  block 25
 TYPE$: segobjno 181, tabno 1, file 1  block 1297
 COLLECTION$: segobjno 181, tabno 2, file 1  block 1297
 ATTRIBUTE$: segobjno 181, tabno 3, file 1  block 1297
 LOBFRAG$: segobjno 299, file 1 block 2393
 LOBCOMPPART$: segobjno 302, file 1 block 2425
 UNDO$: segobjno 15, file 1 block 105
 TS$: segobjno 6, tabno 2, file 1  block 57
 PROPS$: segobjno 96, file 1 block 721
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   51171 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
    1576 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
   55264 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
      59 rows unloaded
. unloading table                  TABPART$      72 rows unloaded
. unloading table                  INDPART$      80 rows unloaded
. unloading table               TABCOMPART$       0 rows unloaded
. unloading table               INDCOMPART$       0 rows unloaded
. unloading table               TABSUBPART$       0 rows unloaded
. unloading table               INDSUBPART$       0 rows unloaded
. unloading table                      IND$    2231 rows unloaded
. unloading table                     ICOL$    3650 rows unloaded
. unloading table                      LOB$     530 rows unloaded
. unloading table                  COLTYPE$    1701 rows unloaded
. unloading table                     TYPE$    1945 rows unloaded
. unloading table               COLLECTION$     555 rows unloaded
. unloading table                ATTRIBUTE$    7275 rows unloaded
. unloading table                  LOBFRAG$       1 row  unloaded
. unloading table              LOBCOMPPART$       0 rows unloaded
. unloading table                     UNDO$      21 rows unloaded
. unloading table                       TS$       7 rows unloaded
. unloading table                    PROPS$      28 rows unloaded
Reading USER.dat 59 entries loaded
Reading OBJ.dat 51171 entries loaded and sorted 51171 entries
Reading TAB.dat 1576 entries loaded
Reading COL.dat 55264 entries loaded and sorted 55264 entries
Reading TABPART.dat 72 entries loaded and sorted 72 entries
Reading TABCOMPART.dat 0 entries loaded and sorted 0 entries
Reading TABSUBPART.dat 0 entries loaded and sorted 0 entries
Reading INDPART.dat 80 entries loaded and sorted 80 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 2231 entries loaded
Reading LOB.dat 530 entries loaded
Reading ICOL.dat 3650 entries loaded
Reading COLTYPE.dat 1701 entries loaded
Reading TYPE.dat 1945 entries loaded
Reading ATTRIBUTE.dat 7275 entries loaded
Reading COLLECTION.dat 555 entries loaded
Reading BOOTSTRAP.dat 57 entries loaded
Reading LOBFRAG.dat 1 entries loaded and sorted 1 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 7 entries loaded
Reading PROPS.dat 28 entries loaded
Database character set is ZHS16GBK
Database national character set is AL16UTF16
DUL> unload table sys.t_xifenfei;
. unloading table                T_XIFENFEI   50031 rows unloaded

通过这里可以发现,我们创建测试数据为50031条,dul读取抽取出来数据文件中对应表数据条数也为50031条;证明:在asm disk header完全损坏情况下,amdu,dul无法直接恢复asm里面数据库,但是可以通过工具扫描数据文件,找出来磁盘中的datafile block实现完整恢复数据[只要你的asm中的数据没有覆盖,都可以通过该方法恢复]

如果你在使用这些思路进行恢复遇到突发情况不能自行解决,请联系我们(ORACLE数据库恢复技术支持),将为您提供专业数据库技术支持:
Phone:13429648788    Q Q:107644445    E-Mail:dba@xifenfei.com

找回ASM中数据文件

很多时候由于asm不能正常启动,导致数据丢失。下面提供两种方法找回asm中的数据文件
一.使用AMDU工具
AMDU是Oracle 11g里自带的一个免费的工具,用于分析ASM磁盘组的元数据以及从不能mount的磁盘组中往外抽取数据文件“NOTE:553639.1 Placeholder for AMDU binaries and using with ASM 10g”明确指出:AMDU也可用于10g,并提供了可用于10g的AMDU的各个操作系统的版本供大家下载。
1.设置LD_LIBRARY_PATH

rac1-> export LD_LIBRARY_PATH=/tmp/amdu

2.查看asm磁盘信息

rac1-> ./amdu -diskstring '/dev/raw/raw*'  
amdu_2012_03_05_03_17_19/
rac1-> cd amdu_2012_03_05_03_17_19/
rac1-> ls
report.txt
rac1-> more report.txt 
-*-amdu-*-

******************************* AMDU Settings ********************************
ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1
System name:    Linux
Node name:      rac1
Release:        2.6.9-89.0.0.0.1.ELhugemem
Version:        #1 SMP Tue May 19 04:38:38 EDT 2009
Machine:        i686
amdu run:       05-MAR-12 03:17:19
Endianess:      1

--------------------------------- Operations ---------------------------------

------------------------------- Disk Selection -------------------------------
 -diskstring '/dev/raw/raw*'

------------------------------ Reading Control -------------------------------

------------------------------- Output Control -------------------------------

********************************* DISCOVERY **********************************

----------------------------- DISK REPORT N0001 ------------------------------
                Disk Path: /dev/raw/raw23
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 156 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **

----------------------------- DISK REPORT N0002 ------------------------------
                Disk Path: /dev/raw/raw11
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 156 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **

----------------------------- DISK REPORT N0003 ------------------------------
                Disk Path: /dev/raw/raw32
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 156 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **

----------------------------- DISK REPORT N0004 ------------------------------
                Disk Path: /dev/raw/raw31
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 156 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **

----------------------------- DISK REPORT N0005 ------------------------------
                Disk Path: /dev/raw/raw12
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 156 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **

----------------------------- DISK REPORT N0006 ------------------------------
                Disk Path: /dev/raw/raw2
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 1976 megabytes
               Group Name: DATA
                Disk Name: DATA_0001
       Failure Group Name: DATA_0001
              Disk Number: 1
            Header Status: 3
       Disk Creation Time: 2012/03/01 22:32:39.289000
          Last Mount Time: 2012/03/05 02:10:02.771000
    Compatibility Version: 0x0a100000
         Disk Sector Size: 512 bytes
         Disk size in AUs: 1976 AUs
         Group Redundancy: 1
      Metadata Block Size: 4096 bytes
                  AU Size: 1048576 bytes
                   Stride: 113792 AUs
      Group Creation Time: 2012/03/01 22:32:39.221000
  File 1 Block 1 location: AU 0

----------------------------- DISK REPORT N0007 ------------------------------
                Disk Path: /dev/raw/raw1
           Unique Disk ID: 
               Disk Label: 
     Physical Sector Size: 512 bytes
                Disk Size: 3137 megabytes
               Group Name: DATA
                Disk Name: DATA_0000
       Failure Group Name: DATA_0000
              Disk Number: 0
            Header Status: 3
       Disk Creation Time: 2012/03/01 22:32:39.289000
          Last Mount Time: 2012/03/05 02:10:02.771000
    Compatibility Version: 0x0a100000
         Disk Sector Size: 512 bytes
         Disk size in AUs: 3137 AUs
         Group Redundancy: 1
      Metadata Block Size: 4096 bytes
                  AU Size: 1048576 bytes
                   Stride: 113792 AUs
      Group Creation Time: 2012/03/01 22:32:39.221000
  File 1 Block 1 location: AU 2

******************************* END OF REPORT ********************************

从这里可以得到信息如下:
1)只有/dev/raw/raw1和raw2是有效的asm磁盘
2)磁盘组只有DATA

3.找出数据文件信息

rac1-> ../amdu -diskstring '/dev/raw/raw*'  -dump 'DATA'
amdu_2012_03_05_03_19_38/
rac1-> cd amdu_2012_03_05_03_19_38/
rac1-> ls
DATA_0001.img  DATA.map  report.txt
rac1-> strings DATA_0001.img |grep DATAFILE
DATAFILE
DATAFILE
DATAFILE
DEVDB/DATAFILE/SYSTEM.256.776817753
DEVDB/DATAFILE/SYSAUX.257.776817753
DEVDB/DATAFILE/UNDOTBS1.258.776817753
DEVDB/DATAFILE/USERS.259.776817753
DEVDB/DATAFILE/UNDOTBS2.267.776817909

从这里可以得出数据文件的file号(第一位数字)

4.恢复数据文件

rac1-> ../../amdu -diskstring '/dev/raw/raw*' -extract 'DATA.259'
amdu_2012_03_05_03_27_21/
rac1-> cd amdu_2012_03_05_03_27_21/
rac1-> ls -l
total 5148
-rw-r--r--  1 oracle oinstall 5251072 Mar  5 03:27 DATA_259.f
-rw-r--r--  1 oracle oinstall    6468 Mar  5 03:27 report.txt
rac1-> dbv file=DATA_259.f

DBVERIFY: Release 10.2.0.1.0 - Production on Mon Mar 5 03:28:00 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

DBVERIFY - Verification starting : FILE = DATA_259.f


DBVERIFY - Verification complete

Total Pages Examined         : 640
Total Pages Processed (Data) : 15
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 2
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 26
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 597
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 442362 (0.442362)

二.使用aulasm工具
1.配置ASM使用磁盘

[root@rac1 oracle]# more disk
/dev/sdb1
/dev/sdb2

2.使用aulasm

[root@rac1 oracle]# ./aulasm_linux32_trial.bin 
AUL : AnySQL UnLoader(MyDUL) for Oracle ASM, release 2.0.0

(C) Copyright Lou Fangxin 2011-2012 (AnySQL.net), all rights reserved.

ASM> help
  SET        -- set options (BLOCK_SIZE, AU_SIZE, F1B1_DISK, F1B1_AU)
  OPEN       -- open data file
  ADD        -- add disk with customized disk num (disk, path)
  LIST       -- list opened disks information (to)
  LSAU       -- list file of given AU (disk, au, count)
  FILE       -- list all file entries in ASM (file, [disk, au, block]).
  ALIAS      -- list all alias entries in ASM (file, [disk, au, block]).
  EXTENT     -- list or adjust variable extent (level, au, count).
  FMAP       -- list file extent (file, [disk, au, block], to)
  DUMP       -- dump file extent (file, [disk, au, block], offset, to)
  COPY       -- copy file to os  (file, [disk, au, block], to)
  QUIT/EXIT  -- exit the program.
ASM> open disk
2012-03-05 03:32:48
2012-03-05 03:32:48
ASM> list
2012-03-05 03:32:49
Total Disks = 2, ausize=1048576, blksize=4096

 disk     size block disk name        disk group   disk path               
===== ======== ===== ================ ============ ========================
    0     3137  4096 DATA_0000        DATA         /dev/sdb1
    1     1976  4096 DATA_0001        DATA         /dev/sdb2

f1b1disk = 0, f1b1au = 2, score=256, compat=0x0a100000
file=(1,0,2,1), disk=(2,0,2,2), alias=(6,0,2,6)
2012-03-05 03:32:49
ASM> alias
2012-03-05 03:33:15
    file          inc parent fstblk flag  refer alias                     
======== ============ ====== ====== ==== ====== ==========================
      -1           -1      0      0    4      1 DEVDB
      -1           -1      0      1    4      2 DATAFILE
      -1           -1      0      1    4      3 CONTROLFILE
      -1           -1      0      1    4      4 ONLINELOG
      -1           -1      0      1    4      5 TEMPFILE
      -1           -1      0      1    4      6 PARAMETERFILE
      -1           -1      0      1    4      7 ARCHIVELOG
     256    776817753      1      2   18     -1 SYSTEM
     257    776817753      1      2   18     -1 SYSAUX
     258    776817753      1      2   18     -1 UNDOTBS1
     259    776817753      1      2   18     -1 USERS
     267    776817909      1      2   18     -1 UNDOTBS2
     261    776817829      1      3   18     -1 Current
     260    776817829      1      3   18     -1 Current
     262    776817833      1      4   18     -1 group_1
     263    776817835      1      4   18     -1 group_1
     264    776817837      1      4   18     -1 group_2
     265    776817841      1      4   18     -1 group_2
     268    776818011      1      4   18     -1 group_3
     269    776818013      1      4   18     -1 group_3
     270    776818015      1      4   18     -1 group_4
     271    776818017      1      4   18     -1 group_4
     266    776817859      1      5   18     -1 TEMP
     272    776818021      1      6   18     -1 spfile
      -1           -1      1      7    4      8 2012_03_01
      -1           -1      1      7    4      9 2012_03_03
      -1           -1      1      7    4     10 2012_03_05
     273    776818747      7      8   18     -1 thread_1_seq_2
     274    776819571      7      8   18     -1 thread_1_seq_3
     275    776819721      7      8   18     -1 thread_2_seq_1
     276    776820331      7      8   18     -1 thread_1_seq_4
     277    776992321      7      9   18     -1 thread_1_seq_5
     278    776992321      7      9   18     -1 thread_2_seq_2
     279    777089437      7     10   18     -1 thread_1_seq_6
2012-03-05 03:33:15
ASM> fmap file 259    
2012-03-05 03:34:49
fid=259, disk=0, au=56, block=3
siz=5, ecnt=6, eeof=6, flags=17, type=12, alias=109,-1
     extid  disk       auid  cnt flg chk
========== ===== ========== ==== === ===
         1     0        428    1   0 135
         2     1        408    1   0 178
         3     0        429    1   0 134
         4     1        409    1   0 179
         5     0        430    1   0 133
         6     1        410    1   0 176
2012-03-05 03:34:49
ASM> copy file 259 to user_259
2012-03-05 03:35:01
2012-03-05 03:35:01
[root@rac1 oracle]# ll
total 6252
-rwxr-xr-x  1 root   root       84619 Mar  6  2012 aulasm_linux32_trial.bin
-rw-r--r--  1 root   root          20 Mar  5 02:52 disk
-rw-r--r--  1 root   root     6291456 Mar  5 03:35 user_259
[root@rac1 oracle]# su - oracle
rac1-> dbv file=user_259 

DBVERIFY: Release 10.2.0.1.0 - Production on Mon Mar 5 03:35:27 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

DBVERIFY - Verification starting : FILE = user_259


DBVERIFY - Verification complete

Total Pages Examined         : 640
Total Pages Processed (Data) : 15
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 2
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 26
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 597
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 442362 (0.442362)

通过dbv验证,两个数据文件(文件大小有出入,可能在对待未格式化数据块上,两者处理有出入)可以基本上确定一致,证明两种方法都可以从异常的ASM中获取数据文件

三.总结与比较
1.AMDU完全免费,aulasm超过16块硬盘需要收费
2.AMDU操作相对于aulasm麻烦一点,但是整体还是可以接受
3.获取到asm中的数据文件后恢复数据内容或者open库,那都是容易的事情
如果因为磁盘头损坏太多或者其他原因,以上方法都不能实现抽取asm磁盘组中数据文件,参考:asm disk header 彻底损坏恢复