联系:手机/微信(+86 17813235971) QQ(107644445)
标题:ERROR: diskgroup XXXX was not mounted
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
aix平台10.2.0.5 2节点RAC,由于节点2系统盘故障,通过节点1镜像系统,复制到节点2,结果由于节点2磁盘顺序和节点1不匹配,aix工程师进行了相关操作之后,节点1重启之后datadg磁盘组无法mount
SQL> alter diskgroup datadg mount Mon Jun 10 23:23:46 CST 2019 NOTE: cache registered group DATADG number=1 incarn=0x8cf61164 Mon Jun 10 23:23:46 CST 2019 NOTE: Hbeat: instance first (grp 1) Mon Jun 10 23:23:50 CST 2019 NOTE: start heartbeating (grp 1) Mon Jun 10 23:23:50 CST 2019 NOTE: cache dismounting group 1/0x8CF61164 (DATADG) NOTE: dbwr not being msg'd to dismount ERROR: diskgroup DATADG was not mounted
检查datadg磁盘组相关信息
Tue Jan 29 19:21:45 CST 2019 NOTE: start heartbeating (grp 2) NOTE: cache opening disk 0 of grp 2: DATADG_0000 path:/dev/rhdisk6 Tue Jan 29 19:21:45 CST 2019 NOTE: F1X0 found on disk 0 fcn 0.0 NOTE: cache opening disk 1 of grp 2: DATADG_0001 path:/dev/rhdisk7 NOTE: cache opening disk 2 of grp 2: DATADG_0002 path:/dev/rhdisk8 NOTE: cache opening disk 3 of grp 2: DATADG_0003 path:/dev/rhdisk9 NOTE: cache mounting (first) group 2/0x60E59155 (DATADG) * allocate domain 2, invalid = TRUE Tue Jan 29 19:21:45 CST 2019 NOTE: attached to recovery domain 2 Tue Jan 29 19:21:45 CST 2019 NOTE: cache recovered group 2 to fcn 0.849668 Tue Jan 29 19:21:45 CST 2019 NOTE: LGWR attempting to mount thread 1 for disk group 2 NOTE: LGWR mounted thread 1 for disk group 2 NOTE: opening chunk 1 at fcn 0.849668 ABA NOTE: seq=21 blk=5394 Tue Jan 29 19:21:46 CST 2019 NOTE: cache mounting group 2/0x60E59155 (DATADG) succeeded SUCCESS: diskgroup DATADG was mounted
通过这里可以看出来datadg磁盘组是由rhdisk6-9 四块磁盘组成,查询相关磁盘信息发现

这里确定rhdisk7磁盘异常,通过kfed分析磁盘情况
D:\BaiduNetdiskDownload\xifenfei>kfed read rhdisk7.dd
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 34 ; 0x001: 0x22
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 49407 ; 0x004: blk=49407
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 58396 ; 0x010: 0x0000e41c
kfbh.fcn.wrap: 131072 ; 0x014: 0x00020000
kfbh.spare1: 4294967064 ; 0x018: 0xffffff18
kfbh.spare2: 2105310074 ; 0x01c: 0x7d7c7b7a
005918A00 00002200 0000C0FF 00000000 00000000 [."..............]
005918A10 0000E41C 00020000 FFFFFF18 7D7C7B7A [............z{|}]
005918A20 00000000 00000000 00000000 00000000 [................]
Repeat 253 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
D:\BaiduNetdiskDownload\xifenfei>kfed read rhdisk7.dd blkn=1
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
006EF8A00 00000000 00000000 00000000 00000000 [................]
Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
D:\BaiduNetdiskDownload\xifenfei>kfed read rhdisk7.dd blkn=2|more
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 33554432 ; 0x004: blk=33554432
kfbh.block.obj: 16777344 ; 0x008: file=128
kfbh.check: 3844041089 ; 0x00c: 0xe51f6981
kfbh.fcn.base: 1297484544 ; 0x010: 0x4d560b00
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdatb10.aunum: 0 ; 0x000: 0x00000000
kfdatb10.shrink: 49153 ; 0x004: 0xc001
kfdatb10.ub2pad: 20555 ; 0x006: 0x504b
kfdatb10.auinfo[0].link.next: 2048 ; 0x008: 0x0800
kfdatb10.auinfo[0].link.prev: 2048 ; 0x00a: 0x0800
kfdatb10.auinfo[0].free: 0 ; 0x00c: 0x0000
kfdatb10.auinfo[0].total: 49153 ; 0x00e: 0xc001
kfdatb10.auinfo[1].link.next: 4096 ; 0x010: 0x1000
kfdatb10.auinfo[1].link.prev: 4096 ; 0x012: 0x1000
kfdatb10.auinfo[1].free: 0 ; 0x014: 0x0000
kfdatb10.auinfo[1].total: 0 ; 0x016: 0x0000
kfdatb10.auinfo[2].link.next: 6144 ; 0x018: 0x1800
kfdatb10.auinfo[2].link.prev: 6144 ; 0x01a: 0x1800
kfdatb10.auinfo[2].free: 0 ; 0x01c: 0x0000
kfdatb10.auinfo[2].total: 0 ; 0x01e: 0x0000
kfdatb10.auinfo[3].link.next: 8192 ; 0x020: 0x2000
kfdatb10.auinfo[3].link.prev: 8192 ; 0x022: 0x2000
kfdatb10.auinfo[3].free: 0 ; 0x024: 0x0000
对比磁盘可能的损坏情况,由于在aix 平台asm disk的block有一个特征一般0082开头,通过工具打开磁盘,检索该标记对比
正常磁盘

异常磁盘

通过上述分析,大概评估rhdisk7 元数据部分损坏的不光是block 0和1,人工修复继续使用的可能性不太大,而且基于客户的数据库不大,采取方案是直接拷贝数据文件、redo、控制文件到文件系统,然后在本地文件系统open库

运气不错,实现完美恢复数据0丢失
