OGG-02771 Input trail file format RELEASE 19.1 is different from previous trail file form at RELEASE 11.2.

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:OGG-02771 Input trail file format RELEASE 19.1 is different from previous trail file form at RELEASE 11.2.

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

源端数据库从11.2.0.4升级到19c版本(目标端本身就是19.1版本ogg),对应的ogg也从11.2升级到了19.1版本,ogg的trail文件传输到目标端之后,replicat进程直接ABENDED

GGSCI (xifenfei) 3> info replicat HISCA01,detail

REPLICAT   HISCA01   Last Started 2024-12-06 17:18   Status ABENDED
Checkpoint Lag       00:00:00 (updated 13:35:38 ago)
Log Read Checkpoint  File /data/ogg/dirdat/his/re000148
                     2024-12-06 01:12:04.078756  RBA 51446

查看view report查看报错详细

***********************************************************************
**                     Run Time Messages                             **
***********************************************************************


2024-12-06 17:50:55  INFO    OGG-02243  Opened trail file /data/ogg/dirdat/his/re000148 at 2024-12-06 17:50:55.559447.

2024-12-06 17:50:55  INFO    OGG-02232  Switching to next trail file /data/ogg/dirdat/his/re000000149 
     at 2024-12-06 17:50:55.559447 due to EOF. with current RBA 51,446.

Source Context :
  SourceModule            : [er.replicat.processloop]
  SourceID                : [er/replicat/processloop.cpp]
  SourceMethod            : [processReplicatLoop]
  SourceLine              : [1111]
  ThreadBacktrace         : [12] elements
                          : [/data/ogg/libgglog.so(CMessageContext::AddThreadContext())]
                          : [/data/ogg/libgglog.so(CMessageFactory::CreateMessage(CSourceContext*, unsigned int, ...))]
                          : [/data/ogg/libgglog.so(_MSG_Int32_String(CSourceContext*, int, int, char const*, CMessageFactory::MessageDisposition))]
                          : [/data/ogg/replicat()]
                          : [/data/ogg/replicat(ggs::er::ReplicatContext::run())]
                          : [/data/ogg/replicat()]
                          : [/data/ogg/replicat(ggs::gglib::MultiThreading::MainThread::ExecMain())]
                          : [/data/ogg/replicat(ggs::gglib::MultiThreading::Thread::RunThread(ggs::gglib::MultiThreading::Thread::ThreadArgs*))]
                          : [/data/ogg/replicat(ggs::gglib::MultiThreading::MainThread::Run(int, char**))]
                          : [/data/ogg/replicat(main)]
                          : [/lib64/libc.so.6(__libc_start_main)]
                          : [/data/ogg/replicat()]

2024-12-06 17:50:55  ERROR   OGG-02171  Error reading LCR from data source. Status 524, data source type TrailDataSource.

Source Context :
  SourceModule            : [er.replicat.ReplicatContext]
  SourceID                : [er/replicat/ReplicatContext.cpp]
  SourceMethod            : [onTrailFormatChange]
  SourceLine              : [564]
  ThreadBacktrace         : [17] elements
                          : [/data/ogg/libgglog.so(CMessageContext::AddThreadContext())]
                          : [/data/ogg/libgglog.so(CMessageFactory::CreateMessage(CSourceContext*, unsigned int, ...))]
                          : [/data/ogg/libgglog.so(_MSG_String_String_String(CSourceContext*, int, char const*, char const*,
                             char const*, CMessageFactory::MessageDisposition))]
                          : [/data/ogg/replicat(ggs::er::ReplicatContext::onTrailFormatChange(char const*, unsigned short, unsigned short) const)]
                          : [/data/ogg/replicat(ggs::gglib::ggtrail::TrailDataSource::updateTrailCompat(ggs::gglib::ggtrail::TrailFile const&))]
                          : [/data/ogg/replicat(ggs::er::ReplicatTrailDataSource::updateTrailCompat(ggs::gglib::ggtrail::TrailFile const&))]
                          : [/data/ogg/replicat(ggs::gglib::ggtrail::TrailDataSource::
                             readNextTrailRecord(ggs::gglib::gglcr::CommonLCR**, long*, int&, int&, bool, bool))]
                          : [/data/ogg/replicat(ggs::er::ReplicatTrailDataSource::readLCR(ggs::gglib::gglcr::CommonLCR**, long&, bool&))]
                          : [/data/ogg/replicat(ggs::er::ReplicatContext::processReplicatLoop())]
                          : [/data/ogg/replicat(ggs::er::ReplicatContext::run())]
                          : [/data/ogg/replicat()]
                          : [/data/ogg/replicat(ggs::gglib::MultiThreading::MainThread::ExecMain())]
                          : [/data/ogg/replicat(ggs::gglib::MultiThreading::Thread::RunThread(ggs::gglib::MultiThreading::Thread::ThreadArgs*))]
                          : [/data/ogg/replicat(ggs::gglib::MultiThreading::MainThread::Run(int, char**))]
                          : [/data/ogg/replicat(main)]
                          : [/lib64/libc.so.6(__libc_start_main)]
                          : [/data/ogg/replicat()]

2024-12-06 17:50:55  ERROR   OGG-02771  Input trail file /data/ogg/dirdat/his/re000000149 format RELEASE 19.1 
                                        is different from previous trail file form at RELEASE 11.2.

trail文件情况

[oracle@xifenfei his]$ ls -ltr
total 2167648
-rw-r----- 1 oracle oinstall 157604039 Nov 14 11:44 re000144
-rw-r----- 1 oracle oinstall 499999979 Nov 21 16:48 re000145
-rw-r----- 1 oracle oinstall 499999866 Dec  2 10:06 re000146
-rw-r----- 1 oracle oinstall 266123675 Dec  6 03:36 re000147
-rw-r----- 1 oracle oinstall     51446 Dec  6 04:15 re000148
-rw-r----- 1 oracle oinstall      1211 Dec  6 04:15 re000000149
-rw-r----- 1 oracle oinstall  43711175 Dec  6 17:50 re000000150

大概的意思就是解析完成了148文件,但是在解析149文件时发现trail的版本从11.2变成了19.1,从而导致进程abend.
解决这个问题,需要人工重新指定解析149文件即可

GGSCI (xifenfei) 5>  Alter replicat HISCA01 EXTSEQNO 149, EXTRBA 0

2024-12-06 17:53:01  INFO    OGG-06594  Replicat HISCA01 has been altered. 
Even the start up position might be updated, duplicate suppression remains active in next startup.
To override duplicate suppression, start HISCA01 with NOFILTERDUPTRANSACTIONS option.

REPLICAT altered.


GGSCI (xifenfei) 6> start HISCA01

Sending START request to MANAGER ...
REPLICAT HISCA01 starting

GGSCI (xifenfei) 8> stats HISCA01

Sending STATS request to REPLICAT HISCA01 ...

Start of Statistics at 2024-12-06 17:53:20.

Replicating from U_XFF_A.T_XFF to U_XFF_B.T_XFF:

*** Total statistics since 2024-12-06 17:53:12 ***
        Total inserts                                    431.00
        Total updates                                      0.00
        Total deletes                                    307.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                 738.00

*** Daily statistics since 2024-12-06 17:53:12 ***
        Total inserts                                    431.00
        Total updates                                      0.00
        Total deletes                                    307.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                 738.00

*** Hourly statistics since 2024-12-06 17:53:12 ***
        Total inserts                                    431.00
        Total updates                                      0.00
        Total deletes                                    307.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                 738.00

*** Latest statistics since 2024-12-06 17:53:12 ***
        Total inserts                                    431.00
        Total updates                                      0.00
        Total deletes                                    307.00
        Total upserts                                      0.00
        Total discards                                     0.00
        Total operations                                 738.00

End of Statistics.

OGG-02246 Source redo compatibility level 19.0.0 requires trail FORMAT 12.2 or higher

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:OGG-02246 Source redo compatibility level 19.0.0 requires trail FORMAT 12.2 or higher

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在一些情况下,我们会遇到某些原因,在源端和目标端部署不同版本的ogg,如果是目标端版本高于源端版本,一般没有问题,但是如果源端版本较高,需要考虑在抽取和传输进程中加上类似这样的配置

EXTTRAIL <trail file>, FORMAT RELEASE 11.2
 
RMTTRAIL <trail file>, FORMAT RELEASE 11.2

需要注意ogg 19版本设置FORMAT RELEASE 参数最低只能12.2,否则就会出现类似报错,导致进程无法启动

2024-12-05 15:06:59  INFO    OGG-02089  Source redo compatibility version is: 19.0.0.

2024-12-05 15:06:59  INFO    OGG-00546  Default thread stack size: 10485760.

Source Context :
  SourceModule     : [er.redo.ora]
  SourceID         : [er/redo/oracle/redoora.c]
  SourceMethod     : [validateOutTrailFileCompatibility]
  SourceLine       : [6931]
  ThreadBacktrace  : [15] elements
                   : [/ogg/libgglog.so(CMessageContext::AddThreadContext())]
                   : [/ogg/libgglog.so(CMessageFactory::CreateMessage(CSourceContext*, unsigned int, ...))]
                   : [/ogg/libgglog.so(_MSG_String_String(CSourceContext*, int, char const*, char const*, CMessageFactory::MessageDisposition))]
                   : [/ogg/extract()]
                   : [/ogg/extract(RedoAPI::createInstance(ggs::gglib::ggdatasource::DataSource*, ggs::gglib::ggapp::ReplicationContext*))]
                   : [/ogg/extract(ggs::er::OraTranLogDataSource::setup())]
                   : [/ogg/extract(ggs::gglib::ggapp::ReplicationContext::establishStartPoints(char, ggs::gglib::ggdatasource::DataSourceParams c
onst&))]
                   : [/ogg/extract(ggs::gglib::ggapp::ReplicationContext::initializeDataSources(ggs::gglib::ggdatasource::DataSourceParams&))]
                   : [/ogg/extract()]
                   : [/ogg/extract(ggs::gglib::MultiThreading::MainThread::ExecMain())]
                   : [/ogg/extract(ggs::gglib::MultiThreading::Thread::RunThread(ggs::gglib::MultiThreading::Thread::ThreadArgs*))]
                   : [/ogg/extract(ggs::gglib::MultiThreading::MainThread::Run(int, char**))]
                   : [/ogg/extract(main)]
                   : [/lib64/libc.so.6(__libc_start_main)]
                   : [/ogg/extract()]

2024-12-05 15:06:59  ERROR   OGG-02246  Source redo compatibility level 19.0.0 requires trail FORMAT 12.2 or higher.

2024-12-05 15:06:59  ERROR   OGG-01668  PROCESS ABENDING.

出现这个问题主要是从oracle 12.2开始引入了BigSCN机制,基于目前客户比较常见的数据库版本,最可能出现这类问题的是:源端19c版本数据库,目标端是11.2版本数据库.对于这类同步需求,可以在目标端也部署19版本ogg for 11.2数据库版本来解决

GoldenGate 19安装和打patch

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:GoldenGate 19安装和打patch

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

1. 下载V983658-01.zip,并上传到/tmp中
2. 安装19.1.0.0.4版本ogg,需要修改静默文件/tmp/fbo_ggs_Linux_x64_shiphome/Disk1/response/oggcore.rsp中修改这两个参数

INSTALL_OPTION=ORA19c
SOFTWARE_LOCATION=/tmp/ogg

3. 静默安装ogg

[oracle@ora19c:/home/oracle]$ /tmp/fbo_ggs_Linux_x64_shiphome/Disk1/runInstaller -silent \
 -responseFile /tmp/fbo_ggs_Linux_x64_shiphome/Disk1/response/oggcore.rsp
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB.   Actual 6106 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 1257 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2024-12-02_10-57-00PM. Please wait ...
[oracle@ora19c:/home/oracle]$ You can find the log of this install session at:
 /data/app/oraInventory/logs/installActions2024-12-02_10-57-00PM.log
Successfully Setup Software.
The installation of Oracle GoldenGate Core was successful.
Please check '/data/app/oraInventory/logs/silentInstall2024-12-02_10-57-00PM.log' for more details.

[oracle@ora19c:/home/oracle]$ cd /tmp/ogg
[oracle@ora19c:/tmp/ogg]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO
Linux, x64, 64bit (optimized), Oracle 19c on Oct 17 2019 21:16:29
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.



GGSCI (ora19c) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED                                           


GGSCI (ora19c) 2> create subdirs

Creating subdirectories under current directory /tmp/ogg

Parameter file                 /tmp/ogg/dirprm: created.
Report file                    /tmp/ogg/dirrpt: created.
Checkpoint file                /tmp/ogg/dirchk: created.
Process status files           /tmp/ogg/dirpcs: created.
SQL script files               /tmp/ogg/dirsql: created.
Database definitions files     /tmp/ogg/dirdef: created.
Extract data files             /tmp/ogg/dirdat: created.
Temporary files                /tmp/ogg/dirtmp: created.
Credential store files         /tmp/ogg/dircrd: created.
Masterkey wallet files         /tmp/ogg/dirwlt: created.
Dump files                     /tmp/ogg/dirdmp: created.


GGSCI (ora19c) 3> exit

4. 下载patch p37236684_1925000OGGRU_Linux-x86-64.zip 和opatch p6880880_190000_Linux-x86-64.zip(12.2.0.1.44版本) 并解压

[oracle@ora19c:/tmp]$ cd /tmp/
[oracle@ora19c:/tmp]$ unzip p37236684_1925000OGGRU_Linux-x86-64.zip 
[oracle@ora19c:/tmp]$ cd /tmp/ogg
[oracle@ora19c:/tmp/ogg]$ mv OPatch/ OPatch_bak
[oracle@ora19c:/tmp/ogg]$ unzip /tmp/p6880880_190000_Linux-x86-64.zip 
[oracle@ora19c:/tmp/ogg]$ /tmp/ogg/OPatch/opatch version
OPatch Version: 12.2.0.1.44

OPatch succeeded.

5. 对19.1.0.0.4版本ogg 打上最新patch(37236684)

[oracle@ora19c:/tmp/ogg]$ export ORACLE_HOME=/tmp/ogg
[oracle@ora19c:/tmp/ogg]$ OPatch/opatch apply /tmp/37236684/
Oracle Interim Patch Installer version 12.2.0.1.44
Copyright (c) 2024, Oracle Corporation.  All rights reserved.


Oracle Home       : /tmp/ogg
Central Inventory : /data/app/oraInventory
   from           : /tmp/ogg/oraInst.loc
OPatch version    : 12.2.0.1.44
OUI version       : 12.2.0.4.0
Log file location : /tmp/ogg/cfgtoollogs/opatch/opatch2024-12-02_23-04-26PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   37236684  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.

Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/tmp/ogg')


Is the local system ready for patching? [y|n]
y
User Responded with: Y
Backing up files...
Applying interim patch '37236684' to OH '/tmp/ogg'

Patching component oracle.oggcore.ora19c, 19.1.0.0.0...
Patch 37236684 successfully applied.
Log file location: /tmp/ogg/cfgtoollogs/opatch/opatch2024-12-02_23-04-26PM_1.log

OPatch succeeded.
[oracle@ora19c:/tmp/ogg]$ OPatch/opatch lspatches
37236684;

OPatch succeeded.
[oracle@ora19c:/tmp/ogg]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 19.25.0.0.241105 OGGCORE_19.25.0.0.0OGGRU_PLATFORMS_241118.0932_FBO
Linux, x64, 64bit (optimized), Oracle 19c  on Nov 18 2024 13:19:46
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2024, Oracle and/or its affiliates. All rights reserved.



GGSCI (ora19c) 1> 

dd破坏asm磁盘头恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:dd破坏asm磁盘头恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友对asm disk的磁盘头dd了2048byte的数据
dd-2048
asm-candidate
QQ20241202-204931


通过分析,gi软件版本,确认是11.2.0.4

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options.
ORACLE_HOME = /u01/app/11.2.0/grid
System name:	Linux
Node name:	rac1
Release:	4.1.12-37.4.1.el6uek.x86_64
Version:	#2 SMP Tue May 17 07:23:38 PDT 2016
Machine:	x86_64

从10.2.0.5之后版本,在第二个au的倒数第二个block上面,有asm disk header备份(每个block大小为4k),分析au大小(通过分析正常的asm disk快速找到au 大小【使用dd备份的正常的磁盘头查看】)

H:\TEMP\tmp\asmbak>kfed read sdcp.dd |grep ausize
kfdhdb.ausize:                 16777216 ; 0x0bc: 0x01000000

找到被破坏的asm disk的备份磁盘头信息

H:\TEMP\tmp\asmbak>kfed read sdc.dd blkn=4094 aun=1 aus=16777216|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                    4094 ; 0x004: blk=4094
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                   229348702 ; 0x00c: 0x0dab955e
kfbh.fcn.base:                 11727032 ; 0x010: 0x00b2f0b8
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:               DATA_0000 ; 0x028: length=9
kfdhdb.grpname:                    DATA ; 0x048: length=4
kfdhdb.fgname:                DATA_0000 ; 0x068: length=9
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.crestmp.hi:             33123276 ; 0x0a8: HOUR=0xc DAYS=0x1e MNTH=0xa YEAR=0x7e5
kfdhdb.crestmp.lo:           2259134464 ; 0x0ac: USEC=0x0 MSEC=0x1ea SECS=0x2a MINS=0x21
kfdhdb.mntstmp.hi:             33162836 ; 0x0b0: HOUR=0x14 DAYS=0x12 MNTH=0x1 YEAR=0x7e8
kfdhdb.mntstmp.lo:           3600987136 ; 0x0b4: USEC=0x0 MSEC=0xad SECS=0x2a MINS=0x35
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                 16777216 ; 0x0bc: 0x01000000
kfdhdb.mfact:                    454272 ; 0x0c0: 0x0006ee80
kfdhdb.dsksize:                   65536 ; 0x0c4: 0x00010000
kfdhdb.pmcnt:                         2 ; 0x0c8: 0x00000002
kfdhdb.fstlocn:                       1 ; 0x0cc: 0x00000001
kfdhdb.altlocn:                       2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn:                      0 ; 0x0d4: 0x00000000
kfdhdb.redomirrors[0]:                0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]:                0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]:                0 ; 0x0dc: 0x0000
…………

确认被损坏的磁盘只有磁盘头信息损坏(即确认第二个block是否是好的)

H:\TEMP\tmp\asmbak>kfed read sdc.dd blkn=0
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
0065D8400 00000000 00000000 00000000 00000000  [................]
  Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]


H:\TEMP\tmp\asmbak>kfed read sdc.dd blkn=1|more
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       1 ; 0x004: blk=1
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  2781697777 ; 0x00c: 0xa5cd56f1
kfbh.fcn.base:                 39359331 ; 0x010: 0x02589363
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdfsb.aunum:                         0 ; 0x000: 0x00000000
kfdfsb.max:                        1014 ; 0x004: 0x03f6
kfdfsb.cnt:                         147 ; 0x006: 0x0093
kfdfsb.bound:                         0 ; 0x008: 0x0000
kfdfsb.flag:                          1 ; 0x00a: B=1
kfdfsb.ub1spare:                      0 ; 0x00b: 0x00
kfdfsb.spare[0]:                      0 ; 0x00c: 0x00000000
kfdfsb.spare[1]:                      0 ; 0x010: 0x00000000
kfdfsb.spare[2]:                      0 ; 0x014: 0x00000000
kfdfse[0].fse:                        0 ; 0x018: FREE=0x0 FRAG=0x0
…………

基于上述分析,直接使用备份的asm disk header信息进行merge或者repair修复之后,asm 磁盘头状态恢复正常
QQ20241202-205116
QQ20241202-205235
QQ20241202-205147


这个客户运气比较好,库非常大,只是破坏了2k的数据,如果超过4k可能就是比较麻烦的事故了,再次提醒对asm磁盘的dd操作一定要小心谨慎.如果不慎破坏asm磁盘过多,参考以前类似文档:
asm磁盘dd破坏恢复

删除asmlib磁盘导致磁盘组故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:删除asmlib磁盘导致磁盘组故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户执行drop disk磁盘组操作之后,然后立刻从oracle asmlib层面执行了oracleasm deletedisk,并且在操作系统层面delete partition(删除磁盘分区),导致磁盘组直接dismount

Tue Nov 26 16:44:04 2024
SQL> alter diskgroup data drop disk DATA_0008 
NOTE: GroupBlock outside rolling migration privileged region
Tue Nov 26 08:44:05 2024
NOTE: stopping process ARB0
NOTE: rebalance interrupted for group 2/0x28dec0d5 (DATA)
NOTE: requesting all-instance membership refresh for group=2
NOTE: membership refresh pending for group 2/0x28dec0d5 (DATA)
Tue Nov 26 08:44:14 2024
GMON querying group 2 at 48 for pid 18, osid 27385
SUCCESS: refreshed membership for 2/0x28dec0d5 (DATA)
SUCCESS: alter diskgroup data drop disk DATA_0008
NOTE: starting rebalance of group 2/0x28dec0d5 (DATA) at power 2
Starting background process ARB0
Tue Nov 26 08:44:14 2024
ARB0 started with pid=38, OS id=56987 
NOTE: assigning ARB0 to group 2/0x28dec0d5 (DATA) with 2 parallel I/Os
Tue Nov 26 08:44:17 2024
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Tue Nov 26 08:44:57 2024
cellip.ora not found.
Tue Nov 26 17:08:46 2024
SQL> alter diskgroup data drop disk DATA_0008 
ORA-15032: not all alterations performed
ORA-15071: ASM disk "DATA_0008" is already being dropped
ERROR: alter diskgroup data drop disk DATA_0008
Tue Nov 26 17:10:30 2024
SQL> alter diskgroup data drop disk DATA_0008 
ORA-15032: not all alterations performed
ORA-15071: ASM disk "DATA_0008" is already being dropped
ERROR: alter diskgroup data drop disk DATA_0008
Tue Nov 26 09:34:38 2024
WARNING: cache read  a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8 (DATA_0008) incarn=3911069755 au=0 blk=98 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc
WARNING:cache read (retry) a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8(DATA_0008)incarn=3911069755 au=0 blk=98 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ERROR: cache failed to read group=2(DATA) dsk=8 blk=98 from disk(s): 8(DATA_0008)
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _arb0_+asm1(56987)initiating offline of disk 8.3911069755 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e303b, mask = 0x6a, op = clear
Tue Nov 26 09:34:38 2024
GMON updating disk modes for group 2 at 49 for pid 38, osid 56987
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Tue Nov 26 09:34:38 2024
NOTE: cache dismounting (not clean) group 2/0x28DEC0D5 (DATA) 
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
NOTE: messaging CKPT to quiesce pins Unix process pid: 89645, image: oracle@ahptdb5 (B000)
Tue Nov 26 09:34:38 2024
NOTE: halting all I/Os to diskgroup 2 (DATA)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc  (incident=413105):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
Tue Nov 26 09:34:39 2024
ERROR: ORA-15130 in COD recovery for diskgroup 2/0x28dec0d5 (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_27385.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ERROR: ORA-15335 thrown in ARB0 for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: stopping process ARB0
Tue Nov 26 09:34:40 2024
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=716.2684 last written ABA 716.2684

通过重新分区,并且kfed repair修复磁盘头操作之后,重新mount磁盘组报错

SQL> alter diskgroup data mount 
NOTE: cache registered group DATA number=2 incarn=0x73bec220
NOTE: cache began mount (first) of group DATA number=2 incarn=0x73bec220
NOTE: Assigning number (2,16) to disk (/dev/oracleasm/disks/DATA208)
NOTE: Assigning number (2,15) to disk (/dev/oracleasm/disks/DATA207)
NOTE: Assigning number (2,14) to disk (/dev/oracleasm/disks/DATA206)
NOTE: Assigning number (2,13) to disk (/dev/oracleasm/disks/DATA205)
NOTE: Assigning number (2,12) to disk (/dev/oracleasm/disks/DATA204)
NOTE: Assigning number (2,11) to disk (/dev/oracleasm/disks/DATA203)
NOTE: Assigning number (2,10) to disk (/dev/oracleasm/disks/DATA202)
NOTE: Assigning number (2,9) to disk (/dev/oracleasm/disks/DATA201)
NOTE: Assigning number (2,6) to disk (/dev/oracleasm/disks/DATA07)
NOTE: Assigning number (2,5) to disk (/dev/oracleasm/disks/DATA06)
NOTE: Assigning number (2,4) to disk (/dev/oracleasm/disks/DATA05)
NOTE: Assigning number (2,0) to disk (/dev/oracleasm/disks/DATA01)
NOTE: Assigning number (2,3) to disk (/dev/oracleasm/disks/DATA04)
NOTE: Assigning number (2,2) to disk (/dev/oracleasm/disks/DATA03)
NOTE: Assigning number (2,1) to disk (/dev/oracleasm/disks/DATA02)
NOTE: Assigning number (2,8) to disk (/dev/oracleasm/disks/DATA101)
Tue Nov 26 11:48:22 2024
NOTE: GMON heartbeating for grp 2
GMON querying group 2 at 83 for pid 27, osid 15781
NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/oracleasm/disks/DATA01
NOTE: F1X0 found on disk 0 au 2 fcn 0.127835487
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/oracleasm/disks/DATA02
NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/oracleasm/disks/DATA03
NOTE: cache opening disk 3 of grp 2: DATA_0003 path:/dev/oracleasm/disks/DATA04
NOTE: cache opening disk 4 of grp 2: DATA_0004 path:/dev/oracleasm/disks/DATA05
NOTE: cache opening disk 5 of grp 2: DATA_0005 path:/dev/oracleasm/disks/DATA06
NOTE: cache opening disk 6 of grp 2: DATA_0006 path:/dev/oracleasm/disks/DATA07
NOTE: cache opening disk 8 of grp 2: DATA_0008 path:/dev/oracleasm/disks/DATA101
NOTE: cache opening disk 9 of grp 2: DATA_0009 path:/dev/oracleasm/disks/DATA201
NOTE: cache opening disk 10 of grp 2: DATA_0010 path:/dev/oracleasm/disks/DATA202
NOTE: cache opening disk 11 of grp 2: DATA_0011 path:/dev/oracleasm/disks/DATA203
NOTE: cache opening disk 12 of grp 2: DATA_0012 path:/dev/oracleasm/disks/DATA204
NOTE: cache opening disk 13 of grp 2: DATA_0013 path:/dev/oracleasm/disks/DATA205
NOTE: cache opening disk 14 of grp 2: DATA_0014 path:/dev/oracleasm/disks/DATA206
NOTE: cache opening disk 15 of grp 2: DATA_0015 path:/dev/oracleasm/disks/DATA207
NOTE: cache opening disk 16 of grp 2: DATA_0016 path:/dev/oracleasm/disks/DATA208
NOTE: cache mounting (first) external redundancy group 2/0x73BEC220 (DATA)
Tue Nov 26 11:48:22 2024
* allocate domain 2, invalid = TRUE 
kjbdomatt send to inst 2
Tue Nov 26 11:48:22 2024
NOTE: attached to recovery domain 2
NOTE: starting recovery of thread=1 ckpt=716.1536 group=2 (DATA)
NOTE: starting recovery of thread=2 ckpt=763.6248 group=2 (DATA)
NOTE: recovery initiating offline of disk 8 group 2 (*)
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _user15781_+asm1 (15781) initiating offline of disk 8.3911069996 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e312c, mask = 0x6a, op = clear
GMON updating disk modes for group 2 at 84 for pid 27, osid 15781
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
Tue Nov 26 11:48:23 2024
NOTE: halting all I/Os to diskgroup 2 (DATA)
NOTE: recovery (pass 2) of diskgroup 2 (DATA) caught error ORA-15130
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_15781.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15131: block 97 of file 8 in diskgroup 2 could not be read
ORA-15196: invalid ASM block header [kfc.c:7600] [endian_kfbh] [2147483656] [97] [0 != 1]

由于客户执行了oracleasm deletedisk,根据经验确认该操作是对asm磁盘头的前1M数据进行了清空,而客户这个asm刚好是drop disk触发了rebalance操作的时候干掉磁盘的,基于这样的情况,直接通过修复磁盘1M数据并且mount磁盘组继续使用该磁盘组的概率不大.因此处理建议:
1. 直接恢复出来该磁盘组数据然后打开该库
2. 直接提取客户需要的核心表数据
有过客户有类似操作是asmlib重新创建了磁盘信息恢复:分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例
删除分区信息之后数据库恢复案例:删除分区 oracle asm disk 恢复