ORA-15335: ASM metadata corruption detected in disk group ‘DATA’

Posted on 2021 年 06 月 30 日 by 惜分飞

标题：ORA-15335: ASM metadata corruption detected in disk group ‘DATA’

asm磁盘组增加磁盘进行扩容之后报ORA-15335: ASM metadata corruption detected in disk group ‘DATA’和ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479],磁盘组dismount,然后mount之后立马dismount掉.

Tue Jun 29 09:19:09 2021
SQL> ALTER DISKGROUP DATA ADD  DISK '/dev/raw/raw5' SIZE 102400M /* ASMCA */ 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (2,1) to disk (/dev/raw/raw5)
NOTE: requesting all-instance membership refresh for group=2
NOTE: initializing header on grp 2 disk DATA_0001
NOTE: requesting all-instance disk validation for group=2
Tue Jun 29 09:19:11 2021
NOTE: skipping rediscovery for group 2/0xb0c845ce (DATA) on local instance.
NOTE: requesting all-instance disk validation for group=2
NOTE: skipping rediscovery for group 2/0xb0c845ce (DATA) on local instance.
NOTE: initiating PST update: grp = 2
Tue Jun 29 09:19:16 2021
GMON updating group 2 at 7 for pid 27, osid 25020
NOTE: PST update grp = 2 completed successfully 
NOTE: membership refresh pending for group 2/0xb0c845ce (DATA)
GMON querying group 2 at 8 for pid 18, osid 3852
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/raw/raw5
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
GMON querying group 2 at 9 for pid 18, osid 3852
SUCCESS: refreshed membership for 2/0xb0c845ce (DATA)
Tue Jun 29 09:19:20 2021
SUCCESS: ALTER DISKGROUP DATA ADD  DISK '/dev/raw/raw5' SIZE 102400M /* ASMCA */
NOTE: starting rebalance of group 2/0xb0c845ce (DATA) at power 1
Starting background process ARB0
Tue Jun 29 09:19:21 2021
ARB0 started with pid=33, OS id=25176 
NOTE: assigning ARB0 to group 2/0xb0c845ce (DATA) with 1 parallel I/O
cellip.ora not found.
Tue Jun 29 09:19:24 2021
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Tue Jun 29 09:19:46 2021
WARNING: cache read  a corrupt block: group=2(DATA) dsk=0 blk=7 disk=0 (DATA_0000) incarn=3915953476 au=0 blk=7 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ERROR: cache failed to read group=2(DATA) dsk=0 blk=7 from disk(s): 0(DATA_0000)
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
NOTE: cache initiating offline of disk 0 group DATA
NOTE: process _arb0_+asm1 (25176) initiating offline of disk 0.3915953476 (DATA_0000) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 0/0xe968b544, mask = 0x6a, op = clear
Tue Jun 29 09:19:46 2021
GMON updating disk modes for group 2 at 10 for pid 33, osid 25176
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Tue Jun 29 09:19:46 2021
NOTE: cache dismounting (not clean) group 2/0xB0C845CE (DATA) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 25395, image: oracle@frsrac1 (B000)
Tue Jun 29 09:19:46 2021
NOTE: halting all I/Os to diskgroup 2 (DATA)
Tue Jun 29 09:19:46 2021
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=11.10715 last written ABA 11.10715
WARNING: Offline for disk DATA_0000 in mode 0x7f failed.
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc  (incident=54665):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0000" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_54665/+ASM1_arb0_25176_i54665.trc
Tue Jun 29 09:19:46 2021
kjbdomdet send to inst 2
detach from dom 2, sending detach message to inst 2
Tue Jun 29 09:19:46 2021
List of instances:
 1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 24)
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 2 invalid = TRUE 
 796 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
Tue Jun 29 09:19:46 2021
WARNING: dirty detached from domain 2
NOTE: cache dismounted group 2/0xB0C845CE (DATA) 
SQL> alter diskgroup DATA dismount force /* ASM SERVER:2965915086 */ 
Tue Jun 29 09:19:47 2021
ERROR: ORA-15130 thrown in ARB0 for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_25176.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0000" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
ORA-15196: invalid ASM block header [kfc.c:26368] [check_kfbh] [2147483648] [7] [2183628676 != 686982479]
Tue Jun 29 09:19:47 2021
NOTE: stopping process ARB0
Tue Jun 29 09:19:47 2021
Sweep [inc][54665]: completed
Tue Jun 29 09:19:47 2021
Sweep [inc2][54665]: completed
NOTE: cache deleting context for group DATA 2/0xb0c845ce
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_3852.trc:
ORA-15130: diskgroup "DATA" is being dismounted
GMON dismounting group 2 at 11 for pid 27, osid 25395
NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
SUCCESS: diskgroup DATA was dismounted
SUCCESS: alter diskgroup DATA dismount force /* ASM SERVER:2965915086 */

通过kfed分析报错block,确认错误

kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       7 ; 0x004: blk=7
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  2183628676 ; 0x00c: 0x82278784 <<======该值错误,应该为:686982479
kfbh.fcn.base:                     3430 ; 0x010: 0x00000d66
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdatb.aunum:                      2240 ; 0x000: 0x000008c0
kfdatb.shrink:                      448 ; 0x004: 0x01c0
kfdatb.ub2pad:                        0 ; 0x006: 0x0000

通过修复该错误,并且禁止reblance操作[增加磁盘数据需要重新分布],mount磁盘组,然后open库,发现redo已经被覆盖(非归档),强制打开库报错

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [0], [2691201882], [0],
[2691227745], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [0], [2691201881], [0],
[2691227745], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [2691201879], [0],
[2691227745], [12583040], [], [], [], [], [], []
Process ID: 25110
Session ID: 287 Serial number: 3

通过对scn进行处理,数据库顺利open

SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 5044088832 bytes
Fixed Size		    2261928 bytes
Variable Size		 1442843736 bytes
Database Buffers	 3590324224 bytes
Redo Buffers		    8658944 bytes
Database mounted.
SQL> alter database open;

Database altered.

/u01空间100%导致数据库报ORA-01114 ORA-29701

Posted on 2021 年 06 月 29 日 by 惜分飞

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：/u01空间100%导致数据库报ORA-01114 ORA-29701

数据库操作报ORA-01114 ORA-29701错误

通过分析发现磁盘/u01分区使用100%

sqlplus / as sysdba 无法登陆

清理trace,释放/u01空间之后,数据库恢复正常，mos中关于ORA-01114错误描述

APPLIES TO:
BI Publisher (formerly XML Publisher) - Version 11.1.1.7.x and later
Information in this document applies to any platform.

SYMPTOMS
 While generating reports the following error occurred.

oracle.xdo.XDOException: java.sql.SQLException: ORA-01114: IO error writing block to file (block # ) 
ORA-01114: IO error writing block to file 201 (block # 755561) 
ORA-27072: File I/O error Additional information: 4 
Additional information: 755561 Additional information: 36864

CHANGES
 

CAUSE
 The issue is caused due to DB not having enough space.

SOLUTION
Cleared space in DB and the report worked fine.

lvm缩小xfs文件系统空间和对swap进行扩容操作

Posted on 2021 年 06 月 28 日 by 惜分飞

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：lvm缩小xfs文件系统空间和对swap进行扩容操作

xfs文件系统lvm缩小空间操作（/home从100G减小到80G)

[root@xifenfei ~]# df -h
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/rhel-root  449G  6.0G  443G   2% /
devtmpfs                63G     0   63G   0% /dev
tmpfs                   63G     0   63G   0% /dev/shm
tmpfs                   63G   20M   63G   1% /run
tmpfs                   63G     0   63G   0% /sys/fs/cgroup
/dev/mapper/rhel-home  100G   38M  100G   1% /home
/dev/sda2             1014M  165M  850M  17% /boot
/dev/sda1              200M  9.8M  191M   5% /boot/efi
tmpfs                   13G  4.0K   13G   1% /run/user/42
tmpfs                   13G   32K   13G   1% /run/user/0
/dev/sr0               4.2G  4.2G     0 100% /media

[root@xifenfei u01]# xfsdump -f /home.xfsdump /home
xfsdump: using file dump (drive_simple) strategy
xfsdump: version 3.1.7 (dump format 3.0) - type ^C for status and control

 ============================= dump label dialog ==============================

please enter label for this dump session (timeout in 300 sec)
 -> home
session label entered: "tar czvf /home.tar.gz /home
home"

 --------------------------------- end dialog ---------------------------------

xfsdump: level 0 dump of xifenfei:/home
xfsdump: dump date: Fri Jun 25 11:37:13 2021
xfsdump: session id: 4d75008e-9927-417d-9722-52d13bb89eb0
xfsdump: session label: 
xfsdump: ino map phase 1: constructing initial dump list
xfsdump: ino map phase 2: skipping (no pruning necessary)
xfsdump: ino map phase 3: skipping (only one dump stream)
xfsdump: ino map construction complete
xfsdump: estimated dump size: 4828224 bytes
xfsdump: /var/lib/xfsdump/inventory created

 ============================= media label dialog =============================

please enter label for media in drive 0 (timeout in 300 sec)
 -> home
media label entered: "home"

 --------------------------------- end dialog ---------------------------------

xfsdump: creating dump session media file 0 (media 0, file 0)
xfsdump: dumping ino map
xfsdump: dumping directories
xfsdump: dumping non-directory files
xfsdump: ending media file
xfsdump: media file size 4732672 bytes
xfsdump: dump size (non-dir files) : 4588480 bytes
xfsdump: dump complete: 4 seconds elapsed
xfsdump: Dump Summary:
xfsdump:   stream 0 /home.xfsdump OK (success)
xfsdump: Dump Status: SUCCESS

[root@xifenfei u01]# umount /home
[root@xifenfei u01]# lvreduce -L 80G /dev/mapper/rhel-home
  WARNING: Reducing active logical volume to 80.00 GiB.
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce rhel/home? [y/n]: y
  Size of logical volume rhel/home changed from 100.00 GiB (25600 extents) to 80.00 GiB (20480 extents).
  Logical volume rhel/home successfully resized.

[root@xifenfei u01]# mkfs.xfs -f /dev/mapper/rhel-home
meta-data=/dev/mapper/rhel-home  isize=512    agcount=16, agsize=1310720 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=20971520, imaxpct=25
         =                       sunit=64     swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=10240, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@xifenfei u01]# mount /home
xfsrestore -f /home.xfsdump /home
[root@xifenfei u01]# xfsrestore -f /home.xfsdump /home
xfsrestore: using file dump (drive_simple) strategy
xfsrestore: version 3.1.7 (dump format 3.0) - type ^C for status and control
xfsrestore: searching media for dump
xfsrestore: examining media file 0
xfsrestore: dump description: 
xfsrestore: hostname: xifenfei
xfsrestore: mount point: /home
xfsrestore: volume: /dev/mapper/rhel-home
xfsrestore: session time: Fri Jun 25 11:37:13 2021
xfsrestore: level: 0
xfsrestore: session label: "tar czvf /home.tar.gz /home
home"
xfsrestore: media label: "home"
xfsrestore: file system id: b996cff9-332b-4c07-96e1-8335a1f23627
xfsrestore: session id: 4d75008e-9927-417d-9722-52d13bb89eb0
xfsrestore: media id: 6094b9b5-a45f-4638-a0e2-c1b982ead67b
xfsrestore: using online session inventory
xfsrestore: searching media for directory dump
xfsrestore: reading directories
xfsrestore: 119 directories and 188 entries processed
xfsrestore: directory post-processing
xfsrestore: restoring non-directory files
xfsrestore: restore complete: 0 seconds elapsed
xfsrestore: Restore Summary:
xfsrestore:   stream 0 /home.xfsdump OK (success)
xfsrestore: Restore Status: SUCCESS
[root@xifenfei u01]# df -h
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/rhel-root  449G   14G  435G   4% /
devtmpfs                63G     0   63G   0% /dev
tmpfs                   63G   20M   63G   1% /run
tmpfs                   63G     0   63G   0% /sys/fs/cgroup
/dev/sda2             1014M  165M  850M  17% /boot
/dev/sda1              200M  9.8M  191M   5% /boot/efi
tmpfs                   13G  4.0K   13G   1% /run/user/42
tmpfs                   13G   28K   13G   1% /run/user/0
/dev/sr0               4.2G  4.2G     0 100% /media
tmpfs                   63G     0   63G   0% /dev/shm
/dev/mapper/rhel-home   80G   38M   80G   1% /home

xfs系统的lvm无法直接缩小空间,只能是通过xfsdump /home内容,然后lvm缩小空间重做xfs文件系统,再使用xfsdump还原

lvm扩容swap空间(swap从8G扩大到16G)

[root@xifenfei home]# free -m
              total        used        free      shared  buff/cache   available
Mem:         128355       86907       26110         274       15338       37632
Swap:         8192           0        8192
[root@xifenfei home]# lvextend -L 16GB /dev/rhel/swap
  Size of logical volume rhel/swap changed from 8.00 GiB (2048 extents) to 16.00 GiB (4096 extents).
  Logical volume rhel/swap successfully resized.
[root@xifenfei home]# sync;sync
[root@xifenfei home]# swapoff /dev/rhel/swap
mkswap /dev/rhel/swap 
[root@xifenfei home]# mkswap /dev/rhel/swap 
mkswap: /dev/rhel/swap: warning: wiping old swap signature.
swapon /dev/rhel/swap Setting up swapspace version 1, size = 16777212 KiB
no label, UUID=8d79ccf4-1796-49c9-968d-23abb67bc6eb
[root@xifenfei home]# swapon /dev/rhel/swap 
[root@xifenfei home]# free -m
              total        used        free      shared  buff/cache   available
Mem:         128355       86907       26110         274       15338       37632
Swap:         16383           0       16383

expdp 并行导出单表数据

Posted on 2021 年 06 月 18 日 by 惜分飞

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：expdp 并行导出单表数据

在某些情况下,需要使用并行的方法使用 datapump 对单个对象并行导出,导入加快数据迁移的数据
expdp导出操作

#!/bin/bash
chunk=10
for ((i=0;i<chunk;i++));
do
  expdp USERNAME/Password@DB_NAME TABLES=LOB_TEST QUERY=LOB_TEST:\"where mod\(dbms_rowid.rowid_block_number\(rowid\)\, 
>${chunk}\) = ${i}\" directory=DMP dumpfile=lob_test_${i}.dmp logfile= log_test_${i}.log &
  echo $i
done

impdp导入操作

#!/bin/bash
chunk=10
for ((i=0;i<chunk;i++));
do
 impdp USERNAME/Password@DB_NAME  directory=DMP REMAP_TABLE=LOB_TEST:LOB_TEST  remap_schema=source:target 
>dumpfile= lob_test_${i}.dmp logfile=TABLE_imp_log_test_${i}.log  DATA_OPTIONS=DISABLE_APPEND_HINT  CONTENT=DATA_ONLY &
 echo $i
done

在12c版本开始impdp可能会启用ENABLE_PARALLEL_DML特性,需要注意
参考:Optimising LOB Export and Import Performance via Oracle DataPump

datapump network_link遭遇ORA-12899错误

Posted on 2021 年 06 月 16 日 by 惜分飞

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：datapump network_link遭遇ORA-12899错误

在给一个客户使用expdp+network_link导出数据,然后通过impdp导入数据的过程中遇到ORA-12899问题.

对原库和现在库进行分析

原库和目标库表结构一致,原库该表存储数据实际长度确实为1,但是在impdp导入的时候提示需要长度为3.通过分析,确认原库的nls_length_semantics参数设置为char了,直接使用impdp+network_link不落地方式导入该表数据成功

根据上述情况,查询相关文档,确认类似记录为:
ORA-12899 When Using IMPDP Over Network Link (Doc ID 414901.1)
ORA-26059 During Impdp Using Export Dump Taken With NETWORK_LINK Option (Doc ID 2266956.1)
虽然都不是完全匹配该问题,但是基本上可以确认expdp的network_link和nls_length_semantics参数是引起该问题的根本原因,在后续的迁移中,尽量保持nls_length_semantics参数一致.