删除asmlib磁盘导致磁盘组故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:删除asmlib磁盘导致磁盘组故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户执行drop disk磁盘组操作之后,然后立刻从oracle asmlib层面执行了oracleasm deletedisk,并且在操作系统层面delete partition(删除磁盘分区),导致磁盘组直接dismount

Tue Nov 26 16:44:04 2024
SQL> alter diskgroup data drop disk DATA_0008 
NOTE: GroupBlock outside rolling migration privileged region
Tue Nov 26 08:44:05 2024
NOTE: stopping process ARB0
NOTE: rebalance interrupted for group 2/0x28dec0d5 (DATA)
NOTE: requesting all-instance membership refresh for group=2
NOTE: membership refresh pending for group 2/0x28dec0d5 (DATA)
Tue Nov 26 08:44:14 2024
GMON querying group 2 at 48 for pid 18, osid 27385
SUCCESS: refreshed membership for 2/0x28dec0d5 (DATA)
SUCCESS: alter diskgroup data drop disk DATA_0008
NOTE: starting rebalance of group 2/0x28dec0d5 (DATA) at power 2
Starting background process ARB0
Tue Nov 26 08:44:14 2024
ARB0 started with pid=38, OS id=56987 
NOTE: assigning ARB0 to group 2/0x28dec0d5 (DATA) with 2 parallel I/Os
Tue Nov 26 08:44:17 2024
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Tue Nov 26 08:44:57 2024
cellip.ora not found.
Tue Nov 26 17:08:46 2024
SQL> alter diskgroup data drop disk DATA_0008 
ORA-15032: not all alterations performed
ORA-15071: ASM disk "DATA_0008" is already being dropped
ERROR: alter diskgroup data drop disk DATA_0008
Tue Nov 26 17:10:30 2024
SQL> alter diskgroup data drop disk DATA_0008 
ORA-15032: not all alterations performed
ORA-15071: ASM disk "DATA_0008" is already being dropped
ERROR: alter diskgroup data drop disk DATA_0008
Tue Nov 26 09:34:38 2024
WARNING: cache read  a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8 (DATA_0008) incarn=3911069755 au=0 blk=98 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc
WARNING:cache read (retry) a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8(DATA_0008)incarn=3911069755 au=0 blk=98 count=1
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ERROR: cache failed to read group=2(DATA) dsk=8 blk=98 from disk(s): 8(DATA_0008)
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _arb0_+asm1(56987)initiating offline of disk 8.3911069755 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e303b, mask = 0x6a, op = clear
Tue Nov 26 09:34:38 2024
GMON updating disk modes for group 2 at 49 for pid 38, osid 56987
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Tue Nov 26 09:34:38 2024
NOTE: cache dismounting (not clean) group 2/0x28DEC0D5 (DATA) 
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
NOTE: messaging CKPT to quiesce pins Unix process pid: 89645, image: oracle@ahptdb5 (B000)
Tue Nov 26 09:34:38 2024
NOTE: halting all I/Os to diskgroup 2 (DATA)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc  (incident=413105):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
Tue Nov 26 09:34:39 2024
ERROR: ORA-15130 in COD recovery for diskgroup 2/0x28dec0d5 (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_27385.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ERROR: ORA-15335 thrown in ARB0 for group number 2
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
NOTE: stopping process ARB0
Tue Nov 26 09:34:40 2024
NOTE: LGWR doing non-clean dismount of group 2 (DATA)
NOTE: LGWR sync ABA=716.2684 last written ABA 716.2684

通过重新分区,并且kfed repair修复磁盘头操作之后,重新mount磁盘组报错

SQL> alter diskgroup data mount 
NOTE: cache registered group DATA number=2 incarn=0x73bec220
NOTE: cache began mount (first) of group DATA number=2 incarn=0x73bec220
NOTE: Assigning number (2,16) to disk (/dev/oracleasm/disks/DATA208)
NOTE: Assigning number (2,15) to disk (/dev/oracleasm/disks/DATA207)
NOTE: Assigning number (2,14) to disk (/dev/oracleasm/disks/DATA206)
NOTE: Assigning number (2,13) to disk (/dev/oracleasm/disks/DATA205)
NOTE: Assigning number (2,12) to disk (/dev/oracleasm/disks/DATA204)
NOTE: Assigning number (2,11) to disk (/dev/oracleasm/disks/DATA203)
NOTE: Assigning number (2,10) to disk (/dev/oracleasm/disks/DATA202)
NOTE: Assigning number (2,9) to disk (/dev/oracleasm/disks/DATA201)
NOTE: Assigning number (2,6) to disk (/dev/oracleasm/disks/DATA07)
NOTE: Assigning number (2,5) to disk (/dev/oracleasm/disks/DATA06)
NOTE: Assigning number (2,4) to disk (/dev/oracleasm/disks/DATA05)
NOTE: Assigning number (2,0) to disk (/dev/oracleasm/disks/DATA01)
NOTE: Assigning number (2,3) to disk (/dev/oracleasm/disks/DATA04)
NOTE: Assigning number (2,2) to disk (/dev/oracleasm/disks/DATA03)
NOTE: Assigning number (2,1) to disk (/dev/oracleasm/disks/DATA02)
NOTE: Assigning number (2,8) to disk (/dev/oracleasm/disks/DATA101)
Tue Nov 26 11:48:22 2024
NOTE: GMON heartbeating for grp 2
GMON querying group 2 at 83 for pid 27, osid 15781
NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/oracleasm/disks/DATA01
NOTE: F1X0 found on disk 0 au 2 fcn 0.127835487
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/oracleasm/disks/DATA02
NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/oracleasm/disks/DATA03
NOTE: cache opening disk 3 of grp 2: DATA_0003 path:/dev/oracleasm/disks/DATA04
NOTE: cache opening disk 4 of grp 2: DATA_0004 path:/dev/oracleasm/disks/DATA05
NOTE: cache opening disk 5 of grp 2: DATA_0005 path:/dev/oracleasm/disks/DATA06
NOTE: cache opening disk 6 of grp 2: DATA_0006 path:/dev/oracleasm/disks/DATA07
NOTE: cache opening disk 8 of grp 2: DATA_0008 path:/dev/oracleasm/disks/DATA101
NOTE: cache opening disk 9 of grp 2: DATA_0009 path:/dev/oracleasm/disks/DATA201
NOTE: cache opening disk 10 of grp 2: DATA_0010 path:/dev/oracleasm/disks/DATA202
NOTE: cache opening disk 11 of grp 2: DATA_0011 path:/dev/oracleasm/disks/DATA203
NOTE: cache opening disk 12 of grp 2: DATA_0012 path:/dev/oracleasm/disks/DATA204
NOTE: cache opening disk 13 of grp 2: DATA_0013 path:/dev/oracleasm/disks/DATA205
NOTE: cache opening disk 14 of grp 2: DATA_0014 path:/dev/oracleasm/disks/DATA206
NOTE: cache opening disk 15 of grp 2: DATA_0015 path:/dev/oracleasm/disks/DATA207
NOTE: cache opening disk 16 of grp 2: DATA_0016 path:/dev/oracleasm/disks/DATA208
NOTE: cache mounting (first) external redundancy group 2/0x73BEC220 (DATA)
Tue Nov 26 11:48:22 2024
* allocate domain 2, invalid = TRUE 
kjbdomatt send to inst 2
Tue Nov 26 11:48:22 2024
NOTE: attached to recovery domain 2
NOTE: starting recovery of thread=1 ckpt=716.1536 group=2 (DATA)
NOTE: starting recovery of thread=2 ckpt=763.6248 group=2 (DATA)
NOTE: recovery initiating offline of disk 8 group 2 (*)
NOTE: cache initiating offline of disk 8 group DATA
NOTE: process _user15781_+asm1 (15781) initiating offline of disk 8.3911069996 (DATA_0008) with mask 0x7e in group 2
NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e312c, mask = 0x6a, op = clear
GMON updating disk modes for group 2 at 84 for pid 27, osid 15781
ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
Tue Nov 26 11:48:23 2024
NOTE: halting all I/Os to diskgroup 2 (DATA)
NOTE: recovery (pass 2) of diskgroup 2 (DATA) caught error ORA-15130
Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_15781.trc:
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
ORA-15131: block 97 of file 8 in diskgroup 2 could not be read
ORA-15196: invalid ASM block header [kfc.c:7600] [endian_kfbh] [2147483656] [97] [0 != 1]

由于客户执行了oracleasm deletedisk,根据经验确认该操作是对asm磁盘头的前1M数据进行了清空,而客户这个asm刚好是drop disk触发了rebalance操作的时候干掉磁盘的,基于这样的情况,直接通过修复磁盘1M数据并且mount磁盘组继续使用该磁盘组的概率不大.因此处理建议:
1. 直接恢复出来该磁盘组数据然后打开该库
2. 直接提取客户需要的核心表数据
有过客户有类似操作是asmlib重新创建了磁盘信息恢复:分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例
删除分区信息之后数据库恢复案例:删除分区 oracle asm disk 恢复

ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ntfs MFT损坏(ntfs文件系统故障)导致oracle异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户虚拟化环境,由于断电,启动数据库报ORA-01157错误,通过操作系统层面查看,发现文件是存在的,但是dbv检测报不可访问
ora-01157


感觉是文件系统损坏了,尝试把该文件拷贝到其他磁盘
221509

查看操作系统事件,确认是ntfs文件系统的MFT损坏
mft

基于这种情况,通过文件系统恢复工具进行恢复该文件尝试,提示恢复文件大小和实际元数据中记录大小不一致
214712

通过对比实际恢复大小和文件本身大小,发现7811899392-7791460352,几乎等于20M大小(也就是说恢复出来的数据文件少了20M),通过分析数据库alert日志,确认该系统在前端时间刚好扩展了20M(增加数据文件之时指定了每次扩展20m)

2023-08-11T11:29:21.397236+08:00
ALTER TABLESPACE "HSHIS" ADD DATAFILE
'D:\APP\ADMINISTRATOR\ORADATA\HIS\HSHIS01.DBF' SIZE 10M AUTOEXTEND ON NEXT 20M MAXSIZE 8001M
Completed: ALTER TABLESPACE "HSHIS" ADD DATAFILE
'D:\APP\ADMINISTRATOR\ORADATA\HIS\HSHIS01.DBF' SIZE 10M AUTOEXTEND ON NEXT 20M MAXSIZE 8001M

2024-10-09T00:18:31.058537+08:00
Resize operation completed for file# 66, old size 7608320K, new size 7628800K

通过对该文件底层block分析,确认最终丢失block就是最后20M(直接的数据文件的block的rdba均正确),对于这种故障,通过填补数据文件尾部,欺骗数据库完成该文件的恢复(最后20M中如果写入了业务数据,可能会丢失),做好该文件修复工作之后,尝试打开数据库,结果很不乐观,redo也损坏
recover-error


屏蔽一致性,强制打开库成功

2024-10-18T04:24:43.911107+08:00
ALTER DATABASE RECOVER    CANCEL  
2024-10-18T04:24:47.098637+08:00
Errors in file E:\TRACE\diag\rdbms\his\his\trace\his_pr00_2608.trc:
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'E:\ORADATA\SYSTEM01.DBF'
2024-10-18T04:24:47.114278+08:00
ORA-1547 signalled during: ALTER DATABASE RECOVER    CANCEL  ...
ALTER DATABASE RECOVER CANCEL 
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...
2024-10-18T04:25:03.989398+08:00
alter database open resetlogs
2024-10-18T04:25:05.598781+08:00
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 2666786639 time 
Resetting resetlogs activation ID 3659241623 (0xda1b9897)
2024-10-18T04:25:12.380089+08:00
Setting recovery target incarnation to 3
2024-10-18T04:25:15.052071+08:00
Ping without log force is disabled:
  instance mounted in exclusive mode.
Endian type of dictionary set to little
2024-10-18T04:25:15.458286+08:00
Assigning activation ID 3703362676 (0xdcbcd474)
2024-10-18T04:25:15.505102+08:00
TT00 (PID:4092): Gap Manager starting
2024-10-18T04:25:15.551992+08:00
Redo log for group 1, sequence 1 is not located on DAX storage
2024-10-18T04:25:17.833250+08:00
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: E:\ORADATA\REDO01.LOG
Successful open of redo thread 1
2024-10-18T04:25:17.848888+08:00
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
stopping change tracking
2024-10-18T04:25:22.052035+08:00
Undo initialization recovery: err:0 start: 24275578 end: 24276578 diff: 1000 ms (1.0 seconds)
Undo initialization online undo segments: err:0 start: 24276578 end: 24276593 diff: 15 ms (0.0 seconds)
Undo initialization finished serial:0 start:24275578 end:24276640 diff:1062 ms (1.1 seconds)
Dictionary check beginning
Dictionary check complete
Verifying minimum file header compatibility for tablespace encryption..
Verifying file header compatibility for tablespace encryption completed for pdb 0
2024-10-18T04:25:23.114610+08:00
Database Characterset is AL32UTF8
No Resource Manager plan active
2024-10-18T04:25:29.036475+08:00
replication_dependency_tracking turned off (no async multimaster replication found)
2024-10-18T04:25:32.833386+08:00
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Starting background process AQPC
2024-10-18T04:25:33.145881+08:00
AQPC started with pid=37, OS id=5560 
2024-10-18T04:25:35.677167+08:00
Starting background process CJQ0
2024-10-18T04:25:35.708430+08:00
CJQ0 started with pid=39, OS id=2728 
2024-10-18T04:25:36.724036+08:00
Completed: alter database open resetlogs

然后导出数据到新库,其中遇到了file# 66号文件最后丢失的20M引起的数据无法正常导出的问题处理(丢弃损坏部分数据,把剩余好的表中数据恢复到新库中)

清空redo,导致ORA-27048: skgfifi: file header information is invalid

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:清空redo,导致ORA-27048: skgfifi: file header information is invalid

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户由于空间不足,使用> redo命令清空了oracle的redo文件
redo


数据库挂掉之后,启动报错

Fri Oct 04 10:32:57 2024
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 31 processes
Started redo scan
Errors in file /home/oracle/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_24876.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/u01/app/oracle/oradata/xifenfei/redo03.log'
ORA-27048: skgfifi: file header information is invalid
Additional information: 13
Aborting crash recovery due to error 313
Errors in file /home/oracle/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_24876.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/u01/app/oracle/oradata/xifenfei/redo03.log'
ORA-27048: skgfifi: file header information is invalid
Additional information: 13
Errors in file /home/oracle/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_24876.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/u01/app/oracle/oradata/xifenfei/redo03.log'
ORA-27048: skgfifi: file header information is invalid
Additional information: 13
ORA-313 signalled during: alter database open...
Fri Oct 04 10:32:58 2024
Errors in file /home/oracle/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_m000_29646.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/u01/app/oracle/oradata/xifenfei/redo01.log'
ORA-27047: unable to read the header block of file
Linux-x86_64 Error: 25: Inappropriate ioctl for device
Additional information: 1
Errors in file /home/oracle/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_m000_29646.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/u01/app/oracle/oradata/xifenfei/redo02.log'
ORA-27047: unable to read the header block of file
Linux-x86_64 Error: 25: Inappropriate ioctl for device
Additional information: 1
Errors in file /home/oracle/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_m000_29646.trc:
ORA-00313: open failed for members of log group 3 of thread 1
ORA-00312: online log 3 thread 1: '/u01/app/oracle/oradata/xifenfei/redo03.log'
ORA-27048: skgfifi: file header information is invalid
Additional information: 11
Checker run found 6 new persistent data failures
Fri Oct 04 10:47:32 2024
db_recovery_file_dest_size of 4182 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.

这种情况下,所有redo全部被清空(包含current,active的redo),只能强制拉库,运气不错,拉库成功.

Sun Oct 06 10:09:01 2024
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 25668466513
Resetting resetlogs activation ID 4222555315 (0xfbaf14b3)
Sun Oct 06 10:09:10 2024
Setting recovery target incarnation to 3
Sun Oct 06 10:09:10 2024
Assigning activation ID 79943739 (0x4c3d83b)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /u01/app/oracle/oradata/xifenfei/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Oct 06 10:09:11 2024
SMON: enabling cache recovery
Undo initialization finished serial:0 start:70198684 end:70198794 diff:110 (1 seconds)
Dictionary check beginning
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
Sun Oct 06 10:09:12 2024
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sun Oct 06 10:09:13 2024
QMNC started with pid=23, OS id=4328 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Sun Oct 06 10:09:16 2024
db_recovery_file_dest_size of 4182 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Sun Oct 06 10:09:16 2024
Starting background process CJQ0
Sun Oct 06 10:09:16 2024
CJQ0 started with pid=25, OS id=4413 
Completed: alter database open resetlogs

通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:通过alert日志分析客户自行对一个数据库恢复的来龙去脉和点评

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

12.1.0.2数据库由于异常断电,导致无法正常启动,通过alert日志对客户的整个操作过程进行分析(不含我的操作部分)
12.1.0.2


通过alert日志分析最初故障原因是由于控制文件有坏块导致

Tue Sep 24 11:49:48 2024
alter database open
Tue Sep 24 11:49:48 2024
Ping without log force is disabled
.
Tue Sep 24 11:49:48 2024
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_4715.trc:
ORA-01113: file 10 needs media recovery
ORA-01110: data file 10: '/u01/app/oracle/oradata/xifenfei.dbf'
ORA-1113 signalled during: alter database open...
alter database recover datafile '/u01/app/oracle/oradata/xifenfei.dbf'

offline 无法正常recover的数据文件

Tue Sep 24 13:13:30 2024
Media Recovery Complete (orcl)
Completed: ALTER DATABASE RECOVER  datafile 15  
ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xifenfei.dbf' END BACKUP
ORA-1235 signalled during: ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xifenfei.dbf' END BACKUP...
ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xifenfei.dbf' offline
Completed: ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xifenfei.dbf' offline
Tue Sep 24 13:25:16 2024
 ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xff.dbf' offline
Completed:  ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/xff.dbf' offline

然后尝试打开数据库,遭遇ORA-600 4193错误,没有open成功

Tue Sep 24 13:27:06 2024
Media Recovery Complete (orcl)
Completed: ALTER DATABASE RECOVER  datafile 13   
alter database open
Tue Sep 24 13:27:16 2024
Ping without log force is disabled
.
Tue Sep 24 13:27:16 2024
Beginning crash recovery of 1 threads
 parallel recovery started with 7 processes
Tue Sep 24 13:27:16 2024
Started redo scan
Tue Sep 24 13:27:16 2024
Completed redo scan
 read 67 KB redo, 0 data blocks need recovery
Tue Sep 24 13:27:16 2024
Started redo application at
 Thread 1: logseq 7422, block 2, scn 119284797
Tue Sep 24 13:27:16 2024
Recovery of Online Redo Log: Thread 1 Group 3 Seq 7422 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/orcl/redo03.log
Tue Sep 24 13:27:16 2024
Completed redo application of 0.00MB
Tue Sep 24 13:27:16 2024
Completed crash recovery at
 Thread 1: logseq 7422, block 136, scn 119284798
 0 data blocks read, 0 data blocks written, 67 redo k-bytes read
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Starting background process TMON
Tue Sep 24 13:27:16 2024
TMON started with pid=32, OS id=10617 
Tue Sep 24 13:27:16 2024
Thread 1 advanced to log sequence 7423 (thread open)
Thread 1 opened at log sequence 7423
  Current log# 1 seq# 7423 mem# 0: /u01/app/oracle/oradata/orcl/redo01.log
Successful open of redo thread 1
Tue Sep 24 13:27:16 2024
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Sep 24 13:27:16 2024
SMON: enabling cache recovery
Tue Sep 24 13:27:20 2024
[10553] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:6974064 end:6975474 diff:1410 ms (1.4 seconds)
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
File #10 is offline, but is part of an online tablespace.
data file 10: '/u01/app/oracle/oradata/tbs_data.dbf'
File #14 is offline, but is part of an online tablespace.
data file 14: '/u01/app/oracle/oradata/corsmf03.dbf'
Dictionary check complete
Verifying minimum file header compatibility (11g) for tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
Tue Sep 24 13:27:21 2024
SMON: enabling tx recovery
Tue Sep 24 13:27:21 2024
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Updating character set in controlfile to AL32UTF8
Starting background process SMCO
Tue Sep 24 13:27:21 2024
SMCO started with pid=34, OS id=10632 
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_10523.trc  (incident=108129):
ORA-00600: internal error code, arguments: [4193], [21368], [21372], [], [], [], [], [], [], [], [], []
Incident details in:/u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_108129/orcl_smon_10523_i108129.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
…………
Tue Sep 24 13:27:24 2024
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_10553.trc:
ORA-00600: internal error code, arguments: [4193], [21652], [21539], [], []
Tue Sep 24 13:27:24 2024
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_10553.trc:
ORA-00600: internal error code, arguments: [4193], [21652], [21539], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 10553): terminating the instance due to error 600
Tue Sep 24 13:27:25 2024
Instance terminated by USER, pid = 10553
ORA-1092 signalled during: alter database open...

重建了ctl,加入_allow_resetlogs_corruption隐含参数,尝试使用resetlogs方式打开数据库,报ORA-600 2662错误

Tue Sep 24 14:30:22 2024
alter database open RESETLOGS
Tue Sep 24 14:32:09 2024
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 119237645 time 
Online log /u01/app/oracle/oradata/orcl/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/app/oracle/oradata/orcl/redo02.log: Thread 1 Group 2 was previously cleared
Online log /u01/app/oracle/oradata/orcl/redo03.log: Thread 1 Group 3 was previously cleared
Tue Sep 24 14:32:09 2024
Setting recovery target incarnation to 2
Tue Sep 24 14:32:09 2024
Ping without log force is disabled
.
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Tue Sep 24 14:32:09 2024
Warning - High Database SCN: Current SCN value is 119237648, threshold SCN value is 0
If you have not previously reported this warning on this database, 
please notify Oracle Support so that additional diagnosis can be performed.
Starting background process TMON
Tue Sep 24 14:32:09 2024
TMON started with pid=25, OS id=15032 
Tue Sep 24 14:32:09 2024
Assigning activation ID 1708301307 (0x65d29bfb)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /u01/app/oracle/oradata/orcl/redo01.log
Successful open of redo thread 1
Tue Sep 24 14:32:09 2024
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Sep 24 14:32:09 2024
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14937.trc  (incident=122458):
ORA-00600: internal error code, arguments: [2662], [0], [119484861], [0], [119484868], [16777344]……
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_122458/orcl_ora_14937_i122458.trc
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14937.trc  (incident=122459):
………………
Tue Sep 24 14:32:16 2024
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_124802/orcl_ora_14937_i124802.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [0], [119484866], [0], [119484868], [16777344]……
ORA-00600: internal error code, arguments: [2662], [0], [119484865], [0], [119484868], [16777344]……
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [119484861], [0], [119484868], [16777344]……

客户的自行恢复到此为止,没有成功,这里客户的恢复没有犯原则性错误(破坏文件的resetlogs 信息),同时也没有解决两个ORA-600错误
1. 在offline部分文件的情况下,打开数据库(没有使用resetlogs,避免了进一步破坏offline文件的resetlogs 信息),但是数据库报ORA-600 4193错误没有打开库成功
2. 后面强制拉库之前重建了ctl文件,避免了offline数据文件在resetlogs之后导致文件头resetlogs 信息和其他文件不一致的可能(因为重建ctl,offline的文件自动onlinne)
3. 最初offline数据文件,启动库报ORA-600 4193故障没有解决,这个故障一般是undo异常导致,这个故障大概率在后面强制拉库open过程中还可能遇到
4. 强制拉库过程中遭遇ORA-600 2662问题,需要修改scn,如果这个问题不解决,数据库无法open成功

ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个10g的库应用访问报ORA-12514: TNS: 监听进程不能解析在连接描述符中给出的SERVICE_NAME
ORA-12514


通过分析alert日志,确认是数据库启动报ORA-600 4194错误

Mon Sep 23 16:12:42 2024
SMON: enabling cache recovery
Mon Sep 23 16:12:43 2024
Successfully onlined Undo Tablespace 1.
Mon Sep 23 16:12:43 2024
SMON: enabling tx recovery
Mon Sep 23 16:12:43 2024
Database Characterset is ZHS16GBK
Mon Sep 23 16:12:43 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\udump\xifenfei_ora_7832.trc:
ORA-00600: 内部错误代码, 参数: [4194], [66], [50], [], [], [], [], []

DEBUG: Replaying xcb 0xae312888, pmd 0x9058f4d4 for failed op 8
Doing block recovery for file 2 block 5547
No block recovery was needed
Mon Sep 23 16:13:31 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\udump\xifenfei_ora_7832.trc:
ORA-00600: 内部错误代码, 参数: [4194], [66], [50], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [4194], [66], [50], [], [], [], [], []

Mon Sep 23 16:13:32 2024
DEBUG: Replaying xcb 0xae312888, pmd 0x9058f4d4 for failed op 8
Mon Sep 23 16:13:32 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\udump\xifenfei_ora_7832.trc:
ORA-00600: 内部错误代码, 参数: [4194], [66], [50], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [4194], [66], [50], [], [], [], [], []

Doing block recovery for file 2 block 5547
No block recovery was needed
Mon Sep 23 16:13:33 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\udump\xifenfei_ora_7832.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4194], [66], [50], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [66], [50], [], [], [], [], []

Mon Sep 23 16:14:18 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_smon_5880.trc:
ORA-00600: internal error code, arguments: [4194], [66], [50], [], [], [], [], []

Mon Sep 23 16:14:19 2024
DEBUG: Replaying xcb 0xae312888, pmd 0x9058f4d4 for failed op 8
Mon Sep 23 16:14:19 2024
Non-fatal internal error happenned while SMON was doing shrinking of rollback segments.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Mon Sep 23 16:14:19 2024
Doing block recovery for file 2 block 5547
No block recovery was needed
Mon Sep 23 16:15:06 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_pmon_6952.trc:
ORA-00600: internal error code, arguments: [4194], [66], [50], [], [], [], [], []

Mon Sep 23 16:15:06 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_pmon_6952.trc:
ORA-00600: internal error code, arguments: [4194], [66], [50], [], [], [], [], []

Mon Sep 23 16:15:06 2024
PMON: terminating instance due to error 472
Mon Sep 23 16:15:07 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_psp0_2104.trc:
ORA-00472: PMON  process terminated with error

Mon Sep 23 16:15:07 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_lgwr_3200.trc:
ORA-00472: PMON  process terminated with error

Mon Sep 23 16:15:07 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw1_448.trc:
ORA-00472: PMON  process terminated with error

Mon Sep 23 16:15:07 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw0_7436.trc:
ORA-00472: PMON  process terminated with error

Mon Sep 23 16:15:07 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_mman_1704.trc:
ORA-00472: PMON  process terminated with error

Mon Sep 23 16:15:07 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_dbw2_5072.trc:
ORA-00472: PMON  process terminated with error

Mon Sep 23 16:15:07 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_ckpt_6628.trc:
ORA-00472: PMON  process terminated with error

Mon Sep 23 16:15:07 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_reco_7924.trc:
ORA-00472: PMON  process terminated with error

Mon Sep 23 16:15:07 2024
Errors in file d:\oracle\product\10.2.0\admin\xifenfei\bdump\xifenfei_smon_5880.trc:
ORA-00472: PMON  process terminated with error

Instance terminated by PMON, pid = 6952

这个比较简单一般就是undo异常,对undo设置为人工管理,然后重建undo完成本次恢复任务