ORA-01172 ORA-01151 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-01172 ORA-01151 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

节点2报Error: Controlfile sequence number in file header is different from the one in memory,导致实例异常

Tue May 09 23:03:24 2023
Thread 2 cannot allocate new log, sequence 16728
Checkpoint not complete
  Current log# 3 seq# 16727 mem# 0: +DATA/xff/onlinelog/group_3.265.941900045
  Current log# 3 seq# 16727 mem# 1: +FRA/xff/onlinelog/group_3.259.941900045
Thread 2 advanced to log sequence 16728 (LGWR switch)
  Current log# 4 seq# 16728 mem# 0: +DATA/xff/onlinelog/group_4.266.941900045
  Current log# 4 seq# 16728 mem# 1: +FRA/xff/onlinelog/group_4.260.941900045
Tue May 09 23:03:31 2023
LNS: Standby redo logfile selected for thread 2 sequence 16728 for destination LOG_ARCHIVE_DEST_2
Tue May 09 23:03:32 2023
Archived Log entry 431615 added for thread 2 sequence 16727 ID 0x5ffc99b5 dest 1:
Tue May 09 23:05:30 2023
Error: Controlfile sequence number in file header is different from the one in memory
       Please check that the correct mount options are used if controlfile is located on NFS
USER (ospid: 30162): terminating the instance
Tue May 09 23:05:30 2023
System state dump requested by (instance=2, osid=30162), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_diag_6650.trc
Instance terminated by USER, pid = 30162

在节点1 进行实例重组之后,节点1 实例异常

Tue May 09 23:04:54 2023
Thread 1 cannot allocate new log, sequence 2060
Checkpoint not complete
  Current log# 1 seq# 2059 mem# 0: +DATA/xff/onlinelog/group_1.261.941899887
  Current log# 1 seq# 2059 mem# 1: +FRA/xff/onlinelog/group_1.257.941899887
Thread 1 advanced to log sequence 2060 (LGWR switch)
  Current log# 2 seq# 2060 mem# 0: +DATA/xff/onlinelog/group_2.262.941899889
  Current log# 2 seq# 2060 mem# 1: +FRA/xff/onlinelog/group_2.258.941899889
Tue May 09 23:04:58 2023
********************* ATTENTION: ******************** 
 The controlfile header block returned by the OS
 has a sequence number that is too old. 
 The controlfile might be corrupted.
 PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE 
 without following the steps below.
 RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE 
 TO THE DATABASE, if the controlfile is truly corrupted.
 In order to re-start the instance safely, 
 please do the following:
 (1) Save all copies of the controlfile for later 
     analysis and contact your OS vendor and Oracle support.
 (2) Mount the instance and issue: 
     ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
 (3) Unmount the instance. 
 (4) Use the script in the trace file to
     RE-CREATE THE CONTROLFILE and open the database. 
*****************************************************
Tue May 09 23:05:31 2023
Reconfiguration started (old inc 20, new inc 22)
List of instances:
 1 (myinst: 1) 
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE 
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Tue May 09 23:05:31 2023
 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Tue May 09 23:05:31 2023
 LMS 0: 3 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
Tue May 09 23:05:32 2023
Instance recovery: looking for dead threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Tue May 09 23:06:00 2023
ARC1 (ospid: 26512): terminating the instance
Tue May 09 23:06:00 2023
System state dump requested by (instance=1, osid=26512 (ARC1)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_diag_26311.trc
Tue May 09 23:06:01 2023
ORA-1092 : opitsk aborting process
Instance terminated by ARC1, pid = 26512

实例重启报错

Recovery of Online Redo Log: Thread 1 Group 1 Seq 2059 Reading mem 0
  Mem# 0: +DATA/dbm/onlinelog/group_1.261.941899887
  Mem# 1: +FRA/dbm/onlinelog/group_1.257.941899887
Recovery of Online Redo Log: Thread 2 Group 3 Seq 16727 Reading mem 0
  Mem# 0: +DATA/dbm/onlinelog/group_3.265.941900045
  Mem# 1: +FRA/dbm/onlinelog/group_3.259.941900045
Recovery of Online Redo Log: Thread 2 Group 4 Seq 16728 Reading mem 0
  Mem# 0: +DATA/dbm/onlinelog/group_4.266.941900045
  Mem# 1: +FRA/dbm/onlinelog/group_4.260.941900045
Hex dump of (file 1, block 102777) in trace file /u01/app/oracle/diag/rdbms/dbm/dbm2/trace/dbm2_ora_30749.trc
Reading datafile '+DATA/dbm/datafile/system.256.941899799' for corruption at rdba: 0x00419179 (file 1, block 102777)
Reread (file 1, block 102777) found different corrupt data (logically corrupt)
Hex dump of (file 1, block 102777) in trace file /u01/app/oracle/diag/rdbms/dbm/dbm2/trace/dbm2_ora_30749.trc
RECOVERY OF THREAD 2 STUCK AT BLOCK 102777 OF FILE 1
Abort recovery for domain 0
Aborting crash recovery due to error 1172
Errors in file /u01/app/oracle/diag/rdbms/dbm/dbm2/trace/dbm2_ora_30749.trc:
ORA-01172: recovery of thread 2 stuck at block 102777 of file 1
ORA-01151: use media recovery to recover block, restore backup if needed
Abort recovery for domain 0
Errors in file /u01/app/oracle/diag/rdbms/dbm/dbm2/trace/dbm2_ora_30749.trc:
ORA-01172: recovery of thread 2 stuck at block 102777 of file 1
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: ALTER DATABASE OPEN /* db agent *//* {0:890:17} */...

人工recover操作失败报ORA-600 3020错误

SQL> recover datafile 1;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [1], [102777], [4297081],[], []
ORA-10567: Redo is inconsistent with data block (file# 1, block# 102777, file
offset is 841949184 bytes)
ORA-10564: tablespace SYSTEM
ORA-01110: data file 1: '+DATA/dbm/datafile/system.256.941899799'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 469884

---alert日志
Tue May 09 23:28:44 2023
ALTER DATABASE RECOVER  datafile 1  
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 2 Group 3 Seq 16727 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/group_3.265.941900045
  Mem# 1: +FRA/xff/onlinelog/group_3.259.941900045
ORA-279 signalled during: ALTER DATABASE RECOVER  datafile 1  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log +FRA/xff/archivelog/2023_05_09/thread_1_seq_2055.20899.1136415701
ORA-279 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log +FRA/xff/archivelog/2023_05_09/thread_1_seq_2056.20837.1136415753
ORA-279 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log +FRA/xff/archivelog/2023_05_09/thread_1_seq_2057.20911.1136415803
ORA-279 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log +FRA/xff/archivelog/2023_05_09/thread_1_seq_2058.21898.1136415853
Recovery of Online Redo Log: Thread 2 Group 4 Seq 16728 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/group_4.266.941900045
  Mem# 1: +FRA/xff/onlinelog/group_4.260.941900045
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2059 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/group_1.261.941899887
  Mem# 1: +FRA/xff/onlinelog/group_1.257.941899887
Hex dump of (file 1, block 102777) in trace file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_16246.trc
Reading datafile '+DATA/xff/datafile/system.256.941899799' for corruption at rdba: 0x00419179 (file 1, block 102777)
Reread (file 1, block 102777) found different corrupt data (logically corrupt)
Hex dump of (file 1, block 102777) in trace file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_16246.trc
Tue May 09 23:28:59 2023
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_16246.trc  (incident=6868615):
ORA-00600: internal error code, arguments: [3020], [1], [102777], [4297081], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 1, block# 102777, file offset is 841949184 bytes)
ORA-10564: tablespace SYSTEM
ORA-01110: data file 1: '+DATA/xff/datafile/system.256.941899799'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 469884
Incident details in: /u01/app/oracle/diag/rdbms/xff/xff1/incident/incdir_6868615/xff1_ora_16246_i6868615.trc
Tue May 09 23:29:00 2023
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER CANCEL 
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...

根据上述报错信息可以确认报错的是一个index,而且非系统核心对象,可以通过allow 1 corruption方式进行恢复,并且open库成功

SQL> recover  datafile 1 allow 1 corruption;
Media recovery complete.
SQL> alter database open;

Database altered.

SQL> select owner,object_name,object_type from dba_objects where object_id=469884;

OWNER
--------------------------------------------------------------------------------
OBJECT_NAME
--------------------------------------------------------------------------------
OBJECT_TYPE
---------------------------------------------------------
SYSTEM
PK_XFF_SERVERS
INDEX

SQL> alter index system.PK_XFF_SERVERS rebuild online;

Index altered.

数据库完美恢复,数据0丢失,业务可以直接正常使用

存储双活同步导致数据库异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:存储双活同步导致数据库异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户双活存储异常之后,单个存储运行,故障存储修复之后,双活同步,出现多套系统异常,上一篇:Control file mount id mismatch!故障处理,这套是win的rac无法正常启动,ocr磁盘组异常(报ORA-600 kfrValAcd30无法正常mount)

C:\Users\Administrator>crsctl start cluster -all
CRS-2672: 尝试启动 'ora.crf' (在 'xff2' 上)
CRS-2672: 尝试启动 'ora.asm' (在 'xff2' 上)
CRS-2672: 尝试启动 'ora.crf' (在 'xff1' 上)
CRS-2672: 尝试启动 'ora.asm' (在 'xff1' 上)
CRS-2676: 成功启动 'ora.crf' (在 'xff2' 上)
CRS-2676: 成功启动 'ora.crf' (在 'xff1' 上)
CRS-5017: 资源操作 "ora.asm start" 遇到以下错误:
ORA-00600: internal error code, arguments: [kfrValAcd30], [OCR_VOTE], [1], [14], [7556], [15], [7584], [], [], [], [], []
。有关详细信息, 请参阅 "(:CLSN00107:)" (位于 "F:\app\grid\Administrator\diag\crs\xff2\crs\trace\ohasd_oraagent_system.trc" 中)。
CRS-2674: 未能启动 'ora.asm' (在 'xff2' 上)
CRS-2679: 尝试清除 'ora.asm' (在 'xff2' 上)
CRS-5017: 资源操作 "ora.asm start" 遇到以下错误:
ORA-00600: internal error code, arguments: [kfrValAcd30], [OCR_VOTE], [1], [14], [7556], [15], [7584], [], [], [], [], []
。有关详细信息, 请参阅 "(:CLSN00107:)" (位于 "F:\app\grid\Administrator\diag\crs\xff1\crs\trace\ohasd_oraagent_system.trc" 中)。
CRS-2674: 未能启动 'ora.asm' (在 'xff1' 上)
CRS-2679: 尝试清除 'ora.asm' (在 'xff1' 上)
CRS-2681: 成功清除 'ora.asm' (在 'xff2' 上)
CRS-2673: 尝试停止 'ora.crf' (在 'xff2' 上)
CRS-2677: 成功停止 'ora.crf' (在 'xff2' 上)
CRS-2681: 成功清除 'ora.asm' (在 'xff1' 上)
CRS-2673: 尝试停止 'ora.crf' (在 'xff1' 上)
CRS-2677: 成功停止 'ora.crf' (在 'xff1' 上)
CRS-4705: 无法在节点 xff1 上启动集群件。
CRS-4705: 无法在节点 xff2 上启动集群件。
CRS-4000: 命令 Start 失败, 或已完成但出现错误。

因为是ocr磁盘组操作比较简单,直接重建该磁盘组,还原ocr等即可

C:\Users\Administrator>asmtool -list
NTFS                             \Device\Harddisk0\Partition1              300M
NTFS                             \Device\Harddisk0\Partition4           599472M
NTFS                             \Device\Harddisk0\Partition5          1000000M
ORCLDISKDATA0                    \Device\Harddisk1\Partition1          1048587M
ORCLDISKDATA1                    \Device\Harddisk2\Partition1          1048587M
ORCLDISKDATA2                    \Device\Harddisk3\Partition1          1048587M
ORCLDISKDATA3                    \Device\Harddisk4\Partition1          1048587M
ORCLDISKDATA4                    \Device\Harddisk6\Partition1           460797M

C:\Users\Administrator>crsctl start crs -excl -nocrs
CRS-4123: Oracle 高可用性服务已启动。
CRS-2672: 尝试启动 'ora.evmd' (在 'xff2' 上)
CRS-2672: 尝试启动 'ora.mdnsd' (在 'xff2' 上)
CRS-2676: 成功启动 'ora.mdnsd' (在 'xff2' 上)
CRS-2676: 成功启动 'ora.evmd' (在 'xff2' 上)
CRS-2672: 尝试启动 'ora.gpnpd' (在 'xff2' 上)
CRS-2676: 成功启动 'ora.gpnpd' (在 'xff2' 上)
CRS-2672: 尝试启动 'ora.cssdmonitor' (在 'xff2' 上)
CRS-2672: 尝试启动 'ora.gipcd' (在 'xff2' 上)
CRS-2676: 成功启动 'ora.cssdmonitor' (在 'xff2' 上)
CRS-2676: 成功启动 'ora.gipcd' (在 'xff2' 上)
CRS-2672: 尝试启动 'ora.cssd' (在 'xff2' 上)
CRS-2676: 成功启动 'ora.cssd' (在 'xff2' 上)
CRS-2672: 尝试启动 'ora.ctssd' (在 'xff2' 上)
CRS-2676: 成功启动 'ora.ctssd' (在 'xff2' 上)
CRS-2672: 尝试启动 'ora.asm' (在 'xff2' 上)
CRS-5017: 资源操作 "ora.asm start" 遇到以下错误:
ORA-00600: internal error code, arguments: [kfrValAcd30], [OCR_VOTE], [1], [14], [7556], [15], [7584], [], [], [], [], []
。有关详细信息, 请参阅 "(:CLSN00107:)" (位于 "F:\app\grid\Administrator\diag\crs\xff2\crs\trace\ohasd_oraagent_system.trc" 中)。
CRS-2674: 未能启动 'ora.asm' (在 'xff2' 上)
CRS-2679: 尝试清除 'ora.asm' (在 'xff2' 上)
CRS-2681: 成功清除 'ora.asm' (在 'xff2' 上)
CRS-2673: 尝试停止 'ora.ctssd' (在 'xff2' 上)
CRS-2677: 成功停止 'ora.ctssd' (在 'xff2' 上)
CRS-4000: 命令 Start 失败, 或已完成但出现错误。

C:\Users\Administrator>sqlplus / as sysasm

SQL*Plus: Release 12.1.0.2.0 Production on 星期四 5月 4 13:52:07 2023

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

已连接到空闲例程。

SQL> startup nomount pfile='f:/pfile_asm.txt';
ASM 实例已启动

Total System Global Area 1140850688 bytes
Fixed Size                  3054680 bytes
Variable Size            1112630184 bytes
ASM Cache                  25165824 bytes

SQL>  create diskgroup OCR_VOTE  external redundancy disk '\\.\ORCLDISKDATA4' force  attribute 'COMPATIBLE.ASM' = '12.1.0';

Diskgroup created.

F:\>ocrconfig -restore backup00.ocr

F:\>crsctl replace votedisk +OCR_VOTE
已成功添加表决磁盘 e2b8fdbd05ae4f9fbf3531630853dbbc。
已成功将表决磁盘组替换为 +OCR_VOTE。
CRS-4266: 已成功替换表决文件

F:\>crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   e2b8fdbd05ae4f9fbf3531630853dbbc (\\.\ORCLDISKDATA4) [OCR_VOTE]
找到了 1 个表决磁盘。

F:\>ocrcheck
Oracle 集群注册表的状态如下:
         版本                  :          4
         总空间 (KB)     :     409568
         已用空间 (KB)      :       1348
         可用空间 (KB):     408220
         ID                       :  820087446
         设备/文件名         :  +OCR_VOTE
                                    设备/文件完整性检查成功

                                    设备/文件尚未配置

                                    设备/文件尚未配置

                                    设备/文件尚未配置

                                    设备/文件尚未配置

         集群注册表完整性检查成功

         逻辑损坏检查成功

mount其他磁盘组成功

SQL> alter diskgroup arch mount;

Diskgroup altered.

SQL>


SQL> alter diskgroup data mount;

Diskgroup altered.

尝试恢复数据库失败

C:\Users\Administrator>sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on 星期四 5月 4 14:09:39 2023

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

已连接到空闲例程。

SQL> startup mount;
ORACLE 例程已经启动。

Total System Global Area 2.0992E+11 bytes
Fixed Size                  7797816 bytes
Variable Size            1.3798E+11 bytes
Database Buffers         7.1672E+10 bytes
Redo Buffers              260636672 bytes
数据库装载完毕。

SQL> recover database;
ORA-10562: Error occurred while applying redo to data block (file# 13, block#1033775)
ORA-10564: tablespace USERS
ORA-01110: 数据文件 13: '+DATA/XFF/users07.dbf'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 40396
ORA-00600: 内部错误代码, 参数: [kdolkr-2], [2], [1], [44], [], [], [], [], [],[], [], []


SQL> recover datafile 2;
ORA-00283: 恢复会话因错误而取消
ORA-00742: 日志读取在线程 1 序列 60656 块 1150508 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1: '+DATA/XFF/redo03.log'


SQL> recover datafile 1;
ORA-00283: 恢复会话因错误而取消
ORA-00742: 日志读取在线程 1 序列 60656 块 1150508 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1: '+DATA/XFF/redo03.log'

SQL> recover datafile 10;
ORA-00283: ??????????
ORA-10562: Error occurred while applying redo to data block (file# 10, block#
2899468)
ORA-10564: tablespace USERS
ORA-01110: ???? 10: '+DATA/XFF/users04.dbf'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 40396
ORA-00600: ??????, ??: [ktbair2: illegal  inheritance], [], [], [], [], [], [],[], [], [], [], []

除了ORA-00742,还有其他一些日志应用错误,比如:ORA-600 ktbair2: illegal inheritance,ORA-600 kdolkr-2等,无法正常应用日志,尝试强制打开库,报ORA-600 2662错误.

SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [8], [678024613], [8],
[678508930], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [8], [678024612], [8],
[678508930], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [8], [678024610], [8],
[678508930], [12583040], [], [], [], [], [], []
进程 ID: 4628
会话 ID: 996 序列号: 48547

通过自研的Patch_SCN工具快速解决该问题
20230507195805


open数据库成功,实现最大限度抢救客户数据.

Control file mount id mismatch!故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Control file mount id mismatch!故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

通过沟通确认客户由于存储双活异常,业务运行在主存储上,另外一套存储修复之后,进行存储双活同步,结果在这个过程中由于遭遇Control file mount id mismatch! 导致数据库crash了

2023-05-03T20:21:07.446873+08:00
Archived Log entry 491897 added for T-1.S-246903 ID 0x97d92f0b LAD:1
2023-05-03T20:47:53.902701+08:00
Error: 2141
Control file mount id mismatch!
fhmid: 2592441863, SGA mid: 2624617448
Requesting DIAG on each RAC instance to dump the control file header block
2023-05-03T20:47:55.906490+08:00
Errors in file /opt/rac/oracle/diag/rdbms/xff/xff1/trace/xff1_rms0_20989.trc:
2023-05-03T20:47:56.521500+08:00
RMS0 (ospid: 20989): terminating the instance
2023-05-03T20:47:56.610656+08:00
System state dump requested by (instance=1, osid=20989 (RMS0)), summary=[abnormal instance termination].
System State dumped to trace file /opt/rac/oracle/diag/rdbms/xff/xff1/trace/xff1_diag_20912_20230503204756.trc
2023-05-03T20:47:58.480397+08:00
License high water mark = 395
2023-05-03T20:48:02.600203+08:00
Instance terminated by RMS0, pid = 20989
2023-05-03T20:48:02.601563+08:00
Warning: 2 processes are still attach to shmid 393226:
 (size: 28672 bytes, creator pid: 19941, last attach/detach pid: 20912)
2023-05-03T20:48:03.481726+08:00
USER (ospid: 967): terminating the instance
2023-05-03T20:48:03.483351+08:00
Instance terminated by USER, pid = 967

节点自动重启报错ORA-600 kccsbck_first

2023-05-03T20:48:34.870435+08:00
NOTE: ASMB mounting group 2 (FRA)
NOTE: ASM background process initiating disk discovery for grp 2 (reqid:0)
NOTE: Assigning number (2,1) to disk (/dev/asm_data0g)
NOTE: Assigning number (2,0) to disk (/dev/asm_data0f)
SUCCESS: mounted group 2 (FRA)
NOTE: grp 2 disk 1: FRA_0001 path:/dev/asm_data0g
NOTE: grp 2 disk 0: FRA_0000 path:/dev/asm_data0f
2023-05-03T20:48:34.919965+08:00
NOTE: dependency between database xff and diskgroup resource ora.FRA.dg is established
2023-05-03T20:48:38.983416+08:00
Errors in file /opt/rac/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_2436.trc  (incident=1333249):
ORA-00600: ??????, ??: [kccsbck_first], [1], [2624617448], [], [], [], [], [], [], [], [], []
Incident details in: /opt/rac/oracle/diag/rdbms/xff/xff1/incident/incdir_1333249/xff1_ora_2436_i1333249.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: ALTER DATABASE MOUNT /* db agent *//* {0:8:116} */...

再次重启数据库报错ORA-00742 ORA-00312

2023-05-04T08:18:59.635790+08:00
Aborting crash recovery due to error 742
2023-05-04T08:18:59.635897+08:00
Errors in file /opt/rac/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_80855.trc:
ORA-00742: ??????? 2 ?? 244996 ? 8262 ??????????
ORA-00312: ???? 7 ?? 2: '+FRA/xff/ONLINELOG/group_7.446.1059323695'
ORA-00312: ???? 7 ?? 2: '+DATA/xff/ONLINELOG/group_7.272.1059323695'
Abort recovery for domain 0, flags 4
2023-05-04T08:18:59.647994+08:00
Errors in file /opt/rac/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_80855.trc:
ORA-00742: ??????? 2 ?? 244996 ? 8262 ??????????
ORA-00312: ???? 7 ?? 2: '+FRA/xff/ONLINELOG/group_7.446.1059323695'
ORA-00312: ???? 7 ?? 2: '+DATA/xff/ONLINELOG/group_7.272.1059323695'
ORA-742 signalled during: ALTER DATABASE OPEN /* db agent *//* {2:37368:2} */...
2023-05-04T08:19:00.820708+08:00
License high water mark = 33
2023-05-04T08:19:00.820936+08:00
USER (ospid: 82788): terminating the instance
2023-05-04T08:19:01.827132+08:00
Instance terminated by USER, pid = 82788

明显数据库在启动的时候做实例恢复,发现redo写丢失,从而引起数据库无法正常open,对于此类故障,处理比较多
ORA-00742 ORA-00312 故障恢复-1
ORA-00742 ORA-00312故障恢复-2
ORA-00742: 日志读取在线程 %d 序列 %d 块 %d 中检测到写入丢失情况

ORA-600 kzrini:!uprofile处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kzrini:!uprofile处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友反馈,oracle expdp导出数据hang住,然后重启数据库,启动成功,但是执行导出的时候报ORA-600 kzrini:!uprofile错误
kzrini uprofile


因为有过大量类似故障的恢复case,所以第一反应这种故障很可能就是tab$被清空的故障,参见:警告:互联网中有oracle介质被注入恶意程序导致—ORA-600 16703,对于这类故障,我们是国内第一个发现并且进行预警(在blog,qq/微信群,oracle技术大会等),但是还有很多朋友中招.今天这个发现故障之后没有重启,比较顺利,进行山删除插入即可,参考以前的处理:ORA-600 16703直接把orachk备份表插入到tab$恢复

数据库open报ORA-07445 kglsget错误处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:数据库open报ORA-07445 kglsget错误处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友反馈,数据库服务器断电之后,强制拉库无法open成功

Wed Apr 19 18:13:30 2023
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Started redo scan
Completed redo scan
 read 35 KB redo, 20 data blocks need recovery
Started redo application at
 Thread 1: logseq 4, block 3
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0
  Mem# 0: E:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Completed redo application of 0.02MB
Completed crash recovery at
 Thread 1: logseq 4, block 73, scn 157134992
 20 data blocks read, 20 data blocks written, 35 redo k-bytes read
Thread 1 advanced to log sequence 5 (thread open)
Thread 1 opened at log sequence 5
  Current log# 2 seq# 5 mem# 0: E:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
No Resource Manager plan active
Starting background process QMNC
Wed Apr 19 18:13:34 2023
QMNC started with pid=56, OS id=6380 
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x12] [PC:0x9068BE9, kglsget()+547]
ERROR: Unable to normalize symbol name for the following short stack (at offset 199):
dbgexProcessError()+193<-dbgeExecuteForError()+65<-dbgePostErrorKGE()+1726
<-dbkePostKGE_kgsf()+75<-kgeade()+560<-kgerev()+125<-kgerec5()+60<-sss_xcpt_EvalFilterEx()+1869
<-sss_xcpt_EvalFilter()+174<-.1.6_8+59<-00007FFB13102316<-00007FFB1311398D<-00007FFB130D93A7
<-00007FFB13112B1A<-kglsget()+547<-qcdolci()+80<-qcdoloi()+167<-qcdlgcd()+940<-kkdlgcd()+129
<-PGOSF158_kkmfbtcn()+17<-qcsIsColInFro()+474<-qcsRslvColWithinQbc()+304<-qcsStrongColRslv()+632
<-qcsRslvName()+212<-qcsridn()+104<-qcsraic()+736<-qcspqbDescendents()+485<-qcspqb()+154<-kkmdrv()+200
<-opiSem()+2560<-opiDeferredSem()+565<-opitca()+391<-PGOSF525_kksFullTypeCheck()+27<-rpiswu2()+2757
<-kksLoadChild()+9357<-kxsGetRuntimeLock()+2320<-kksfbc()+15225<-kkspsc0()+2117<-kksParseCursor()+181
<-opiosq0()+2538<-opiall0()+6705<-opikpr()+699<-opiodr()+1662<-rpidrus()+862<-rpidru()+154<-rpiswu2()+2757
<-kprball()+1610<-qmCheckIfXdbInstalled()+1059<-qm_open_db()+11<-kscnfy()+778
Errors in file e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_6036.trc  (incident=113085):
ORA-07445: 出现异常错误: 核心转储 [kglsget()+547] [ACCESS_VIOLATION] [ADDR:0x12] [PC:0x9068BE9] [UNABLE_TO_READ] []
Incident details in: e:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_113085\orcl_ora_6036_i113085.trc
Wed Apr 19 18:13:36 2023
Trace dumping is performing id=[cdmp_20230419181336]
Wed Apr 19 18:13:39 2023
PMON (ospid: 6624): terminating the instance due to error 397
Instance terminated by PMON, pid = 6624

查看trace文件

*** 2023-04-21 17:29:42.077
*** SESSION ID:(898.359) 2023-04-21 17:29:42.077
*** CLIENT ID:() 2023-04-21 17:29:42.077
*** SERVICE NAME:(SYS$USERS) 2023-04-21 17:29:42.077
*** MODULE NAME:(sqlplus.exe) 2023-04-21 17:29:42.077
*** ACTION NAME:() 2023-04-21 17:29:42.077
 
Dump continued from file: e:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_9648.trc
ORA-07445: 出现异常错误: 核心转储 [kglsget()+547] [ACCESS_VIOLATION] [ADDR:0x12] [PC:0x9138BE9] [UNABLE_TO_READ] []

========= Dump for incident 444285 (ORA 7445 [kglsget()+547]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x12] [PC:0x9138BE9, kglsget()+547]

Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
Process Id: 0x00000a8c  Thread Id : 0x000025b0    Time : Fri Apr 21 17:29:42 
Excp. Code: 0xc0000005  Excp. Type: ACCESS_VIO    Flags: 0x00000000

------------------- Registers ----------------------------
ip=0000000009138BE9 sp=00000000276FA470 rp=0000000A0BF280D0
r1=000000001089AFE0 r2=0000000000000000 r3=0000000000000101 
r4=00000009BDFE40B0 r5=00000000276FA470 r6=0000000A0BF280D0 r7=00000000000005C0 
r8=00000009B9DC4F48 r9=0000000A0BF280D0 r10=0000000000000200 r11=00000000276FA590 
r12=00000009ED85F4C0 r13=000000000CB3C070 r14=0000000000000001 r15=0000000000000001 
------------------- End of Registers ---------------------


*** 2023-04-21 17:29:42.077
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=gtf6tgc2ycgxx) -----
select count(*) from XDB.XDB$SCHEMA s where s.xmldata.schema_url = 'http://xmlns.oracle.com/xdb/XDBSchema.xsd'

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
kglsget()+547                 0000000000000000     009962700 00996270C 000000000
                                                   0276DCCA0
qcdolci()+80         CALL???  kglsget()+347        9C5F9E7D0 009138AE2 000000018
                                                   000000001
qcdoloi()+167        CALL???  qcdolci()            00000DC7E 008D52DEF 000000006
                                                   00CB5BEE8
qcdlgcd()+940        CALL???  qcdoloi()            000000000 000000000 000000000
                                                   000000000
kkdlgcd()+129        CALL???  qcdlgcd()            000000000 9B9DC6C78 000000000
                                                   01089AA20
__PGOSF158_kkmfbtcn  CALL???  kkdlgcd()            01089AFE0 00CB5BEE8 000000000
()+17                                              000000000
qcsIsColInFro()+474  CALL???  __PGOSF158_kkmfbtcn  000000001 9BDFE41E0 000000000
                              ()                   000000001
qcsRslvColWithinQbc  CALL???  qcsIsColInFro()      900000000 01089AFE0 0276FCFC0
()+304                                             742E204F1B71
qcsStrongColRslv()+  CALL???  qcsRslvColWithinQbc  000000010 27F00000000
632                           ()+73                000000000 000000000
qcsRslvName()+212    CALL???  qcsStrongColRslv()+  0276FA9E0 9C3F136C0 00764E4E0
                              516                  0093A505B
qcsridn()+104        CALL???  qcsRslvName()        101000000000000 000000000
                                                   000000000 0276FAE90
qcsraic()+736        CALL???  qcsridn()            0276FACB0 0276FACB0 000000000
                                                   000000001
qcspqbDescendents()  CALL???  qcsraic()            0276FAE90 01089AFE0 00CB5D030
+485                                               00CB5CEE8
qcspqb()+154         CALL???  qcspqbDescendents()  00000002A 008D53029 01089AFE0
                                                   000000000
kkmdrv()+200         CALL???  qcspqb()             000000029 00895C700 0276FAF90
                                                   00895C700
opiSem()+2560        CALL???  kkmdrv()             00CB31730 0108B7820 000000001
                                                   9C3F13730
opiDeferredSem()+56  CALL???  opiSem()             0276FCFC0 A09FA3D68 00000006E
5                                                  000000001
opitca()+391         CALL???  opiDeferredSem()     000000001 000000000 006F7AC18
                                                   000000000
__PGOSF525_kksFullT  CALL???  opitca()             037615F10 9C3F138D0 0277000A0
ypeCheck()+27                                      00895C700
rpiswu2()+2757       CALL???  __PGOSF525_kksFullT  0276FEB98 000004018 0108A04E0
                              ypeCheck()           0108A0700
kksLoadChild()+9357  CALL???  rpiswu2()            9F86A81C8 000000000 9C5F9D1E4
                                                   000000002
kxsGetRuntimeLock()  CALL???  kksLoadChild()       01089AFE0 A09FA39D8 0276FF400
+2320                                              A09FA39D8
kksfbc()+15225       CALL???  kxsGetRuntimeLock()  01089AFE0 037615F10 0276FF400
                                                   000000103
kkspsc0()+2117       CALL???  kksfbc()             037615F10 000000003 A00000108
                                                   0076D1C80
kksParseCursor()+18  CALL???  kkspsc0()            00CAD1DA8 0076D1C80 00000006F
1                                                  000000003
opiosq0()+2538       CALL???  kksParseCursor()     00CACF990 000000000 00895C700
                                                   027700940
opiall0()+6705       CALL???  opiosq0()            000000003 00000000E 0277010C0
                                                   000000020
opikpr()+699         CALL???  opiall0()            000000065 000000022 0277018C8
                                                   000000000
opiodr()+1662        CALL???  opikpr()             000000065 90000001C 027703208
                                                   01089AA20
rpidrus()+862        CALL???  opiodr()             000000065 00000001C 027703208
                                                   000000000
rpidru()+154         CALL???  rpidrus()            0277027F8 000000000 000000000
                                                   000000000
rpiswu2()+2757       CALL???  rpidru()             027703030 000000000 000000000
                                                   000000000
kprball()+1610       CALL???  rpiswu2()            9F86A81C8 000000000 027702E50
                                                   000000002
qmCheckIfXdbInstall  CALL???  kprball()            027703208 000000100 000000002
ed()+1059                                          00CAEC188
qm_open_db()+11      CALL???  qmCheckIfXdbInstall  000000000 000000000 000000000
                              ed()                 00895C700
kscnfy()+778         CALL???  qm_open_db()         657FC5CD00000019 000000000
                                                   000000000 100000000
adbdrv()+45489       CALL???  kscnfy()             000000019 000000000 000000000
                                                   000000000
opiexe()+20842       CALL???  adbdrv()             000000023 000000003 A00000102
                                                   000000000
opiosq0()+5129       CALL???  opiexe()+16981       000000004 000000000 02770A8C0
                                                   009121AB3
kpooprx()+357        CALL???  opiosq0()            000000003 00000000E 02770AB90
                                                   0000000A4
kpoal8()+940         CALL???  kpooprx()            01089AFE0 0090CC0A8 01089AFE0
                                                   000000001
opiodr()+1662        CALL???  kpoal8()             00000005E 00000001C 02770E100
                                                   00ABBF224
ttcpip()+1325        CALL???  opiodr()             00000005E 00000001C 02770E100
                                                   4100200000000000
opitsk()+2040        CALL???  ttcpip()             0108B4D00 000000000 000000000
                                                   000000000
opiino()+1258        CALL???  opitsk()             00000001E 000000000 000000000
                                                   02770F9F8
opiodr()+1662        CALL???  opiino()             00000003C 000000004 02770FAB0
                                                   000000000
opidrv()+864         CALL???  opiodr()             00000003C 000000004 02770FAB0
                                                   615C3A6500000000
sou2o()+98           CALL???  opidrv()+150         00000003C 000000004 02770FAB0
                                                   000000000
opimai_real()+158    CALL???  sou2o()              064425745 000000000 000000000
                                                   02770FBA4
opimai()+191         CALL???  opimai_real()        7FFA241388C0 7FFA24138A49
                                                   000000000 00895C700

*** 2023-04-21 17:29:42.343
OracleThreadStart()  CALL???  opimai()             000401452 000000002 0085DFE20
+724                                               0000025B0
00007FFA246713D2     CALL???  OracleThreadStart()  00FCAFF18 000000000 000000000
                                                   000000000
00007FFA26D403C4     CALL???  00007FFA246713B0     7FFA246713B0 000000000
                                                   000000000 000000000
 

--------------------- Binary Stack Dump ---------------------

确定报错是在select count(*) from XDB.XDB$SCHEMA s where s.xmldata.schema_url = ‘http://xmlns.oracle.com/xdb/XDBSchema.xsd’这个sql语句中,对其数据库启动过程进行跟踪,没有发现该语句而是只有报错

EXEC #6:c=0,e=181,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,plh=3755742892,tim=22655713465
FETCH #6:c=0,e=12,p=0,cr=3,cu=0,mis=0,r=1,dep=2,og=4,plh=3755742892,tim=22655713503
CLOSE #6:c=0,e=6,dep=2,type=3,tim=22655713532
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x12] [PC:0x9138BE9, kglsget()+547]
 dump file: e:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_452285\orcl_ora_9848_i452285.trc
ORA-07445: 出现异常错误: 核心转储 [kglsget()+547] [ACCESS_VIOLATION] [ADDR:0x12] [PC:0x9138BE9] [UNABLE_TO_READ] []

证明ORA-07445 kglsget错误的语句没有到sql执行阶段,而是在解析阶段就开始报错.通过对oracle启动过程进行处理(因为该语句不是数据库启动个必须的语句),规避掉该sql执行,实现数据库正常open
20230421223825