ORA-00742 ORA-00312 故障恢复

联系:手机/微信(+86 13429648788) QQ(107644445)QQ咨询惜分飞

标题:ORA-00742 ORA-00312 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

12.1.0.1的由于硬件故障,恢复文件之后,导致redo写丢失,数据库数据库无法正常启动
报错ORA-00742 ORA-00312

Mon Feb 22 17:07:48 2021
alter database open
Mon Feb 22 17:07:48 2021
Beginning crash recovery of 1 threads
 parallel recovery started with 3 processes
Mon Feb 22 17:07:48 2021
Started redo scan
Mon Feb 22 17:07:49 2021
Slave encountered ORA-10388 exception during crash recovery
Mon Feb 22 17:07:49 2021
Slave encountered ORA-10388 exception during crash recovery
Mon Feb 22 17:07:49 2021
Slave encountered ORA-10388 exception during crash recovery
Mon Feb 22 17:07:51 2021
Aborting crash recovery due to error 742
Mon Feb 22 17:07:51 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4624.trc:
ORA-00742: 日志读取在线程 1 序列 4035 块 44165 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO03.LOG'
Mon Feb 22 17:07:51 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4624.trc:
ORA-00742: 日志读取在线程 1 序列 4035 块 44165 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO03.LOG'
ORA-742 signalled during: alter database open...

通过屏蔽一致性,强制resetlogs方式打开库报ORA-600 2662错误

Mon Feb 22 17:27:38 2021
Checker run found 17 new persistent data failures
alter database open resetlogs 
Mon Feb 22 17:27:54 2021
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 102879654
Resetting resetlogs activation ID 762781739 (0x2d77202b)
Mon Feb 22 17:27:59 2021
Setting recovery target incarnation to 4
Mon Feb 22 17:28:00 2021
Assigning activation ID 895702933 (0x35635795)
Starting background process TMON
Mon Feb 22 17:28:00 2021
TMON started with pid=26, OS id=4204 
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\ORCL12C\REDO01.LOG
Successful open of redo thread 1
Mon Feb 22 17:28:00 2021
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Feb 22 17:28:00 2021
SMON: enabling cache recovery
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4804.trc  (incident=21657):
ORA-00600: 内部错误代码, 参数: [2662], [0], [102879661], [0], [102879857], [20971648], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\incident\incdir_21657\orcl12c_ora_4804_i21657.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Feb 22 17:28:06 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4804.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [102879661], [0], [102879857], [20971648], [], [], [], [], [], []
Mon Feb 22 17:28:06 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_4804.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [102879661], [0], [102879857], [20971648], [], [], [], [], [], []
Mon Feb 22 17:28:06 2021
Error 600 happened during db open, shutting down database
USER (ospid: 4804): terminating the instance due to error 600

由于scn相差的不大,重启几次后,该问题解决,后续数据库启动报ORA-600 4193

Mon Feb 22 19:53:11 2021
Database Characterset is ZHS16GBK
Starting background process SMCO
Mon Feb 22 19:53:11 2021
SMCO started with pid=28, OS id=3236 
Mon Feb 22 19:53:15 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_smon_4460.trc:
ORA-01595: 释放区 (2) 回退段 (1) 时出错
ORA-00600: 内部错误代码, 参数: [4193], [15352], [18655], [], [], [], [], [], [], [], [], []
Mon Feb 22 19:53:18 2021
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl12c\orcl12c\trace\orcl12c_ora_1356.trc:
ORA-00600: 内部错误代码, 参数: [4193], [15352], [18655], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 1356): terminating the instance due to error 600
Mon Feb 22 19:53:21 2021
Instance terminated by USER, pid = 1356
ORA-1092 signalled during: ALTER DATABASE OPEN...

处理异常undo之后,数据库启动正常,完成数据库恢复

Oracle 19c故障恢复

联系:手机/微信(+86 13429648788) QQ(107644445)QQ咨询惜分飞

标题:Oracle 19c故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户找到我们,他们的oracle 19c数据库由于异常断电,导致启动异常,经过一系列恢复之后,依旧无法解决问题,请求我们给予支持.通过我们的Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check),获取数据库当前信息如下:
数据库版本为19C并且安装了19.5.0.0.191015 (30125133)补丁
20200310220453
20200310220748


数据库使用pdb
20200310220610

数据库启动成功后,一会就crash掉

2020-03-10T01:44:41.018032+08:00
Pluggable database RACBAK opened read write
2020-03-10T01:44:41.018996+08:00
Pluggable database RAC opened read write
2020-03-10T01:44:51.244050+08:00
Completed: ALTER PLUGGABLE DATABASE ALL OPEN
Starting background process CJQ0
Completed: ALTER DATABASE OPEN
2020-03-10T01:44:51.317085+08:00
CJQ0 started with pid=224, OS id=32581 
2020-03-10T01:44:56.067043+08:00
Errors in file /opt/oracle/diag/rdbms/XFF/XFF/trace/XFF_j001_32588.trc  (incident=1095281) (PDBNAME=RAC):
ORA-00600: internal error code, arguments: [4193], [27733], [27754], [], [], [], [], [], [], [], [], []
RAC(4):Incident details in: /opt/oracle/diag/rdbms/XFF/XFF/incident/incdir_1095281/XFF_j001_32588_i1095281.trc
RAC(4):Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2020-03-10T01:44:56.073112+08:00
RAC(4):*****************************************************************
RAC(4):An internal routine has requested a dump of selected redo.
RAC(4):This usually happens following a specific internal error, when
RAC(4):analysis of the redo logs will help Oracle Support with the
RAC(4):diagnosis.
RAC(4):It is recommended that you retain all the redo logs generated (by
RAC(4):all the instances) during the past 12 hours, in case additional
RAC(4):redo dumps are required to help with the diagnosis.
RAC(4):*****************************************************************
2020-03-10T01:44:56.079228+08:00
Errors in file /opt/oracle/diag/rdbms/XFF/XFF/trace/XFF_j002_32590.trc  (incident=1095289) (PDBNAME=RAC):
ORA-00600: internal error code, arguments: [4193], [2633], [2638], [], [], [], [], [], [], [], [], []
RAC(4):Incident details in: /opt/oracle/diag/rdbms/XFF/XFF/incident/incdir_1095289/XFF_j002_32590_i1095289.trc
RAC(4):Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2020-03-10T01:44:56.085068+08:00
RAC(4):*****************************************************************
RAC(4):An internal routine has requested a dump of selected redo.
RAC(4):This usually happens following a specific internal error, when
RAC(4):analysis of the redo logs will help Oracle Support with the
RAC(4):diagnosis.
RAC(4):It is recommended that you retain all the redo logs generated (by
RAC(4):all the instances) during the past 12 hours, in case additional
RAC(4):redo dumps are required to help with the diagnosis.
RAC(4):*****************************************************************
2020-03-10T01:44:56.115765+08:00
Errors in file /opt/oracle/diag/rdbms/XFF/XFF/trace/XFF_j004_32594.trc  (incident=1095305) (PDBNAME=RAC):
ORA-00600: internal error code, arguments: [4193], [63532], [63537], [], [], [], [], [], [], [], [], []
RAC(4):Incident details in: /opt/oracle/diag/rdbms/XFF/XFF/incident/incdir_1095305/XFF_j004_32594_i1095305.trc
RAC(4):Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2020-03-10T01:46:48.202213+08:00
RAC(4):Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
RAC(4):  Mem# 0: /opt/oracle/oradata/XFF/redo02.log
RAC(4):Block recovery completed at rba 0.0.0, scn 0x0000000d3675e48e
RAC(4):DDE: Problem Key 'ORA 600 [4193]' was completely flood controlled (0x6)
Further messages for this problem key will be suppressed for up to 10 minutes
2020-03-10T01:46:48.384040+08:00
Errors in file /opt/oracle/diag/rdbms/XFF/XFF/trace/XFF_clmn_31741.trc:
ORA-00600: internal error code, arguments: [4193], [27733], [27754], [], [], [], [], [], [], [], [], []
Errors in file /opt/oracle/diag/rdbms/XFF/XFF/trace/XFF_clmn_31741.trc  (incident=1093505) (PDBNAME=CDB$ROOT):
ORA-501 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /opt/oracle/diag/rdbms/XFF/XFF/incident/incdir_1093505/XFF_clmn_31741_i1093505.trc
2020-03-10T01:46:49.264624+08:00
USER (ospid: 31741): terminating the instance due to ORA error 501
2020-03-10T01:46:49.280664+08:00
System state dump requested by (instance=1, osid=31741 (CLMN)), summary=[abnormal instance termination].
System State dumped to trace file /opt/oracle/diag/rdbms/XFF/XFF/trace/XFF_diag_31759.trc
2020-03-10T01:46:53.156926+08:00
ORA-00501: CLMN process terminated with error
2020-03-10T01:46:53.157103+08:00
Errors in file /opt/oracle/diag/rdbms/XFF/XFF/trace/XFF_diag_31759.trc:
ORA-00501: CLMN process terminated with error
2020-03-10T01:46:53.157211+08:00
Dumping diagnostic data in directory=[cdmp_20200310014649], requested by (instance=1, osid=31741 (CLMN)), 
summary=[abnormal instance termination].

通过报错信息判断,数据库open之后(特别是pdb 4 open之后),开始报ORA-600 4193错误.然后由于CLMN进程异常,最后数据库crash.对于这类故障,因为使用的pdb,而且是由于pdb的undo异常导致数据库启动之后crash,可以通过对于pdb进行特殊处理,从而实现数据库启动之后不再crash.

ORA-600 [4193]

ORA-600 4193 解释说明

ERROR:              

  Format: ORA-600 [4193] [a] [b]

VERSIONS:           
  versions 6.0 to 12.1

DESCRIPTION:        

  A mismatch has been detected between Redo records and Rollback (Undo) 
  records.

  We are validating the Undo block sequence number in the undo block against 
  the Redo block sequence number relating to the change being applied.

  This error is reported when this validation fails.

ARGUMENTS:
  Arg [a] Undo record seq number
  Arg [b] Redo record seq number

FUNCTIONALITY:
  KERNEL TRANSACTION UNDO





ORA-600 [4193] [a] [b] [ ] [ ]  [ ]        
Versions: 7.2.2  - 9.2.0                              Source: ktuc.c
===========================================================================
Meaning: seq# mismatch while adding an undo record to an undo block. This 
         is done by the application of redo. 
---------------------------------------------------------------------------
Argument Description:

    a. (ktubhseq): undo record seq# - this is the seq# of the block that 
                                      this undo record WILL BE APPLIED TO. 
                                      This is from the Undo Block. It is 
                                      NOT the seq# of the undo block itself.
                                      
    b. (ktudbseq): redo RECORD seq# - this is the seq# number in the block 
                                      that this redo WILL BE APPLIED TO. 
                                      This is from the Redo Record. 

---------------------------------------------------------------------------
Diagnosis:

    This error is raised in kturdb which handles the adding of undo records 
    by the application of redo. 
    
    When we try to apply redo to an undo block (forward changes are made by 
    the application of redo to a block) we check that the seq# in the undo 
    record matches the seq# in the redo record. These seq# should be the 
    same because when we apply a redo record we must apply it to the 
    correct version of the block. We can only apply a redo record to a 
    block that contains the same seq# as in the redo record. 

    If the seq# do not match then this error is raised. This implies some 
    kind of block corruption in either the redo or the undo block. 

7.3.x - 8.1.7.x
ASSERT2(ubh->ktubhseq == db->ktudbseq, OERI(4193), KSESVSGN,
            ubh->ktubhseq, db->ktudbseq);
9.2.x
ksesic2(OERI(4193), ksenrg(ubh->ktubhseq), ksenrg(db->ktudbseq));

struct ktubh
{
  kxid  ktubhxid;      /* txid of tx currently using or last used this block */
  ub2   ktubhseq;                              /* undo block sequence number */
  ub1   ktubhcnt;    /* high water mark record index, number of undo entries */
  ub1   ktubhirb;  /* rollback record index, rec index to start the rollback */
  ub1   ktubhicl;  /* collecting record index, rec index to start retrieving col info */
  ub1   ktubhflg;                                                 /* dummy */
  ub2   ktubhidx[1];     /* byte offset of record in block, grows at runtime */
};

struct ktudb   Kernel Transaction Undo Data operation Block (redo)
{
  ub2    ktudbsiz;                                          /* size of entry */
  ub2    ktudbspc;                 /* verification: space left in undo block */
  ub2    ktudbflg;            /* flag to indicate the kind of redo operation */
  kxid   ktudbxid;                                          /* current tx id */
  ub2    ktudbseq;                                  /* block sequence number */
  ub1    ktudbrec;                       /* new record index for this change */
};

ORA 600 4193 处理方法同How to resolve ORA-600 [4194] errors