ORA-600 3417和ORA-600 3005故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 3417和ORA-600 3005故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

运行中的数据库突然报ORA-600 3417错误,lgwr进程异常数据库crash

Thu Nov 17 03:00:14 2022
Archived Log entry 23860 added for thread 2 sequence 1958 ID 0x6200e2f5 dest 1:
Thu Nov 17 03:13:11 2022
Auto-tuning: Shutting down background process GTX1
Thu Nov 17 04:00:02 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_1740.trc  (incident=672186):
ORA-00600: 内部错误代码, 参数: [3417], [2], [0], [0], [0], [1], [2], [], [], [], [], []
Thu Nov 17 04:00:04 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_1740.trc:
ORA-00600: 内部错误代码, 参数: [3417], [2], [0], [0], [0], [1], [2], [], [], [], [], []
LGWR (ospid: 1740): terminating the instance due to error 470

重启之后报ORA-600 3005错误,数据库启动失败

Thu Nov 17 04:03:09 2022
Successful mount of redo thread 2, with mount id 1648753015
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: ALTER DATABASE MOUNT /* db agent *//* {0:1:38} */
ALTER DATABASE OPEN /* db agent *//* {0:1:38} */
This instance was first to open
Beginning crash recovery of 1 threads
 parallel recovery started with 31 processes
Thu Nov 17 04:03:14 2022
Started redo scan
ORA-00600: ??????, ??: [3005], [1], [706], [10374], [0], [0], [], [], [], [], [], []
Thu Nov 17 04:03:15 2022
Reconfiguration started (old inc 14, new inc 16)
List of instances:
 1 2 (myinst: 2) 
 Global Resource Directory frozen
Thu Nov 17 04:03:15 2022
 Communication channels reestablished
Thu Nov 17 04:03:16 2022
 * domain 0 valid = 0 according to instance 1 
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Thu Nov 17 04:03:16 2022
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Thu Nov 17 04:03:16 2022
Sweep [inc][688298]: completed
Sweep [inc2][688298]: completed
Thu Nov 17 04:03:16 2022
 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Thu Nov 17 04:03:16 2022
 LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Abort recovery for domain 0
Aborting crash recovery due to error 600
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_ora_15352.trc:
ORA-00600: ??????, ??: [3005], [1], [706], [10374], [0], [0], [], [], [], [], [], []
Abort recovery for domain 0
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_ora_15352.trc:
ORA-00600: ??????, ??: [3005], [1], [706], [10374], [0], [0], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN /* db agent *//* {0:1:38} */...

尝试人工进行recover恢复库

SQL> recover database;
ORA-00279: 更改 310644203 (在 11/17/2022 01:00:05 生成) 对于线程 2 是必需的
ORA-00289: 建议:
+DATA/orcl/archivelog/2022_11_17/thread_2_seq_1956.22763.1120960801
ORA-00280: 更改 310644203 (用于线程 2) 在序列 #1956 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: 更改 310663747 (在 11/17/2022 02:00:01 生成) 对于线程 2 是必需的
ORA-00289: 建议:
+DATA/orcl/archivelog/2022_11_17/thread_2_seq_1957.22764.1120962585
ORA-00280: 更改 310663747 (用于线程 2) 在序列 #1957 中


ORA-10877: error signaled in parallel recovery slave


ORA-01112: 未启动介质恢复

通过查看alert日志确认由于ORA-00353错误导致recover database失败

Thu Nov 17 08:07:39 2022
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 32 slaves
Thu Nov 17 08:07:41 2022
Recovery of Online Redo Log: Thread 1 Group 1 Seq 705 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/group_1.261.1116409583
ORA-279 signalled during: ALTER DATABASE RECOVER  database  ...
Thu Nov 17 08:08:07 2022
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log +DATA/orcl/archivelog/2022_11_17/thread_2_seq_1956.22763.1120960801
ORA-279 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log +DATA/orcl/archivelog/2022_11_17/thread_2_seq_1957.22764.1120962585
Thu Nov 17 08:08:14 2022
Recovery of Online Redo Log: Thread 2 Group 4 Seq 1958 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/group_4.266.1116409589
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_pr00_7116.trc  (incident=704315):
ORA-00353: 日志损坏接近块 20866 更改 310761542 时间 11/17/2022 03:00:04
ORA-00312: 联机日志 1 线程 1: '+DATA/orcl/onlinelog/group_1.261.1116409583'
Thu Nov 17 08:08:26 2022
Sweep [inc][704315]: completed
Thu Nov 17 08:08:27 2022
Media Recovery failed with error 354
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_pr00_7116.trc:
ORA-00283: 恢复会话因错误而取消
ORA-00354: 损坏重做日志块标头
ORA-00353: 日志损坏接近块 20866 更改 310761542 时间 11/17/2022 03:00:04
ORA-00312: 联机日志 1 线程 1: '+DATA/orcl/onlinelog/group_1.261.1116409583'
Thu Nov 17 08:08:27 2022
ORA-10877 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER CANCEL 
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...

通过对redo进行处理顺利recover成功并完美open库

SQL> recover database;
完成介质恢复。
SQL> alter database open;

数据库已更改。

ORA-600 3417故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 3417故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库突然报ORA-600 3417错误

Mon Sep 26 06:42:51 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc  (incident=176185):
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\incident\incdir_176185\orcl2_lgwr_7984_i176185.trc
Mon Sep 26 06:42:54 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7984.trc:
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
LGWR (ospid: 7984): terminating the instance due to error 470

节点2异常之后,节点1由于跨节点实例恢复导致异常

Mon Sep 26 06:44:26 2022
Instance recovery: looking for dead threads
Beginning instance recovery of 1 threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
 parallel recovery started with 31 processes
Started redo scan
Completed redo scan
 read 887 KB redo, 348 data blocks need recovery
Started redo application at
 Thread 2: logseq 9907, block 1980
Recovery of Online Redo Log: Thread 2 Group 3 Seq 9907 Reading mem 0
  Mem# 0: +DATA/orcl/onlinelog/group_3.265.1078882689
  Mem# 1: +OCR/orcl/onlinelog/group_3.259.1078882689
Completed redo application of 0.32MB
Completed instance recovery at
 Thread 2: logseq 9907, block 3755, scn 231951271
 338 data blocks read, 348 data blocks written, 887 redo k-bytes read
Mon Sep 26 06:44:36 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc  (incident=208205):
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
Mon Sep 26 06:44:38 2022
Reconfiguration started (old inc 14, new inc 16)
List of instances:
 1 2 (myinst: 1) 
 Global Resource Directory frozen
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Mon Sep 26 06:44:38 2022
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 26 06:44:38 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Sep 26 06:44:38 2022
 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 26 06:44:38 2022
 LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info 
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
ORA-600 occurred during recovery, instance will be terminated
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_smon_7972.trc:
ORA-00600: 内部错误代码, 参数: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
System state dump requested by (instance=1,osid=7972 (SMON)),summary=[abnormal instance termination].
SMON (ospid: 7972): terminating the instance due to error 600
Mon Sep 26 06:44:43 2022
ORA-1092 : opitsk aborting process
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl1\trace\orcl1_diag_6956_20220926064442.trc:
ORA-00601: ??????
Mon Sep 26 06:44:46 2022
opiodr aborting process unknown ospid (6688) as a result of ORA-1092

再次重启实例无法正常启动报ORA-600 3417错误

Completed: ALTER DATABASE   MOUNT
Mon Sep 26 08:08:34 2022
alter database open
This instance was first to open
Picked broadcast on commit scheme to generate SCNs
Mon Sep 26 08:08:36 2022
LGWR: STARTING ARCH PROCESSES
Mon Sep 26 08:08:36 2022
ARC0 started with pid=39, OS id=5004 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Sep 26 08:08:37 2022
ARC1 started with pid=38, OS id=3568 
Mon Sep 26 08:08:37 2022
ARC2 started with pid=41, OS id=3308 
Mon Sep 26 08:08:37 2022
ARC3 started with pid=42, OS id=8180 
Mon Sep 26 08:08:37 2022
ARC4 started with pid=43, OS id=7768 
Mon Sep 26 08:08:37 2022
ARC5 started with pid=44, OS id=4628 
Mon Sep 26 08:08:37 2022
ARC6 started with pid=45, OS id=6920 
Mon Sep 26 08:08:37 2022
ARC7 started with pid=46, OS id=7960 
ARC1: Archival started
ARC2: Archival started
ARC3: Archival started
ARC4: Archival started
ARC5: Archival started
ARC6: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc(incident=400186):
ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
ARC7: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Mon Sep 26 08:08:39 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl2\trace\orcl2_lgwr_7924.trc:
ORA-00600: ??????, ??: [3417], [3], [0], [1], [0], [3], [3], [], [], [], [], []
LGWR (ospid: 7924): terminating the instance due to error 470
Mon Sep 26 08:08:44 2022
ORA-1092 : opitsk aborting process

故障比较明显,数据库两个节点故障之后,启动其中一个节点,已经完成了实例恢复,但是无法open,通过10046进行跟踪发下你open过程没有执行任何语句直接导致win服务异常,然后终止
111


通过进一步分析确认是redo组异常

SQL> select group#,sequence#,status,thread# from v$log;

    GROUP#  SEQUENCE# STATUS              THREAD#
---------- ---------- ---------------- ----------
         1      10837 CURRENT                   1
         2          0 UNUSED                    1
         3       9907 CURRENT                   2
         4          0 UNUSED                    2

进一步查询数据文件是否正常

SQL> set pages 10000
SQL> set numw 16
SQL> SELECT status,
  2  checkpoint_change#,
  3  checkpoint_time,last_change#,
  4  count(*) ROW_NUM
  5  FROM v$datafile
  6  GROUP BY status, checkpoint_change#, checkpoint_time,last_change#
  7  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_CHANGE# CHECKPOINT_TIME         LAST_CHANGE#          ROW_NUM

------- ------------------ ------------------- ---------------- ----------------

ONLINE           231971314 2022-09-26 06:44:37        231971314               44

SYSTEM           231971314 2022-09-26 06:44:37        231971314                1



SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,chec
kpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh
24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_TIME                          FUZ CHECKPOINT_CHANGE#
 ROW_NUM
------- ---------------------------------------- --- ------------------ --------
--------
ONLINE  2022-09-26 06:44:37                      NO           231971314
      45

基于上述情况,对于这个库,可以直接open库,实现数据0丢失