ORA-600 kfrValAcd30 恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kfrValAcd30 恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户由于存储控制器损坏,修复控制器之后,asm无法正常mount,报ORA-600 kfrValAcd30错误,让我们提供技术支持
kfrValAcd30


asm alert日志报错

Wed Apr 03 16:50:57 2019
SQL> alter diskgroup data mount 
NOTE: cache registered group DATA number=1 incarn=0x14248741
NOTE: cache began mount (first) of group DATA number=1 incarn=0x14248741
NOTE: Assigning number (1,0) to disk (ORCL:DATAVOL1)
Wed Apr 03 16:51:03 2019
NOTE: start heartbeating (grp 1)
kfdp_query(DATA): 15 
kfdp_queryBg(): 15 
NOTE: cache opening disk 0 of grp 1: DATAVOL1 label:DATAVOL1
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache mounting (first) external redundancy group 1/0x14248741 (DATA)
Wed Apr 03 16:51:04 2019
* allocate domain 1, invalid = TRUE 
Wed Apr 03 16:51:04 2019
NOTE: attached to recovery domain 1
NOTE: starting recovery of thread=1 ckpt=27.2697 group=1 (DATA)
Errors in file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_ora_15951.trc  (incident=23394):
ORA-00600: internal error code, arguments: [kfrValAcd30], [DATA], [1], [27], [2697], [28], [2697], [], [], [], [], []
Incident details in: /u01/app/grid/diag/asm/+asm/+ASM2/incident/incdir_23394/+ASM2_ora_15951_i23394.trc
Abort recovery for domain 1
NOTE: crash recovery signalled OER-600
ERROR: ORA-600 signalled during mount of diskgroup DATA
ORA-00600: internal error code, arguments: [kfrValAcd30], [DATA], [1], [27], [2697], [28], [2697], [], [], [], [], []
ERROR: alter diskgroup data mount
NOTE: cache dismounting (clean) group 1/0x14248741 (DATA) 
NOTE: lgwr not being msg'd to dismount
freeing rdom 1
Wed Apr 03 16:51:05 2019
Sweep [inc][23394]: completed
Sweep [inc2][23394]: completed
Wed Apr 03 16:51:05 2019
Trace dumping is performing id=[cdmp_20190403165105]
NOTE: detached from domain 1
NOTE: cache dismounted group 1/0x14248741 (DATA) 
NOTE: cache ending mount (fail) of group DATA number=1 incarn=0x14248741
kfdp_dismount(): 16 
kfdp_dismountBg(): 16 
NOTE: De-assigning number (1,0) from disk (ORCL:DATAVOL1)
ERROR: diskgroup DATA was not mounted
NOTE: cache deleting context for group DATA 1/337938241

mos相关记录
参考:ORA-600 [KFRVALACD30] in ASM (Doc ID 2123013.1)
kfrValAcd30-mos


ORA-00600: internal error code, arguments: [kfrValAcd30]可能是bug或者硬件故障导致.基于客户的情况,最大可能就是由于硬件故障导致asm 磁盘组的acd无法正常进行,从而无法mount成功.如果运气好,通过kfed相关修复可以正常mount成功,运气不好可以通过底层进行恢复数据文件,从而最大限度恢复数据.

ORA-600 kcffo_online_pdb_check: fno_system 和 ORA-600 kcvfdb_pdb_set_clean_scn: cleanckpt错误

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 kcffo_online_pdb_check: fno_system 和 ORA-600 kcvfdb_pdb_set_clean_scn: cleanckpt错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在做18c模拟故障测试中,经过自己一系列折腾,主要遭遇了ORA-600 kcffo_online_pdb_check: fno_system 和 ORA-600 kcvfdb_pdb_set_clean_scn: cleanckpt错误,这些都是pdb特有的,主要是由于一些bug引起,在非pdb环境中不太可能遇到.其实这也就是说明由于pdb机制的引入,使得后续的数据库异常恢复中会更加复杂.
18c数据库open ORA-00603 ORA-01092 ORA-00600报错

[oracle@ora11g tmp]$ ss

SQL*Plus: Release 18.0.0.0.0 - Production on Sat Apr 20 21:18:09 2019
Version 18.3.0.0.0

Copyright (c) 1982, 2018, Oracle.  All rights reserved.


Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.3.0.0.0

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
BANNER_FULL
--------------------------------------------------------------------------------
BANNER_LEGACY
--------------------------------------------------------------------------------
    CON_ID
----------
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.3.0.0.0
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
         0

BANNER
--------------------------------------------------------------------------------
BANNER_FULL
--------------------------------------------------------------------------------
BANNER_LEGACY
--------------------------------------------------------------------------------
    CON_ID
----------

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [],
[], [], []
Process ID: 55775
Session ID: 135 Serial number: 49652

alert日志信息

Database Characterset is AL32UTF8
2019-04-20T18:20:54.256841+08:00
No Resource Manager plan active
2019-04-20T18:20:56.751241+08:00
replication_dependency_tracking turned off (no async multimaster replication found)
2019-04-20T18:20:57.862516+08:00
Starting background process AQPC
2019-04-20T18:20:58.341991+08:00
AQPC started with pid=45, OS id=55830 
2019-04-20T18:21:01.476252+08:00
PDB$SEED(2):Autotune of undo retention is turned on. 
2019-04-20T18:21:01.738732+08:00
Pdb PDB$SEED hit error 1157 during open read only (2) and will be closed.
2019-04-20T18:21:01.755310+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc:
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/u01/app/oracle/oradata/ORCL18C/pdbseed/system01.dbf'
PDB$SEED(2):JIT: pid 55775 requesting stop
PDB$SEED(2):Buffer Cache flush deferred for PDB 2
Could not open PDB$SEED error=1157
2019-04-20T18:21:01.887601+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc:
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/u01/app/oracle/oradata/ORCL18C/pdbseed/system01.dbf'
2019-04-20T18:21:03.385503+08:00
PDB1(3):Autotune of undo retention is turned on. 
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_p000_55808.trc  (incident=66865) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [3], [], [], [], [], [], [], [], [], [], []
2019-04-20T18:21:03.682428+08:00
PDB2(4):Autotune of undo retention is turned on. 
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66865/orcl18c_p000_55808_i66865.trc
2019-04-20T18:21:12.863880+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-04-20T18:21:12.879921+08:00
Pdb PDB1 hit error 600 during open read write (5) and will be closed.
2019-04-20T18:21:12.880506+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_p000_55808.trc:
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [3], [], [], [], [], [], [], [], [], [], []
PDB1(3):JIT: pid 55808 requesting stop
2019-04-20T18:21:12.915407+08:00
Dumping diagnostic data in directory=[cdmp_20190420182112], requested by (instance=1, osid=55808 (P000)), summary=[incident=66865].
2019-04-20T18:21:12.989890+08:00
PDB1(3):Buffer Cache flush deferred for PDB 3
2019-04-20T18:21:13.004575+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_p001_55810.trc  (incident=66873) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [4], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66873/orcl18c_p001_55810_i66873.trc
2019-04-20T18:21:17.218642+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-04-20T18:21:17.222057+08:00
Pdb PDB2 hit error 600 during open read write (5) and will be closed.
2019-04-20T18:21:17.222236+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_p001_55810.trc:
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [4], [], [], [], [], [], [], [], [], [], []
2019-04-20T18:21:17.260941+08:00
Dumping diagnostic data in directory=[cdmp_20190420182117], requested by (instance=1, osid=55810 (P001)), summary=[incident=66873].
2019-04-20T18:21:17.262023+08:00
PDB2(4):JIT: pid 55810 requesting stop
PDB2(4):Buffer Cache flush deferred for PDB 4
2019-04-20T18:21:17.352939+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-04-20T18:21:17.483695+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc  (incident=66857) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kcffo_online_pdb_check: fno_system], [3], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66857/orcl18c_ora_55775_i66857.trc
2019-04-20T18:21:22.612339+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************
2019-04-20T18:21:26.062635+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc  (incident=66858) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66858/orcl18c_ora_55775_i66858.trc
2019-04-20T18:21:26.506644+08:00
Dumping diagnostic data in directory=[cdmp_20190420182126], requested by (instance=1, osid=55775), summary=[incident=66857].
2019-04-20T18:21:30.119381+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2019-04-20T18:21:30.119505+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
2019-04-20T18:21:30.119629+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
2019-04-20T18:21:30.119719+08:00
Error 600 happened during db open, shutting down database
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_55775.trc  (incident=66859) (PDBNAME=CDB$ROOT):
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_66859/orcl18c_ora_55775_i66859.trc
2019-04-20T18:21:30.346811+08:00
Dumping diagnostic data in directory=[cdmp_20190420182130], requested by (instance=1, osid=55775), summary=[incident=66858].
2019-04-20T18:21:34.551418+08:00
opiodr aborting process unknown ospid (55775) as a result of ORA-603
2019-04-20T18:21:34.720885+08:00
ORA-603 : opitsk aborting process
License high water mark = 4
2019-04-20T18:21:34.754766+08:00
USER (ospid: 55775): terminating the instance due to ORA error 600
2019-04-20T18:21:35.839992+08:00
Instance terminated by USER, pid = 55775

alert日志提示文件不存在,实际上文件是存在的

[root@ora11g ~]# 
[root@ora11g ~]# ls -l /u01/app/oracle/oradata/ORCL18C/pdbseed/system01.dbf
-rw-r-----. 1 oracle oinstall 283123712 4月  13 23:59 /u01/app/oracle/oradata/ORCL18C/pdbseed/system01.dbf

主要错误是ORA-600 kcffo_online_pdb_check: fno_system 查询mos发现主要是由于数据库bug导致,查询mos发现不少bug
KCFFO_ONLINE_PDB_CHECK FNO_SYSTEM]


官方解释,主要是由于kcffo_online_pdb函数执行异常导致

Function kcffo_online_pdb_check  Check if it ok to online the files in a 
pluggable database. An error is  signalled if it is not ok. This routine will 
grab a file enqueue for the relevant files and save it in the NULL-terminated 
array fenqsp. The caller must release these after kcffo_online_pdb() or if an 
error occurs.

通过人工修改文件状态,绕过该错误,cdb数据库open成功,但是pdb依旧无法正常open

SQL> alter database open;

Database altered.

SQL>  alter session set container=PDB1;

Session altered.

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcvfdb_pdb_set_clean_scn: cleanckpt], [3], [1739494], [38655308813], [2], [], [], [], [], [], [], []

alert日志

PDB1(3):alter database open
PDB1(3):Autotune of undo retention is turned on. 
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_56178.trc  (incident=69185) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kcvfdb_pdb_set_clean_scn: cleanckpt], [3], [1739494], [38655308813], [2], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/incident/incdir_69185/orcl18c_ora_56178_i69185.trc
2019-04-20T18:31:41.761479+08:00
*****************************************************************
An internal routine has requested a dump of selected redo.
This usually happens following a specific internal error, when
analysis of the redo logs will help Oracle Support with the
diagnosis.
It is recommended that you retain all the redo logs generated (by
all the instances) during the past 12 hours, in case additional
redo dumps are required to help with the diagnosis.
*****************************************************************
2019-04-20T18:31:42.097465+08:00
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Pdb PDB1 hit error 600 during open read write (1) and will be closed.
2019-04-20T18:31:42.097847+08:00
Errors in file /u01/app/oracle/diag/rdbms/orcl18c/orcl18c/trace/orcl18c_ora_56178.trc:
ORA-00600: internal error code, arguments: [kcvfdb_pdb_set_clean_scn: cleanckpt], [3], [1739494], [38655308813], [2], [], [], [], [], [], [], []
PDB1(3):JIT: pid 56178 requesting stop
2019-04-20T18:31:42.098808+08:00
Dumping diagnostic data in directory=[cdmp_20190420183142], requested by (instance=1, osid=56178), summary=[incident=69185].
2019-04-20T18:31:42.138818+08:00
PDB1(3):Buffer Cache flush deferred for PDB 3
PDB1(3):ORA-600 signalled during: alter database open...

主要错误是ORA-600 kcvfdb_pdb_set_clean_scn: cleanckpt,通过查询mos,依旧发现mos上有的主要可能的bug
kcvfdb_pdb_set_clean_scn cleanckpt


通过人工修改数据文件的checkpoint scn解决该问题,pdb open成功

tab$被恶意删除sys用户之外记录

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:tab$被恶意删除sys用户之外记录

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库open成功,但是alert日志报大量ORA-00600错误

Sun Apr 14 14:30:46 2019
SMCO started with pid=53, OS id=6761 
Completed: ALTER DATABASE OPEN /* db agent *//* {1:65047:2} */
Sun Apr 14 14:30:49 2019
Starting background process CJQ0
Sun Apr 14 14:30:49 2019
CJQ0 started with pid=54, OS id=6776 
Setting Resource Manager plan SCHEDULER[0x32DF]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Sun Apr 14 14:30:52 2019
Starting background process VKRM
Sun Apr 14 14:30:52 2019
VKRM started with pid=37, OS id=6809 
Sun Apr 14 14:30:54 2019
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_6811.trc  (incident=288633):
ORA-00600: internal error code, arguments: [kkpo_rcinfo_defstg:delseg], [84638], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288633/xifenfei1_j000_6811_i288633.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_6811.trc:
ORA-00600: internal error code, arguments: [kkpo_rcinfo_defstg:delseg], [84638], [], [], [], [], [], [], [], [], [], []
ORA-06512: at "APEX_030200.WWV_FLOW_MAIL", line 695
ORA-06512: at line 1
Sun Apr 14 14:30:57 2019
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_7491.trc  (incident=288658):
ORA-00600: 内部错误代码, 参数: [16659], [kqldtu], [INS], [0], [206196], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288658/xifenfei1_j000_7491_i288658.trc
Sun Apr 14 14:34:10 2019
Dumping diagnostic data in directory=[cdmp_20190414143410], requested by (instance=1, osid=7491 (J000)), summary=[incident=288658].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_7491.trc:
ORA-00600: 内部错误代码, 参数: [16659], [kqldtu], [INS], [0], [206196], [], [], [], [], [], [], []
ORA-06512: 在 "WEBCSMS.P_YGERROR", line 3
Sun Apr 14 14:39:08 2019
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j000_8515.trc  (incident=288593):
ORA-00600: 内部错误代码, 参数: [kdfReserveSingle_1], [0], [65280], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288593/xifenfei1_j000_8515_i288593.trc
ORA-06512: 在 line 1
Sun Apr 14 14:52:14 2019
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_j001_11062.trc  (incident=288705):
ORA-00600: internal error code, arguments: [16607], [0x3CFB04C90], [257], [9], [0x000000000], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288705/xifenfei1_j001_11062_i288705.trc
Errors in file /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_16712.trc  (incident=288597):
ORA-00600: 内部错误代码, 参数: [16607], [0x3C7CEA678], [1281], [9], [0x000000000], [], [], [], [], [], [], []
Incident details in: /oracle/oracle/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_288597/xifenfei1_ora_16712_i288597.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.

报错比较多,客户还反馈登录数据库之后,发现所有的表都丢失。第一反应可能数据字典损坏了,然后让客户查看备库,现在dg的备库也一样表都丢失了,进一步确认字典可能异常,让客户提供system文件进行本地分析.发现DBMS_SUPPORT_DBMONITOR触发器调用DBMS_SUPPORT_DBMONITORP存储过程,和警告:互联网中有oracle介质被注入恶意程序导致—ORA-600 16703中的名称非常类似,但是有点不一样,以前的恶意脚本中都是被注入并且触发之后,数据库无法正常启动,这次数据库能够正常open成功.分析恶意脚本,确认原因
1
2
3
确实这次的恶意脚本是在2016年8月份被创建在库中,在600天之后重启被触发,而且是删除非sys的tab$中记录.知道了恶意脚本的源头,那恢复就比较容易,直接通过批量bbed程序对tab$反删除可以实现比较完美恢复.原则上这样的故障可以实现数据库完美恢复,原库继续使用.

ORACLE Instance XFF (pid = 18) – Error 600 encountered while recovering transaction

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORACLE Instance XFF (pid = 18) – Error 600 encountered while recovering transaction

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

分享一次由于一个表异常导致数据库报类似:ORACLE Instance XFF (pid = 18) – Error 600 encountered while recovering transaction故障的案例
一个10.2.0.4的数据库,正常运行的库突然出现如下错误

Sun Apr 07 11:07:12 2019
Thread 1 advanced to log sequence 602883 (LGWR switch)
  Current log# 3 seq# 602883 mem# 0: L:\ORADATA\XFF\REDO03.LOG
Sun Apr 07 11:10:38 2019
Thread 1 advanced to log sequence 602884 (LGWR switch)
  Current log# 1 seq# 602884 mem# 0: L:\ORADATA\XFF\REDO01.LOG
Sun Apr 07 11:11:56 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\udump\XFF_ora_22956.trc:
ORA-00600: 内部错误代码, 参数: [ktspgfb-1], [], [], [], [], [], [], []

Sun Apr 07 11:12:46 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\udump\XFF_ora_27408.trc:
ORA-00600: 内部错误代码, 参数: [kcbnew_3], [0], [1], [168354056], [], [], [], []

Sun Apr 07 11:13:57 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\udump\XFF_ora_6632.trc:
ORA-00600: 内部错误代码, 参数: [ktspgfb-1], [], [], [], [], [], [], []

过一段时间报,然后实例直接crash

Tue Apr 09 07:47:35 2019
ORACLE Instance XFF (pid = 18) - Error 600 encountered while recovering transaction (1, 1) on object 113718002.
Tue Apr 09 07:47:35 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_smon_12948.trc:
ORA-00600: internal error code, arguments: [kcbgcur_3], [168454497], [8], [4], [0], [], [], []

Tue Apr 09 07:55:23 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_pmon_22652.trc:
ORA-00474: SMON process terminated with error

Tue Apr 09 07:55:24 2019
PMON: terminating instance due to error 474
Tue Apr 09 07:55:24 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_lgwr_28608.trc:
ORA-00474: SMON process terminated with error

Tue Apr 09 07:55:34 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_psp0_12544.trc:
ORA-00474: SMON process terminated with error

Tue Apr 09 07:55:34 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_j000_5216.trc:
ORA-00474: SMON process terminated with error

Tue Apr 09 07:55:35 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_ckpt_28204.trc:
ORA-00474: SMON process terminated with error

Tue Apr 09 07:55:36 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_mman_9320.trc:
ORA-00474: SMON process terminated with error

Tue Apr 09 07:55:44 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_q002_24384.trc:
ORA-00474: SMON process terminated with error

Tue Apr 09 07:55:53 2019
Errors in file c:\oracle\product\10.2.0\admin\XFF\bdump\XFF_reco_24124.trc:
ORA-00474: SMON process terminated with error

根据以上报错,数据库crash的原因是由于undo异常导致,通过对undo进行重建,解决掉异常undo,但是业务运行之后,一样的问题又重现,最后通过分析确认是对象异常导致

SQL> create table XFF.T_XIFENFEI_xff as select * from XFF.T_XIFENFEI;
create table XFF.T_XIFENFEI_xff as select * from XFF.T_XIFENFEI
                                                           *
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcbz_check_objd_typ], [0], [0], [1], [], [], [], []

屏蔽相关block obj的check之后
SQL> create table XFF.T_XIFENFEI_xff as select * from XFF.T_XIFENFEI;
create table XFF.T_XIFENFEI_xff as select * from XFF.T_XIFENFEI
                                                           *
ERROR at line 1:
ORA-00600: internal error code, arguments: [ktspScanInit-l1], [], [], [], [],[], [], []

比较明显该表对象出现逻辑异常,通过基于rowid的方式对该表数据进行抽取

SQL> create table XFF.T_XIFENFEI_new
  2  as
  3  select * from XFF.T_XIFENFEI where 1=0;

Table created.

SQL> set serveroutput on
SQL> set concat off
SQL> DECLARE
  2   nrows number;
  3   rid rowid;
  4   dobj number;
  5   ROWSPERBLOCK number;
  6  BEGIN
  7   ROWSPERBLOCK:=1000;
  8   nrows:=0;
  9   select data_object_id  into dobj
 10   from dba_objects
 11   where owner = 'XFF'
 12   and object_name = 'T_XIFENFEI'
 13   ;
 14   for i in (select relative_fno, block_id, block_id+blocks-1 totblocks
 15             from dba_extents
 16             where owner = 'XFF'
 17               and segment_name = 'T_XIFENFEI'
 18            order by extent_id)
 19   loop
 20     for br in i.block_id..i.totblocks loop
 21      for j in 1..ROWSPERBLOCK loop
 22      begin
 23        rid := dbms_rowid.ROWID_CREATE(1,dobj,i.relative_fno, br , j-1);
 24        insert into XFF.T_XIFENFEI_NEW
 25        select /*+ ROWID(A) */ *
 26        from XFF.T_XIFENFEI A
 27        where rowid = rid;
 28        if sql%rowcount = 1 then nrows:=nrows+1; end if;
 29        if (mod(nrows,10000)=0) then commit; end if;
 30      exception when others then null;
 31      end;
 32      end loop;
 33    end loop;
 34   end loop;
 35   COMMIT;
 36   dbms_output.put_line('Total rows: '||to_char(nrows));
 37  END;
 38  /
Total rows: 227000

PL/SQL procedure successfully completed.

再次观察数据库恢复正常,也不再crash和报错,恢复完成

linux资源限制导致数据库异常

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:linux资源限制导致数据库异常

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一起由于liunx系统资源限制导致数据库无法启动案例分享
数据库启动报ORA-01157错

SQL> startup
ORACLE instance started.

Total System Global Area 3340451840 bytes
Fixed Size		    2217952 bytes
Variable Size		 1862273056 bytes
Database Buffers	 1459617792 bytes
Redo Buffers		   16343040 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/home/oracle/oradata/XIFENFEI.dbf'

该错误一般是由于文件丢失或者路径错误导致

alert日志显示

Sun Apr 07 20:57:03 2019
ALTER DATABASE OPEN
Sun Apr 07 20:57:03 2019
Errors in file /dbdata/oracle/diag/rdbms/orcl/orcl/trace/orcl_dbw0_2681.trc:
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/home/oracle/oradata/XIFENFEI.dbf'
ORA-27092: size of file exceeds file size limit of the process
Additional information: 262144
Additional information: 262145
Errors in file /dbdata/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2802.trc:
ORA-01157: cannot identify/lock data file 5 - see DBWR trace file
ORA-01110: data file 5: '/home/oracle/oradata/XIFENFEI.dbf'
ORA-1157 signalled during: ALTER DATABASE OPEN...
Sun Apr 07 20:57:04 2019
Errors in file /dbdata/oracle/diag/rdbms/orcl/orcl/trace/orcl_m000_2804.trc  (incident=38578):
ORA-00600: internal error code, arguments: [kcidr_io_check_common_6], [10], 
     [/home/oracle/oradata/XIFENFEI.dbf], [8192], [2], [5], [], [], [], [], [], []
ORA-27092: size of file exceeds file size limit of the process

这里看到提示ORA-27092: size of file exceeds file size limit of the process
查看系统limit配置

[oracle@XFF ~]$ ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) 640000
scheduling priority             (-e) 0
file size               (blocks, -f) 2097152
pending signals                 (-i) 128489
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 131072
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

一般操作系统block size为1k,这里限制文件大小为2097152=(2G)
查看文件

[oracle@XFF ~]$ ls -l /home/oracle/oradata/XIFENFEI.dbf
-rw-r-----. 1 oracle oinstall 2147491840 Apr  7 19:04 /home/oracle/oradata/XIFENFEI.dbf

文件大小为2097160>2097152,导致异常

设置系统对文件大小限制2097152kb

[root@XFF ~]# ulimit -f 102400000
[root@XFF ~]# su - oracle
[oracle@XFF ~]$ ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) 640000
scheduling priority             (-e) 0
file size               (blocks, -f) 102400000
pending signals                 (-i) 128489
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 131072
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

重启数据库,open成功

SQL> shutdown immediate;      
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 3340451840 bytes
Fixed Size		    2217952 bytes
Variable Size		 1862273056 bytes
Database Buffers	 1459617792 bytes
Redo Buffers		   16343040 bytes
Database mounted.
Database opened.