Oracle断电故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle断电故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

异常断电导致数据库异常恢复文件报ORA-00283 ORA-00742 ORA-00312

 
D:\check_db>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on 星期二 5月 31 00:38:42 2022

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> recover datafile 1;
ORA-00283: 恢复会话因错误而取消
ORA-00742: 日志读取在线程 %d 序列 %d 块 %d 中检测到写入丢失情况
ORA-00312: 联机日志 3 线程 1:
'D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ONLINELOG\O1_MF_3_HJ32KJD5_.LOG'

这个错误比较明显是由于异常断电引起的写丢失导致.而且这种故障在没有备份的情况下,没有什么好处理方法,只能屏蔽一致性强制拉库,尝试强制拉库报错如下

SQL> startup mount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area 2.0310E+10 bytes
Fixed Size                  2290000 bytes
Variable Size            3690991280 bytes
Database Buffers         1.6576E+10 bytes
Redo Buffers               40837120 bytes
数据库装载完毕。
SQL> recover database until cancel;
ORA-00279: 更改 18755939194213 (在  生成) 对于线程 1 是必需的


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ONLINELOG\O1_MF_3_HJ32KJD5_.LOG
ORA-00600: internal error code, arguments: [3020], [2], [78824], [8467432], [],
[], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 78824, file
offset is 645726208 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: 'D:\ORADATA\ORCL\SYSAUX01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 80834


ORA-01112: 未启动介质恢复


SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [krsi_al_hdr_update.15], [4294967295], [], [],[], [], [], [], [], [], [], []

ORA-600 krsi_al_hdr_update.15错误,主要是由于redo异常导致无法resetlogs成功,具体参考:Alter Database Open Resetlogs returns error ORA-00600: [krsi_al_hdr_update.15], (Doc ID 2026541.1)描述,处理这个问题之后,再次resetlogs库,报ORA-600 2662错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [4366], [4112122046],
[4366], [4112228996], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [4366], [4112122045],
[4366], [4112228996], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [4366], [4112122040],
[4366], [4112228996], [12583040], [], [], [], [], [], []
进程 ID: 4644
会话 ID: 1701 序列号: 3

这个问题比较简单,通过修改scn即可绕过去,之后数据库open报ORA-600 4194等错误

SQL> alter database open ;
alter database open 
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [4194], [
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc  (incident=322982):
ORA-00600: internal error code, arguments: [4137], [10.33.3070116], [0], [0], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_322982\orcl_smon_5112_i322982.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
replication_dependency_tracking turned off (no async multimaster replication found)
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_3340.trc  (incident=323030):
ORA-00600: 内部错误代码, 参数: [4194], [
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_323030\orcl_ora_3340_i323030.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 31 09:05:04 2022
Sweep [inc][322982]: completed
ORACLE Instance orcl (pid = 13) - Error 600 encountered while recovering transaction (10, 33).
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc:
ORA-00600: internal error code, arguments: [4137], [10.33.3070116], [0], [0], [], [], [], [], [], [], [], []
Checker run found 1 new persistent data failures
Tue May 31 09:05:05 2022
Sweep [inc][323030]: completed
Sweep [inc2][322982]: completed
Tue May 31 09:05:14 2022
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc  (incident=322983):
ORA-00600: internal error code, arguments: [4193], [10.33.3070116], [0], [], [], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_322983\orcl_smon_5112_i322983.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Tue May 31 09:05:14 2022
ORA-600 signalled during: alter database open...
Block recovery stopped at EOT rba 2.61.16
Block recovery completed at rba 2.61.16, scn 4366.4112429058
Block recovery from logseq 2, block 60 to scn 18755939643393
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ONLINELOG\O1_MF_2_K9BSVC11_.LOG
Block recovery completed at rba 2.61.16, scn 4366.4112429058
Dumping diagnostic data in directory=[cdmp_2022053],requested by(instance=1,osid=5112(SMON)),summary=[incident=322983].
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_smon_5112.trc:
ORA-01595: error freeing extent (3) of rollback segment (1))
ORA-00600: internal error code, arguments: [4193], [10.33.3070116], [3], [], [], [], [], [], [], [], [], []

对异常undo进行处理,数据库正常open成功

SQL> shutdown immediate;
ORA-00600: 内部错误代码, 参数: [4193], [


SQL> shutdown abort;
ORACLE 例程已经关闭。
SQL> startup mount
ORACLE 例程已经启动。

Total System Global Area 2.0310E+10 bytes
Fixed Size                  2290000 bytes
Variable Size            3690991280 bytes
Database Buffers         1.6576E+10 bytes
Redo Buffers               40837120 bytes
数据库装载完毕。
SQL> alter database open;

数据库已更改。

hcheck检测有一些字典不一致,建议客户逻辑导出,然后导入到新库中

HCheck Version 07MAY18 on 31-5月 -2022 09:12:22
----------------------------------------------
Catalog Version 11.2.0.4.0 (1102000400)
db_name: ORCL

                                   Catalog       Fixed
Procedure Name                     Version    Vs Release    Timestamp      Resul
t
------------------------------ ... ---------- -- ---------- -------------- -----
-
.- LobNotInObj                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- MissingOIDOnObjCol          ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- SourceNotInObj              ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- OversizedFiles              ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- PoorDefaultStorage          ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- PoorStorage                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- TabPartCountMismatch        ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- OrphanedTabComPart          ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- MissingSum$                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- MissingDir$                 ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- DuplicateDataobj            ... 1102000400 <=  *All Rel* 05/31 09:12:22 PASS
.- ObjSynMissing               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjSeqMissing               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedUndo                ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndex               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndexPartition      ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndexSubPartition   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedTable               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedTablePartition      ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedTableSubPartition   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- MissingPartCol              ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedSeg$                ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- OrphanedIndPartObj#         ... 1102000400 <=  *All Rel* 05/31 09:12:23 FAIL

HCKE-0024: Orphaned Index Partition Obj# (no OBJ$) (Doc ID 1360935.1)
ORPHAN INDPART$: OBJ#=149167 BO#=6378 - no OBJ$ row
ORPHAN INDPART$: OBJ#=149168 BO#=6378 - no OBJ$ row

.- DuplicateBlockUse           ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- FetUet                      ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- Uet0Check                   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- SeglessUET                  ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadInd$                     ... 1102000400 <=  *All Rel* 05/31 09:12:23 FAIL

HCKE-0030: OBJ$ INDEX entry has no IND$ or INDPART$/INDSUBPART$ entry (Doc ID 13
60528.1)
OBJ$ INDEX PARTITION has no INDPART$ entry: Obj#=148278 SYS Name=WRH$_FILESTATXS
_PK PARTITION=WRH$_FILEST_1572571104_16462
OBJ$ INDEX PARTITION has no INDPART$ entry: Obj#=148920 SYS Name=WRH$_FILESTATXS
_PK PARTITION=WRH$_FILEST_1572571104_16678

.- BadTab$                     ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadIcolDepCnt               ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjIndDobj                  ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- TrgAfterUpgrade             ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjType0                    ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadOwner                    ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- StmtAuditOnCommit           ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadPublicObjects            ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadSegFreelist              ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadDepends                  ... 1102000400 <=  *All Rel* 05/31 09:12:23 WARN

HCKW-0016: Dependency$ p_timestamp mismatch for VALID objects (Doc ID 1361045.1)

[E] - P_OBJ#=6376 D_OBJ#=6765

.- CheckDual                   ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- ObjectNames                 ... 1102000400 <=  *All Rel* 05/31 09:12:23 PASS
.- BadCboHiLo                  ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- ChkIotTs                    ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- NoSegmentIndex              ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- BadNextObject               ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- DroppedROTS                 ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- FilBlkZero                  ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- DbmsSchemaCopy              ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- OrphanedObjError            ... 1102000400 >  1102000000 05/31 09:12:24 PASS
.- ObjNotLob                   ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- MaxControlfSeq              ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- SegNotInDeferredStg         ... 1102000400 >  1102000000 05/31 09:12:24 PASS
.- SystemNotRfile1             ... 1102000400 >   902000000 05/31 09:12:24 PASS
.- DictOwnNonDefaultSYSTEM     ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- OrphanTrigger               ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
.- ObjNotTrigger               ... 1102000400 <=  *All Rel* 05/31 09:12:24 PASS
---------------------------------------
31-5月 -2022 09:12:24  Elapsed: 2 secs
---------------------------------------
Found 4 potential problem(s) and 1 warning(s)
Contact Oracle Support with the output and trace file
to check if the above needs attention or not

PL/SQL 过程已成功完成。

commit_wait和commit_logging设置不当导致数据库无法正常启动

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:commit_wait和commit_logging设置不当导致数据库无法正常启动

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户数据库设置以下参数,突然掉电之后,数据库无法正常启动

  commit_wait              = "NOWAIT"
  commit_logging           = "BATCH"

数据库open报错

alter database open
Block change tracking file is current.
Beginning crash recovery of 1 threads
 parallel recovery started with 31 processes
Started redo scan
Errors in file /media/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_28711.trc  (incident=8002955):
ORA-00353: log corruption near block 3 change 17372812227460 time 03/20/2022 00:11:51
ORA-00312: online log 12315  thread 1: '/media/oracle/redolog/redo05.log'
Aborting crash recovery due to error 399
Errors in file /media/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_28711.trc:
ORA-00399: corrupt change description in redo log
ORA-00353: log corruption near block 3 change 17372812227460 time 03/20/2022 00:11:51
ORA-00312: online log 12315 thread 1: '/media/oracle/redolog/redo05.log'
Errors in file /media/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_28711.trc:
ORA-00399: corrupt change description in redo log
ORA-00353: log corruption near block 3 change 17372812227460 time 03/20/2022 00:11:51
ORA-00312: online log 12315  thread 1: '/media/oracle/redolog/redo05.log'
ORA-399 signalled during: alter database open...

报错信息比较明显是由于redo损坏导致,尝试强制open库

Sun Mar 20 18:32:35 2022
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 17372812227456
Resetting resetlogs activation ID 1627598093 (0x61032d0d)
Sun Mar 20 18:34:08 2022
Setting recovery target incarnation to 2
Sun Mar 20 18:34:08 2022
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Warning - High Database SCN: Current SCN value is 17372812227459, threshold SCN value is 0
If you have not previously reported this warning on this database, 
please notify Oracle Support so that additional diagnosis can be performed.
Sun Mar 20 18:34:08 2022
Assigning activation ID 1627615603 (0x61037173)
Thread 1 opened at log sequence 1
  Current log# 2 seq# 1 mem# 0: /media/oracle/redolog/redo02.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Mar 20 18:34:08 2022
SMON: enabling cache recovery
Errors in file /media/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14369.trc  (incident=8003142):
ORA-00600: internal error code, arguments: [2662], [4044], [3964482439], [4044], [3964488833], [12669344], 
Incident details in: /media/oracle/diag/rdbms/orcl/orcl/incident/incdir_8003142/orcl_ora_14369_i8003142.trc
Errors in file /media/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14369.trc  (incident=8003143):
ORA-00353: log corruption near block 3 change 17372812227462 time 03/20/2022 18:34:10
ORA-00312: online log 2 thread 1: '/media/oracle/redolog/redo02.log'
ORA-00600: internal error code, arguments: [2662], [4044], [3964482439], [4044], [3964488833], [12669344], 
Incident details in: /media/oracle/diag/rdbms/orcl/orcl/incident/incdir_8003143/orcl_ora_14369_i8003143.trc
Errors in file /media/oracle/diag/rdbms/orcl/orcl/incident/incdir_8003142/orcl_ora_14369_i8003142.trc:
ORA-00399: corrupt change description in redo log
ORA-00353: log corruption near block 3 change 17372812227462 time 03/20/2022 18:34:10
ORA-00312: online log 2 thread 1: '/media/oracle/redolog/redo02.log'
ORA-00600: internal error code, arguments: [2662], [4044], [3964482439], [4044], [3964488833], [12669344], 
Errors in file /media/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14369.trc  (incident=8003144):
ORA-00353: log corruption near block 3 change 17372812227462 time 03/20/2022 18:34:10
ORA-00334: archived log: '/media/oracle/redolog/redo02.log'
ORA-00600: internal error code, arguments: [2662], [4044], [3964482439], [4044], [3964488833], [12669344], 
Incident details in: /media/oracle/diag/rdbms/orcl/orcl/incident/incdir_8003144/orcl_ora_14369_i8003144.trc
Sun Mar 20 18:34:11 2022
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [4044], [3964482444], [4044], [3964488833], [12669344], 
ORA-00600: internal error code, arguments: [2662], [4044], [3964482443], [4044], [3964488833], [12669344], 
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [4044], [3964482439], [4044], [3964488833], [12669344], 

强制open数据库报ora-600 2662错误,比较常见,通过修改scn再尝试open库

Sun Mar 20 18:38:43 2022
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 17372812227460
Resetting resetlogs activation ID 1627615603 (0x61037173)
Sun Mar 20 18:40:02 2022
Setting recovery target incarnation to 2
Sun Mar 20 18:40:02 2022
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Warning - High Database SCN: Current SCN value is 17372812227463, threshold SCN value is 0
If you have not previously reported this warning on this database,
 please notify Oracle Support so that additional diagnosis can be performed.
Sun Mar 20 18:40:02 2022
Assigning activation ID 1627669665 (0x610444a1)
Thread 1 opened at log sequence 1
  Current log# 2 seq# 1 mem# 0: /media/oracle/redolog/redo02.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Mar 20 18:40:02 2022
SMON: enabling cache recovery
Undo initialization finished serial:0 start:779809538 end:779809788 diff:250 (2 seconds)
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Database Characterset is AL32UTF8
No Resource Manager plan active
**********************************************************
WARNING: Files may exists in db_recovery_file_dest
that are not known to the database. Use the RMAN command
CATALOG RECOVERY AREA to re-catalog any such files.
If files cannot be cataloged, then manually delete them
using OS command.
One of the following events caused this:
1. A backup controlfile was restored.
2. A standby controlfile was restored.
3. The controlfile was re-created.
4. db_recovery_file_dest had previously been enabled and
   then disabled.
**********************************************************
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sun Mar 20 18:40:04 2022
QMNC started with pid=55, OS id=16232 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Sun Mar 20 18:40:05 2022
db_recovery_file_dest_size of 3882 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Sun Mar 20 18:40:05 2022
Starting background process CJQ0
Sun Mar 20 18:40:05 2022
CJQ0 started with pid=58, OS id=16251 
Completed: alter database open resetlogs

后续增加temp,导出数据到新库,恢复完成

ora-600 2662和ora-600 kclchkblk_4恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ora-600 2662和ora-600 kclchkblk_4恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

这两天连续处理两个case,一个是12.1.0.2版本数据库屏蔽一致性,强制open之后,报ORA-600 2662故障
20210429220218


这个错误本身是一个非常常见的错误,直接推scn即可解决,但是问题是12.1.0.2版本,oracle不允许以前常规的操作方法,就连oradebug都报错oradebug poke ORA-32521/ORA-32519故障解决,而且可以是rac环境,bbed修改文件头也相当麻烦,最后我们使用patch方法轻松解决

另外一例是11.2.0.4版本,强制open库报ora-600 kclchkblk_4

Wed Apr 28 21:25:38 2021
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Errors in file /u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_27832.trc  (incident=564430):
ORA-00600: internal error code, arguments: [kclchkblk_4], [2959], [904341694], [2959], [904131717], []
Incident details in: /u01/app/oracle/diag/rdbms/dc/dc1/incident/incdir_564430/dc1_ora_27832_i564430.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_27832.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [kclchkblk_4], [2959], [904341694], [2959], [904131717], []
Errors in file /u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_27832.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [kclchkblk_4], [2959], [904341694], [2959], [904131717], []
Error 704 happened during db open, shutting down database
USER (ospid: 27832): terminating the instance due to error 704
Instance terminated by USER, pid = 27832
ORA-1092 signalled during: alter database open resetlogs...

这个比较简单,参考redo异常 ORA-600 kclchkblk_4 故障恢复.

raid强制上线后数据库无法启动故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:raid强制上线后数据库无法启动故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于raid掉盘过多,强制raid上线,然后启动数据库报以下错误

Mon Apr 19 23:19:28 2021
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Started redo scan
Completed redo scan
 read 106750 KB redo, 9080 data blocks need recovery
Mon Apr 19 23:19:45 2021
Slave exiting with ORA-1115 exception
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_p000_3277.trc:
ORA-01115: IO error reading block from file 9 (block # 339)
ORA-01110: data file 9: '/u01/app/oracle/oradata/orcl/dev02.dbf'
ORA-27072: File I/O error
Additional information: 4
Additional information: 326
Additional information: 24576
ORA-27072: File I/O error
Additional information: 4
Additional information: 326
Additional information: 24576
ORA-27072: File I/O error
Additional information: 4
Additional information: 326
Additional information: 24576
ORA-27072: File I/O error
Additional information: 4
Additional information: 326
Additional information: 24576
ORA-27072: File I/O error
Additional information: 4
Additional information: 326
Additional information: 24576
ORA-27072: File I/O error
Additional information: 4
Additional information: 326
Additional information: 24576
ORA-27072: File I/O error
Additional information: 4
Additional information: 326
Additional information: 24576
ORA-27072: File I/O error
Additional information: 4
Additional information: 326
Additional information: 24576
ORA-27072: File I
Mon Apr 19 23:19:45 2021
Aborting crash recovery due to slave death, attempting serial crash recovery
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
 read 106750 KB redo, 9080 data blocks need recovery
Aborting crash recovery due to error 1115
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_3275.trc:
ORA-01115: IO error reading block from file 9 (block # 329)
ORA-01110: data file 9: '/u01/app/oracle/oradata/orcl/dev02.dbf'
ORA-1115 signalled during: ALTER DATABASE OPEN...

错误提示比较明显IO error,结合客户强行上线raid的操作,比较明显是由于底层io问题导致该错误,直接对此文件dbv检查

[oracle@database orcl]$ dbv file=dev02.dbf

DBVERIFY: Release 11.2.0.4.0 - Production on Mon Apr 19 23:59:03 2021

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/app/oracle/oradata/orcl/dev02.dbf

DBV-00600: Fatal Error - [28] [27061] [0] [0]

对于此类情况,通过工具进行处理

DUL> copy file from  /u01/app/oracle/oradata/orcl/dev02.dbf to /oradata/dev02.dbf
 
starting copy datafile '/u01/app/oracle/oradata/orcl/dev02.dbf' to '/oradata/dev02.dbf'
read data error from file '/u01/app/oracle/oradata/orcl/dev02.dbf'.error message:Input/output error
read block# error: 303
read data error from file '/u01/app/oracle/oradata/orcl/dev02.dbf'.error message:Input/output error
read block# error: 304
read data error from file '/u01/app/oracle/oradata/orcl/dev02.dbf'.error message:Input/output error
read block# error: 329
datafile copy completed with 2 block error.

dbv校验文件

[oracle@database oradata]$ dbv file=dev02.dbf 

DBVERIFY: Release 11.2.0.4.0 - Production on Tue Apr 20 00:28:31 2021

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oradata/dev02.dbf
Page 303 is marked corrupt
Corrupt block relative dba: 0x0240012f (file 9, block 303)
Completely zero block found during dbv: 

Page 304 is marked corrupt
Corrupt block relative dba: 0x02400130 (file 9, block 304)
Completely zero block found during dbv: 

Page 329 is marked corrupt
Corrupt block relative dba: 0x02400149 (file 9, block 329)
Completely zero block found during dbv: 



DBVERIFY - Verification complete

Total Pages Examined         : 3932160
Total Pages Processed (Data) : 3213723
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 714294
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 4139
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 3
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 85078875 (6.85078875)

通过对io error的文件进行处理,最终损坏三个block,最大限度抢救数据.使用被恢复出来的文件,尝试open库遭遇以下错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [6], [85035771], [6],
[85084136], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [6], [85035770], [6],
[85084136], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [6], [85035764], [6],
[85084136], [12583040], [], [], [], [], [], []
Process ID: 6733
Session ID: 570 Serial number: 3

ora-600 2662这个错误比较明显,处理文件头scn,继续open库

SQL> alter database open ;
alter database open 
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 6840
Session ID: 570 Serial number: 3

查看alert日志信息

Tue Apr 20 01:22:27 2021
alter database open upgrade
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Started redo scan
Completed redo scan
 read 1 KB redo, 3 data blocks need recovery
Started redo application at
 Thread 1: logseq 1, block 3
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/orcl/redo01.log
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 1, block 5, scn 25854859541
 3 data blocks read, 3 data blocks written, 1 redo k-bytes read
Tue Apr 20 01:22:28 2021
Thread 1 advanced to log sequence 2 (thread open)
Thread 1 opened at log sequence 2
  Current log# 2 seq# 2 mem# 0: /u01/app/oracle/oradata/orcl/redo02.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Apr 20 01:22:28 2021
SMON: enabling cache recovery
[6840] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:5902014 end:5905574 diff:3560 (35 seconds)
Dictionary check beginning
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_6824.trc  (incident=63970):
ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_63970/orcl_smon_6824_i63970.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORACLE Instance orcl (pid = 14) - Error 600 encountered while recovering transaction (24, 2) on object 89023.
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_6824.trc:
ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], [], [], [], [], []
Tue Apr 20 01:22:38 2021
ORACLE Instance orcl (pid = 14) - Error 600 encountered while recovering transaction (63, 3) on object 89023.
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_6824.trc:
ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_smon_6824.trc  (incident=63974):
ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl/incident/incdir_63974/orcl_smon_6824_i63974.trc
Tue Apr 20 01:22:55 2021
PMON (ospid: 6798): terminating the instance due to error 474

这个错误是比较常见的错误,参考:ORACLE Instance XFF (pid = 18) – Error 600 encountered while recovering transaction ,通过处理之后,数据库open成功

SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 1603411968 bytes
Fixed Size                  2253664 bytes
Variable Size            1023413408 bytes
Database Buffers          570425344 bytes
Redo Buffers                7319552 bytes
Database mounted.
SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

后续安排逻辑导出,导入新库

硬件恢复之后,数据库无法open故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:硬件恢复之后,数据库无法open故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于硬件故障,客户恢复硬件之后,数据库无法正常启动,报ORA-00354 ORA-00353错误

/tmp/> sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Mon Mar 1 17:10:30 2021

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> recover database;

ORA-00283: recovery session canceled due to errors
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 86088 change 16135545783340 time
02/23/2021 13:53:24
ORA-00312: online log 2 thread 1: '/oradata02/redo02b.log'
ORA-00312: online log 2 thread 1: '/oradata01/redo02a.log'

由于redo损坏,数据库无法继续正常恢复,通过屏蔽一致性,force open库

SQL> alter database open resetlogs;
alter database open resetlogs 
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [3756], [3571444619], [3756], [3648471803], [4194545]
Process ID: 5104
Session ID: 576 Serial number: 3

这个错误比较简单,是由于scn问题导致,修改数据库scn启动库

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4193], [], [], [], [], [], []
Process ID: 5536
Session ID: 576 Serial number: 1

这个错误比较明显,修改回滚段,尝试启动库

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 6033
Session ID: 576 Serial number: 3

数据库依旧无法正常open,alert日志报错如下

ARC3 started with pid=30, OS id=6078 
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
Exception[type:SIGSEGV Address not mapped to object][ADDR:0x60173487F5][PC:0xC003B1C20,_memcpy()+64][flags:0x0,count:1]
Exception[type:SIGSEGV,Address not mapped to object][ADDR:0x60173487F5][PC:0xC003B1C20,_memcpy()+64][flags:0x2,count:2]
Exception[type:SIGSEGV,Address not mapped to object][ADDR:0x60173487F5][PC:0xC003B1C20,_memcpy()+64][flags:0x2,count:2]
Archived Log entry 2 added for thread 1 sequence 2 ID 0x506cafbb dest 1:
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Mon Mar 01 17:44:44 2021
PMON (ospid: 5993): terminating the instance due to error 397
Mon Mar 01 17:44:45 2021
System state dump requested by (instance=1, osid=5993 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/diag/rdbms/xff/xff/trace/xff_diag_6001.trc
Instance terminated by PMON, pid = 5993

通过其启动过程分析,发现数据库卡在如下对象:

PARSING IN CURSOR #11529215044940435280 len=148 dep=1 uid=0 oct=6 lid=0 tim=223080942765 
hv=3540833987 ad='c000000d67a42778' sqlid='5ansr7r9htpq3'
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
END OF STMT
PARSE #11529215044940435280:c=10000,e=8182,p=6,cr=55,cu=0,mis=1,r=0,dep=1,og=4,plh=0,tim=223080942764
BINDS #11529215044940435280:
 Bind#0
  oacdty=01 mxl=32(20) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=873 siz=32 off=0
  kxsbbbfp=c000000d5fd299aa  bln=32  avl=20  flg=09
  value="_SYSSMU1_3935275865$"
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6e078  bln=24  avl=02  flg=05
  value=3
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6e048  bln=24  avl=03  flg=05
  value=128
 Bind#3
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6e010  bln=24  avl=02  flg=05
  value=5
 Bind#4
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dfe0  bln=24  avl=02  flg=05
  value=1
 Bind#5
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dfb0  bln=24  avl=04  flg=05
  value=28921
 Bind#6
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6df80  bln=24  avl=05  flg=05
  value=1245262
 Bind#7
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6df48  bln=24  avl=06  flg=05
  value=1217986655
 Bind#8
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dc90  bln=24  avl=03  flg=05
  value=3621
 Bind#9
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dc60  bln=24  avl=01  flg=05
  value=0
 Bind#10
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dc30  bln=24  avl=02  flg=05
  value=2
 Bind#11
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6dc00  bln=24  avl=02  flg=05
  value=2
 Bind#12
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=9fffffffbcc6e0a8  bln=22  avl=02  flg=05
  value=1
WAIT #11529215044940435280: nam='db file sequential read' ela= 21 file#=1 block#=530 blocks=1 obj#=0 tim=223080944352
Incident 528204 created, dump file: /oracle/diag/rdbms/xff/xff/incident/incdir_528204/xff_ora_6593_i528204.trc
ORA-00600: internal error code, arguments: [4193], [], [], [], [], [], [], [], [], [], [], []

至此基本上可以确认是由于出现回滚段异常,继续查看日志发现

Error 600 in redo application callback
Dump of change vector:
TYP:0 CLS:16 AFN:1 DBA:0x00400212 OBJ:4294967295 SCN:0x0ea6.f4f2da14 SEQ:1 OP:5.1 ENC:0 RBL:0
ktudb redo: siz: 320 spc: 5892 flg: 0x0012 seq: 0x0072 rec: 0x08
            xid:  0x0000.004.000000bc  
ktubl redo: slt: 4 rci: 0 opc: 11.1 [objn: 15 objd: 15 tsn: 0]
Undo type:  Regular undo        Begin trans    Last buffer split:  No 
Temp Object:  No 
Tablespace Undo:  No 
             0x00000000  prev ctl uba: 0x00400212.0072.07 
prev ctl max cmt scn:  0x0eac.d42963be  prev tx cmt scn:  0x0eac.d4296f48 
txn start scn:  0xffff.ffffffff  logon user: 0  prev brb: 4194446  prev bcl: 0 BuExt idx: 0 flg2: 0
KDO undo record:
KTB Redo 
op: 0x04  ver: 0x01  
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x0000.060.000000bb uba: 0x00400212.0072.04
                      flg: C---    lkc:  0     scn: 0x0eac.d9736b46
KDO Op code: URP row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x004000e1  hdba: 0x004000e0
itli: 4  ispac: 0  maxfr: 4863
tabn: 0 slot: 1(0x1) flag: 0x2c lock: 0 ckix: 0
ncol: 17 nnew: 12 size: 0
col  1: [20]  5f 53 59 53 53 4d 55 31 5f 33 39 33 35 32 37 35 38 36 35 24
col  2: [ 2]  c1 02
col  3: [ 2]  c1 04
col  4: [ 3]  c2 02 1d
col  5: [ 6]  c5 0d 12 63 43 38
col  6: [ 3]  c2 25 16
col  7: [ 5]  c4 02 19 35 3f
col  8: [ 4]  c3 03 5a 16
col  9: [ 1]  80
col 10: [ 2]  c1 04
col 11: [ 2]  c1 03
col 16: [ 2]  c1 03
Block after image is corrupt: 
buffer tsn: 0 rdba: 0x00400212 (1/530)
scn: 0x0ea6.f4f2da14 seq: 0x01 flg: 0x04 tail: 0xda140201
frmt: 0x02 chkval: 0x9dd8 type: 0x02=KTU UNDO BLOCK

使用bbed对file 1 block 530进行处理

   struct ktuxcscn, 8 bytes                 @4148    
      ub4 kscnbas                           @4148     0xd42963be
      ub2 kscnwrp                           @4152     0x0eac
   struct ktuxcuba, 8 bytes                 @4156    
      ub4 kubadba                           @4156     0x00400212
      ub2 kubaseq                           @4160     0x0072
      ub1 kubarec                           @4162     0x07
   sb2 ktuxcflg                             @4164     1 (KTUXCFSK)
   ub2 ktuxcseq                             @4166     0x0072
   sb2 ktuxcnfb                             @4168     1
   ub4 ktuxcinc                             @4172     0x00000000
   sb2 ktuxcchd                             @4176     4
   sb2 ktuxcctl                             @4178     3
   ub2 ktuxcmgc                             @4180     0x8002
   ub4 ktuxcopt                             @4188     0x7ffffffe

数据库顺利open成功
20210301210425


后续建议客户逻辑导出数据,导入到新库