断电引起redo和数据文件不一致故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:断电引起redo和数据文件不一致故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有些时候故障总是来的让人非常意外,这个在准备停机迁移数据库之前的几分钟由于某种原因直接导致主机掉电,再次开机数据库无法启动

Sat Aug 03 23:10:37 2024
Successful mount of redo thread 1, with mount id 3696805928
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount
Sat Aug 03 23:10:43 2024
alter database open
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_ora_6808.trc:
ORA-01113: 文件 21 需要介质恢复
ORA-01110: 数据文件 21: 'D:\APP\ADMINISTRATOR\ORADATA\XFF\XIFENFEI.DBF'
ORA-1113 signalled during: alter database open...

尝试数据库恢复各种报错ORA-600 kdourp_inorder2,ORA-600 3020,ORA-7445 kdxlin等

ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\XFF\ARCHIVELOG\2024_08_03\O1_MF_1_1159998_MBW605HP_.ARC
ORA-279 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT  
Media Recovery Log D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\XFF\ARCHIVELOG\2024_08_03\O1_MF_1_1159999_MBW63QBY_.ARC
Sat Aug 03 23:22:10 2024
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xC] [PC:0x14306B54A, kdxlin()+4432]
Sat Aug 03 23:22:10 2024
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xC] [PC:0x14306B54A, kdxlin()+4432]
Sat Aug 03 23:22:10 2024
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_pr25_7740.trc  (incident=132557):
ORA-00600: internal error code, arguments: [kdourp_inorder2], [4], [22], [44], [44], [], [], [], [], [], [], []
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\incident\incdir_132557\xff_pr25_7740_i132557.trc
ERROR: Unable to normalize symbol name for the following short stack (at offset 213):
dbgexProcessError()+200<-dbgeExecuteForError()+65<-dbgePostErrorKGE()+2269<-dbkePostKGE_kgsf()+77<-kgeade()+562
<-kgerelv()+151<-kgerev()+45<-kgerec5()+60<-sss_xcpt_EvalFilterEx()+1862<-sss_xcpt_EvalFilter()+174
<-.1.4_5+59<-00007FFCB5E2C92F<-00007FFCB5E3D82D<-00007FFCB5DE916B<-00007FFCB5E3C9EE<-kdxlin()+4432
<-kco_issue_callback()+196<-kcoapl()+746<-kcbr_apply_change()+6156<-kcbr_mapply_change()+1162
<-kcbrapply()+2297<-kcbr_apply_pending()+2931<-krp_slave_apply()+1155<-krp_slave_main()+4010<-ksvrdp()+2580
<-opirip()+904<-opidrv()+906<-sou2o()+98<-opimai_real()+280<-opimai()+191<-BackgroundThreadStart()+646
<-00007FFCB562168D<-00007FFCB5E14629
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_pr1w_6472.trc  (incident=132485):
ORA-07445: exception encountered: core dump [kdxlin()+4432] [ACCESS_VIOLATION] [ADDR:0xC] [PC:0x14306B54A]
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\incident\incdir_132485\xff_pr1w_6472_i132485.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_pr2o_7472.trc  (incident=132709):
ORA-07445: exception encountered: core dump [kdxlin()+4432] [ACCESS_VIOLATION] [ADDR:0xC] [PC:0x14306B54A] 
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\incident\incdir_132709\xff_pr2o_7472_i132709.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sat Aug 03 23:22:11 2024
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_pr18_7812.trc  (incident=132301):
ORA-00600: internal error code, arguments: [3020], [62], [517633], [260564481], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 62, block# 517633, file offset is 4240449536 bytes)
ORA-10564: tablespace HSEMR_TAB
ORA-01110: data file 62: 'D:\APP\ADMINISTRATOR\ORADATA\XFF\EMR006.DBF'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Sat Aug 03 23:22:56 2024
Slave exiting with ORA-10562 exception
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_pr25_7740.trc:
ORA-10562: Error occurred while applying redo to data block (file# 64, block# 508263)
ORA-10564: tablespace HSEMR_TAB
ORA-01110: data file 64: 'D:\APP\ADMINISTRATOR\ORADATA\XFF\HSEMR_TAB008.DBF'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 467202
ORA-00600: internal error code, arguments: [kdourp_inorder2], [4], [22], [44], [44], [], [], []
Sat Aug 03 23:22:56 2024
Slave exiting with ORA-10562 exception
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_pr1w_6472.trc:
ORA-10562: Error occurred while applying redo to data block (file# 65, block# 498512)
ORA-10564: tablespace HSEMR_TAB
ORA-01110: data file 65: 'D:\APP\ADMINISTRATOR\ORADATA\XFF\EMR009.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 467200
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kdxlin()+4432] [ACCESS_VIOLATION] [ADDR:0xC] 
Sat Aug 03 23:22:57 2024
Media Recovery failed with error 448
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\xff\xff\trace\xff_pr00_6732.trc:
ORA-00283: recovery session canceled due to errors
ORA-00448: normal completion of background process
Sat Aug 03 23:22:57 2024
ORA-600 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER CANCEL 
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...

通过分析确认有部分数据文件和redo信息不匹配,导致无法正常recover成功

SQL> recover datafile 77;
完成介质恢复。
SQL> recover datafile 78;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [78], [473221], [327628933], [], [], [],
[], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 78, block# 473221, file
offset is 3876626432 bytes)
ORA-10564: tablespace HSEMR_TAB
ORA-01110: 数据文件 78: 'D:\APP\ADMINISTRATOR\ORADATA\XFF\HIS23.DBF'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'

SQL> recover datafile 66;
ORA-00279: 更改 6029114092 (在 08/03/2024 19:44:05 生成) 对于线程 1 是必需的
ORA-00289: 建议:
D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\XFF\ARCHIVELOG\2024_08_03\O1_MF_1_115999

9_MBW63QBY_.ARC
ORA-00280: 更改 6029114092 (用于线程 1) 在序列 #1159999 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
auto
已应用的日志。
完成介质恢复。
SQL> recover datafile 65;
ORA-00279: 更改 6029114092 (在 08/03/2024 19:44:05 生成) 对于线程 1 是必需的
ORA-00289: 建议:
D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\XFF\ARCHIVELOG\2024_08_03\O1_MF_1_115999

9_MBW63QBY_.ARC
ORA-00280: 更改 6029114092 (用于线程 1) 在序列 #1159999 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00283: 恢复会话因错误而取消
ORA-10562: Error occurred while applying redo to data block (file# 65, block#
498544)
ORA-10564: tablespace HSEMR_TAB
ORA-01110: 数据文件 65: 'D:\APP\ADMINISTRATOR\ORADATA\XFF\EMR009.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 467200
ORA-00607: 当更改数据块时出现内部错误
ORA-00602: 内部编程异常错误
ORA-07445: 出现异常错误: 核心转储 [kdxlin()+4432] [ACCESS_VIOLATION] [ADDR:0xC]
[PC:0x14306B54A] [UNABLE_TO_READ] []


ORA-01112: 未启动介质恢复

对于最终无法正常recover成功数据文件,使用Oracle数据库恢复利器:Oracle Recovery Tools工具快速调整scn
oracle-recovery-tools


然后重建ctl,recover 数据库并open成功

Sun Aug 04 01:01:51 2024
Successful mount of redo thread 1, with mount id 3696824638
Completed: CREATE CONTROLFILE REUSE DATABASE "XFF" NORESETLOGS FORCE LOGGING ARCHIVELOG
    MAXLOGFILES 16
    MAXLOGMEMBERS 3
    MAXDATAFILES 200
    MAXINSTANCES 8
    MAXLOGXFFTORY 23360
LOGFILE
  GROUP 1 'D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO01.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 2 'D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO02.LOG'  SIZE 50M BLOCKSIZE 512,
  GROUP 3 'D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO03.LOG'  SIZE 50M BLOCKSIZE 512
DATAFILE
  'D:\APP\ADMINISTRATOR\ORADATA\XFF\SYSTEM01.DBF',
  'D:\APP\ADMINISTRATOR\ORADATA\XFF\SYSAUX01.DBF',
……
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Only allocated 127 recovery slaves (requested 128)
Parallel Media Recovery started with 127 slaves
Sun Aug 04 01:01:56 2024
Recovery of Online Redo Log: Thread 1 Group 2 Seq 1160002 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO02.LOG
Completed: ALTER DATABASE RECOVER  database  
Sun Aug 04 01:02:20 2024
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Started redo scan
Completed redo scan
 read 1946 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 1160002, block 2, scn 6029119350
Recovery of Online Redo Log: Thread 1 Group 2 Seq 1160002 Reading mem 0
  Mem# 0: D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO02.LOG
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 1160002, block 3895, scn 6029139793
 0 data blocks read, 0 data blocks written, 1946 redo k-bytes read
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Sun Aug 04 01:02:21 2024
LGWR: STARTING ARCH PROCESSES
Sun Aug 04 01:02:21 2024
ARC0 started with pid=71, OS id=2772 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Sun Aug 04 01:02:22 2024
ARC1 started with pid=72, OS id=7996 
Sun Aug 04 01:02:22 2024
ARC2 started with pid=73, OS id=2900 
Sun Aug 04 01:02:22 2024
ARC3 started with pid=74, OS id=6856 
Archived Log entry 1 added for thread 1 sequence 1160000 ID 0xc4814d77 dest 1:
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC1: Becoming the heartbeat ARCH
Thread 1 advanced to log sequence 1160003 (thread open)
Thread 1 opened at log sequence 1160003
  Current log# 1 seq# 1160003 mem# 0: D:\APP\ADMINISTRATOR\ORADATA\XFF\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Aug 04 01:02:23 2024
SMON: enabling cache recovery
Archived Log entry 2 added for thread 1 sequence 1160002 ID 0xc4814d77 dest 1:
Archived Log entry 3 added for thread 1 sequence 1160001 ID 0xc4814d77 dest 1:
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
[7808] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:7657234 end:7657703 diff:469 (4 seconds)
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         Txff condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Database Characterset is ZHS16GBK
No Resource Manager plan active
**********************************************************
WARNING: Files may exists in db_recovery_file_dest
that are not known to the database. Use the RMAN command
CATALOG RECOVERY AREA to re-catalog any such files.
If files cannot be cataloged, then manually delete them
using OS command.
One of the following events caused txff:
1. A backup controlfile was restored.
2. A standby controlfile was restored.
3. The controlfile was re-created.
4. db_recovery_file_dest had previously been enabled and
   then disabled.
**********************************************************
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sun Aug 04 01:02:27 2024
QMNC started with pid=75, OS id=7884 
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: alter database open

后续处理异常表,lob,index等数据,客户业务测试都ok,完成本次恢复工作

ORA-03113: 通信通道的文件结尾

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-03113: 通信通道的文件结尾

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报:ORA-03113: 通信通道的文件结尾

PS C:\Users\Administrator> sqlplus  / as sysdba

SQL*Plus: Release 11.2.0.1.0 Production on 星期六 8月 3 11:05:03 2024

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

已连接到空闲例程。

SQL> startup nomount
ORACLE 例程已经启动。

Total System Global Area 2.0510E+10 bytes
Fixed Size                  2184632 bytes
Variable Size            1.1476E+10 bytes
Database Buffers         8992587776 bytes
Redo Buffers               40046592 bytes
SQL> shutdown immediate
ORA-01507: ??????


ORACLE 例程已经关闭。
SQL> startup
ORACLE 例程已经启动。

Total System Global Area 2.0510E+10 bytes
Fixed Size                  2184632 bytes
Variable Size            1.1476E+10 bytes
Database Buffers         8992587776 bytes
Redo Buffers               40046592 bytes
数据库装载完毕。
ORA-03113: 通信通道的文件结尾
进程 ID: 4040
会话 ID: 1018 序列号: 7

这类错误,一般真正错误原因在alert日志中,查看alert日志

Sat Aug 03 08:15:12 2024
alter database mount exclusive
Successful mount of redo thread 1, with mount id 3557233552
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 11 processes
Started redo scan
Completed redo scan
 read 0 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 745, block 80599, scn 7100295
Recovery of Online Redo Log: Thread 1 Group 1 Seq 745 Reading mem 0
  Mem# 0: D:\ORACLE\ORADATA\XFF\REDO01.LOG
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 745, block 80599, scn 7120296
 0 data blocks read, 0 data blocks written, 0 redo k-bytes read
LGWR: STARTING ARCH PROCESSES
Sat Aug 03 08:15:19 2024
ARC0 started with pid=32, OS id=5496 
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Sat Aug 03 08:15:20 2024
ARC1 started with pid=33, OS id=3873072 
Sat Aug 03 08:15:20 2024
ARC2 started with pid=34, OS id=3873644 
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC1: Becoming the heartbeat ARCH
sksasmowrt WriteConsole error 6
Errors in file d:\oracle\diag\rdbms\xff\xff\trace\xff_arc2_3873644.trc:
ORA-19815: 警告: db_recovery_file_dest_size 字节 (共 10737418240 字节) 已使用 100.00%, 尚有 0 字节可用。
************************************************************************
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
   then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
   BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
   reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
   system command was used to delete files, then use RMAN CROSSCHECK and
   DELETE EXPIRED commands.
************************************************************************
Errors in file d:\oracle\diag\rdbms\xff\xff\trace\xff_arc2_3873644.trc:
ORA-19809: 超出了恢复文件数的限制
ORA-19804: 无法回收 13760000 字节磁盘空间 (从 10737418240 限制中)
ARC2: Error 19809 Creating archive log file to 'D:\FRA\XFF\ARCHIVELOG\2024_08_03\O1_MF_1_744_%U_.ARC'
Sat Aug 03 08:15:20 2024
ARC3 started with pid=35, OS id=3873424 
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file d:\oracle\diag\rdbms\xff\xff\trace\xff_ora_3873352.trc:
ORA-19815: ??: db_recovery_file_dest_size ?? (? 10737418240 ??) ??? 100.00%, ?? 0 ?????
************************************************************************
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
   then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
   BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
   reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
   system command was used to delete files, then use RMAN CROSSCHECK and
   DELETE EXPIRED commands.
************************************************************************
Errors in file d:\oracle\diag\rdbms\xff\xff\trace\xff_ora_3873352.trc:
ORA-19809: ???????????
ORA-19804: ???? 12296704 ?????? (? 10737418240 ???)
ARCH: Error 19809 Creating archive log file to 'D:\FRA\XFF\ARCHIVELOG\2024_08_03\O1_MF_1_743_%U_.ARC'
ARCH: Archival stopped, error occurred. Will continue retrying
Errors in file d:\oracle\diag\rdbms\xff\xff\trace\xff_arc2_3873644.trc:
ORA-16038: 日志 3 sequence# 744 无法归档
ORA-19809: 超出了恢复文件数的限制
ORA-00312: 联机日志 3 线程 1: 'D:\ORACLE\ORADATA\XFF\REDO03.LOG'
Errors in file d:\oracle\diag\rdbms\xff\xff\trace\xff_ora_3873352.trc:
ORA-16038: ?? 2 sequence# 743 ????
ORA-19809: ???????????
ORA-00312: ???? 2 ?? 1: 'D:\ORACLE\ORADATA\XFF\REDO02.LOG'
USER (ospid: 3873352): terminating the instance due to error 16038
Sat Aug 03 08:15:27 2024
Instance terminated by USER, pid = 3873352

是由于闪回区满了,导致redo无法归档,从而使得数据库无法正常open,解决办法:
1. 清理以前归档日志
2. 把闪回区调大一些

Oracle 支持GB18030-2022

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle 支持GB18030-2022

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

GB18030 是中国政府标准,定义了在中国软件所需的语言和字符集支持. 它是一种与 ISO 10646/Unicode 相匹配的 Unicode 转换格式,并提供涵盖所有 Unicode 的字符库.它最初于 2000 年发布 (GB18030-2000),随后在 2005 年 (GB18030-2005) 和 2022 年 (GB18030-2022) 发布了两次更新.GB18030-2022 在字符覆盖方面的内容等同于Unicode Version 11.0. GB18030-2022 标准的生效日期为 2023 年 8 月 1 日.Oracle 数据库通过 AL32UTF8 Unicode 数据库字符集支持 GB18030 字符的存储.同时还提供客户端专用字符集 ZHS32GB18030,使应用程序可以在客户端处理 GB18030 编码文本的输入/输出,并根据需要进行字符集转换. 截至 2023 年 3 月,现有 Oracle 数据库版本中 ZHS32GB18030 的实施是基于 GB18030-2005.
Oracle Database 23ai 版本中已实现对最新 GB18030-2022 标准的支持.具体来说,Oracle Database 23ai 支持 GB18030-2022 标准的级别 3 的实现,这是 GB18030 最广泛的支持级别. 与此同时,我们还为现有客户在 Oracle Database 19c 各 RU 版本之上通过的一次性补丁来支持 GB18030-2022.Oracle Database 19c 如果要支持 GB18030-2022 需要下载补丁 (#34994751).它可以应用于 Oracle 19c RU 版本 19.3 或更高版本.它需要应用于数据库服务器和客户端. 此补丁仅启用对 Oracle 数据库的 GB18030-2022 支持. 为了使应用程序完全支持 GB18030-2022,应用程序堆栈中的每个组件都需要能够确保处理 GB18030-2022 中的字符.
ZHS32GB18030字符集是仅限客户端的字符集,因此不支持将 ZHS32GB18030 用作NLS_CHARACTERSET.如果您的数据库使用ZHS32GB18030作为NLS_CHARACTERSET,那么强烈建议您尽快迁移到 AL32UTF8.
GB18030编码定义了完整的 Unicode 映射,这意味着如果在 GB18030客户端上使用设置为ZHS32GB18030的NLS_LANG并使用 AL32UTF8(或 UTF8)NLS_CHARACTERSET数据将相互转换和从 AL32UTF8 转换为 并完全保留.在使用非 Unicode GB18030编码和设置为 ZHS32GB18030 的 NLS_LANG 的 GB18030客户端上插入 GB18030数据时,Oracle 会将非 Unicode GB18030代码转换为AL32UTF8代码,并将其作为 AL32UTF8 代码存储在数据库中.在使用非 Unicode GB18030编码和设置为 ZHS32GB18030 的 NLS_LANG 的 GB18030客户端上选择 GB18030-2000 数据时,Oracle 会将 GB18030数据从 AL32UTF8 代码转换为非 Unicode GB18030代码提供给客户端,因此,如果此客户端使用非 Unicode GB18030代码,则数据库端的存储AL32UTF8对客户端完全透明.
通过以下sql验证你的数据库是否支持GB18030-2022
以下结果表示支持GB18030-2022

[oracle@xifenfei ~]$ sqlplus / as sysdba

SQL*Plus: Release 23.0.0.0.0 - for Oracle Cloud and Engineered Systems on Thu Aug 1 21:02:36 2024
Version 23.5.0.24.07

Copyright (c) 1982, 2024, Oracle.  All rights reserved.


Connected to:
Oracle Database 23ai Enterprise Edition Release 23.0.0.0.0 - for Oracle Cloud and Engineered Systems
Version 23.5.0.24.07

SQL> SELECT DUMP(CONVERT(UNISTR('\FE10\FFFD\E78D'), 'ZHS32GB18030', 'AL16UTF16'), 16) FROM DUAL;

DUMP(CONVERT(UNISTR('\FE10\FFFD\E78D'),'ZHS32GB18030','AL16UTF16'),16)
--------------------------------------------------------------------------------
Typ=1 Len=10: a6,d9,84,31,a4,37,84,31,82,36

以下结果不表示支持GB18030-2022

[oracle@iZbp11c0qyuuo1gr7j98upZ ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Thu Aug 1 20:46:12 2024

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> SELECT DUMP(CONVERT(UNISTR('\FE10\FFFD\E78D'), 'ZHS32GB18030', 'AL16UTF16'), 16) FROM DUAL;

DUMP(CONVERT(UNISTR('\FE10\FFFD\E78D
------------------------------------
Typ=1 Len=8: 84,31,82,36,a3,bf,a6,d9

参考文档:Oracle 数据库 GB18030-2022 支持方针 (Doc ID 2937409.1)

手工对multipath设备进行授权导致asm 磁盘组mount报ORA-15032-ORA-15131

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:手工对multipath设备进行授权导致asm 磁盘组mount报ORA-15032-ORA-15131

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户硬件通过底层重组raid,然后把lun进行到asm的机器上,在mount data_dg磁盘组的时候,报ORA-15032 ORA-15131错误,磁盘组无法正常mount,这种报错不太常见,一般要不直接报某个block无法访问,要不直接报缺少asm disk之类的.
ORA-15131


通过远程上去分析,发现alert日志如下

Wed Jul 31 04:55:17 2024
NOTE: attached to recovery domain 1
NOTE: cache recovered group 1 to fcn 0.1814063801
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Wed Jul 31 04:55:17 2024
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA_DG)
Errors in file /oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lgwr_8681.trc:
ORA-15025: could not open disk "/dev/mapper/xffdb_data01_new"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 3
Errors in file /oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lgwr_8681.trc:
ORA-15025: could not open disk "/dev/mapper/xffdb_data01_new"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 3
WARNING: cache failed reading from group=1(DATA_DG) fn=1 blk=3 count=1 from disk= 0 
  (DATA_DG_0000) kfkist=0x20 status=0x02 osderr=0x0 file=kfc.c line=11596
Errors in file /oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lgwr_8681.trc:
ORA-15025: could not open disk "/dev/mapper/xffdb_data01_new"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 3
ORA-15080: synchronous I/O operation to a disk failed
ERROR: cache failed to read group=1(DATA_DG) fn=1 blk=3 from disk(s): 0(DATA_DG_0000)
ORA-15080: synchronous I/O operation to a disk failed
NOTE: cache initiating offline of disk 0 group DATA_DG
NOTE: process _lgwr_+asm2 (8681) initiating offline of disk 0.3915927124 (DATA_DG_0000) with mask 0x7e in group 1
NOTE: initiating PST update: grp = 1, dsk = 0/0xe9684e54, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 42 for pid 15, osid 8681
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 1)
WARNING: Offline for disk DATA_DG_0000 in mode 0x7f failed.
Wed Jul 31 04:55:17 2024
NOTE: halting all I/Os to diskgroup 1 (DATA_DG)
NOTE: LGWR caught ORA-15131 while mounting diskgroup 1
ORA-15080: synchronous I/O operation to a disk failed
NOTE: cache initiating offline of disk 0 group DATA_DG
NOTE: process _lgwr_+asm2 (8681) initiating offline of disk 0.3915927124 (DATA_DG_0000) with mask 0x7e in group 1
NOTE: initiating PST update: grp = 1, dsk = 0/0xe9684e54, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 42 for pid 15, osid 8681
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 1)
WARNING: Offline for disk DATA_DG_0000 in mode 0x7f failed.
Wed Jul 31 04:55:17 2024
NOTE: halting all I/Os to diskgroup 1 (DATA_DG)
NOTE: LGWR caught ORA-15131 while mounting diskgroup 1
ERROR: ORA-15131 signalled during mount of diskgroup DATA_DG
NOTE: cache dismounting (clean) group 1/0xA868BD55 (DATA_DG)
NOTE: messaging CKPT to quiesce pins Unix process pid: 16915, image: oracle@xffdb2 (TNS V1-V3)
NOTE: lgwr not being msg'd to dismount
Wed Jul 31 04:55:18 2024
List of instances:
 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 9)
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 1 invalid = TRUE
 2 GCS resources traversed, 0 cancelled
Dirty Detach Reconfiguration complete
freeing rdom 1
WARNING: dirty detached from domain 1
WARNING: thread recovery enqueue was not held for domain 1 when doing a dirty detach
NOTE: cache dismounted group 1/0xA868BD55 (DATA_DG)
NOTE: cache ending mount (fail) of group DATA_DG number=1 incarn=0xa868bd55
NOTE: cache deleting context for group DATA_DG 1/0xa868bd55
GMON dismounting group 1 at 43 for pid 29, osid 16915
NOTE: Disk DATA_DG_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0001 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0002 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0003 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0004 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_DG_0005 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA_DG was not mounted
ORA-15032: not all alterations performed
ORA-15131: block  of file  in diskgroup  could not be read
ERROR: alter diskgroup data_dg mount

基本上可以确认是由于访问/dev/mapper/xffdb_data01_new 磁盘权限不对导致读disk= 0 fn=1 blk=3失败(突然读这个block没有权限,而没有报最初的磁盘头无权限,有点不合常理),进一步分析确认是xffdb_data01_new 权限不对.

xffdb2:/oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace$ls -l /dev/mapper/
total 0
crw-rw---- 1 root root 10, 58 Jul 26 12:24 control
lrwxrwxrwx 1 root root      8 Jul 31 04:21 mpathe -> ../dm-17
lrwxrwxrwx 1 root root      7 Jul 31 04:28 mpathf -> ../dm-7
lrwxrwxrwx 1 root root      8 Jul 31 04:55 xffdb_data01_new -> ../dm-14
lrwxrwxrwx 1 root root      8 Jul 31 04:55 xffdb_data02_new -> ../dm-13
lrwxrwxrwx 1 root root      7 Jul 31 04:55 xffdb_data03 -> ../dm-2
lrwxrwxrwx 1 root root      7 Jul 31 04:55 xffdb_data04 -> ../dm-5
lrwxrwxrwx 1 root root      8 Jul 31 04:55 xffdb_data05_new -> ../dm-12
lrwxrwxrwx 1 root root      7 Jul 31 04:55 xffdb_data06 -> ../dm-6
lrwxrwxrwx 1 root root      8 Jul 31 04:28 xffdb_data07 -> ../dm-11
lrwxrwxrwx 1 root root      7 Jul 31 04:28 xffdb_data08 -> ../dm-9
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_log1 -> ../dm-4
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_log2 -> ../dm-3
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_vote2 -> ../dm-8
lrwxrwxrwx 1 root root      8 Jul 31 04:59 xffdb_vote3 -> ../dm-10
lrwxrwxrwx 1 root root      8 Jul 26 12:24 vgdata-lv_data -> ../dm-15
lrwxrwxrwx 1 root root      7 Jul 26 12:24 vg_xffdb2-LogVol00 -> ../dm-1
lrwxrwxrwx 1 root root      7 Jul 26 12:24 vg_xffdb2-LogVol01 -> ../dm-0
lrwxrwxrwx 1 root root      8 Jul 26 12:24 vg_xffdb2-LogVol02 -> ../dm-16
xffdb2:/oracle/u01/app/grid/diag/asm/+asm/+ASM2/trace$ls -l /dev/dm*
brw-rw---- 1 root disk     253,  0 Jul 26 12:24 /dev/dm-0
brw-rw---- 1 root disk     253,  1 Jul 26 12:24 /dev/dm-1
brw-rw---- 1 grid asmadmin 253, 10 Jul 31 05:13 /dev/dm-10
brw-rw---- 1 root disk     253, 11 Jul 31 04:28 /dev/dm-11
brw-rw---- 1 root disk     253, 12 Jul 31 04:55 /dev/dm-12
brw-rw---- 1 grid asmadmin 253, 13 Jul 31 04:55 /dev/dm-13
brw-rw---- 1 grid asmadmin 253, 14 Jul 31 04:55 /dev/dm-14
brw-rw---- 1 root disk     253, 15 Jul 26 12:24 /dev/dm-15
brw-rw---- 1 root disk     253, 16 Jul 26 12:24 /dev/dm-16
brw-rw---- 1 root disk     253, 17 Jul 31 04:21 /dev/dm-17
brw-rw---- 1 grid asmadmin 253,  2 Jul 31 04:55 /dev/dm-2
brw-rw---- 1 grid asmadmin 253,  3 Jul 31 04:59 /dev/dm-3
brw-rw---- 1 grid asmadmin 253,  4 Jul 31 05:13 /dev/dm-4
brw-rw---- 1 grid asmadmin 253,  5 Jul 31 04:55 /dev/dm-5
brw-rw---- 1 grid asmadmin 253,  6 Jul 31 04:55 /dev/dm-6
brw-rw---- 1 root disk     253,  7 Jul 31 04:28 /dev/dm-7
brw-rw---- 1 grid asmadmin 253,  8 Jul 31 05:13 /dev/dm-8
brw-rw---- 1 root disk     253,  9 Jul 31 04:28 /dev/dm-9

再进一步确认xffdb_*_new三个磁盘是硬件恢复之后镜像过来的,然后现场工程师直接人工修改/dev/dm_[12-14]权限,再尝试mount磁盘组,结果发生该错误,通过v$asm_disk再次查询asm disk情况,发现xffdb_*_new的磁盘均不在列表中

GROUP_NUMBER DISK_NUMBER HEADER_STATUS         STATE          PATH
------------ ----------- --------------------- -------------- --------------------------
           0           2 MEMBER                NORMAL         /dev/mapper/xffdb_data03
           0           3 MEMBER                NORMAL         /dev/mapper/xffdb_data06
           0           4 MEMBER                NORMAL         /dev/mapper/xffdb_data04
           3           1 MEMBER                NORMAL         /dev/mapper/xffdb_vote2
           2           0 MEMBER                NORMAL         /dev/mapper/xffdb_log1
           3           2 MEMBER                NORMAL         /dev/mapper/xffdb_vote3
           2           1 MEMBER                NORMAL         /dev/mapper/xffdb_log2

7 rows selected.

进一步查看磁盘权限

xffdb2:/dev/mapper$ls -ltr
total 0
crw-rw---- 1 root root 10, 58 Jul 26 12:24 control
lrwxrwxrwx 1 root root      7 Jul 26 12:24 vg_xffdb2-LogVol01 -> ../dm-0
lrwxrwxrwx 1 root root      8 Jul 26 12:24 vgdata-lv_data -> ../dm-15
lrwxrwxrwx 1 root root      7 Jul 26 12:24 vg_xffdb2-LogVol00 -> ../dm-1
lrwxrwxrwx 1 root root      8 Jul 26 12:24 vg_xffdb2-LogVol02 -> ../dm-16
lrwxrwxrwx 1 root root      8 Jul 31 04:21 mpathe -> ../dm-17
lrwxrwxrwx 1 root root      7 Jul 31 04:28 xffdb_data08 -> ../dm-9
lrwxrwxrwx 1 root root      8 Jul 31 04:28 xffdb_data07 -> ../dm-11
lrwxrwxrwx 1 root root      7 Jul 31 04:28 mpathf -> ../dm-7
lrwxrwxrwx 1 root root      8 Jul 31 04:55 xffdb_data05_new -> ../dm-12
lrwxrwxrwx 1 root root      8 Jul 31 04:59 xffdb_vote3 -> ../dm-10
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_vote2 -> ../dm-8
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_log2 -> ../dm-3
lrwxrwxrwx 1 root root      7 Jul 31 04:59 xffdb_log1 -> ../dm-4
lrwxrwxrwx 1 root root      8 Jul 31 05:15 xffdb_data01_new -> ../dm-14
lrwxrwxrwx 1 root root      8 Jul 31 05:15 xffdb_data02_new -> ../dm-13
lrwxrwxrwx 1 root root      7 Jul 31 05:15 xffdb_data06 -> ../dm-6
lrwxrwxrwx 1 root root      7 Jul 31 05:15 xffdb_data04 -> ../dm-5
lrwxrwxrwx 1 root root      7 Jul 31 05:15 xffdb_data03 -> ../dm-2
xffdb2:/dev/mapper$ls -l /dev/dm*
brw-rw---- 1 root disk     253,  0 Jul 26 12:24 /dev/dm-0
brw-rw---- 1 root disk     253,  1 Jul 26 12:24 /dev/dm-1
brw-rw---- 1 grid asmadmin 253, 10 Jul 31 05:22 /dev/dm-10
brw-rw---- 1 root disk     253, 11 Jul 31 04:28 /dev/dm-11
brw-rw---- 1 root disk     253, 12 Jul 31 04:55 /dev/dm-12
brw-rw---- 1 root disk     253, 13 Jul 31 05:15 /dev/dm-13
brw-rw---- 1 root disk     253, 14 Jul 31 05:15 /dev/dm-14
brw-rw---- 1 root disk     253, 15 Jul 26 12:24 /dev/dm-15
brw-rw---- 1 root disk     253, 16 Jul 26 12:24 /dev/dm-16
brw-rw---- 1 root disk     253, 17 Jul 31 04:21 /dev/dm-17
brw-rw---- 1 grid asmadmin 253,  2 Jul 31 05:15 /dev/dm-2
brw-rw---- 1 grid asmadmin 253,  3 Jul 31 04:59 /dev/dm-3
brw-rw---- 1 grid asmadmin 253,  4 Jul 31 05:22 /dev/dm-4
brw-rw---- 1 grid asmadmin 253,  5 Jul 31 05:15 /dev/dm-5
brw-rw---- 1 grid asmadmin 253,  6 Jul 31 05:15 /dev/dm-6
brw-rw---- 1 root disk     253,  7 Jul 31 04:28 /dev/dm-7
brw-rw---- 1 grid asmadmin 253,  8 Jul 31 05:22 /dev/dm-8
brw-rw---- 1 root disk     253,  9 Jul 31 04:28 /dev/dm-9

发现进一步访问,这三个盘权限全部还原成root:disk,导致grid无法正常访问,到这一部分基本上可以判断恢复过来的多路径下面的三个磁盘,当被访问之时,权限会发生改变,一般发生该问题,是由于这些设备没有被udev进行绑定导致,使用udev对这三个磁盘进行权限和所有组相关信息进行绑定之后,磁盘权限不再变化,v$asm_disk中显示信息也正常

[root@xffdb2 rules.d]# ls -l /dev/dm*
brw-rw---- 1 root disk     253,  0 Jul 31 05:26 /dev/dm-0
brw-rw---- 1 root disk     253,  1 Jul 31 05:26 /dev/dm-1
brw-rw---- 1 grid asmadmin 253, 10 Jul 31 05:26 /dev/dm-10
brw-rw---- 1 root disk     253, 11 Jul 31 05:26 /dev/dm-11
brw-rw---- 1 grid asmadmin 253, 12 Jul 31 05:26 /dev/dm-12
brw-rw---- 1 grid asmadmin 253, 13 Jul 31 05:26 /dev/dm-13
brw-rw---- 1 grid asmadmin 253, 14 Jul 31 05:26 /dev/dm-14
brw-rw---- 1 root disk     253, 15 Jul 31 05:26 /dev/dm-15
brw-rw---- 1 root disk     253, 16 Jul 31 05:26 /dev/dm-16
brw-rw---- 1 root disk     253, 17 Jul 31 05:26 /dev/dm-17
brw-rw---- 1 grid asmadmin 253,  2 Jul 31 05:26 /dev/dm-2
brw-rw---- 1 grid asmadmin 253,  3 Jul 31 05:26 /dev/dm-3
brw-rw---- 1 grid asmadmin 253,  4 Jul 31 05:26 /dev/dm-4
brw-rw---- 1 grid asmadmin 253,  5 Jul 31 05:26 /dev/dm-5
brw-rw---- 1 grid asmadmin 253,  6 Jul 31 05:26 /dev/dm-6
brw-rw---- 1 root disk     253,  7 Jul 31 05:26 /dev/dm-7
brw-rw---- 1 grid asmadmin 253,  8 Jul 31 05:26 /dev/dm-8
brw-rw---- 1 root disk     253,  9 Jul 31 05:26 /dev/dm-9
[root@xffdb2 rules.d]# ls -l /dev/mapper/
total 0
crw-rw---- 1 root root 10, 58 Jul 31 05:26 control
lrwxrwxrwx 1 root root      8 Jul 31 05:26 mpathe -> ../dm-17
lrwxrwxrwx 1 root root      7 Jul 31 05:26 mpathf -> ../dm-7
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_data01_new -> ../dm-14
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_data02_new -> ../dm-13
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_data03 -> ../dm-2
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_data04 -> ../dm-5
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_data05_new -> ../dm-12
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_data06 -> ../dm-6
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_data07 -> ../dm-11
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_data08 -> ../dm-9
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_log1 -> ../dm-4
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_log2 -> ../dm-3
lrwxrwxrwx 1 root root      7 Jul 31 05:26 xffdb_vote2 -> ../dm-8
lrwxrwxrwx 1 root root      8 Jul 31 05:26 xffdb_vote3 -> ../dm-10
lrwxrwxrwx 1 root root      8 Jul 31 05:26 vgdata-lv_data -> ../dm-15
lrwxrwxrwx 1 root root      7 Jul 31 05:26 vg_xffdb2-LogVol00 -> ../dm-1
lrwxrwxrwx 1 root root      7 Jul 31 05:26 vg_xffdb2-LogVol01 -> ../dm-0
lrwxrwxrwx 1 root root      8 Jul 31 05:26 vg_xffdb2-LogVol02 -> ../dm-16
[root@xffdb2 rules.d]# 
SQL> /

GROUP_NUMBER DISK_NUMBER HEADER_STATUS                        STATE                    PATH
------------ ----------- ------------------------------------ ------------------------ -----------------------------
           0           0 MEMBER                               NORMAL                   /dev/mapper/xffdb_data01_new
           0           1 MEMBER                               NORMAL                   /dev/mapper/xffdb_data05_new
           0           2 MEMBER                               NORMAL                   /dev/mapper/xffdb_data03
           0           3 MEMBER                               NORMAL                   /dev/mapper/xffdb_data06
           0           4 MEMBER                               NORMAL                   /dev/mapper/xffdb_data04
           0           5 MEMBER                               NORMAL                   /dev/mapper/xffdb_data02_new
           3           1 MEMBER                               NORMAL                   /dev/mapper/xffdb_vote2
           2           0 MEMBER                               NORMAL                   /dev/mapper/xffdb_log1
           3           2 MEMBER                               NORMAL                   /dev/mapper/xffdb_vote3
           2           1 MEMBER                               NORMAL                   /dev/mapper/xffdb_log2

10 rows selected.

mount磁盘组成功

SQL>  alter diskgroup data_dg mount 
NOTE: cache registered group DATA_DG number=1 incarn=0x4178bd5e
NOTE: cache began mount (first) of group DATA_DG number=1 incarn=0x4178bd5e
NOTE: Assigning number (1,0) to disk (/dev/mapper/xffdb_data01_new)
NOTE: Assigning number (1,4) to disk (/dev/mapper/xffdb_data05_new)
NOTE: Assigning number (1,2) to disk (/dev/mapper/xffdb_data03)
NOTE: Assigning number (1,5) to disk (/dev/mapper/xffdb_data06)
NOTE: Assigning number (1,3) to disk (/dev/mapper/xffdb_data04)
NOTE: Assigning number (1,1) to disk (/dev/mapper/xffdb_data02_new)
Wed Jul 31 05:27:47 2024
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 46 for pid 29, osid 26738
NOTE: cache opening disk 0 of grp 1: DATA_DG_0000 path:/dev/mapper/xffdb_data01_new
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache opening disk 1 of grp 1: DATA_DG_0001 path:/dev/mapper/xffdb_data02_new
NOTE: cache opening disk 2 of grp 1: DATA_DG_0002 path:/dev/mapper/xffdb_data03
NOTE: cache opening disk 3 of grp 1: DATA_DG_0003 path:/dev/mapper/xffdb_data04
NOTE: cache opening disk 4 of grp 1: DATA_DG_0004 path:/dev/mapper/xffdb_data05_new
NOTE: cache opening disk 5 of grp 1: DATA_DG_0005 path:/dev/mapper/xffdb_data06
NOTE: cache mounting (first) external redundancy group 1/0x4178BD5E (DATA_DG)
Wed Jul 31 05:27:47 2024
* allocate domain 1, invalid = TRUE 
kjbdomatt send to inst 1
Wed Jul 31 05:27:47 2024
NOTE: attached to recovery domain 1
NOTE: cache recovered group 1 to fcn 0.1814063801
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Wed Jul 31 05:27:47 2024
NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA_DG)
NOTE: LGWR found thread 1 closed at ABA 12401.4517
NOTE: LGWR mounted thread 1 for diskgroup 1 (DATA_DG)
NOTE: LGWR opening thread 1 at fcn 0.1814063801 ABA 12402.4518
NOTE: cache mounting group 1/0x4178BD5E (DATA_DG) succeeded
NOTE: cache ending mount (success) of group DATA_DG number=1 incarn=0x4178bd5e
Wed Jul 31 05:27:47 2024
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
SUCCESS: diskgroup DATA_DG was mounted
SUCCESS:  alter diskgroup data_dg mount

重要提醒:手工直接对multipath设备权限所有者操作,当该设备被访问之时权限可能恢复成当初默认root:disk,对于这样的设备建议通过udev进行设置权限和所有者等信息

存储宕机导致Oracle异常故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:存储宕机导致Oracle异常故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

存储突然掉线,导致数据库crash,报大量ORA-00206 ORA-00202 ORA-15081以及Linux-x86_64 Error: 5: Input/output error之类的错误

Sun Jul 21 20:00:11 2024
Thread 1 advanced to log sequence 1594398 (LGWR switch)
  Current log# 5 seq# 1594398 mem# 0: +DATA/xff/onlinelog/group_5.412.906718739
Sun Jul 21 20:53:17 2024
WARNING: Write Failed. group:2 disk:0 AU:506916 offset:49152 size:16384
Sun Jul 21 20:53:17 2024
WARNING: Read Failed. group:2 disk:2 AU:506931 offset:49152 size:16384
WARNING: failed to read mirror side 1 of virtual extent 4 logical extent 0 of file 415 in group [2.34109396] 
from disk ORACLE_DATA_0002  allocation unit 506931 reason error; if possible, will try another mirror side
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ckpt_42142.trc:
ORA-15080: 与磁盘的同步 I/O 操作失败
ORA-27061: 异步 I/O 等待失败
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 16384
WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 
of file 415 in group 2 on disk 0 allocation unit 506916 
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ckpt_42142.trc:
ORA-00206: 写入控制文件时出错 (块 3, # 块 1)
ORA-00202: 控制文件: ''+DATA/xff/controlfile/current.415.906718737''
ORA-15081: 无法将 I/O 操作提交到磁盘
ORA-15081: 无法将 I/O 操作提交到磁盘
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ckpt_42142.trc:
ORA-00221: 写入控制文件时出错
ORA-00206: 写入控制文件时出错 (块 3, # 块 1)
ORA-00202: 控制文件: ''+DATA/xff/controlfile/current.415.906718737''
ORA-15081: 无法将 I/O 操作提交到磁盘
ORA-15081: 无法将 I/O 操作提交到磁盘
CKPT (ospid: 42142): terminating the instance due to error 221
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_lmon_42087.trc:
ORA-00202: 控制文件: ''+DATA/xff/controlfile/current.415.906718737''
ORA-15081: 无法将 I/O 操作提交到磁盘
ORA-27072: 文件 I/O 错误
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 1038194784
Additional information: -1
Sun Jul 21 20:53:19 2024
ORA-1092 : opitsk aborting process
Sun Jul 21 20:53:24 2024
ORA-1092 : opitsk aborting process
Sun Jul 21 20:53:24 2024
License high water mark = 59
Sun Jul 21 20:53:28 2024
Instance terminated by CKPT, pid = 42142
USER (ospid: 64660): terminating the instance
Instance terminated by USER, pid = 64660

存储恢复之后启动数据库报ORA-600 2131错误

Mon Jul 22 09:10:04 2024
ALTER DATABASE   MOUNT
This instance was first to mount
Mon Jul 22 09:10:04 2024
Sweep [inc][490008]: completed
Sweep [inc2][490008]: completed
NOTE: Loaded library: System 
SUCCESS: diskgroup ORACLE_DATA was mounted
NOTE: dependency between database rac and diskgroup resource ora.ORACLE_DATA.dg is established
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ora_14301.trc  (incident=492409):
ORA-00600: ??????, ??: [2131], [33], [32], [], [], [], [], [], [], [], [], []
Incident details in: /users/oracle/app/db/diag/rdbms/xff/xff1/incident/incdir_492409/xff1_ora_14301_i492409.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: ALTER DATABASE   MOUNT...

客户尝试重建ctl进行恢复,结果由于分析不正确,导致在重建ctl的时候,遗漏了3个数据文件,并且在屏蔽一致性的情况下,强制resetlogs操作,结果数据库没有被正常打开,而是报ORA-600 2662错误

alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 9965567206652
Clearing online redo logfile 1 +DATA/xff/onlinelog/group_1.414.906718739
Clearing online log 1 of thread 1 sequence number 0
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 +DATA/xff/onlinelog/group_2.413.906718739
Clearing online log 2 of thread 1 sequence number 0
Clearing online redo logfile 2 complete
Clearing online redo logfile 5 +DATA/xff/onlinelog/group_5.412.906718739
Clearing online log 5 of thread 1 sequence number 0
Clearing online redo logfile 5 complete
Expanded controlfile section 2 from 1 to 63 records
The number of logical blocks in section 2 remains the same
Expanded controlfile section 1 from 4 to 66 records
Requested to grow by 62 records; added 32 blocks of records
Expanded controlfile section 30 from 1 to 63 records
The number of logical blocks in section 30 remains the same
Expanded controlfile section 29 from 1 to 63 records
The number of logical blocks in section 29 remains the same
Control file has been expanded to support 63 threads
Mon Jul 22 23:04:07 2024
Redo thread 2 enabled by open resetlogs or standby activation
Online log +DATA/xff/onlinelog/group_1.414.906718739: Thread 1 Group 1 was previously cleared
Online log +DATA/xff/onlinelog/group_2.413.906718739: Thread 1 Group 2 was previously cleared
Online log +DATA/xff/onlinelog/group_3.501.1175036643: Thread 2 Group 3 was previously cleared
Online log +DATA/xff/onlinelog/group_4.502.1175036645: Thread 2 Group 4 was previously cleared
Online log +DATA/xff/onlinelog/group_5.412.906718739: Thread 1 Group 5 was previously cleared
Mon Jul 22 23:04:08 2024
Setting recovery target incarnation to 2
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Warning - High Database SCN: Current SCN value is 9965567206655, threshold SCN value is 0
If you have not previously reported this warning on this database, 
please notify Oracle Support so that additional diagnosis can be performed.
Mon Jul 22 23:04:09 2024
Assigning activation ID 2763017873 (0xa4b04e91)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: +DATA/xff/onlinelog/group_1.414.906718739
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Jul 22 23:04:10 2024
SMON: enabling cache recovery
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ora_64210.trc  (incident=624374):
ORA-00600: 内部错误代码, 参数: [2662], [2320], [1243079939], [2320], [1243211805], [12583040], [], [], [], [], [], []
Incident details in: /users/oracle/app/db/diag/rdbms/xff/xff1/incident/incdir_624374/xff1_ora_64210_i624374.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ora_64210.trc:
ORA-00600: 内部错误代码, 参数: [2662], [2320], [1243079939], [2320], [1243211805], [12583040], [], [], [], [], [], []
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ora_64210.trc:
ORA-00600: 内部错误代码, 参数: [2662], [2320], [1243079939], [2320], [1243211805], [12583040], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 64210): terminating the instance due to error 600
Instance terminated by USER, pid = 64210
ORA-1092 signalled during: alter database open resetlogs...

操作到这里,后续问题就比较麻烦了,因为在asm磁盘组中数据文件重建ctl的时候遗漏3个并且还被resetlogs操作过,导致这三个文件的resetlogs scn和其他数据文件不一致,对于这个问题,解决办法通过Oracle Recovery Tools工具或者bbed修改相关resetlogs scn,然后重建ctl

SQL> @rectl.sql

Control file created.

SQL> RECOVER DATABASE;
Media recovery complete

然后解决之前数据库启动报ORA-600 2662问题,通过修改数据库scn进行解决,可以使用Patch_SCN工具进行快速解决,然后open数据库成功

SQL> ALTER DATABASE OPEN;
 
Database altered.

但是查看alert日志数据库报大量ORA-600 4194、ORA-01595Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xC21D511] [PC:0x97F4EFA, kgegpa()+40]之类错误

Wed Jul 24 15:24:21 2024
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Started redo scan
Completed redo scan
 read 0 KB redo, 0 data blocks need recovery
…………
Database Characterset is ZHS16GBK
No Resource Manager plan active
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_smon_40279.trc  (incident=777938):
ORA-00600: 内部错误代码, 参数: [4194], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Wed Jul 24 15:24:40 2024
QMNC started with pid=79, OS id=54632 
Block recovery from logseq 2, block 74 to scn 9965587206835
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/redo02
LOGSTDBY: Validating controlfile with logical metadata
Wed Jul 24 15:24:40 2024
Block recovery stopped at EOT rba 2.82.16
Block recovery completed at rba 2.82.16, scn 2320.1263080114
Block recovery from logseq 2, block 74 to scn 9965587206833
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/redo02
Block recovery completed at rba 2.82.16, scn 2320.1263080114
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_smon_40279.trc:
ORA-01595: 释放区 (4) 回退段 (20) 时出错
ORA-00600: 内部错误代码, 参数: [4194], [], [], [], [], [], [], [], [], [], [], []
LOGSTDBY: Validation complete
Wed Jul 24 15:24:41 2024
Sweep [inc][777938]: completed
Sweep [inc2][777938]: completed
Wed Jul 24 15:24:41 2024
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_q001_54657.trc  (incident=778362):
ORA-00600: 内部错误代码, 参数: [4194], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Starting background process SMCO
Wed Jul 24 15:24:42 2024
SMCO started with pid=83, OS id=54691 
Block recovery from logseq 2, block 74 to scn 9965587206835
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/redo02
Block recovery completed at rba 2.82.16, scn 2320.1263080118
Block recovery from logseq 2, block 74 to scn 9965587206838
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/redo02
Block recovery completed at rba 2.83.16, scn 2320.1263080119
Error 600 in kwqmnpartition(), aborting txn 
Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_q001_54657.trc  (incident=778363):
ORA-25319: 队列表重新分区已中止
Completed: alter database open
Block recovery from logseq 2, block 74 to scn 9965587206835
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: +DATA/rac/onlinelog/redo02
Block recovery completed at rba 2.82.16, scn 2320.1263080118
Block recovery from logseq 2, block 74 to scn 9965587207538
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: +DATA/rac/onlinelog/redo02
Block recovery completed at rba 2.1097.16, scn 2320.1263080819
Errors in file /users/oracle/app/db/diag/rdbms/rac/rac1/trace/rac1_cjq0_55657.trc  (incident=778427):
ORA-00600: 内部错误代码, 参数: [600], [ORA-00600: 内部错误代码, 参数: 
[4194], [], [], [], [], [], [], [], [], [], [], []], [], [], [], [], [], [], [], [], [], []
Incident details in: /users/oracle/app/db/diag/rdbms/xff/xff1/incident/incdir_778427/xff1_cjq0_55657_i778427.trc
Exception [type:SIGSEGV, Address not mapped to object][ADDR:0xC21D511][PC:0x97F4EFA, kgegpa()+40][flags: 0x0, count: 1]
Exception [type:SIGSEGV, Address not mapped to object][ADDR:0xC21D511][PC:0x97F396E, kgebse()+776][flags: 0x2, count: 2]
Exception [type:SIGSEGV, Address not mapped to object][ADDR:0xC21D511][PC:0x97F396E, kgebse()+776][flags: 0x2, count: 2]

从报错分析是由于undo异常导致,处理异常undo回滚段之后,数据库open正常,安排逻辑迁移数据,完成本次恢复