Oracle Recovery Tools更新—202304

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle Recovery Tools更新—202304

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近对Oracle Recovery Tools工具进行了升级,增加了在某些恢复情况下对于单个文件的CheckPoint Time的显示,便于判断文件的有效性(便于历史文件区分),该软件是一个恢复小工具,主要定位:
1. 对于某些恢复场景的快速判断,便于问题定位和后续恢复
2. 实现图形化快速修复损坏的oracle数据块和构造块
3. 实现图形化快速修改文件头信息,代替bbed解决文件头的各种scn等问题
4. 快速修改内存中记录,在某些强制拉库的过程中提供便利,特别是在win平台后续版本中无法通过oradebug等方式修改的情况下使用
20230410223727


把对Oracle内存修改功能集成到该软件中
20230410223826

增加了一些可能使用到的小功能
20230410223913

软件下载地址:OraRecovery下载
可以处理常见的oracle故障案例:
Oracle Recovery Tools修复空闲坏块
Oracle Recovery Tools实战批量坏块修复
Oracle Recovery Tools快速恢复ORA-19909
Oracle Recovery Tools 解决ORA-600 3020故障
Oracle Recovery Tools恢复MISSING00000文件故障
一键恢复ORA-01113 ORA-01110—Oracle Recovery Tools
Oracle Recovery Tools快速解决sysaux文件不能online问题
Oracle Recovery Tools恢复—ORA-00704 ORA-01555故障
Oracle Recovery Tools 解决ORA-01190 ORA-01248等故障
Oracle Recovery Tools解决ORA-00279 ORA-00289 ORA-00280故障

messages日志报Error:emcp:emcp_pseudo_ctl_ioctl错误

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:messages日志报Error:emcp:emcp_pseudo_ctl_ioctl错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在linux平台的rac中,当存储使用的是emc,而且多路径使用的是EMC PowerPath,我们有时候会在系统的/var/log/messages日志中看到类似这样的信息

Apr  2 18:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x1268, proc name: osysmond.bin
Apr  2 18:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x2285, proc name: osysmond.bin
Apr  2 19:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x80081272, proc name: osysmond.bin
Apr  2 19:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x1268, proc name: osysmond.bin
Apr  2 19:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x2285, proc name: osysmond.bin
Apr  2 20:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x80081272, proc name: osysmond.bin
Apr  2 20:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x1268, proc name: osysmond.bin
Apr  2 20:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x2285, proc name: osysmond.bin
Apr  2 21:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x80081272, proc name: osysmond.bin
Apr  2 21:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x1268, proc name: osysmond.bin
Apr  2 21:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x2285, proc name: osysmond.bin
Apr  2 22:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x80081272, proc name: osysmond.bin
Apr  2 22:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x1268, proc name: osysmond.bin
Apr  2 22:19:09 his02 kernel: Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0x2285, proc name: osysmond.bin

这些错误消息显示osysmond.bin已通过接口emcp_pseudo_ctl_ioctl发出具有无效命令的ioctl
osysmond.bin是Oracle RAC Cluster Health Monitor
emcp_pseudo_ctl_ioctl是EMC PowerPath内核驱动程序接口
在系统没有其他异常的情况下,可以忽略该报错.
参考:Oracle Linux: Error ” Error:emcp:emcp_pseudo_ctl_ioctl: Invalid cmd 0×80081272, proc name: osysmond.bin” (Doc ID 2152706.1)

Oracle 启动后一会儿就挂掉故障处理—ORA-600 17182

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Oracle 启动后一会儿就挂掉故障处理—ORA-600 17182

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一例正常运行的数据库突然节点不停重启(因为是rac,启动一会儿就crash,然后又被crs给启动起来,然后有crash,依次循环),告警日志类似:

Fri Mar 24 13:36:07 2023
QMNC started with pid=124, OS id=188397 
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Completed: ALTER DATABASE OPEN
Fri Mar 24 13:36:08 2023
minact-scn: Inst 1 is now the master inc#:2 mmon proc-id:188028 status:0x7
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000
Fri Mar 24 13:36:08 2023
Starting background process CJQ0
Fri Mar 24 13:36:08 2023
CJQ0 started with pid=144, OS id=188451 
Fri Mar 24 13:36:09 2023
Redo thread 2 internally disabled at seq 44406 (CKPT)
Archived Log entry 135343 added for thread 2 sequence 44405 ID 0xcd7086e0 dest 1:
ARC0: Archiving disabled thread 2 sequence 44406
Archived Log entry 135344 added for thread 2 sequence 44406 ID 0xcd7086e0 dest 1:
Thread 1 advanced to log sequence 40030 (LGWR switch)
  Current log# 2 seq# 40030 mem# 0: +DATA/xff/onlinelog/group_2.310.1087136761
Archived Log entry 135345 added for thread 1 sequence 40029 ID 0xcd7086e0 dest 1:
Fri Mar 24 13:36:30 2023
Errors in file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_p200_188856.trc  (incident=1082418):
ORA-00600: internal error code, arguments: [17182], [0x7F4D2A13DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/database/diag/rdbms/xff/xff1/incident/incdir_1082418/xff1_p200_188856_i1082418.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Mar 24 13:36:30 2023
Dumping diagnostic data in directory=[cdmp_20230324133630], requested by (instance=1, osid=188856 (P200)), summary=[incident=1082418].
Fri Mar 24 13:36:54 2023
Decreasing number of real time LMS from 6 to 0
Fri Mar 24 13:36:54 2023
Block recovery from logseq 40030, block 259 to scn 17199959182
Recovery of Online Redo Log: Thread 1 Group 2 Seq 40030 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/group_2.310.1087136761
Block recovery stopped at EOT rba 40030.317.16
Block recovery completed at rba 40030.317.16, scn 4.20089998
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97E8579, kghrst()+1835] [flags: 0x0, count: 1]
Errors in file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_p200_188856.trc  (incident=1082419):
ORA-07445: exception encountered: core dump [kghrst()+1835] [SIGSEGV] [ADDR:0x0] [PC:0x97E8579] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F4D2A13DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/database/diag/rdbms/xff/xff1/incident/incdir_1082419/xff1_p200_188856_i1082419.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_p200_188856.trc  (incident=1082420):
ORA-00600: internal error code, arguments: [17147], [0x7F4D2A13DBD0], [], [], [], [], [], [], [], [], [], []
ORA-07445: exception encountered: core dump [kghrst()+1835] [SIGSEGV] [ADDR:0x0] [PC:0x97E8579] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F4D2A13DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/database/diag/rdbms/xff/xff1/incident/incdir_1082420/xff1_p200_188856_i1082420.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_p200_188856.trc  (incident=1082421):
ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x7F4D2A13DBE8], [], [], [], [], [], [], [], [], [], []
ORA-07445: exception encountered: core dump [kghrst()+1835] [SIGSEGV] [ADDR:0x0] [PC:0x97E8579] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F4D2A13DBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/database/diag/rdbms/xff/xff1/incident/incdir_1082421/xff1_p200_188856_i1082421.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Mar 24 13:36:56 2023
Dumping diagnostic data in directory=[cdmp_20230324133656], requested by (instance=1, osid=188856 (P200)), summary=[incident=1082420].
SMON: Restarting fast_start parallel rollback
Fri Mar 24 13:37:12 2023
Errors in file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_p000_188229.trc  (incident=1080530):
ORA-00600: internal error code, arguments: [17182], [0x7F3AB22ADBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/database/diag/rdbms/xff/xff1/incident/incdir_1080530/xff1_p000_188229_i1080530.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Mar 24 13:37:12 2023
Dumping diagnostic data in directory=[cdmp_20230324133712], requested by (instance=1, osid=188229 (P000)), summary=[incident=1080530].
Fri Mar 24 13:37:24 2023
Block recovery from logseq 40030, block 259 to scn 17199959182
Recovery of Online Redo Log: Thread 1 Group 2 Seq 40030 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/group_2.310.1087136761
Block recovery completed at rba 40030.317.16, scn 4.20089999
Fri Mar 24 13:37:37 2023
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97E8579, kghrst()+1835] [flags: 0x0, count: 1]
Errors in file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_p000_188229.trc  (incident=1080531):
ORA-07445: exception encountered: core dump [kghrst()+1835] [SIGSEGV] [ADDR:0x0] [PC:0x97E8579] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F3AB22ADBF8], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/database/diag/rdbms/xff/xff1/incident/incdir_1080531/xff1_p000_188229_i1080531.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Mar 24 13:37:37 2023
Dumping diagnostic data in directory=[cdmp_20230324133737], requested by (instance=1, osid=188229 (P000)), summary=[incident=1080531].
Fri Mar 24 13:38:16 2023
SMON: slave died unexpectedly, downgrading to serial recovery
Errors in file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_smon_188020.trc  (incident=1080418):
ORA-00600: internal error code, arguments: [17182], [0x7F9184B445C0], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/database/diag/rdbms/xff/xff1/incident/incdir_1080418/xff1_smon_188020_i1080418.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Block recovery from logseq 40030, block 259 to scn 17199959182
Recovery of Online Redo Log: Thread 1 Group 2 Seq 40030 Reading mem 0
  Mem# 0: +DATA/xff/onlinelog/group_2.310.1087136761
Block recovery completed at rba 40030.317.16, scn 4.20089999
ORACLE Instance xff1 (pid = 56) - Error 600 encountered while recovering transaction (10, 26) on object 242112.
Errors in file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_smon_188020.trc:
ORA-00600: internal error code, arguments: [17182], [0x7F9184B445C0], [], [], [], [], [], [], [], [], [], []
Fri Mar 24 13:38:17 2023
Dumping diagnostic data in directory=[cdmp_20230324133817], requested by (instance=1, osid=188020 (SMON)), summary=[incident=1080418].
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x97E8579, kghrst()+1835] [flags: 0x0, count: 1]
Errors in file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_smon_188020.trc  (incident=1080419):
ORA-07445: exception encountered: core dump [kghrst()+1835] [SIGSEGV] [ADDR:0x0] [PC:0x97E8579] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x7F9184B445C0], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/database/diag/rdbms/xff/xff1/incident/incdir_1080419/xff1_smon_188020_i1080419.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Mar 24 13:38:20 2023
PMON (ospid: 187888): terminating the instance due to error 474
System state dump requested by (instance=1, osid=187888 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/database/diag/rdbms/xff/xff1/trace/xff1_diag_187902_20230324133820.trc
Fri Mar 24 13:38:21 2023
ORA-1092 : opitsk aborting process
Dumping diagnostic data in directory=[cdmp_20230324133820], requested by (instance=1, osid=187888 (PMON)), summary=[abnormal instance termination].
Instance terminated by PMON, pid = 187888

这类的故障在多年前处理过几次
ORA-600 17182导致oracle异常
ORA-00600[17182],ORA-00600[25027],ORA-00600[kghfrempty:ds]故障处理
这个故障的原因是由于block逻辑损坏,实例无法正常做回滚恢复,从而异常.处理异常回滚问题,就可以规避掉数据库启动后一会儿就crash问题.

断电引起的oracle数据库异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:断电引起的oracle数据库异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

服务器断电,数据库mount失败

SQL> startup mount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area 1185853440 bytes
Fixed Size                  2175168 bytes
Variable Size             335548224 bytes
Database Buffers          838860800 bytes
Redo Buffers                9269248 bytes
ORA-00205: ?????????, ??????, ???????

alert日志报错信息

Sun Mar 19 20:18:29 2023
ALTER DATABASE   MOUNT
Errors in file d:\app\xifenfei\diag\rdbms\orcl\orcl\trace\orcl_ckpt_15064.trc  (incident=3697):
ORA-00227: ????????????: (? 1, # ? 1)
ORA-00202: ????: ''D:\BAIDUNETDISKDOWNLOAD\ORCL\CONTROL01.CTL''
Incident details in: d:\app\xifenfei\diag\rdbms\orcl\orcl\incident\incdir_3697\orcl_ckpt_15064_i3697.trc
Sun Mar 19 20:18:30 2023
Errors in file d:\app\xifenfei\diag\rdbms\orcl\orcl\trace\orcl_m000_18084.trc  (incident=3761):
ORA-00227: ????????????: (? 1, # ? 1)
ORA-00202: ????: ''D:\BAIDUNETDISKDOWNLOAD\ORCL\CONTROL01.CTL''
Incident details in: d:\app\xifenfei\diag\rdbms\orcl\orcl\incident\incdir_3761\orcl_m000_18084_i3761.trc
Sun Mar 19 20:18:29 2023
MMNL started with pid=16, OS id=9404 
ORA-00227: ????????????: (? 1, # ? 1)
ORA-00202: ????: ''D:\BAIDUNETDISKDOWNLOAD\ORCL\CONTROL01.CTL''
Checker run found 1 new persistent data failures
Trace dumping is performing id=[cdmp_20230319201831]
ORA-205 signalled during: ALTER DATABASE   MOUNT...

错误比较明显由于控制文件的block损坏导致数据库在mount的时候提示ORA-00205,重试重建ctl

SQL> CREATE CONTROLFILE REUSE DATABASE "orcl" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 50
  3      MAXLOGMEMBERS 5
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 226
  7  LOGFILE
  8    GROUP 1 'D:\BaiduNetdiskDownload\orcl/redo01.log'  SIZE 50M,
  9    GROUP 2 'D:\BaiduNetdiskDownload\orcl/redo02.log'  SIZE 50M,
 10    GROUP 3 'D:\BaiduNetdiskDownload\orcl/redo03.log'  SIZE 50M
 11  DATAFILE
 12  'D:\BaiduNetdiskDownload\orcl\EXAMPLE01.DBF',
 13  'D:\BaiduNetdiskDownload\orcl\GHZN.DBF',
 14  'D:\BaiduNetdiskDownload\orcl\GHZN2.DBF',
 15  'D:\BaiduNetdiskDownload\orcl\SYSAUX01.DBF',
 16  'D:\BaiduNetdiskDownload\orcl\SYSTEM01.DBF',
 17  'D:\BaiduNetdiskDownload\orcl\UNDOTBS01.DBF',
 18  'D:\BaiduNetdiskDownload\orcl\USERS01.DBF'
 19  CHARACTER SET ZHS16GBK
 20  ;
CREATE CONTROLFILE REUSE DATABASE "orcl" NORESETLOGS  NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01565: error in identifying file
'D:\BaiduNetdiskDownload\orcl\UNDOTBS01.DBF'
ORA-27041: unable to open file
OSD-04001: 逻辑块大小无效 (OS 2613931212)

由于undo文件异常(大小不是block size的整数倍),因此报OSD-04001: 逻辑块大小无效错误.对undo文件及其其他文件进行检查发现数据库文件有不少坏块,而且undo文件的文件头损坏
20230319202417


通过抛弃undo文件并进行一些处理,重建ctl成功,并且recover 数据库成功,顺利open数据库

SQL> CREATE CONTROLFILE REUSE DATABASE "orcl" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 50
  3      MAXLOGMEMBERS 5
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 226
  7  LOGFILE
  8    GROUP 1 'D:\BaiduNetdiskDownload\orcl/redo01.log'  SIZE 50M,
  9    GROUP 2 'D:\BaiduNetdiskDownload\orcl/redo02.log'  SIZE 50M,
 10    GROUP 3 'D:\BaiduNetdiskDownload\orcl/redo03.log'  SIZE 50M
 11  DATAFILE
 12  'D:\BaiduNetdiskDownload\orcl\EXAMPLE01.DBF',
 13  'D:\BaiduNetdiskDownload\orcl\GHZN.DBF',
 14  'D:\BaiduNetdiskDownload\orcl\GHZN2.DBF',
 15  'D:\BaiduNetdiskDownload\orcl\SYSAUX01.DBF',
 16  'D:\BaiduNetdiskDownload\orcl\SYSTEM01.DBF',
 17  'D:\BaiduNetdiskDownload\orcl\USERS01.DBF'
 18  CHARACTER SET ZHS16GBK
 19  ;

Control file created.

SQL> recover database;
Media recovery complete.
SQL> shutdown immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.
SQL> startup mount pfile='d:/pfile.txt'
ORACLE instance started.

Total System Global Area 1185853440 bytes
Fixed Size                  2175168 bytes
Variable Size             335548224 bytes
Database Buffers          838860800 bytes
Redo Buffers                9269248 bytes
Database mounted.
SQL> alter database open;

Database altered.

然后使用逻辑方式导出数据,运气不错业务文件没有任何坏块,system坏块在aud$上,无任何业务数据丢失.

等保修改oracle SYS用户名要求的请注意—ORA-00600 kokasgi1

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:等保修改oracle SYS用户名要求的请注意—ORA-00600 kokasgi1

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在2019年处理过第一起ORA-00600 kokasgi1 故障到现在,已经过去了近4年,今天又有客户依旧因为修改sys重启库遇到该问题
ora-600-kokasgi1


通过分析确认客户那边在等保的时候要求修改oracle的SYS用户
20230319001510

然后重启数据库,数据库就开始报ORA-600 kokasgi1错误.
再次呼吁:
1. oracle的sys用户名不能修改,这个东西是写在oracle代码里面的,启动的时候会去读取
2.如果已经修改了sys用户名的,请在数据库重启之前一定修改回来