open只有system文件的库

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:open只有system文件的库

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有一个朋友自己想测试只用system文件open库,闲着没事给他测试了下,顺利open成功(主要还是经验比较多,规避了很多坑)
1. 准备参数文件

*.audit_file_dest='C:\app\XFF\admin\ORCL\adump'
*.audit_trail='none'
*.compatible='11.2.0.3.0'
*.control_files='H:\TEMP\11203\control01.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='DBM'
*.diagnostic_dest='C:\app\XFF'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=ORCLXDB)'
*.nls_language='SIMPLIFIED CHINESE'
*.nls_territory='CHINA'
*.open_cursors=300
*.pga_aggregate_target=2147483648
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sessions=170
*.sga_target=6442450944
*.undo_tablespace='UNDOTBS1'
undo_management=MANUAL
_corrupted_rollback_segments=
_allow_resetlogs_corruption=true

2. 准备重建ctl语句

CREATE CONTROLFILE REUSE DATABASE "DBM" RESETLOGS  NOARCHIVELOG  
    MAXLOGFILES 50  
    MAXLOGMEMBERS 5  
    MAXDATAFILES 100  
    MAXINSTANCES 8  
    MAXLOGHISTORY 226  
LOGFILE  
  GROUP 1 'H:\TEMP\11203\redo01.log'  SIZE 50M,  
  GROUP 2 'H:\TEMP\11203\redo02.log'  SIZE 50M,  
  GROUP 3 'H:\TEMP\11203\redo03.log'  SIZE 50M  
DATAFILE  
'H:\TEMP\11203\system01.dbf'
CHARACTER SET ZHS16GBK  
;  

3. 重建ctl并且resetogs open库

SQL> recover database using backup controlfile until cancel;
ORA-00279: 更改 40438873410 (在 10/21/2022 14:06:16 生成) 对于线程 1 是必需的
ORA-00289: 建议:
C:\APP\XFF\PRODUCT\11.2.0.3\DBHOME_1\RDBMS\ARC0000000093_1118545292.0001
ORA-00280: 更改 40438873410 (用于线程 1) 在序列 #93 中


指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'H:\TEMP\11203\SYSTEM01.DBF'


ORA-01112: 未启动介质恢复


SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-01176: data dictionary has more than the 100 files allowed by the
controlfie
进程 ID: 3952
会话 ID: 14 序列号: 3

MAXDATAFILES值不对修改正确值,重建ctl,open库

SQL> RECOVER DATABASE;
完成介质恢复。
SQL> ALTER DATABASE OPEN;
ALTER DATABASE OPEN
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small
进程 ID: 6916
会话 ID: 14 序列号: 1

alert日志内容

Database Characterset is ZHS16GBK
Errors in file C:\APP\XFF\diag\rdbms\dbm\test\trace\test_smon_9384.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号  (名称为 "") 过小
Errors in file C:\APP\XFF\diag\rdbms\dbm\test\trace\test_ora_6916.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号  (名称为 "") 过小
Errors in file C:\APP\XFF\diag\rdbms\dbm\test\trace\test_ora_6916.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01555: 快照过旧: 回退段号  (名称为 "") 过小
Error 604 happened during db open, shutting down database
USER (ospid: 6916): terminating the instance due to error 604
Errors in file C:\APP\XFF\diag\rdbms\dbm\test\trace\test_smon_9384.trc  (incident=2521):
ORA-00600: 内部错误代码, 参数: [2662], [9], [1784188335], [9], [1784216952], [6019273], [], [], [], [], [], []
Incident details in: C:\APP\XFF\diag\rdbms\dbm\test\incident\incdir_2521\test_smon_9384_i2521.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Non-fatal internal error happenned while SMON was doing temporary segment drop.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Tue Nov 01 10:17:49 2022
Instance terminated by USER, pid = 6916
ORA-1092 signalled during: ALTER DATABASE OPEN...

修改文件头scn,并正常open库

SQL> startup nomount pfile='d:/pfile.txt'
ORACLE 例程已经启动。

Total System Global Area 6413680640 bytes
Fixed Size                  2267184 bytes
Variable Size            1107298256 bytes
Database Buffers         5284823040 bytes
Redo Buffers               19292160 bytes
SQL> alter database mount;

数据库已更改。

SQL> set numw 16
SQL> col CHECKPOINT_TIME for a40
SQL> set lines 150
SQL> set pages 1000
SQL> SELECT status,
  2  to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss') checkpoint_time,FUZZY,checkpoint_change#,
  3  count(*) ROW_NUM
  4  FROM v$datafile_header
  5  GROUP BY status, checkpoint_change#, to_char(checkpoint_time,'yyyy-mm-dd hh24:mi:ss'),fuzzy
  6  ORDER BY status, checkpoint_change#, checkpoint_time;

STATUS  CHECKPOINT_TIME                          FUZ CHECKPOINT_CHANGE#          ROW_NUM
------- ---------------------------------------- --- ------------------ ----------------
OFFLINE                                                               0              121
ONLINE  2022-11-01 10:17:44                      YES        40438893615                1

20221101102342


SQL> alter database open;

数据库已更改。

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------
H:\TEMP\11203\SYSTEM01.DBF
C:\APP\XFF\PRODUCT\11.2.0.3\DBHOME_1\DATABASE\MISSING00002
C:\APP\XFF\PRODUCT\11.2.0.3\DBHOME_1\DATABASE\MISSING00003
C:\APP\XFF\PRODUCT\11.2.0.3\DBHOME_1\DATABASE\MISSING00004
C:\APP\XFF\PRODUCT\11.2.0.3\DBHOME_1\DATABASE\MISSING00005
………………
C:\APP\XFF\PRODUCT\11.2.0.3\DBHOME_1\DATABASE\MISSING00121
C:\APP\XFF\PRODUCT\11.2.0.3\DBHOME_1\DATABASE\MISSING00122

已选择122行。

恢复完成

11.2 crs启动超时dd npohasd 处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:11.2 crs启动超时dd npohasd 处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户由于光纤链路故障导致表决盘异常从而使得主机重启,主机重启之后,集群没有正常启动
操作系统和crs版本

[root@rac1 ~]# cat /etc/redhat-release 
CentOS release 6.9 (Final)
[root@rac1 ~]# sqlplus -v

SQL*Plus: Release 11.2.0.4.0 Production

人工启动crs hang住一段时间然后报错

[root@rac1 ~]# crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.

查看启动进程

[grid@rac1 ~]$ ps -ef|grep d.bin
root       7043      1  0 11:48 ?        00:00:00 /u01/app/grid/product/11.2.0/bin/ohasd.bin reboot
root       8311      1  0 11:53 ?        00:00:00 /u01/app/grid/product/11.2.0/bin/ohasd.bin reboot
grid      10984  10954  0 12:10 pts/2    00:00:00 grep d.bin

根据经验这个故障很可能就是BUG:17229230 – DURING REBOOT, “OHASD.BIN REBOOT” REMAINS SLEEPING,临时解决方案,一个会话启动crs,然后在另外一个会话发起

/bin/dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1

后续crs启动正常

[root@rac1 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@rac1 ~]# crsctl status res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               Instance Shutdown   
ora.cluster_interconnect.haip
      1        ONLINE  OFFLINE                                                   
ora.crf
      1        ONLINE  ONLINE       rac1                                         
ora.crsd
      1        ONLINE  OFFLINE                                                   
ora.cssd
      1        ONLINE  OFFLINE                               STARTING            
ora.cssdmonitor
      1        ONLINE  ONLINE       rac1                                         
ora.ctssd
      1        ONLINE  OFFLINE                                                   
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  OFFLINE                                                   
ora.gipcd
      1        ONLINE  ONLINE       rac1                                         
ora.gpnpd
      1        ONLINE  ONLINE       rac1                                         
ora.mdnsd
      1        ONLINE  ONLINE       rac1                                         

终止dd命令,集群启动正常

Patch SCN工具快速解决ORA-600 2662问题

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Patch SCN工具快速解决ORA-600 2662问题

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有一个数据库由于redo异常,强制拉库启动的时候报ORA-600 2662

Sun Oct 23 06:51:13 2022
SMON: enabling cache recovery
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\dbm\xff01\trace\xff01_ora_5016.trc  (incident=264609):
ORA-00600: ??????, ??: [2662], [9], [1784167754], [9], [1784229886], [12583040], [], [], [], [], [], []
Incident details in: C:\APP\ADMINISTRATOR\diag\rdbms\dbm\xff01\incident\incdir_264609\xff01_ora_5016_i264609.trc
Sun Oct 23 06:51:17 2022
Dumping diagnostic data in directory=[cdmp_20221023065117],requested by (instance=1,osid=5016),summary=[incident=264609].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\dbm\xff01\trace\xff01_ora_5016.trc:
ORA-00600: ??????, ??: [2662], [9], [1784167754], [9], [1784229886], [12583040], [], [], [], [], [], []
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\dbm\xff01\trace\xff01_ora_5016.trc:
ORA-00600: ??????, ??: [2662], [9], [1784167754], [9], [1784229886], [12583040], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 5016): terminating the instance due to error 600
Sun Oct 23 06:51:22 2022
Instance terminated by USER, pid = 5016
ORA-1092 signalled during: alter database open resetlogs...

报错比较明显由于scn问题导致,对于这个问题通过以前研发的Patch_SCN工具一键解决
20221023082341


解决给问题之后,open数据库遭遇ora-600 4194错误

Database Characterset is ZHS16GBK
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\dbm\xff01\trace\xff01_smon_4388.trc  (incident=296569):
ORA-00600: 内部错误代码, 参数: [4194], [], [], [], [], [], [], [], [], [], [], []
Incident details in: C:\APP\ADMINISTRATOR\diag\rdbms\dbm\xff01\incident\incdir_296569\xff01_smon_4388_i296569.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
No Resource Manager plan active
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\dbm\xff01\trace\xff01_ora_1628.trc  (incident=296617):
ORA-00600: 内部错误代码, 参数: [4193], [], [], [], [], [], [], [], [], [], [], []
Incident details in: C:\APP\ADMINISTRATOR\diag\rdbms\dbm\xff01\incident\incdir_296617\xff01_ora_1628_i296617.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sun Oct 23 08:23:02 2022
Block recovery from logseq 1, block 568 to scn 41438874500
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: E:\ORADATA\ONLINELOG\GROUP_1.261.840661629
  Mem# 1: E:\ORADATA\ONLINELOG\GROUP_1.269.840661631
Block recovery stopped at EOT rba 1.570.16
Block recovery completed at rba 1.570.16, scn 9.2784168835
Block recovery from logseq 1, block 568 to scn 41438874497
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: E:\ORADATA\ONLINELOG\GROUP_1.261.840661629
  Mem# 1: E:\ORADATA\ONLINELOG\GROUP_1.269.840661631
Block recovery completed at rba 1.568.16, scn 9.2784168834
Errors in file C:\APP\ADMINISTRATOR\diag\rdbms\dbm\xff01\trace\xff01_smon_4388.trc:
ORA-01595: 释放区 (2) 回退段 (1) 时出错
ORA-00600: 内部错误代码, 参数: [4194], [], [], [], [], [], [], [], [], [], [], []

处理异常undo问题,数据库open成功,建议业务安排导出数据导入新库,完成本次恢复
Patch_SCN下载:Patch_SCN下载
Patch_SCN使用说明:Patch_SCN使用说明

Controlfile sequence number in file header is different from the one in memory

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Controlfile sequence number in file header is different from the one in memory

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最近一段时间遇到多个客户类似如下错误,由于某种原因导致rac两个节点oracle实例一段时间之后就crash,然后重启,系统非常不稳定,严重影响业务使用,一个节点alert日志有类似日志

Sat Oct 22 13:01:59 2022
Instance recovery: looking for dead threads
********************* ATTENTION: ******************** 
 The controlfile header block returned by the OS
 has a sequence number that is too old. 
 The controlfile might be corrupted.
 PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE 
 without following the steps below.
 RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE 
 TO THE DATABASE, if the controlfile is truly corrupted.
 In order to re-start the instance safely, 
 please do the following:
 (1) Save all copies of the controlfile for later 
     analysis and contact your OS vendor and Oracle support.
 (2) Mount the instance and issue: 
     ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
 (3) Unmount the instance. 
 (4) Use the script in the trace file to
     RE-CREATE THE CONTROLFILE and open the database. 
*****************************************************
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
SMON (ospid: 31905): terminating the instance
Sat Oct 22 13:02:03 2022
System state dump requested by (instance=1, osid=31905 (SMON)), summary=[abnormal instance termination].
System State dumped to trace file 
     /u02/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_31863_20221022130203.trc
Sat Oct 22 13:02:04 2022
ORA-1092 : opitsk aborting process
Instance terminated by SMON, pid = 31905

另外的一个节点错误日志如下:

Sat Oct 22 13:02:41 2022
[24610] Successfully onlined Undo Tablespace 5.
Undo initialization finished serial:0 start:152928776 end:152930636 diff:1860 (18 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Sat Oct 22 13:03:10 2022
Error: Controlfile sequence number in file header is different from the one in memory
       Please check that the correct mount options are used if controlfile is located on NFS
USER (ospid: 24610): terminating the instance
Sat Oct 22 13:03:10 2022
System state dump requested by (instance=2, osid=24610), summary=[abnormal instance termination].
System State dumped to trace file 
         /u02/app/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_24486_20221022130310.trc
Dumping diagnostic data in directory=[cdmp_20221022130310], 
requested by (instance=2, osid=24610), summary=[abnormal instance termination].
Instance terminated by USER, pid = 24610

通过节点的日志基本上可以确定是由于Controlfile sequence number异常导致,官方也有类似的文档描述:
Instance Crashed With “Controlfile sequence number in file header is different from the onein memory” (Doc ID 2884958.1)

oracle drop tablespace 恢复最后一招

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:oracle drop tablespace 恢复最后一招

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

客户由于不太熟悉oracle数据库,加入错误的数据文件到一个业务表空间,然后经过一系列操作,最终结果是做了drop tablespace xxx including contents and datafiles操作,导致表空间被删除,而且该数据库未做任何备份和归档.通过检查操作系统和数据库alert日志,确认文件已经从os层面彻底删除
20221022211123


对于这种情况,如果立即保护现场,然后通过反删除软件进行恢复,运气好还可以恢复出来被删除的数据文件,然后再通过dul之类的工具恢复其中数据,这个客户库一直没有关闭,而且尝试各种工具恢复,解决均没有正常恢复出来被删除的几个数据文件.对于这种情况,正常os层面的方法肯定无法恢复了,尝试使用基于block层面技术进行扫描磁盘恢复,结果发现运气还不错,绝大部分block都被找到,参考类似恢复方法:Oracle 数据文件大小为0kb或者文件丢失恢复通过类似分析,找出来绝大多数没有覆盖的block,恢复出来被删除的含数据的file 18,20,21,并通过检测整体恢复效果如下
20221022212822

通过dul工具结合客户提供的表定语以及获取到大表id信息,相互关联,快速恢复客户绝大多数数据,最大限度挽回客户损失.
20221022213506

对于oracle 删除表空间之类的操作,我们可以做到block层面深入恢复,理论上只要你被删除的数据文件在磁盘上还有一个block没有被覆盖,我们都可以把里面的数据恢复出来,最大限度的减少因为这种误操作而引起的损失.如果有类似需求无法自行解决,可以联系我们进行最大限度、最快速度的抢救数据.
电话/微信:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com