Patch_SCN快速解决ORA-600 2663故障

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:Patch_SCN快速解决ORA-600 2663故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个维保客户和我说他们测试库删除了日志文件导致库无法启动,让我帮忙看看
客户现场现况
1. 磁盘空间使用100%

[oracle@We1-db_Test ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           3.9G  880K  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/vda1        40G   38G   92M 100% /
tmpfs           783M     0  783M   0% /run/user/0

2. 数据库redo被删除了部分,而且是active状态的被删除

[oracle@We1-db_Test ~]$ ls -l /opt/app/oracle/oradata/orcl/redo0*
-rw-r----- 1 oracle oinstall 52429312 Jan 15 15:29 /opt/app/oracle/oradata/orcl/redo04.log
-rw-r----- 1 oracle oinstall 52429312 Jan 15 16:26 /opt/app/oracle/oradata/orcl/redo05.log

SQL> select group#,SEQUENCE#,STATUS FROM V$lOG;

    GROUP#  SEQUENCE# STATUS
---------- ---------- ----------------
         1       8989 CURRENT
         2          0 UNUSED
         5          0 UNUSED
         4          0 UNUSED
         3       8988 ACTIVE

SQL> select member from v$logfile;

MEMBER
-----------------------------------------------------
/opt/app/oracle/oradata/orcl/redo03.log
/opt/app/oracle/oradata/orcl/redo02.log
/opt/app/oracle/oradata/orcl/redo01.log
/opt/app/oracle/oradata/orcl/redo04.log
/opt/app/oracle/oradata/orcl/redo05.log

基于当前情况,直接open库无望,但是空间不足问题需要先解决,不然恢复过程中创建redo空间不足依旧会报错卡死,所以先清理了监听和alert等日志,系统空闲了3G多空间,可以进行恢复操作

[oracle@We1-db_Test trace]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           3.9G  880K  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/vda1        40G   34G  3.9G  90% /
tmpfs           783M     0  783M   0% /run/user/0

恢复数据库
1. 由于active redo丢失,毫无疑问,直接强制拉库,使用_allow_resetlogs_corruption参数开干

SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 2455228416 bytes
Fixed Size                  2255712 bytes
Variable Size             905970848 bytes
Database Buffers         1526726656 bytes
Redo Buffers               20275200 bytes
Database mounted.
SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done


SQL> recover database using backup controlfile;
ORA-00279: change 311982775 generated at 12/31/2025 17:35:11 needed for thread
1
ORA-00289: suggestion :
/opt/app/oracle/fast_recovery_area/ORCL/archivelog/2026_01_16/o1_mf_1_8988_%u_.a
rc
ORA-00280: change 311982775 for thread 1 is in sequence #8988


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2663], [0], [311982792], [0],
[311982833], [], [], [], [], [], [], []
Process ID: 11917
Session ID: 576 Serial number: 3

alert日志报错

Fri Jan 16 21:25:31 2026
Assigning activation ID 1750515127 (0x6856bdb7)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /opt/app/oracle/oradata/orcl/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Jan 16 21:25:31 2026
SMON: enabling cache recovery
Errors in file /opt/app/oracle/diag/rdbms/orcl/we1db/trace/we1db_ora_11917.trc  (incident=81753):
ORA-00600: internal error code, arguments: [2663], [0], [311982792], [0], [311982833], [], []
Incident details in: /opt/app/oracle/diag/rdbms/orcl/we1db/incident/incdir_81753/we1db_ora_11917_i81753.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /opt/app/oracle/diag/rdbms/orcl/we1db/trace/we1db_ora_11917.trc:
ORA-00600: internal error code, arguments: [2663], [0], [311982792], [0], [311982833], [], []
Errors in file /opt/app/oracle/diag/rdbms/orcl/we1db/trace/we1db_ora_11917.trc:
ORA-00600: internal error code, arguments: [2663], [0], [311982792], [0], [311982833], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 11917): terminating the instance due to error 600
Instance terminated by USER, pid = 11917
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (11917) as a result of ORA-1092
Fri Jan 16 21:25:33 2026
ORA-1092 : opitsk aborting process

不幸数据库遇到ORA-600 2663错误,这个故障在以前的文章中描述过,基本上和ORA-600 2662的处理思路类似,这里直接使用:Patch_SCN for Linux进行恢复
2. 使用Patch_SCN处理数据库SCN

SQL> startup nomount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 2455228416 bytes
Fixed Size                  2255712 bytes
Variable Size             905970848 bytes
Database Buffers         1526726656 bytes
Redo Buffers               20275200 bytes
SQL>@rectl

Control file created.

SQL> recover database;
Media recovery complete.

patch_scn


SQL> alter database open;

Database altered.
SQL> SELECT CURRENT_SCN FROM V$DATABASE;

     CURRENT_SCN
----------------
       322002903

到这里完成数据库open操作,后续逻辑导出完成恢复任务

在生产环境错误执行dd命令破坏asm磁盘故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:在生产环境错误执行dd命令破坏asm磁盘故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于ssh登录错误,客户对生产环境进行了误操作把系统的一块磁盘dd到另外两个磁盘上,由于及时发现立马进行了终止操作,但是还是分别破坏了一点数据(一块盘破坏了2G多,另外一块盘破坏了1G多)
dd


通过分析udev的绑定关系确认被破坏的asm disk名称
udev

再通过asm alert日志确认破坏磁盘在asm disk中情况
asmdisk

通过上述信息基本上可以确认,asmdisk13被分别dd到了asmdisk11和asmdisk26中了部分11
26

基于这种情况,由于asm disk被破坏了1-2G多,直接修复然后正常mount磁盘组基本上没有希望,经过分析以及与客户沟通,确认他们改系统是4节点组成的集群,1/2节点上面跑2套库,3/4节点上跑2套库,数据整体放在data_dg磁盘组中,需要恢复的库是第二个顺序创建的1套库(4套库中只需恢复一套即可),由于破坏的数据本身不多,而且需要恢复的数据不是最初写入asm磁盘组,基于这样的情况,需要恢复的数据库机会比较大.

由于现在三个磁盘头信息一致(一个磁盘被dd到另外两个磁盘上),因此第一步先把损坏的两个磁盘头进行简单修复,为了便于amdu(找回ASM中数据文件)等之类数据可以识别到正确的磁盘头信息,然后进行后续的数据文件提取恢复.使用工具对数据文件进行了批量提取,提取数据完成之后,尝试recover和open库
open

虽然数据库正常打开了,不过很不幸,后台还是有一些坏块报错,通过dbv检查发现有文件有一部分坏块,类似dbv报错信息
1_1

通过分析该文件在磁盘组中各个磁盘的分布情况
map

确认该文件确实有部分block分布在被dd磁盘的破坏的范围内这个部分的数据丢失无法挽回,只能是定位到具体对象然后由业务想办法处理.相对以往的各种dd破坏的案例恢复而言,这个应该是效果比较好的一个,而且也是恢复比较容易的一个,没有使用到asm disk 基于asm au/oracle block 扫描的级别,而且system表空间没有任何损坏,数据库甚至直接open成功了,以往的一些dd案例列举:
asm磁盘dd破坏恢复
dd破坏asm磁盘头恢复
asm disk 磁盘部分被清空恢复

obet快速修改scn/resetlogs恢复数据库(缺少归档,ORA-00308)

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:obet快速修改scn/resetlogs恢复数据库(缺少归档,ORA-00308)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

最初是由于业务系统报ORA-01578导致
blk


然后相关人员上来进行了一系列操作,包括但不限于offline文件,create datafile 44,recover datafile,还尝试做resetlogs,使用备份控制文件还原等等.我接手该故障的时候状态
resetlog
44号文件状态是12月1日的,而且resetlogs信息也不对.通过和客户沟通,确认是他们在没有备份44号文件的前提下直接执行了类似alter database create datafile 44的命令,但是在应用了写归档之后,发现提示有归档不存在

ALTER DATABASE RECOVER    CONTINUE DEFAULT
Media Recovery Log /data/app/oracle/archive/orcl/1_181285_1103629544.dbf
ORA-279 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Media Recovery Log /data/app/oracle/archive/orcl/1_181286_1103629544.dbf
Tue Jan 06 12:48:12 2026
Errors with log /data/app/oracle/archive/orcl/1_181286_1103629544.dbf
Errors in file /data/app/oracle/diag/rdbms/orcl8/orcl/trace/orcl_pr00_8338.trc:
ORA-00308: cannot open archived log '/data/app/oracle/archive/orcl/1_181286_1103629544.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

分析现存归档,确实发现12月1日之后丢失了部分归档日志(归档不连续)
arch


基于这种情况(故障数据文件直接被覆盖,归档又出现了多天的连续断档,而且没有有效备份),只能先打开数据库,然后根据情况导出数据,然后导入到新库中.这里我直接使用obet来修改resetlogs和checkpoint 信息,实现数据库快速打开

OBET> help
OBET (Oracle Block Editor Tool) commands:
  open <config_file>    - Load file list from config file (format: <num> <path>)
  info                  - Show loaded file list (from open command)
  set filename <path>   - Set target file path (required)
  set file <num>        - Set filename using loaded file number (from open list)
  set blocksize <size>  - Set block size (2048,4096,[8192],16384,32768)
  set block <num>       - Set block number (starts from 0, default: 1)
  set offset <offset>   - Set offset within block (< blocksize, default: 0)
  set count <bytes>     - Set number of bytes to read (default: 32)
  set mode edit/browse  - Enable edit/browse mode
  d/dump [options]      - Display data (options: block X, offset Y, count N)
  m/modify <hex> [opts] - Modify data with hex (opts: block X, offset Y)
  undo                  - Undo last modification
  sum [block X]         - Calculate checksum for block (default: current block)
  sum apply [block X]   - Apply checksum: write calculated value to block
  tailchk [block X]     - Calculate tailchk for block (default: current block)
  tailchk apply [block X] - Apply tailchk: write calculated value to block
  repair [block X]      - Repair bad block (fix seq_kcbh, tailchk, checksum)
  copy data <src> to <dest>  - Copy data between files
    <src> format: file,block,offset,count (e.g., 1,1,10,64)
    <dest> format: file[,block][,offset] (e.g., 3 or 3,1 or 3,1,128)
  copy block file#,block# to file#,block#     - Copy entire data block
  copy chkscn file n to file m         - Copy datafile header checkpoint SCN info
  copy resetlogscn file n to file m    - Copy datafile header resetlogs info
  backup [file n] [block m] - Backup data block (optional file and block parameters)
  corrupt [block X]     - Mark block as corrupted (default: current block)
  show                  - Display current settings (filename, blocksize, block, offset, count, mode)
  license               - Show/manage software license (registration code required)
  version               - Show software version and developer information
  p/print <param>       - Print Oracle structure,Use the 'p/print' command to see details
  undo                  - Undo the last copy chkscn or copy resetlogscn operation
  spool <file>          - Start logging to file (spool off to stop)
  help                  - Show this help message
  quit/exit             - Exit OBET

OBET> open 1.txt                 
Loaded 2 files from config file '1.txt'.

OBET> info

Loaded files (2 total):
----------------------------------------
Number  Path
----------------------------------------
     1  /data/app/oracle/oradata/orcl/system01.dbf
    44  /data/app/oracle/oradata/orcl/xff26.dbf
----------------------------------------

OBET> set file 44
filename set to: /data/app/oracle/oradata/orcl/xff26.dbf (file#44)

OBET> d

File: /data/app/oracle/oradata/orcl/xff26.dbf
Block: 1                Offsets:     0 to    31
--------------------------------------------------------------------------------
0xifenfei000 0BA20000 0100000B 00000000 00000104 B05F0000 00000000 0004200B 65583D61 

<32 bytes read>


OBET> backup file 44 block 1
Created backup directory: backup_blk
Successfully backed up block 1 from file#44 to /tmp/backup_blk/xff26.dbf_1.20260106182405

OBET> copy resetlogscn file 1 to file 44
Error: Edit mode not enabled. Use 'set mode edit' first.

OBET> set mode edit     
mode set to: edit

OBET> copy resetlogscn file 1 to file 44

Confirm Modify resetlogscn:
Source: file#1 (/data/app/oracle/oradata/orcl/system01.dbf)
Target: file#44 (/data/app/oracle/oradata/orcl/xff26.dbf)
Proceed? (Y/YES to confirm): y
Successfully copied resetlog SCN information from file#1 to file#44.

OBET> copy chkscn file 1 to file 44

Confirm Modify chkscn:
Source: file#1 (/data/app/oracle/oradata/orcl/system01.dbf)
Target: file#44 (/data/app/oracle/oradata/orcl/xff26.dbf)
Proceed? (Y/YES to confirm): y
Successfully copied checkpoint SCN information from file#1 to file#44.

OBET> tailchk
Check tailchk for File /data/app/oracle/oradata/orcl/xff26.dbf, Block 1:
current = 0x010B0000, required = 0x010B0000

OBET> sum
Check value for File /data/app/oracle/oradata/orcl/xff26.dbf, Block 1:
current = 0xB05F, required = 0x6DA3

OBET> sum apply

Confirm applying checksum:
File: /data/app/oracle/oradata/orcl/xff26.dbf
Block: 1
Offset in block: 16 (file offset: 0x0xifenfei010)
Original value: 0xB05F
New value:      0x6DA3
Confirm? (Y/YES to proceed): y
Verification successful: Stored checksum matches calculated value (0x6DA3).
Checksum applied successfully.

然后重建ctl正常打开数据库

[oracle@xifenfei check_db]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Tue Jan 6 18:29:22 2026

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 6.8413E+10 bytes
Fixed Size		    2270360 bytes
Variable Size		 3.3554E+10 bytes
Database Buffers	 3.4762E+10 bytes
Redo Buffers		   93585408 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01190: control file or data file 1 is from before the last RESETLOGS
ORA-01110: data file 1: '/data/app/oracle/oradata/orcl/system01.dbf'


SQL> shutdown immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.
SQL> 
SQL> 
SQL> startup nomount pfile='/tmp/pfile
';
ORACLE instance started.

Total System Global Area 6.8413E+10 bytes
Fixed Size		    2270360 bytes
Variable Size		 3.3554E+10 bytes
Database Buffers	 3.4762E+10 bytes
Redo Buffers		   93585408 bytes
SQL> @ctl.sql

Control file created.

SQL> SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-00264: no recovery required


SQL> alter database open;

Database altered.

后续根据情况选择导出表,对于导出报错的表再进行二次处理.

使用DBMS_PDB.RECOVER抢救单个pdb

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:使用DBMS_PDB.RECOVER抢救单个pdb

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于存储异常导致数据库启动报ORA-1172错误
ORA-1172


尝试recover datafile 1 报ORA-600 kdBlkCheckError错误

Wed Jan 07 14:31:17 2026
Decreasing number of real time LMS from 5 to 0
ALTER DATABASE RECOVER  datafile 1
Wed Jan 07 14:31:21 2026
Media Recovery Start
Wed Jan 07 14:31:21 2026
Serial Media Recovery started
Wed Jan 07 14:31:21 2026
Recovery of Online Redo Log: Thread 2 Group 24 Seq 2584 Reading mem 0
  Mem# 0: +DATA/CDBDG/ONLINELOG/group_24.393.1122613933
  Mem# 1: +ARCH/CDBDG/ONLINELOG/group_24.276.1122613933
Wed Jan 07 14:31:21 2026
Recovery of Online Redo Log: Thread 1 Group 18 Seq 13823 Reading mem 0
  Mem# 0: +DATA/CDBDG/ONLINELOG/group_18.387.1122613923
  Mem# 1: +ARCH/CDBDG/ONLINELOG/group_18.270.1122613925
Wed Jan 07 14:31:22 2026
Recovery of Online Redo Log: Thread 1 Group 19 Seq 13824 Reading mem 0
  Mem# 0: +DATA/CDBDG/ONLINELOG/group_19.388.1122613925
  Mem# 1: +ARCH/CDBDG/ONLINELOG/group_19.271.1122613925
Errors in file /u01/app/oracle/diag/rdbms/cdbdg/cdb1/trace/cdb1_ora_2753010.trc (incident=806358) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kdBlkCheckError], [1], [105538], [6401], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/cdbdg/cdb1/incident/incdir_806358/cdb1_ora_2753010_i806358.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 07 14:31:25 2026
Media Recovery failed with error 10562
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 1  ...

通过dbv检查file 1(system)数据文件,发现大量坏块

bash-4.2$ dbv file=system.271.1122584815

DBVERIFY: Release 12.1.0.2.0 - Production on Wed Jan 7 20:15:26 2026

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /data/bak/system.271.1122584815

DBV-00200: Block, DBA 4195214, already marked corrupt

DBV-00200: Block, DBA 4210332, already marked corrupt

DBV-00200: Block, DBA 4210333, already marked corrupt
……………………
DBV-00200: Block, DBA 4312889, already marked corrupt

DBV-00200: Block, DBA 4313119, already marked corrupt

DBV-00200: Block, DBA 4313122, already marked corrupt


DBVERIFY - Verification complete

Total Pages Examined         : 524288
Total Pages Processed (Data) : 79989
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 20781
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 408840
Total Pages Processed (Seg)  : 1
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 14678
Total Pages Marked Corrupt   : 33
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 3135037281 (4374.3135037281)

分析发现这个库是包含了多个pdb,客户的业务主要在con_id=3的pdb中

bash-4.2$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Wed Jan 7 20:27:42 2026

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       MOUNTED
         3 XFF                            MOUNTED
         4 DYCW                           MOUNTED
SQL>

尝试对除cdb的system/sysaux之外的文件进行recover操作均成功
2


对于这种情况,比较好的处理思路就是通过对单独的pdb文件使用DBMS_PDB.RECOVER处理,然后插入到一个新的cdb中,第一步使用dbca创建一个和当前库字符集一样的cdb

dbca -silent -ignorePreReqs -createDatabase -templateName $ORACLE_HOME/assistants/dbca/templates/General_Purpose.dbc \
-responseFile NO_VALUE \
-gdbname oratest  -sid oratest \
-createAsContainerDatabase TRUE \
-sysPassword oracle -systemPassword oracle -pdbAdminPassword oracle -dbsnmpPassword oracle \
-datafileDestination '/data/oradata' \
-redoLogFileSize 200 \
-characterset ZHS16GBK -nationalCharacterSet AL16UTF16 \
-totalMemory 4096 \
-databaseType MULTIPURPOSE \
-emConfiguration NONE

然后利用老的pdb生成需要的xml文件

BEGIN
DBMS_PDB.RECOVER(PDB_DESCR_FILE=> '/tmp/XFF.xml', PDB_NAME=>'XFF', 
FILENAMES=> '/data/bak/system.268.1122584817,
/data/bak/system.441.1125224053,
/data/bak/sysaux.269.1122584817,
…………
/data/bak/odin.312.1122593153,
/data/bak/dsg.314.1122593223');
END; 
/

利用该xml文件,直接把这个pdb插入到新cdb中

Wed Jan 07 14:40:52 2026
create pluggable database XFF using '/tmp/XFF.xml' nocopy
Wed Jan 07 14:41:29 2026
Session (221,18860): EXTERNAL SCN ALERT: Advanced SCN by 3185580 minutes worth to 0x1116.bade672c,
    by Pluggable Database plug-in
Session (221,18860): EXTERNAL SCN SOURCE: Inbound connection from client
Session (221,18860): EXTERNAL SCN SOURCE: DB Logon User: SYS, Remote Machine: p770a,
  Program: sqlplus@p770a (TNS V1-V3),OS User: oracle
****************************************************************
Pluggable Database XFF with pdb id - 3 is created as UNUSABLE.
If any errors are encountered before the pdb is marked as NEW,
then the pdb must be dropped
****************************************************************
Database Characterset for XFF is ZHS16GBK
Deleting old file#8 from file$ 
Deleting old file#9 from file$ 
Deleting old file#10 from file$ 
…………
Deleting old file#148 from file$ 
Deleting old file#199 from file$ 
Deleting old file#200 from file$ 
Adding new file#9 to file$(old file#8) 
Adding new file#8 to file$(old file#145) 
Adding new file#10 to file$(old file#9) 
…………
Adding new file#19 to file$(old file#148) 
Adding new file#18 to file$(old file#199) 
Adding new file#17 to file$(old file#200) 
Marking tablespace #2 invalid since it is not present   in the describe file 
Successfully created internal service XFF at open
ALTER SYSTEM: Flushing buffer cache inst=0 container=3 local
Wed Jan 07 14:41:49 2026
****************************************************************
Post plug operations are now complete.
Pluggable database XFF with pdb id - 3 is now marked as NEW.
****************************************************************
Completed: create pluggable database XFF using '/tmp/XFF.xml' nocopy
alter database open
Wed Jan 07 14:43:21 2026
Pluggable database XFF dictionary check beginning
Tablespace 'TEMP' #2 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Pluggable Database XFF Dictionary check complete
Database Characterset for XFF is ZHS16GBK
*********************************************************************
WARNING: The following temporary tablespaces in container(XFF)
         contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
 
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
 
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
***************************************************************
WARNING: Pluggable Database XFF with pdb id - 3 is
         altered with errors or warnings. Please look into
         PDB_PLUG_IN_VIOLATIONS view for more details.
***************************************************************
Opening pdb XFF (3) with no Resource Manager plan active
Pluggable database XFF opened read write
Completed: alter database open
alter tablespace temp add tempfile '/data/bak/temp01.dbf' size 16m autoextend on
Completed: alter tablespace temp add tempfile '/data/bak/temp01.dbf' size 16m autoextend on
ALTER PLUGGABLE DATABASE CLOSE IMMEDIATE
Wed Jan 07 14:46:08 2026
Immediate Kill Session#: 366, Serial#: 2995
Immediate Kill Session: sess: 7000100bafb5f58  OS pid: 5637924
ALTER SYSTEM: Flushing buffer cache inst=0 container=3 local
Wed Jan 07 14:46:22 2026
Pluggable database XFF closed
Completed: ALTER PLUGGABLE DATABASE CLOSE IMMEDIATE
alter database open
Wed Jan 07 14:46:48 2026
Database Characterset for XFF is ZHS16GBK
***************************************************************
WARNING: Pluggable Database XFF with pdb id - 3 is
         altered with errors or warnings. Please look into
         PDB_PLUG_IN_VIOLATIONS view for more details.
***************************************************************
Opening pdb XFF (3) with no Resource Manager plan active
Pluggable database XFF opened read write
Completed: alter database open

3


这次4T多的aix环境rac数据库的单个pdb通过DBMS_PDB.RECOVER方法非常完美的恢复,DBMS_PDB.RECOVER是拯救单个pdb的神兵利器.

数据库启动报 maximum number of processes () exceeded分析

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:数据库启动报 maximum number of processes () exceeded分析

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

比较少遇到打开一个数据库报ORA-00020: maximum number of processes () exceeded,导致open数据库失败

SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00450: background process 'QMNC' did not start
ORA-00444: background process "QMNC" failed while starting
ORA-00020: maximum number of processes () exceeded
进程 ID: 6340
会话 ID: 109 序列号: 3

看这个报错应该是由于进程数达到了限制,导致在open数据库过程中启动后台进程QMNC失败,导致open过程失败,分析进程数不足的原因,查看启动时候的参数配置

Fri Dec 26 16:48:27 2025
Adjusting the default value of parameter parallel_max_servers
from 2560 to 135 due to the value of parameter processes (150)
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
Fri Dec 26 16:48:38 2025
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =28
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Windows NT Version V6.2  
CPU                 : 64 - type 8664, 32 Physical Cores
Process Affinity    : 0x0x0000000000000000
Memory (Avail/Total): Ph:82501M/130752M, Ph+PgF:103658M/150208M 
Using parameter settings in client-side pfile D:\PFILE.TXT on machine HISSERVER
System parameters with non-default values:
  processes                = 150
  sga_target               = 39296M
  control_files            = "D:\ORACLE64\ORADATA\ORA817\CONTROL01.CTL"
  control_files            = "D:\ORACLE64\RECOVERY_AREA\ORA817\CONTROL02.CTL"
  db_block_size            = 8192
  compatible               = "11.2.0.0.0"
  db_recovery_file_dest    = "D:\ORACLE64\recovery_area"
  db_recovery_file_dest_size= 4122M
  undo_tablespace          = "UNDOTBS1"
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  dispatchers              = "(PROTOCOL=TCP) (SERVICE=ora817XDB)"
  audit_file_dest          = "D:\ORACLE64\ADMIN\ORA817\ADUMP"
  audit_trail              = "DB"
  db_name                  = "ora817"
  open_cursors             = 300
  pga_aggregate_target     = 13075M
  diagnostic_dest          = "D:\ORACLE64"
Fri Dec 26 16:49:10 2025
PMON started with pid=2, OS id=1992 
Fri Dec 26 16:49:10 2025
PSP0 started with pid=3, OS id=6744 
Fri Dec 26 16:49:12 2025
VKTM started with pid=4, OS id=7592 at elevated priority
Fri Dec 26 16:49:12 2025
GEN0 started with pid=5, OS id=7780 
Fri Dec 26 16:49:12 2025
DBRM started with pid=7, OS id=5372 
Fri Dec 26 16:49:12 2025
MMAN started with pid=9, OS id=6700 
Fri Dec 26 16:49:12 2025
DBW0 started with pid=10, OS id=1920 
Fri Dec 26 16:49:12 2025
DBW1 started with pid=11, OS id=4476 
Fri Dec 26 16:49:12 2025
DBW2 started with pid=12, OS id=6788 
Fri Dec 26 16:49:12 2025
DBW3 started with pid=13, OS id=4608 
Fri Dec 26 16:49:12 2025
DBW4 started with pid=14, OS id=6716 
Fri Dec 26 16:49:12 2025
DBW5 started with pid=15, OS id=6252 
Fri Dec 26 16:49:12 2025
DBW6 started with pid=16, OS id=7948 
Fri Dec 26 16:49:12 2025
DBW7 started with pid=17, OS id=5868 
Fri Dec 26 16:49:12 2025
LGWR started with pid=18, OS id=1644 
Fri Dec 26 16:49:12 2025
CKPT started with pid=19, OS id=5704 
Fri Dec 26 16:49:12 2025
SMON started with pid=20, OS id=4532 
Fri Dec 26 16:49:12 2025
RECO started with pid=21, OS id=380 
Fri Dec 26 16:49:12 2025
MMON started with pid=22, OS id=1316 
Fri Dec 26 16:49:12 2025
MMNL started with pid=23, OS id=6108 
Fri Dec 26 16:49:12 2025
DIAG started with pid=6, OS id=7904 
Fri Dec 26 16:49:12 2025
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
ORACLE_BASE from environment = D:\ORACLE64

这里可以看出来数据库版本是11.2.0.3,processes配置为默认值150.进一步查看启动报错时候的alert日志

ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Fri Dec 26 17:03:44 2025
Media Recovery failed with error 264
Slave exiting with ORA-283 exception
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_pr00_344.trc:
ORA-00283: recovery session canceled due to errors
ORA-00264: no recovery required
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...
alter database open
Fri Dec 26 17:03:48 2025
Thread 1 opened at log sequence 37301
  Current log# 2 seq# 37301 mem# 0: D:\ORACLE64\ORADATA\ORA817\REDO02.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Dec 26 17:03:49 2025
SMON: enabling cache recovery
[6340] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:5143437 end:5143718 diff:281 (2 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Fri Dec 26 17:03:49 2025
Database Characterset is ZHS16GBK
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
ORA-00020: maximum number of processes (150) exceeded
 ORA-20 errors will not be written to the alert log for
 the next minute. Please look at trace files to see all
 the ORA-20 errors.
Process P123 submission failed with error = 20
Starting background process QMNC
Process QMNC submission failed with error = 20
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_ora_6340.trc:
ORA-00444: 后台进程 "QMNC" 启动失败
ORA-00020: 超出最大进程数 ()
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_ora_6340.trc:
ORA-00450: 后台进程 'QMNC' 未启动
ORA-00444: 后台进程 "QMNC" 启动失败
ORA-00020: 超出最大进程数 ()
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_ora_6340.trc:
ORA-00450: 后台进程 'QMNC' 未启动
ORA-00444: 后台进程 "QMNC" 启动失败
ORA-00020: 超出最大进程数 ()
Error 450 happened during db open, shutting down database
USER (ospid: 6340): terminating the instance due to error 450
Fri Dec 26 17:03:50 2025
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_p036_8044.trc  (incident=4105):
ORA-01578: ORACLE data block corrupted (file # 9, block # 504643)
ORA-01110: data file 9: 'D:\ORACLE64\ORADATA\ORA817\HRPRUN_DATA05.DBF'
Incident details in: D:\ORACLE64\diag\rdbms\ora817\ora817\incident\incdir_4105\ora817_p036_8044_i4105.trc
Fri Dec 26 17:03:50 2025
Process m000 submission failed with error = 20
Fri Dec 26 17:04:07 2025
Instance terminated by USER, pid = 6340
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (6340) as a result of ORA-1092
Fri Dec 26 17:04:09 2025
ORA-1092 : opitsk aborting process

看alert日志比较明显有Process P123 submission failed with error = 20提示,这里表示启动P123进程时候也遭遇到了进程数不足的限制导致该进程无法启动,而Pxxx进程是oracle的并行进程(而且在后面还报了由于坏块导致P036进程报错的信息),也就是说这个数据库在启动过程中启动了大量的并行进程(至少122个已经启动),这也就是说明了为什么150个进程不足的原因.在数据库完成实例恢复,完全open之前启动并行进程进行恢复的,很可能就是并行回滚操作.为了进一步验证猜想,把processes调整到1500,然后尝试启动库

Fri Dec 26 17:06:44 2025
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Started redo scan
Completed redo scan
 read 20 KB redo, 5 data blocks need recovery
Started redo application at
 Thread 1: logseq 37301, block 18209
Recovery of Online Redo Log: Thread 1 Group 2 Seq 37301 Reading mem 0
  Mem# 0: D:\ORACLE64\ORADATA\ORA817\REDO02.LOG
Completed redo application of 0.02MB
Completed crash recovery at
 Thread 1: logseq 37301, block 18249, scn 327081807
 5 data blocks read, 5 data blocks written, 20 redo k-bytes read
Fri Dec 26 17:06:46 2025
Thread 1 advanced to log sequence 37302 (thread open)
Thread 1 opened at log sequence 37302
  Current log# 3 seq# 37302 mem# 0: D:\ORACLE64\ORADATA\ORA817\REDO03.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Dec 26 17:06:47 2025
SMON: enabling cache recovery
[7104] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:5321812 end:5322031 diff:219 (2 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Fri Dec 26 17:06:49 2025
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_p091_5924.trc  (incident=5745):
ORA-01578: ORACLE data block corrupted (file # 9, block # 504643)
ORA-01110: data file 9: 'D:\ORACLE64\ORADATA\ORA817\HRPRUN_DATA05.DBF'
Incident details in: D:\ORACLE64\diag\rdbms\ora817\ora817\incident\incdir_5745\ora817_p091_5924_i5745.trc
Starting background process QMNC
Fri Dec 26 17:06:49 2025
QMNC started with pid=156, OS id=8012 
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_p091_5924.trc  (incident=5746):
ORA-01578: ORACLE data block corrupted (file # , block # )
Incident details in: D:\ORACLE64\diag\rdbms\ora817\ora817\incident\incdir_5746\ora817_p091_5924_i5746.trc
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_smon_7424.trc  (incident=4961):
ORA-01578: ORACLE data block corrupted (file # 9, block # 504643)
ORA-01110: data file 9: 'D:\ORACLE64\ORADATA\ORA817\HRPRUN_DATA05.DBF'
Incident details in: D:\ORACLE64\diag\rdbms\ora817\ora817\incident\incdir_4961\ora817_smon_7424_i4961.trc
Completed: alter database open
SMON: Restarting fast_start parallel rollback
Fri Dec 26 17:06:51 2025
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_p000_5472.trc  (incident=5009):
ORA-01578: ORACLE data block corrupted (file # 9, block # 504643)
ORA-01110: data file 9: 'D:\ORACLE64\ORADATA\ORA817\HRPRUN_DATA05.DBF'
Incident details in: D:\ORACLE64\diag\rdbms\ora817\ora817\incident\incdir_5009\ora817_p000_5472_i5009.trc
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_p000_5472.trc  (incident=5010):
ORA-01578: ORACLE data block corrupted (file # , block # )
Errors in file D:\ORACLE64\diag\rdbms\ora817\ora817\trace\ora817_smon_7424.trc  (incident=4962):
ORA-01578: ORACLE data block corrupted (file # 9, block # 504643)
ORA-01110: data file 9: 'D:\ORACLE64\ORADATA\ORA817\HRPRUN_DATA05.DBF'
Incident details in: D:\ORACLE64\diag\rdbms\ora817\ora817\incident\incdir_5010\ora817_p000_5472_i5010.trc

这次就比较清晰了,数据库在open过程中由于smon启动了并行回滚并遇到了坏块,导致P091进程和smon进程错误。因为数据库已经open,这个坏块导致无法回滚,直接对这个对象进行处理即可解决掉这个回滚报错问题.