asm磁盘类似_DROPPED_0001_DATA名称故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:asm磁盘类似_DROPPED_0001_DATA名称故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

发现一客户数据库的asm磁盘组中有磁盘掉线(通过分析日志确认2016年就已经掉线,而且不在做rebalance)
20201205195855


20201205221937

进一步检查

SQL> /

NAME			       PATH		  GROUP_NUMBER DISK_NUMBER MOUNT_STATUS   HEADER_STATUS
------------------------------ --------------------- ------------ ----------- -------------- ------------------------
MODE_STATUS    STATE		FAILGROUP
-------------- ---------------- --------------------
			       ORCL:DATA2	  0		 0 CLOSED	  MEMBER
ONLINE	       NORMAL

			       ORCL:FLASH1	  0		 1 CLOSED	  MEMBER
ONLINE	       NORMAL

			       ORCL:GRID3	  0		 2 CLOSED	  MEMBER
ONLINE	       NORMAL

_DROPPED_0000_FLASH				  2		 0 MISSING	  UNKNOWN
OFFLINE        FORCING		FLASH1

_DROPPED_0001_DATA				  1		 1 MISSING	  UNKNOWN
OFFLINE        FORCING		DATA2

DATA1			       ORCL:DATA1	  1		 0 CACHED	  MEMBER
ONLINE	       NORMAL		DATA1

FLASH2			       ORCL:FLASH2	  2		 1 CACHED	  MEMBER
ONLINE	       NORMAL		FLASH2

GRID1			       ORCL:GRID1	  3		 0 CACHED	  MEMBER
ONLINE	       NORMAL		GRID1

GRID2			       ORCL:GRID2	  3		 1 CACHED	  MEMBER
ONLINE	       NORMAL		GRID2

GRID4			       ORCL:GRID4	  3		 3 CACHED	  MEMBER
ONLINE	       NORMAL		GRID4

GRID5			       ORCL:GRID5	  3		 4 CACHED	  MEMBER
ONLINE	       NORMAL		GRID5

GRID6			       ORCL:GRID6	  3		 5 CACHED	  MEMBER
ONLINE	       NORMAL		GRID6


12 rows selected.


SQL> select NAME,STATE,TYPE,OFFLINE_DISKS from v$asm_diskgroup;

NAME
------------------------------------------------------------
STATE		       TYPE	    OFFLINE_DISKS
---------------------- ------------ -------------
DATA
MOUNTED 	       NORMAL			1

FLASH
MOUNTED 	       NORMAL			1

GRID
MOUNTED 	       NORMAL			0

主要问题是由于ORCL:FLASH1和ORCL:DATA2磁盘掉线导致处于_DROPPED_0000_FLASH和_DROPPED_0001_DATA状态.底层检查,确定现在这些磁盘都正常.然后使用force命令进行强制增加掉线的磁盘到对应的磁盘组中

SQL> alter diskgroup FLASH add failgroup flg1 disk 'ORCL:FLASH1'  force;

Diskgroup altered.

SQL> alter diskgroup data add failgroup dg2 disk 'ORCL:DATA2'  force;

Diskgroup altered.

观察asm 日志,等rebalance完成

Sat Dec 05 16:48:10 2020
SQL> alter diskgroup FLASH add failgroup flg1 disk 'ORCL:FLASH1'  force 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (2,2) to disk (ORCL:FLASH1)
NOTE: requesting all-instance membership refresh for group=2
NOTE: initializing header on grp 2 disk FLASH1
NOTE: requesting all-instance disk validation for group=2
Sat Dec 05 16:48:13 2020
NOTE: skipping rediscovery for group 2/0x58e713e7 (FLASH) on local instance.
NOTE: requesting all-instance disk validation for group=2
NOTE: skipping rediscovery for group 2/0x58e713e7 (FLASH) on local instance.
Sat Dec 05 16:48:19 2020
GMON updating for reconfiguration, group 2 at 14 for pid 34, osid 12203
NOTE: group 2 PST updated.
NOTE: initiating PST update: grp = 2
GMON updating group 2 at 15 for pid 34, osid 12203
NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH
NOTE: group FLASH: updated PST location: disk 0001 (PST copy 0)
NOTE: group FLASH: updated PST location: disk 0002 (PST copy 1)
NOTE: PST update grp = 2 completed successfully 
NOTE: membership refresh pending for group 2/0x58e713e7 (FLASH)
GMON querying group 2 at 16 for pid 18, osid 41180
NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH
NOTE: cache opening disk 2 of grp 2: FLASH1 label:FLASH1
NOTE: Attempting voting file refresh on diskgroup FLASH
NOTE: Refresh completed on diskgroup FLASH. No voting file found.
GMON querying group 2 at 17 for pid 18, osid 41180
NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH
Sat Dec 05 16:48:25 2020
SUCCESS: refreshed membership for 2/0x58e713e7 (FLASH)
Sat Dec 05 16:48:25 2020
SUCCESS: alter diskgroup FLASH add failgroup flg1 disk 'ORCL:FLASH1'  force
NOTE: starting rebalance of group 2/0x58e713e7 (FLASH) at power 1
Starting background process ARB0
Sat Dec 05 16:48:26 2020
ARB0 started with pid=36, OS id=12451 
NOTE: assigning ARB0 to group 2/0x58e713e7 (FLASH) with 1 parallel I/O
cellip.ora not found.
NOTE: F1X0 copy 2 relocating from 0:2 to 2:2 for diskgroup 2 (FLASH)
NOTE: Attempting voting file refresh on diskgroup FLASH
NOTE: Refresh completed on diskgroup FLASH. No voting file found.
Sat Dec 05 16:48:45 2020
NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group FLASH
Sat Dec 05 16:49:06 2020
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 2/0x58e713e7 (FLASH)
Sat Dec 05 16:49:08 2020
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=2
Sat Dec 05 16:49:11 2020
GMON updating for reconfiguration, group 2 at 18 for pid 36, osid 12681
NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH
NOTE: group FLASH: updated PST location: disk 0001 (PST copy 0)
NOTE: group FLASH: updated PST location: disk 0002 (PST copy 1)
NOTE: group 2 PST updated.
SUCCESS: grp 2 disk _DROPPED_0000_FLASH going offline 
GMON updating for reconfiguration, group 2 at 19 for pid 36, osid 12681
NOTE: cache closing disk 0 of grp 2: (not open) _DROPPED_0000_FLASH
NOTE: group FLASH: updated PST location: disk 0001 (PST copy 0)
NOTE: group FLASH: updated PST location: disk 0002 (PST copy 1)
NOTE: group 2 PST updated.
NOTE: membership refresh pending for group 2/0x58e713e7 (FLASH)
GMON querying group 2 at 20 for pid 18, osid 41180
GMON querying group 2 at 21 for pid 18, osid 41180
NOTE: Disk _DROPPED_0000_FLASH in mode 0x0 marked for de-assignment
SUCCESS: refreshed membership for 2/0x58e713e7 (FLASH)
Sat Dec 05 16:51:56 2020
SQL> alter diskgroup data add failgroup dg2 disk 'ORCL:DATA2'  force 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (1,2) to disk (ORCL:DATA2)
NOTE: requesting all-instance membership refresh for group=1
NOTE: initializing header on grp 1 disk DATA2
NOTE: requesting all-instance disk validation for group=1
Sat Dec 05 16:51:57 2020
NOTE: skipping rediscovery for group 1/0x58d713e6 (DATA) on local instance.
NOTE: requesting all-instance disk validation for group=1
NOTE: skipping rediscovery for group 1/0x58d713e6 (DATA) on local instance.
Sat Dec 05 16:52:02 2020
GMON updating for reconfiguration, group 1 at 22 for pid 34, osid 12203
NOTE: group 1 PST updated.
NOTE: initiating PST update: grp = 1
GMON updating group 1 at 23 for pid 34, osid 12203
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA
NOTE: group DATA: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATA: updated PST location: disk 0002 (PST copy 1)
NOTE: PST update grp = 1 completed successfully 
NOTE: membership refresh pending for group 1/0x58d713e6 (DATA)
GMON querying group 1 at 24 for pid 18, osid 41180
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA
NOTE: cache opening disk 2 of grp 1: DATA2 label:DATA2
Sat Dec 05 16:52:08 2020
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
GMON querying group 1 at 25 for pid 18, osid 41180
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA
SUCCESS: refreshed membership for 1/0x58d713e6 (DATA)
Sat Dec 05 16:52:08 2020
SUCCESS: alter diskgroup data add failgroup dg2 disk 'ORCL:DATA2'  force
NOTE: starting rebalance of group 1/0x58d713e6 (DATA) at power 1
Starting background process ARB0
Sat Dec 05 16:52:08 2020
ARB0 started with pid=37, OS id=13463 
NOTE: assigning ARB0 to group 1/0x58d713e6 (DATA) with 1 parallel I/O
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Sat Dec 05 16:52:44 2020
cellip.ora not found.
NOTE: F1X0 copy 2 relocating from 1:2 to 2:2 for diskgroup 1 (DATA)
Sat Dec 05 16:53:22 2020
NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group DATA
NOTE: membership refresh pending for group 1/0x58d713e6 (DATA)
GMON querying group 1 at 27 for pid 18, osid 41180
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA
SUCCESS: refreshed membership for 1/0x58d713e6 (DATA)
SUCCESS: alter diskgroup data rebalance power 11
NOTE: starting rebalance of group 1/0x58d713e6 (DATA) at power 11
Starting background process ARB0
Sat Dec 05 17:27:52 2020
ARB0 started with pid=35, OS id=23318 
NOTE: assigning ARB0 to group 1/0x58d713e6 (DATA) with 11 parallel I/Os
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Sat Dec 05 17:28:29 2020
cellip.ora not found.
Sat Dec 05 17:28:45 2020
NOTE: Rebalance has restored redundancy for any existing control file or redo log in disk group DATA
Sat Dec 05 18:48:10 2020
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
Sat Dec 05 18:48:32 2020
GMON updating for reconfiguration, group 1 at 28 for pid 36, osid 47454
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA
NOTE: group DATA: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATA: updated PST location: disk 0002 (PST copy 1)
Sat Dec 05 18:48:32 2020
NOTE: group 1 PST updated.
SUCCESS: grp 1 disk _DROPPED_0001_DATA going offline 
GMON updating for reconfiguration, group 1 at 29 for pid 36, osid 47454
NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DATA
NOTE: group DATA: updated PST location: disk 0000 (PST copy 0)
NOTE: group DATA: updated PST location: disk 0002 (PST copy 1)
NOTE: group 1 PST updated.
Sat Dec 05 18:48:32 2020
NOTE: membership refresh pending for group 1/0x58d713e6 (DATA)
GMON querying group 1 at 30 for pid 18, osid 41180
GMON querying group 1 at 31 for pid 18, osid 41180
NOTE: Disk _DROPPED_0001_DATA in mode 0x0 marked for de-assignment
SUCCESS: refreshed membership for 1/0x58d713e6 (DATA)
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Sat Dec 05 18:52:24 2020
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 1/0x58d713e6 (DATA)

查询磁盘状态,掉线磁盘已经被加入,asm磁盘组恢复正常
20201205201841


20201205201851
总结:对于normal磁盘组由于某种原因磁盘从磁盘组中掉,v$asm_disk.name类似_DROPPED_0001_DATA,v$asm_disk.state为FORCING,可以通过类似alter diskgroup data add failgroup dg2 disk ‘ORCL:DATA2′ force;方式强制增加掉线的磁盘进入磁盘组,然后待rebalance完成,问题修复

12C数据库报ORA-600 kcbzib_kcrsds_1故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:12C数据库报ORA-600 kcbzib_kcrsds_1故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友由于主机断电,导致数据库无法启动,需要我们提供恢复支持.通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)检查分析,发现存放在一个磁盘的数据文件scn过小,数据库无法正常open,需要历史redo(已经被覆盖,而且数据库未归档).
20201130202509
20201130202545


对于这种情况,选择强制拉库,然后数据库报ORA-600 kcbzib_kcrsds_1错误
20201130203023
20201130203204

对于此类故障参考以前处理过类似案例:ORA-600 kcbzib_kcrsds_1报错,对数据库scn进行修改,数据库顺利open
20201130203529

再次遇到ORA-600 kokasgi1故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:再次遇到ORA-600 kokasgi1故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到一个朋友数据库无法启动,报错ORA-00600: 内部错误代码, 参数: [kokasgi1]错误:

Sun Nov 29 14:23:15 2020
Thread 1 advanced to log sequence 28572 (thread open)
Thread 1 opened at log sequence 28572
  Current log# 3 seq# 28572 mem# 0: /u01/app/oracle/oradata/xifenfei/redo03.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Nov 29 14:23:15 2020
SMON: enabling cache recovery
[3505] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:135724 end:135754 diff:30 (0 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
ORA-00600: 内部错误代码, 参数: [kokasgi1], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_3505.trc:
ORA-00600: 内部错误代码, 参数: [kokasgi1], [], [], [], [], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_3505.trc:
ORA-00600: 内部错误代码, 参数: [kokasgi1], [], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3505): terminating the instance due to error 600
Instance terminated by USER, pid = 3505
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (3505) as a result of ORA-1092

对于这个错误,以前处理过一些类似故障,参考:
ORA-600 kokasgi1故障恢复
重命名sys用户引起数据库启动报ORA-01092 ORA-00600 kokasgi1错误
通过技术手段实现数据库open,查询user$信息
20201129175609


确认sys和system被分别被修改为sysa和systema,由于这个调整导致数据库无法正常启动.把sysa修改为sys,systema修改为system,数据库正常启动,业务测试正常.
20201129175803

sysaux表空间不足—WRH$_ACTIVE_SESSION_HISTORY

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:sysaux表空间不足—WRH$_ACTIVE_SESSION_HISTORY

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个老生常谈的问题遇到oracle sysaux表空间不足,通过awrinfo 分析,初步判断很可能是ash相关表没有正常清理分区导致
20201128205515


进一步分析确认异常对象
20201128205618

20201128205704

通过分析基本上可以确认是由于WRH$_ACTIVE_SESSION_HISTORY没有创建新分析,数据未被及时清理.根据官方WRH$_ACTIVE_SESSION_HISTORY Does Not Get Purged (Doc ID 387914.1)处理建议
alter session set “_swrf_test_action” = 72;
增加新分区,过一段时间之后,自动清理WRH$_ACTIVE_SESSION_HISTORY表历史数据.
20201128210106

对于sysaux表空间不足的情况,还遇到过类似情况:
sysaux中HEATMAP 对象较大
WRI$_ADV_OBJECTS表过大,导致SYSAUX表空间不足

模拟19c数据库root pdb undo异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:模拟19c数据库root pdb undo异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于19c在pdb情况下三种常见故障进行了模拟测试:
模拟19c数据库redo异常恢复
模拟19c数据库pdb undo异常恢复
模拟19c数据库root pdb undo异常恢复
模拟oracle 19c cdb模式下root pdb中undo丢失故障恢复
会话1,pdb中插入大量数据,未提交

SQL> alter session set container=pdb; 

Session altered.

SQL> alter database open;

Database altered.

SQL> create user xff identified by oracle default tablespace users;
grant dba to xff;
conn xff/oracle@127.0.0.1/pdb
create table t_xifenfei as select * from dba_objects;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;

User created.

SQL> 
Grant succeeded.

SQL> Connected.
SQL> 

Table created.

SQL> 
72351 rows created.

SQL> 
144702 rows created.

SQL> 
289404 rows created.

SQL> 
578808 rows created.

SQL> 

1157616 rows created.

SQL> SQL> SQL> 

会话2中root pdb模拟事务

[oracle@localhost ~]$ ss

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Nov 16 16:56:01 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.5.0.0.0

SQL> 
SQL> 
SQL> conn system/oracle
Connected.
SQL> create table t_xifenfei tablespace users as select * from dba_objects;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;
insert into t_xifenfei select * from t_xifenfei;

Table created.

SQL> 
72380 rows created.

SQL> 
144760 rows created.

SQL> 
289520 rows created.

SQL> 
579040 rows created.

SQL> 

1158080 rows created.

SQL> SQL> 

会话3 abort库并删除root pdb中undo文件

[oracle@localhost ~]$ ss

SQL*Plus: Release 19.0.0.0.0 - Production on Mon Nov 16 16:56:55 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.5.0.0.0

SQL> shutdown abort;
ORACLE instance shut down.
SQL> 

[oracle@localhost oradata]$ cd ORA19C
[oracle@localhost ORA19C]$ ls
control01.ctl  control02.ctl  pdb  pdbseed  redo01.log  redo02.log  redo03.log  sysaux01.dbf
system01.dbf  temp01.dbf  undotbs01.dbf  users01.dbf
[oracle@localhost ORA19C]$ rm -rf undotbs01.dbf 

启动数据库报ORA-01157 ORA-01110错误

SQL> alter database datafile 4 offline drop;

Database altered.

SQL> alter database open;

Database altered.

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            MOUNTED
SQL> alter session set container=pdb;

Session altered.

SQL> alter database open;

Database altered.

SQL> conn / as sysdba
Connected.
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO
 
SQL> SQL> select tablespace_name,segment_name,status from dba_rollback_segs;

TABLESPACE_NAME                SEGMENT_NAME                   STATUS
------------------------------ ------------------------------ --------------------------------
SYSTEM                         SYSTEM                         ONLINE
UNDOTBS1                       _SYSSMU1_1261223759$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU2_27624015$             NEEDS RECOVERY
UNDOTBS1                       _SYSSMU3_2421748942$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU4_625702278$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU5_2101348960$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU6_813816332$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU7_2329891355$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU8_399776867$            NEEDS RECOVERY
UNDOTBS1                       _SYSSMU9_1692468413$           NEEDS RECOVERY
UNDOTBS1                       _SYSSMU10_930580995$           NEEDS RECOVERY

本次测试比较幸运,虽然undo段状态为NEEDS RECOVERY,但是数据库直接open成功.实际生产情况,可能比这个要复杂很多