因asm sga_target设置不当导致11gr2 rac无法正常启动

2014年第一个故障排查和解决:同事反馈给我说solaris 11.2 两节点rac无法启动,让我帮忙看下。通过分析是因为sga_target参数设置不合理导致asm无法正常启动
GI无法正常启动

grid@zwq-rpt1:~$crsctl status resource -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
grid@zwq-rpt1:~$crsctl status resource -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               Instance Shutdown   
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.crf
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.crsd
      1        ONLINE  OFFLINE                                                   
ora.cssd
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.cssdmonitor
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.ctssd
      1        ONLINE  ONLINE       zwq-rpt1                 ACTIVE:0            
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  INTERMEDIATE zwq-rpt1                                     
ora.gipcd
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.gpnpd
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.mdnsd
      1        ONLINE  ONLINE       zwq-rpt1                                     

asm未正常启动

GI日志报错

2014-01-01 00:40:47.708
[cssd(1418)]CRS-1605:CSSD voting file is online: /dev/rdsk/emcpower0a; details in /export/home/app/grid/log/zwq-rpt1/cssd/ocssd.log.
2014-01-01 00:40:53.234
[cssd(1418)]CRS-1601:CSSD Reconfiguration complete. Active nodes are zwq-rpt1 zwq-rpt2 .
2014-01-01 00:40:56.659
[ctssd(1483)]CRS-2407:The new Cluster Time Synchronization Service reference node is host zwq-rpt2.
2014-01-01 00:40:56.661
[ctssd(1483)]CRS-2401:The Cluster Time Synchronization Service started on host zwq-rpt1.
2014-01-01 00:41:02.016
[ctssd(1483)]CRS-2408:The clock on host zwq-rpt1 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
2014-01-01 00:43:23.874
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 00:45:42.837
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 00:48:02.087
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 00:48:18.836
[ohasd(1083)]CRS-2807:Resource 'ora.asm' failed to start automatically.
2014-01-01 00:48:18.837
[ohasd(1083)]CRS-2807:Resource 'ora.crsd' failed to start automatically.
2014-01-01 01:05:15.396
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [CRSDG], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 01:05:45.101
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [CRSDG], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".
2014-01-01 01:06:15.104
[/export/home/app/grid/bin/oraagent.bin(1348)]CRS-5019:All OCR locations are on ASM disk groups [CRSDG], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/export/home/app/grid/log/zwq-rpt1/agent/ohasd/oraagent_grid/oraagent_grid.log".

这里较为明显的看到,因为asm磁盘组异常导致ocr无法被访问导致crs无法正常启动

ORAAGENT日志

2014-01-01 00:43:23.870: [ora.asm][9] {0:0:2} [start] InstConnection::connectInt (2) Exception OCIException
2014-01-01 00:43:23.870: [ora.asm][9] {0:0:2} [start] InstConnection:connect:excp OCIException OCI error 604
2014-01-01 00:43:23.870: [ora.asm][9] {0:0:2} [start] DgpAgent::queryDgStatus excp ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")

报了较为清晰的ORA-4031错误,检查asm日志

ASM日志报错

Wed Jan 01 00:47:33 2014
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /export/home/app/oracle
Wed Jan 01 00:47:39 2014
Errors in file /export/home/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_1728.trc  (incident=291447):
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")
Incident details in: /export/home/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_291447/+ASM1_ora_1728_i291447.trc
Wed Jan 01 00:47:48 2014
Dumping diagnostic data in directory=[cdmp_20140101004748], requested by (instance=1, osid=1728), summary=[incident=291447].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 01 00:47:53 2014
Errors in file /export/home/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_1730.trc  (incident=291448):
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")
Incident details in: /export/home/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_291448/+ASM1_ora_1730_i291448.trc
Wed Jan 01 00:48:01 2014
Dumping diagnostic data in directory=[cdmp_20140101004801], requested by (instance=1, osid=1730), summary=[incident=291448].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 01 00:48:07 2014
Errors in file /export/home/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_1732.trc  (incident=291449):
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^34f764db","kglHeapInitialize:temp")
Incident details in: /export/home/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_291449/+ASM1_ora_1732_i291449.trc
Wed Jan 01 00:48:16 2014
Dumping diagnostic data in directory=[cdmp_20140101004816], requested by (instance=1, osid=1732), summary=[incident=291449].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jan 01 00:48:16 2014
License high water mark = 1
USER (ospid: 1736): terminating the instance
Instance terminated by USER, pid = 1736

这里可以清晰的看到,因为shared pool不足,导致asm报ora-4031错误,从而使得asm无法正常启动

分析原因

Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options.
ORACLE_HOME = /export/home/app/grid
System name:	SunOS
Node name:	zwq-rpt1
Release:	5.11
Version:	11.1
Machine:	sun4v
Using parameter settings in server-side spfile +CRSDG/zwq-rpt-cluster/asmparameterfile/registry.253.823992831
System parameters with non-default values:
  sga_max_size             = 2G
  large_pool_size          = 16M
  instance_type            = "asm"
  sga_target               = 0
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "/dev/rdsk/*"
  asm_diskgroups           = "FRADG"
  asm_diskgroups           = "DATADG"
  asm_power_limit          = 1
  diagnostic_dest          = "/export/home/app/oracle"

这里可以看到sga_target被设置为了0,而shared pool又未被配置,这里因为shared pool不足从而出现了ORA-4031,从而导致crs在启动asm的过程失败,从而使得ocr不能被访问,进而使得crs不能正常启动.

处理方法
1.编辑pfile

grid@zwq-rpt1:/export/home/app/oracle/diag/asm/+asm/+ASM1/trace$vi /tmp/asm.pfile

  memory_target = 2G
  large_pool_size          = 16M
  instance_type            = "asm"
  sga_target               = 0
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "/dev/rdsk/*"
  asm_diskgroups           = "FRADG"
  asm_diskgroups           = "DATADG"
  asm_power_limit          = 1
  diagnostic_dest          = "/export/home/app/oracle"

2.启动asm

grid@zwq-rpt1:/export/home/app/oracle/diag/asm/+asm/+ASM1/trace$sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Wed Jan 1 01:04:10 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup pfile='/tmp/asm.pfile'
ASM instance started

Total System Global Area 2138521600 bytes
Fixed Size                  2161024 bytes
Variable Size            2102806144 bytes
ASM Cache                  33554432 bytes
ASM diskgroups mounted

3. 创建spfile

SQL> create spfile='+CRSDG' FROM PFILE='/tmp/asm.pfile';

File created.

--asm alert日志
Wed Jan 01 01:08:59 2014
NOTE: updated gpnp profile ASM SPFILE to 
NOTE: updated gpnp profile ASM diskstring: /dev/rdsk/*
NOTE: updated gpnp profile ASM diskstring: /dev/rdsk/*
NOTE: updated gpnp profile ASM SPFILE to +CRSDG/zwq-rpt-cluster/asmparameterfile/registry.253.835664939

4. 关闭asm

SQL> shutdown immediate
ORA-15097: cannot SHUTDOWN ASM instance with connected client (process 1971)
SQL> shutdown abort
ASM instance shutdown

5. 重启crs

root@zwq-rpt1:~# crsctl stop crs -f
root@zwq-rpt1:~# crsctl start crs

6. 重启其他节点crs

root@zwq-rpt2:~# crsctl stop crs -f
root@zwq-rpt2:~# crsctl start crs

7. 检查结果

root@zwq-rpt1:~# crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRSDG.dg
               ONLINE  ONLINE       zwq-rpt1                                     
               ONLINE  ONLINE       zwq-rpt2                                     
ora.DATADG.dg
               ONLINE  ONLINE       zwq-rpt1                                     
               ONLINE  ONLINE       zwq-rpt2                                     
ora.FRADG.dg
               ONLINE  ONLINE       zwq-rpt1                                     
               ONLINE  ONLINE       zwq-rpt2                                     
ora.LISTENER.lsnr
               ONLINE  ONLINE       zwq-rpt1                                     
               ONLINE  ONLINE       zwq-rpt2                                     
ora.asm
               ONLINE  ONLINE       zwq-rpt1                 Started             
               ONLINE  ONLINE       zwq-rpt2                 Started             
ora.gsd
               OFFLINE OFFLINE      zwq-rpt1                                     
               OFFLINE OFFLINE      zwq-rpt2                                     
ora.net1.network
               ONLINE  ONLINE       zwq-rpt1                                     
               ONLINE  ONLINE       zwq-rpt2                                     
ora.ons
               ONLINE  ONLINE       zwq-rpt1                                     
               ONLINE  ONLINE       zwq-rpt2                                     
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.cvu
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.oc4j
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.rptdb.db
      1        ONLINE  ONLINE       zwq-rpt1                 Open                
      2        ONLINE  ONLINE       zwq-rpt2                 Open                
ora.scan1.vip
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.zwq-rpt1.vip
      1        ONLINE  ONLINE       zwq-rpt1                                     
ora.zwq-rpt2.vip
      1        ONLINE  ONLINE       zwq-rpt2                  

至此恢复正常,2014年第一个故障顺利解决

ORACLE 12C RAC hub AND leaf 相互转换

感谢Lunar的指导,完成ORACLE 12C RAC hub和leaf相互转换,参考官方文档Oracle Flex Clusters部分
当前数据库状态

--集群状态
[root@rac1 ~]# crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.DATA.dg
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.SYSDB_NEW.dg
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.SYSDG.dg
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.net1.network
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.ons
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.proxy_advm
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac1                     STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       rac2                     STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       rac2                     169.254.177.226 10.1
                                                             .1.104,STABLE
ora.asm
      1        ONLINE  ONLINE       rac1                     STABLE
      2        ONLINE  ONLINE       rac2                     STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       rac2                     STABLE
ora.gns
      1        ONLINE  ONLINE       rac2                     STABLE
ora.gns.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       rac2                     Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       rac2                     STABLE
ora.ora12c.db
      1        ONLINE  ONLINE       rac1                     Open,STABLE
      2        ONLINE  ONLINE       rac2                     Open,STABLE
ora.rac1.vip
      1        ONLINE  ONLINE       rac1                     STABLE
ora.rac2.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       rac1                     STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       rac2                     STABLE
--------------------------------------------------------------------------------

--rac运行在flex模式
[root@rac1 ~]#  crsctl get cluster mode status
Cluster is running in "flex" mode

--asm运行在flex模式
[grid@rac1 ~]$ asmcmd
ASMCMD> showclustermode
ASM cluster : Flex mode enabled

--节点角色
[root@rac1 ~]# crsctl get node role config
Node 'rac1' configured role is 'hub'
[root@rac2 ~]# crsctl get node role config
Node 'rac2' configured role is 'hub'

转换hub to leaf

--转换hub为leaf
[root@rac1 ~]# crsctl set node role leaf
CRS-4408: Node 'rac1' configured role successfully changed; restart Oracle High Availability Services for new role to take effect.

--关闭集群
[root@rac1 ~]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.crsd' on 'rac1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.SYSDB_NEW.dg' on 'rac1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'rac1'
CRS-2673: Attempting to stop 'ora.SYSDG.dg' on 'rac1'
CRS-2673: Attempting to stop 'ora.ora12c.db' on 'rac1'
CRS-2673: Attempting to stop 'ora.proxy_advm' on 'rac1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.rac1.vip' on 'rac1'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'rac1'
CRS-2677: Stop of 'ora.rac1.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.rac1.vip' on 'rac2'
CRS-2677: Stop of 'ora.ora12c.db' on 'rac1' succeeded
CRS-2677: Stop of 'ora.scan1.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.scan1.vip' on 'rac2'
CRS-2676: Start of 'ora.rac1.vip' on 'rac2' succeeded
CRS-2677: Stop of 'ora.SYSDB_NEW.dg' on 'rac1' succeeded
CRS-2676: Start of 'ora.scan1.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'rac2'
CRS-2677: Stop of 'ora.DATA.dg' on 'rac1' succeeded
CRS-2677: Stop of 'ora.SYSDG.dg' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac1'
CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'rac2' succeeded
CRS-2677: Stop of 'ora.proxy_advm' on 'rac1' succeeded
CRS-2677: Stop of 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'rac1'
CRS-2677: Stop of 'ora.ons' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'rac1'
CRS-2677: Stop of 'ora.net1.network' on 'rac1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rac1' has completed
CRS-2677: Stop of 'ora.crsd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.storage' on 'rac1'
CRS-2673: Attempting to stop 'ora.crf' on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac1'
CRS-2677: Stop of 'ora.storage' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac1'
CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.

--启动集群
[root@rac1 ~]# crsctl start crs -wait
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2672: Attempting to start 'ora.evmd' on 'rac1'
CRS-2676: Start of 'ora.evmd' on 'rac1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2789: Cannot stop resource 'ora.diskmon' as it is not running on server 'rac1'
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.storage' on 'rac1'
CRS-2676: Start of 'ora.storage' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crf' on 'rac1'
CRS-2676: Start of 'ora.crf' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded
CRS-6017: Processing resource auto-start for servers: rac1
CRS-6016: Resource auto-start has completed for server rac1
CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources
CRS-4123: Oracle High Availability Services has been started.

--hub转换为leaf后状态
[root@rac1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       rac2                     STABLE
ora.DATA.dg
               ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac2                     STABLE
ora.SYSDB_NEW.dg
               ONLINE  ONLINE       rac2                     STABLE
ora.SYSDG.dg
               ONLINE  ONLINE       rac2                     STABLE
ora.net1.network
               ONLINE  ONLINE       rac2                     STABLE
ora.ons
               ONLINE  ONLINE       rac2                     STABLE
ora.proxy_advm
               ONLINE  ONLINE       rac2                     STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       rac2                     STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       rac2                     169.254.177.226 10.1
                                                             .1.104,STABLE
ora.asm
      1        ONLINE  OFFLINE                               STABLE
      2        ONLINE  ONLINE       rac2                     STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       rac2                     STABLE
ora.gns
      1        ONLINE  ONLINE       rac2                     STABLE
ora.gns.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       rac2                     Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       rac2                     STABLE
ora.ora12c.db
      1        ONLINE  OFFLINE                               Instance Shutdown,ST
                                                             ABLE
      2        ONLINE  ONLINE       rac2                     Open,STABLE
ora.rac1.vip
      1        ONLINE  INTERMEDIATE rac2                     FAILED OVER,STABLE
ora.rac2.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       rac2                     STABLE
--------------------------------------------------------------------------------

--集群角色
[root@rac1 ~]# crsctl get node role config
Node 'rac1' configured role is 'leaf'
[root@rac2 ~]# crsctl get node role config
Node 'rac2' configured role is 'hub'

leaf转换为hub

--leaf转换为hub
[root@rac1 ~]# crsctl set node role hub
CRS-4408: Node 'rac1' configured role successfully changed; restart Oracle High Availability Services for new role to take effect.

--关闭集群
[root@rac1 ~]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.crsd' on 'rac1'
CRS-2677: Stop of 'ora.crsd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.storage' on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac1'
CRS-2677: Stop of 'ora.storage' on 'rac1' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'rac1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac1'
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.

--启动集群
[root@rac1 ~]# crsctl start crs -wait
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2672: Attempting to start 'ora.evmd' on 'rac1'
CRS-2676: Start of 'ora.evmd' on 'rac1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2789: Cannot stop resource 'ora.diskmon' as it is not running on server 'rac1'
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.storage' on 'rac1'
CRS-2676: Start of 'ora.storage' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crf' on 'rac1'
CRS-2676: Start of 'ora.crf' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded
CRS-6017: Processing resource auto-start for servers: rac1
CRS-2672: Attempting to start 'ora.ons' on 'rac1'
CRS-2673: Attempting to stop 'ora.rac1.vip' on 'rac2'
CRS-2672: Attempting to start 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'rac2'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'rac2'
CRS-2677: Stop of 'ora.rac1.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.rac1.vip' on 'rac1'
CRS-2677: Stop of 'ora.scan1.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.scan1.vip' on 'rac1'
CRS-2676: Start of 'ora.rac1.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.LISTENER.lsnr' on 'rac1'
CRS-2676: Start of 'ora.scan1.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'rac1'
CRS-2676: Start of 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac1' succeeded
CRS-2676: Start of 'ora.ons' on 'rac1' succeeded
CRS-2676: Start of 'ora.LISTENER.lsnr' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'rac1' succeeded
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.proxy_advm' on 'rac1'
CRS-2676: Start of 'ora.proxy_advm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.ora12c.db' on 'rac1'
CRS-2676: Start of 'ora.ora12c.db' on 'rac1' succeeded
CRS-6016: Resource auto-start has completed for server rac1
CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources
CRS-4123: Oracle High Availability Services has been started.

--集群状态
[root@rac1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.DATA.dg
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.SYSDB_NEW.dg
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.SYSDG.dg
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.net1.network
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.ons
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
ora.proxy_advm
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac1                     STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       rac2                     STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       rac2                     STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       rac2                     169.254.177.226 10.1
                                                             .1.104,STABLE
ora.asm
      1        ONLINE  ONLINE       rac1                     STABLE
      2        ONLINE  ONLINE       rac2                     STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       rac2                     STABLE
ora.gns
      1        ONLINE  ONLINE       rac2                     STABLE
ora.gns.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       rac2                     Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       rac2                     STABLE
ora.ora12c.db
      1        ONLINE  ONLINE       rac1                     Open,STABLE
      2        ONLINE  ONLINE       rac2                     Open,STABLE
ora.rac1.vip
      1        ONLINE  ONLINE       rac1                     STABLE
ora.rac2.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       rac1                     STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       rac2                     STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       rac2                     STABLE
--------------------------------------------------------------------------------

--集群角色
[root@rac1 ~]# crsctl get node role config
Node 'rac1' configured role is 'hub'
[root@rac2 ~]# crsctl get node role config
Node 'rac2' configured role is 'hub'

这里实现了ORACLE 12C RAC的leaf和hub 角色相互转换,在转换的过程中需要转移确认集群和ASM均为flex mode,如果参考相关文档完成转换

OLR相关维护

官方关于OLR描述
OLR is a registry similar to OCR located on each node in a cluster, but contains information specific to each node. It contains manageability information about Oracle Clusterware, including dependencies between various services. Oracle High Availability Services uses this information. OLR is located on local storage on each node in a cluster. Its default location is in the path Grid_home/cdata/host_name.olr, where Grid_home is the Oracle Grid Infrastructure home, and host_name is the host name of the node.
OLR是类似OCR的东西,存储在集群的每个节点本地

查看OLR位置

[root@rac2 cdata]# cd /etc/oracle
[root@rac2 oracle]# ls -l
total 2868
drwxrwx--- 2 root oinstall    4096 Nov 24 20:00 lastgasp
drwxrwxrwt 2 root oinstall    4096 Dec 21 20:51 maps
-rw-r--r-- 1 root oinstall      96 Nov 25 18:38 ocr.loc
-rw-r--r-- 1 root root           0 Nov 24 19:58 ocr.loc.orig
-rw-r--r-- 1 root oinstall      80 Nov 24 19:58 olr.loc
-rw-r--r-- 1 root root           0 Nov 24 19:58 olr.loc.orig
drwxrwxr-x 5 root oinstall    4096 Nov 24 19:57 oprocd
drwxr-xr-x 3 root oinstall    4096 Nov 24 19:57 scls_scr
-rws--x--- 1 root oinstall 2904377 Nov 24 19:57 setasmgid
[root@rac2 oracle]# more olr.loc
olrconfig_loc=/u01/app/12.1.0/grid/cdata/rac2.olr
crs_home=/u01/app/12.1.0/grid
--在部分平台olr.loc文件可能在/var/opt/oracle/目录下

[root@rac2 oracle]#  ocrcheck -config -local
Oracle Local Registry configuration is :
         Device/File Name         : /u01/app/12.1.0/grid/cdata/rac2.olr

[root@rac2 oracle]# ocrcheck -local
Status of Oracle Local Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :        996
         Available space (kbytes) :     408572
         ID                       :  816087519
         Device/File Name         : /u01/app/12.1.0/grid/cdata/rac2.olr
                                    Device/File integrity check succeeded

         Local registry integrity check succeeded

         Logical corruption check succeeded

[root@rac2 oracle]# ls -l /u01/app/12.1.0/grid/cdata/rac2.olr
-rw------- 1 root oinstall 503484416 Dec 22 12:09 /u01/app/12.1.0/grid/cdata/rac2.olr

查看OLR备份

[root@rac2 oracle]# ocrconfig -local -showbackup

rac2     2013/11/24 20:02:38     /u01/app/12.1.0/grid/cdata/rac2/backup_20131124_200238.olr

备份OLR

[root@rac2 oracle]# ocrconfig -local -manualbackup

rac2     2013/12/22 12:09:33     /u01/app/12.1.0/grid/cdata/rac2/backup_20131222_120933.olr

rac2     2013/11/24 20:02:38     /u01/app/12.1.0/grid/cdata/rac2/backup_20131124_200238.olr

[root@rac2 oracle]# ls -l /u01/app/12.1.0/grid/cdata/rac2/
total 1908
-rw-r--r-- 1 root root  860160 Nov 24 20:02 backup_20131124_200238.olr
-rw-r--r-- 1 root root 1085440 Dec 22 12:09 backup_20131222_120933.olr

OLR异常恢复

--破坏OLR
[root@rac2 oracle]# ls -l /u01/app/12.1.0/grid/cdata/rac2.olr
-rw------- 1 root oinstall 503484416 Dec 22 12:09 /u01/app/12.1.0/grid/cdata/rac2.olr
[root@rac2 oracle]# /u01/app/12.1.0/grid/cdata/rac2.olr /u01/app/12.1.0/grid/cdata/rac2.olr_bak

--关闭crs
[root@rac2 oracle]# crsctl stop crs

--启动crs报错
[root@rac2 oracle]# crsctl start crs
PROCL-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]
CRS-4000: Command Start failed, or completed with errors.

--跟踪crs启动
[root@rac2 oracle]# strace crsctl start crs
……
uname({sys="Linux", node="rac2", ...})  = 0
open("/etc/oracle/olr.loc", O_RDONLY)   = 14
fstat(14, {st_mode=S_IFREG|0644, st_size=80, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd8ac628000
read(14, "olrconfig_loc=/u01/app/12.1.0/gr"..., 4096) = 80
read(14, "", 4096)                      = 0
close(14)                               = 0
munmap(0x7fd8ac628000, 4096)            = 0
stat("/u01/app/12.1.0/grid/cdata/rac2.olr", 0x7fffa215a580) = -1 ENOENT (No such file or directory)
--这里可以看到先是读取/etc/oracle/olr.loc,然后获取/u01/app/12.1.0/grid/cdata/rac2.olr失败
……
--确定ohasd.bin关闭
[root@rac2 cdata]# ps -ef|grep ohasd
root     15715 31578  0 14:34 pts/3    00:00:00 grep ohasd

--还原OLR
[root@rac2 oracle]# ocrconfig -local -restore /u01/app/12.1.0/grid/cdata/rac2/backup_20131124_200238.olr
PROTL-35: The configured OLR location is not accessible
[root@rac2 oracle]# cd /u01/app/12.1.0/grid/cdata/
[root@rac2 cdata]# ls
localhost  rac12c-cluster  rac2  rac2.olr_bak
[root@rac2 cdata]# touch rac2.olr
[root@rac2 cdata]# chmod 600 rac2.olr
[root@rac2 cdata]# ocrconfig -local -restore /u01/app/12.1.0/grid/cdata/rac2/backup_20131124_200238.olr

--确定还原成功
[root@rac2 cdata]# ls -l
total 84200
drwxr-xr-x 2 grid oinstall      4096 Nov 24 19:37 localhost
drwxrwxr-x 2 grid oinstall      4096 Dec 22 09:07 rac12c-cluster
drwxr-xr-x 2 grid oinstall      4096 Dec 22 12:09 rac2
-rw------- 1 root root     503484416 Dec 22 14:29 rac2.olr
-rw------- 1 root oinstall 503484416 Dec 22 12:43 rac2.olr_bak

--启动crs
[root@rac2 cdata]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

其他OLR命令

To export OLR to a file:
# ocrconfig –local –export file_name

To import a specified file to OLR:
# ocrconfig –local –import file_name

To view the contents of the OLR file:
ocrdump -local file_name

To view the contents of the OLR backup file:
ocrdump -local -backupfile olr_backup_file_name

To change the OLR backup location:
ocrconfig -local -backuploc new_olr_backup_path

当OLR异常时,RAC节点不能正常启动,而且OLR不像OCR会定时自动备份,建议人工定时备份OLR

ORACLE 12C RAC修改ocr/votedisk/asm spfile所在磁盘组名称

今天看着我这个单节点的12C rac,突然觉得ocr所在的磁盘组叫做+DG_SYS有点不舒服,想改成+SYS_DG。处理方法是先把ocr/votedisk/asm spfile迁移到已经存在的asm中,然后修改磁盘组名称,最后迁移到新名称磁盘组中(本次处理流程+DG_SYS—>+DATA—>+SYS_DG)
当前运行情况

[grid@xifenfei ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.DATA.dg
               ONLINE  ONLINE       xifenfei                 STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.net1.network
               ONLINE  ONLINE       xifenfei                 STABLE
ora.ons
               ONLINE  ONLINE       xifenfei                 STABLE
ora.proxy_advm
               ONLINE  OFFLINE      xifenfei                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       xifenfei                 169.254.196.108 10.1
                                                             0.30.22,STABLE
ora.asm
      1        ONLINE  ONLINE       xifenfei                 Started,STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cdb.db
      1        ONLINE  ONLINE       xifenfei                 Open,STABLE
ora.cvu
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.oc4j
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.xifenfei.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
--------------------------------------------------------------------------------

SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                                0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

SQL> select name,state from v$asm_diskgroup;

NAME                           STATE
------------------------------ -----------
DG_SYS                         MOUNTED
DATA                           MOUNTED

[grid@xifenfei ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   60a037da30714f6bbfe5d90206ff27a7 (/dev/sdc2) [DG_SYS]
Located 1 voting disk(s).

[grid@xifenfei ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       1380
         Available space (kbytes) :     408188
         ID                       : 2132096904
         Device/File Name         :    +DG_SYS
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check bypassed due to non-privileged user

SQL> show parameter spfile;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DG_SYS/xff-cluster/ASMPARAMET
                                                 ERFILE/registry.253.825640465

修改ocr路径
ocrconfig -add和ocrconfig -delete完成ocr更换磁盘组,该过程可以在线处理

[root@xifenfei ~]# ocrconfig -add +data
--alert 日志
2013-09-09 22:32:40.799: 
[crsd(5064)]CRS-1007:The OCR/OCR mirror location was replaced by +data.

[root@xifenfei ~]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       1380
         Available space (kbytes) :     408188
         ID                       : 2132096904
         Device/File Name         :    +DG_SYS
                                    Device/File integrity check succeeded
         Device/File Name         :      +data
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

[root@xifenfei ~]# ocrconfig -delete +DG_SYS

--alert 日志
2013-09-09 22:35:53.585: 
[crsd(5064)]CRS-1010:The OCR mirror location +DG_SYS was removed.

[root@xifenfei ~]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       1380
         Available space (kbytes) :     408188
         ID                       : 2132096904
         Device/File Name         :      +data
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

修改votedisk路径
通过crsctl replace votedisk命令修改

[root@xifenfei ~]# crsctl replace votedisk +DATA
Successful addition of voting disk 161ddea0a5fe4f28bfb67536e6105122.
Successful deletion of voting disk 60a037da30714f6bbfe5d90206ff27a7.
Successfully replaced voting disk group with +DATA.
CRS-4266: Voting file(s) successfully replaced

-alert日志
2013-09-09 22:38:15.259: 
[cssd(4685)]CRS-1605:CSSD voting file is online: /dev/sdb; details in /u01/app/12.1/grid/product/log/xifenfei/cssd/ocssd.log.
2013-09-09 22:38:15.259: 
[cssd(4685)]CRS-1626:A Configuration change request completed successfully
2013-09-09 22:38:15.285: 
[cssd(4685)]CRS-1601:CSSD Reconfiguration complete. Active nodes are xifenfei .

[root@xifenfei ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   161ddea0a5fe4f28bfb67536e6105122 (/dev/sdb) [DATA]
Located 1 voting disk(s).

修改asm spfile位置

[grid@xifenfei ~]$  gpnptool get -o-

Success.
…………
<orcl:ASM-Profile id="asm" DiscoveryString="/dev/sd*" SPFile="+DG_SYS/xff-cluster/ASMPARAMETERFILE/registry.253.825640465" Mode="legacy"/>
…………

[grid@xifenfei ~]$ sqlplus / as sysasm

SQL*Plus: Release 12.1.0.1.0 Production on Mon Sep 9 22:42:05 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> create pfile='/tmp/pfile.asm' from spfile;

File created.

SQL> create spfile='+DATA' FROM PFILE='/tmp/pfile.asm';

File created.

[grid@xifenfei ~]$  gpnptool get -o-

Success.
…………
<orcl:ASM-Profile id="asm" DiscoveryString="/dev/sd*" SPFile="+DATA/xff-cluster/ASMPARAMETERFILE/registry.253.825720159" Mode="legacy"/>
…………

这里证明create asm spfile会自动修改spfile在gpnptool对应的profile里面的配置,无需人工干预

重启crs
为了使得asm使用新的磁盘组中的spfile文件

[root@xifenfei ~]# crsctl stop crs
[root@xifenfei ~]# crsctl start crs

验证+DG_SYS磁盘组未被使用

[grid@xifenfei ~]$ sqlplus / as sysasm

SQL*Plus: Release 12.1.0.1.0 Production on Mon Sep 9 22:59:49 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> show parameter spfile;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DATA/xff-cluster/ASMPARAMETER
                                                 FILE/registry.253.825720159

ASMCMD> lsof
DB_Name  Instance_Name  Path                                                                      
+ASM     +ASM1          +DATA.255.819326577                                                       
cdb      cdb1           +DATA/CDB/CONTROLFILE/current.274.819356503                               
cdb      cdb1           +DATA/CDB/DATAFILE/sysaux.278.819355829                                   
cdb      cdb1           +DATA/CDB/DATAFILE/system.269.819356101                                   
cdb      cdb1           +DATA/CDB/DATAFILE/undotbs1.276.819356317                                 
cdb      cdb1           +DATA/CDB/DATAFILE/users.279.819356309                                    
cdb      cdb1           +DATA/CDB/DD7C48AA5A4404A2E04325AAE80A403C/DATAFILE/pdbseed_temp01.dbf    
cdb      cdb1           +DATA/CDB/DD7C48AA5A4404A2E04325AAE80A403C/DATAFILE/sysaux.272.819356709  
cdb      cdb1           +DATA/CDB/DD7C48AA5A4404A2E04325AAE80A403C/DATAFILE/system.271.819356709  
cdb      cdb1           +DATA/CDB/ONLINELOG/group_1.277.822736453                                 
cdb      cdb1           +DATA/CDB/ONLINELOG/group_2.280.822736461                                 
cdb      cdb1           +DATA/CDB/ONLINELOG/group_3.275.822736397                                 
cdb      cdb1           +DATA/CDB/TEMPFILE/temp.273.819356649  

dismount +DG_SYS磁盘组

                                   
ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576     20480     3369                0            3369              0             Y  DATA/
MOUNTED  EXTERN  N         512   4096  1048576      5451     5231                0            5231              0             N  DG_SYS/
ASMCMD> umount dg_sys
ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576     20480     3369                0            3369              0             Y  DATA/ 

修改asm dg名称
修改磁盘组+DG_SYS为+SYS_DG

[grid@xifenfei ~]$ renamedg phase=both dgname=DG_SYS newdgname=SYS_DG verbose=true

Parsing parameters..

Parameters in effect:

         Old DG name       : DG_SYS 
         New DG name          : SYS_DG 
         Phases               :
                 Phase 1
                 Phase 2
         Discovery str        : (null) 
         Clean              : TRUE
         Raw only           : TRUE
renamedg operation: phase=both dgname=DG_SYS newdgname=SYS_DG verbose=true
Executing phase 1
Discovering the group
Performing discovery with string:
Identified disk UFS:/dev/sdc2 with disk number:0 and timestamp (32990496 1727895552)
Checking for hearbeat...
Re-discovering the group
Performing discovery with string:
Identified disk UFS:/dev/sdc2 with disk number:0 and timestamp (32990496 1727895552)
Checking if the diskgroup is mounted or used by CSS 
Checking disk number:0
Generating configuration file..
Completed phase 1
Executing phase 2
Looking for /dev/sdc2
Modifying the header
Completed phase 2
Terminating kgfd context 0x7fceeb02a0a0

mount +SYS_DG

SQL> select name,state from v$asm_diskgroup;

NAME                           STATE
------------------------------ -----------
DATA                           MOUNTED
SYS_DG                         DISMOUNTED

SQL> alter diskgroup sys_dg mount;

Diskgroup altered.

SQL>  select name,state from v$asm_diskgroup;

NAME                           STATE
------------------------------ -----------
DATA                           MOUNTED
SYS_DG                         MOUNTED

asm spfile/ocr/votedisk迁移从+DATA到+SYS_DG

SQL> create spfile='+SYS_DG' FROM pfile='/tmp/pfile.asm';

File created.

[root@xifenfei ~]# ocrconfig -add +SYS_DG
[root@xifenfei ~]# ocrconfig -DELETE +DATA
[root@xifenfei ~]# crsctl replace votedisk +SYS_DG
Successful addition of voting disk 9694a31053ea4ff4bfb57891461a1296.
Successful deletion of voting disk 161ddea0a5fe4f28bfb67536e6105122.
Successfully replaced voting disk group with +SYS_DG.
CRS-4266: Voting file(s) successfully replaced
[root@xifenfei ~]# crsctl stop crs
[root@xifenfei ~]# crsctl start crs

删除ocr里面老磁盘组(+DG_SYS)信息

[root@xifenfei ~]# crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.DATA.dg
               ONLINE  ONLINE       xifenfei                 STABLE
ora.DG_SYS.dg
               ONLINE  OFFLINE      xifenfei                 STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.SYS_DG.dg
               ONLINE  ONLINE       xifenfei                 STABLE
ora.net1.network
               ONLINE  ONLINE       xifenfei                 STABLE
ora.ons
               ONLINE  ONLINE       xifenfei                 STABLE
ora.proxy_advm
               ONLINE  OFFLINE      xifenfei                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       xifenfei                 169.254.196.108 10.1
                                                             0.30.22,STABLE
ora.asm
      1        ONLINE  ONLINE       xifenfei                 Started,STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cdb.db
      1        ONLINE  ONLINE       xifenfei                 Open,STABLE
ora.cvu
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.oc4j
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.xifenfei.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
--------------------------------------------------------------------------------

[root@xifenfei ~]# srvctl remove diskgroup -diskgroup dg_sys
[root@xifenfei ~]# crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.DATA.dg
               ONLINE  ONLINE       xifenfei                 STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.SYS_DG.dg
               ONLINE  ONLINE       xifenfei                 STABLE
ora.net1.network
               ONLINE  ONLINE       xifenfei                 STABLE
ora.ons
               ONLINE  ONLINE       xifenfei                 STABLE
ora.proxy_advm
               ONLINE  OFFLINE      xifenfei                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       xifenfei                 169.254.196.108 10.1
                                                             0.30.22,STABLE
ora.asm
      1        ONLINE  ONLINE       xifenfei                 Started,STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cdb.db
      1        ONLINE  ONLINE       xifenfei                 Open,STABLE
ora.cvu
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.oc4j
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.xifenfei.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
--------------------------------------------------------------------------------

至此ocr/votedisk/asm spfile所在磁盘组修改名称完成,因为该库是一个单节点的rac,如果是两个或者更多节点的rac可以实现不停机的情况下进行(分步重启节点).该处理过程和11.2 rac完全相同,未有任何的改变

oracle 12.1 RAC的ocr磁盘组异常恢复

在11.2或者12.1的RAC中,ocr和votedisk可以放到asm中,而很多人安装系统把ocr和votedisk放到一个单独的asm 磁盘组里面,但是如果这个磁盘组坏了,而数据所在的磁盘组是好的,这个时候该怎么恢复呢?这里的恢复分两种情况,一种是有ocr备份的恢复,另外一种是无ocr备份的恢复。但是在一般情况下ocr是每4个小时自动备份一份,因此大部分的系统中都会有ocr的备份。本blog主要对于oracle 12c rac在有ocr备份,存储ocr,votedisk的asm磁盘组异常恢复
确定ocr,votedisk,asm spfile存在一个独立asm diskgroup中

[grid@xifenfei ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       1360
         Available space (kbytes) :     408208
         ID                       : 2132096904
         Device/File Name         :    +DG_SYS
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check bypassed due to non-privileged user


SQL> show parameter spfile;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DG_SYS/xff-cluster/ASMPARAMET
                                                 ERFILE/registry.253.825628977

[grid@xifenfei ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   3e20d13ae98a4fcfbffa489ab4df68a3 (/dev/sdc2) [DG_SYS]
Located 1 voting disk(s).

ASMCMD>  lsdsk -t -G dg_sys
Create_Date  Mount_Date  Repair_Timer  Path
08-SEP-13    08-SEP-13   0             /dev/sdc2

查看当前rac状态

[grid@xifenfei ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details      
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.DATA.dg
               ONLINE  ONLINE       xifenfei                 STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.net1.network
               ONLINE  ONLINE       xifenfei                 STABLE
ora.ons
               ONLINE  ONLINE       xifenfei                 STABLE
ora.proxy_advm
               ONLINE  OFFLINE      xifenfei                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       xifenfei                 169.254.196.108 10.1
                                                             0.30.22,STABLE
ora.asm
      1        ONLINE  ONLINE       xifenfei                 Started,STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cdb.db
      1        ONLINE  ONLINE       xifenfei                 Open,STABLE
ora.cvu
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.oc4j
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.xifenfei.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
--------------------------------------------------------------------------------

kfed查看磁盘头

[grid@xifenfei ~]$ kfed read /dev/sdc2
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  2879801080 ; 0x00c: 0xaba646f8
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                202375168 ; 0x020: 0x0c100000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:             DG_SYS_0000 ; 0x028: length=11
kfdhdb.grpname:                  DG_SYS ; 0x048: length=6
kfdhdb.fgname:              DG_SYS_0000 ; 0x068: length=11
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.crestmp.hi:             32990483 ; 0x0a8: HOUR=0x13 DAYS=0x8 MNTH=0x9 YEAR=0x7dd
kfdhdb.crestmp.lo:            303455232 ; 0x0ac: USEC=0x0 MSEC=0x197 SECS=0x21 MINS=0x4
kfdhdb.mntstmp.hi:             32990485 ; 0x0b0: HOUR=0x15 DAYS=0x8 MNTH=0x9 YEAR=0x7dd
kfdhdb.mntstmp.lo:           1776845824 ; 0x0b4: USEC=0x0 MSEC=0x221 SECS=0x1e MINS=0x1a
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000
kfdhdb.mfact:                    113792 ; 0x0c0: 0x0001bc80
kfdhdb.dsksize:                    5451 ; 0x0c4: 0x0000154b
kfdhdb.pmcnt:                         3 ; 0x0c8: 0x00000003
kfdhdb.fstlocn:                       1 ; 0x0cc: 0x00000001
kfdhdb.altlocn:                       2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn:                     10 ; 0x0d4: 0x0000000a
kfdhdb.redomirrors[0]:                0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]:                0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]:                0 ; 0x0dc: 0x0000
kfdhdb.redomirrors[3]:                0 ; 0x0de: 0x0000
kfdhdb.dbcompat:              168820736 ; 0x0e0: 0x0a100000
kfdhdb.grpstmp.hi:             32990483 ; 0x0e4: HOUR=0x13 DAYS=0x8 MNTH=0x9 YEAR=0x7dd
kfdhdb.grpstmp.lo:            301063168 ; 0x0e8: USEC=0x0 MSEC=0x77 SECS=0x1f MINS=0x4
kfdhdb.vfstart:                     224 ; 0x0ec: 0x000000e0
kfdhdb.vfend:                       256 ; 0x0f0: 0x00000100
kfdhdb.spfile:                      219 ; 0x0f4: 0x000000db   ----asm spfile的起点
kfdhdb.spfflg:                        1 ; 0x0f8: 0x00000001
kfdhdb.flags:                         1 ; 0x0fc: 0x00000001

备份ocr

[root@xifenfei xff-cluster]# ocrconfig  -manualbackup

xifenfei     2013/09/08 23:48:57     /u01/app/12.1/grid/product/cdata/xff-cluster/backup_20130908_234857.ocr

[root@xifenfei xff-cluster]# ocrconfig -showbackup

xifenfei     2013/08/08 21:11:00     /u01/app/12.1/grid/product/cdata/xifenfe-cluster/backup00.ocr

xifenfei     2013/08/08 17:10:56     /u01/app/12.1/grid/product/cdata/xifenfe-cluster/backup01.ocr

xifenfei     2013/07/08 20:23:18     /u01/app/12.1/grid/product/cdata/xifenfe-cluster/backup02.ocr

xifenfei     2013/08/08 17:10:56     /u01/app/12.1/grid/product/cdata/xifenfe-cluster/day.ocr

xifenfei     2013/08/08 17:10:56     /u01/app/12.1/grid/product/cdata/xifenfe-cluster/week.ocr

xifenfei     2013/09/08 23:48:57     /u01/app/12.1/grid/product/cdata/xff-cluster/backup_20130908_234857.ocr

xifenfei     2013/06/28 22:55:02     /u01/app/12.1/grid/product/cdata/xifenfe-cluster/backup_20130628_225502.ocr

破坏asm disk

[grid@xifenfei ~]$ dd if=/dev/zero of=/dev/sdc2 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 6.6061e-05 seconds, 62.0 MB/s

关闭crs

[root@xifenfei xff-cluster]# crsctl stop crs

启动crs

[root@xifenfei xff-cluster]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

[grid@xifenfei admin]$ crsctl status res -t -init
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details      
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  OFFLINE                               Instance Shutdown,ST
                                                             ABLE
ora.cluster_interconnect.haip
      1        ONLINE  OFFLINE                               STABLE
ora.crf
      1        ONLINE  OFFLINE                               STABLE
ora.crsd
      1        ONLINE  OFFLINE                               STABLE
ora.cssd
      1        ONLINE  OFFLINE      xifenfei                 STARTING
ora.cssdmonitor
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.ctssd
      1        ONLINE  OFFLINE                               STABLE
ora.diskmon
      1        OFFLINE OFFLINE                               STABLE
ora.drivers.acfs
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.evmd
      1        ONLINE  INTERMEDIATE xifenfei                 STABLE
ora.gipcd
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.gpnpd
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.mdnsd
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.storage
      1        ONLINE  OFFLINE                               STABLE
--------------------------------------------------------------------------------

GI相关日志

--alert日志
2013-09-08 23:53:37.662:
[gpnpd(1507)]CRS-2328:GPNPD started on node xifenfei.
2013-09-08 23:54:10.244:
[cssd(1584)]CRS-1713:CSSD daemon is started in hub mode
2013-09-08 23:54:10.915:
[cssd(1584)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/12.1/grid/product/log/xifenfei/cssd/ocssd.log
2013-09-08 23:54:11.183:
[ohasd(1367)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2013-09-08 23:54:11.183:
[ohasd(1367)]CRS-2769:Unable to failover resource 'ora.diskmon'.
2013-09-08 23:54:26.044:
[cssd(1584)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/12.1/grid/product/log/xifenfei/cssd/ocssd.log
2013-09-08 23:54:41.146:
[cssd(1584)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/12.1/grid/product/log/xifenfei/cssd/ocssd.log
2013-09-08 23:54:56.195:
[cssd(1584)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/12.1/grid/product/log/xifenfei/cssd/ocssd.log

--ocssd日志
2013-09-08 23:54:25.976: [    GPNP][1090226496]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2360] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote "ipc://GPNPD_xifenfei" disco ""
2013-09-08 23:54:25.976: [    CSSD][1090226496]clssnmReadDiscoveryProfile: voting file discovery string(/dev/sd*)
2013-09-08 23:54:25.976: [    CSSD][1090226496]clssnmvDDiscThread: using discovery string /dev/sd* for initial discovery
2013-09-08 23:54:25.976: [   SKGFD][1090226496]Discovery with str:/dev/sd*:

2013-09-08 23:54:25.976: [   SKGFD][1090226496]UFS discovery with :/dev/sd*:

2013-09-08 23:54:26.032: [   SKGFD][1090226496]Fetching UFS disk :/dev/sdc:

2013-09-08 23:54:26.037: [   SKGFD][1090226496]Fetching UFS disk :/dev/sdc1:

2013-09-08 23:54:26.037: [   SKGFD][1090226496]Fetching UFS disk :/dev/sdb:

2013-09-08 23:54:26.037: [   SKGFD][1090226496]Fetching UFS disk :/dev/sdc2:

2013-09-08 23:54:26.037: [   SKGFD][1090226496]Fetching UFS disk :/dev/sdd:

2013-09-08 23:54:26.037: [   SKGFD][1090226496]Fetching UFS disk :/dev/sdd1:

2013-09-08 23:54:26.037: [   SKGFD][1090226496]Fetching UFS disk :/dev/sda:

2013-09-08 23:54:26.037: [   SKGFD][1090226496]Fetching UFS disk :/dev/sda1:

2013-09-08 23:54:26.037: [   SKGFD][1090226496]Fetching UFS disk :/dev/sda2:

2013-09-08 23:54:26.037: [   SKGFD][1090226496]OSS discovery with :/dev/sd*:

2013-09-08 23:54:26.042: [   SKGFD][1090226496]Handle 0x1d65c10 from lib :UFS:: for disk :/dev/sdb:

2013-09-08 23:54:26.043: [   SKGFD][1090226496]Handle 0x20c95a0 from lib :UFS:: for disk :/dev/sdc2:

2013-09-08 23:54:26.043: [   SKGFD][1090226496]Handle 0x20c9dd0 from lib :UFS:: for disk :/dev/sdd:

2013-09-08 23:54:26.044: [   SKGFD][1090226496]Lib :UFS:: closing handle 0x1d65c10 for disk :/dev/sdb:

2013-09-08 23:54:26.044: [   SKGFD][1090226496]Lib :UFS:: closing handle 0x20c95a0 for disk :/dev/sdc2:

2013-09-08 23:54:26.044: [   SKGFD][1090226496]Lib :UFS:: closing handle 0x20c9dd0 for disk :/dev/sdd:

2013-09-08 23:54:26.044: [    CSSD][1090226496]clssnmvDiskVerify: Successful discovery of 0 disks
2013-09-08 23:54:26.044: [    CSSD][1090226496]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2013-09-08 23:54:26.044: [    CSSD][1090226496]clssnmvFindInitialConfigs: No voting files found
2013-09-08 23:54:26.044: [    CSSD][1090226496](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds

在我们破坏了ocr所在的asm disk的磁盘后,启动crs明显提示无法找到votedisk信息

强制关闭crs

[root@xifenfei xff-cluster]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'xifenfei'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'xifenfei'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'xifenfei'
CRS-2677: Stop of 'ora.drivers.acfs' on 'xifenfei' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'xifenfei' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'xifenfei'
CRS-2673: Attempting to stop 'ora.gipcd' on 'xifenfei'
CRS-2673: Attempting to stop 'ora.evmd' on 'xifenfei'
CRS-2677: Stop of 'ora.gpnpd' on 'xifenfei' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'xifenfei' succeeded
CRS-2677: Stop of 'ora.evmd' on 'xifenfei' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'xifenfei' has completed
CRS-4133: Oracle High Availability Services has been stopped.

exclusive模式启动crs

[root@xifenfei xff-cluster]# crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'xifenfei'
CRS-2677: Stop of 'ora.drivers.acfs' on 'xifenfei' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'xifenfei'
CRS-2672: Attempting to start 'ora.mdnsd' on 'xifenfei'
CRS-2676: Start of 'ora.evmd' on 'xifenfei' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'xifenfei' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'xifenfei'
CRS-2676: Start of 'ora.gpnpd' on 'xifenfei' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'xifenfei'
CRS-2672: Attempting to start 'ora.gipcd' on 'xifenfei'
CRS-2676: Start of 'ora.cssdmonitor' on 'xifenfei' succeeded
CRS-2676: Start of 'ora.gipcd' on 'xifenfei' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'xifenfei'
CRS-2672: Attempting to start 'ora.diskmon' on 'xifenfei'
CRS-2676: Start of 'ora.diskmon' on 'xifenfei' succeeded
CRS-2676: Start of 'ora.cssd' on 'xifenfei' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'xifenfei'
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'xifenfei'
CRS-2672: Attempting to start 'ora.ctssd' on 'xifenfei'
CRS-2676: Start of 'ora.ctssd' on 'xifenfei' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'xifenfei' succeeded
CRS-2676: Start of 'ora.drivers.acfs' on 'xifenfei' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'xifenfei'
CRS-2676: Start of 'ora.asm' on 'xifenfei' succeeded


[grid@xifenfei xifenfei]$ crsctl stat res -t -init
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details      
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  INTERMEDIATE xifenfei                 OCR not started,STAB
                                                             LE
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.crf
      1        OFFLINE OFFLINE                               STABLE
ora.crsd
      1        OFFLINE OFFLINE                               STABLE
ora.cssd
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.cssdmonitor
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.ctssd
      1        ONLINE  ONLINE       xifenfei                 ACTIVE:0,STABLE
ora.diskmon
      1        OFFLINE OFFLINE                               STABLE
ora.drivers.acfs
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.evmd
      1        ONLINE  INTERMEDIATE xifenfei                 STABLE
ora.gipcd
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.gpnpd
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.mdnsd
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.storage
      1        OFFLINE OFFLINE                               STABLE

创建磁盘组

[grid@xifenfei xifenfei]$ sqlplus / as sysasm

SQL*Plus: Release 12.1.0.1.0 Production on Mon Sep 9 00:23:40 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> create diskgroup DG_OCR external redundancy disk '/dev/sdc2' attribute 'COMPATIBLE.ASM' = '12.1.0';

Diskgroup created.

[root@xifenfei xff-cluster]# ls -l
total 1472
-rw-r--r-- 1 root root 1503232 Sep  8 23:48 backup_20130908_234857.ocr
[root@xifenfei xff-cluster]# ocrconfig -restore backup_20130908_234857.ocr
PROT-35: The configured OCR locations are not accessible

SQL> conn / as sysasm
Connected.
SQL> drop diskgroup DG_OCR force including contents;

Diskgroup dropped.

SQL>  create diskgroup DG_SYS  external redundancy disk '/dev/sdc2' attribute 'COMPATIBLE.ASM' = '12.1.0';

Diskgroup created.

为了操作方便,建议创建磁盘组和以前ocr所在异常的磁盘组一致

还原ocr

[root@xifenfei xff-cluster]# ocrconfig -restore backup_20130908_234857.ocr

--ALERT 日志
2013-09-09 00:26:50.584:
[client(3015)]CRS-1002:The OCR was restored from file backup_20130908_234857.ocr.

处理votedisk

[root@xifenfei xff-cluster]# crsctl replace votedisk +DG_SYS
Successful addition of voting disk 60a037da30714f6bbfe5d90206ff27a7.
Successfully replaced voting disk group with +DG_SYS.
CRS-4266: Voting file(s) successfully replaced

创建asm spfile

[grid@xifenfei dbs]$ vi /tmp/asm.txt 
instance_type='asm'
large_pool_size=12M
remote_login_passwordfile= "EXCLUSIVE"
asm_diskstring           = "/dev/sd*"
asm_power_limit          = 1

[grid@xifenfei dbs]$ sqlplus '/ as sysasm'

SQL*Plus: Release 12.1.0.1.0 Production on Mon Sep 9 00:34:18 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL>  create spfile='+DG_SYS' FROM pfile='/tmp/asm.txt';

File created.

重启crs

[root@xifenfei xff-cluster]# crsctl stop crs -f

[root@xifenfei xff-cluster]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

[grid@xifenfei dbs]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details      
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.DATA.dg
               ONLINE  ONLINE       xifenfei                 STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       xifenfei                 STABLE
ora.net1.network
               ONLINE  ONLINE       xifenfei                 STABLE
ora.ons
               ONLINE  ONLINE       xifenfei                 STABLE
ora.proxy_advm
               ONLINE  OFFLINE      xifenfei                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       xifenfei                 169.254.196.108 10.1
                                                             0.30.22,STABLE
ora.asm
      1        ONLINE  ONLINE       xifenfei                 Started,STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cdb.db
      1        ONLINE  ONLINE       xifenfei                 Open,STABLE
ora.cvu
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.oc4j
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
ora.xifenfei.vip
      1        ONLINE  ONLINE       xifenfei                 STABLE
--------------------------------------------------------------------------------

这里crs已经恢复正常,进一步检查ocr,votedisk,asm spfile情况

[grid@xifenfei ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       1380
         Available space (kbytes) :     408188
         ID                       : 2132096904
         Device/File Name         :    +DG_SYS
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check bypassed due to non-privileged user

[grid@xifenfei ~]$ sqlplus / as sysasm

SQL*Plus: Release 12.1.0.1.0 Production on Mon Sep 9 16:12:21 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> show parameter spfile; 

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DG_SYS/xff-cluster/ASMPARAMET
                                                 ERFILE/registry.253.825640465
SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
[grid@xifenfei ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   60a037da30714f6bbfe5d90206ff27a7 (/dev/sdc2) [DG_SYS]
Located 1 voting disk(s).

至此在ocr 磁盘组异常,而有ocr备份的情况下故障恢复完毕,对于没有ocr备份的故障,只能通过重建ocr来完成,大概步骤为

--deconfigure(root)
remote node
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose

lastnode
# <$GRID_HOME>/crs/install/rootcrs.pl -deconfig -force -verbose -lastnode

--配置信息重建ocr等(grid)
# $GRID_HOME/crs/config/config.sh