CLSRSC-400: A system reboot is required to continue installing.

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:CLSRSC-400: A system reboot is required to continue installing.

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在以前的文章中写过关于12.2.0.1在linux 7.3中安装可能遇到的oracle rac 12.2 执行root.sh报CLSRSC-400问题,最近mos上面提供的相应的patch(25078431)以及处理方法,供参考
12.2.0.1 Cluster Installation Failure – ACFS-9459 With RH 7.3 (文档 ID 2273119.1)文章提供方法
25078431-1


ALERT: root.sh Fails With “CLSRSC-400″ While Installing GI 12.2.0.1 on RHEL or OL with RedHat Compatible Kernel (RHCK) 7.3 (文档 ID 2284463.1)文章提供方法
applyOneOffs

12.1人工修改操作系统时间导致数据库异常

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:12.1人工修改操作系统时间导致数据库异常

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户数据库版本为12.1.0.1 版本RAC,突发发生重启,让协助分析原因
数据库alert日志报ORA-15064错误

Mon Apr 15 15:06:26 2019
WARNING: inbound connection timed out (ORA-3136)
Mon Apr 15 15:41:26 2019
NOTE: ASMB terminating
Mon Apr 15 15:41:26 2019
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_asmb_61426.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
Process ID: 
Session ID: 1892 Serial number: 29
Mon Apr 15 15:41:26 2019
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_asmb_61426.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
Process ID: 
Session ID: 1892 Serial number: 29
Mon Apr 15 15:41:26 2019
System state dump requested by (instance=1, osid=61426 (ASMB)), summary=[abnormal instance termination].
Mon Apr 15 15:41:26 2019
USER (ospid: 61426): terminating the instance due to error 15064
Mon Apr 15 15:41:26 2019
System State dumped to trace file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_61287.trc
Mon Apr 15 15:41:27 2019
opiodr aborting process unknown ospid (1171) as a result of ORA-1092
Mon Apr 15 15:41:27 2019
ORA-1092 : opitsk aborting process

这里看,明显asmb异常导致数据库无法正常访问asm从而出现数据库crash的问题.

分析asm日志

Mon Apr 15 15:41:26 2019
WARNING: client [+ASM1:+ASM] not responsive for 2069s; state=0x1. pid 23155
NOTE: umbilicus traces dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_gen0_23050.trc
WARNING: client [orcl1:orcl] not responsive for 2069s; state=0x1. killing pid 61436
NOTE: umbilicus traces dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_gen0_23050.trc
WARNING: fencing client [orcl1:orcl] after 2069 seconds (mbr 2)
WARNING: client [-MGMTDB:_mgmtdb] not responsive for 2070s; state=0x1. killing pid 24026
NOTE: umbilicus traces dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_gen0_23050.trc
WARNING: fencing client [-MGMTDB:_mgmtdb] after 2070 seconds (mbr 1)
Mon Apr 15 15:41:26 2019
NOTE: cleaned up ASM client -MGMTDB:_mgmtdb
NOTE: cleaned up ASM client orcl1:orcl
Mon Apr 15 15:41:43 2019
NOTE: Standard client -MGMTDB:_mgmtdb registered, osid 183707, mbr 0x1 (reg:1371965153)
Mon Apr 15 15:42:16 2019
NOTE: Standard client orcl1:orcl registered, osid 184063, mbr 0x2 (reg:2088418628)
Mon Apr 15 15:44:30 2019
Warning: VKTM detected a time drift.
Time drifts can result in an unexpected behavior such as time-outs. Please check trace file for more details.

asm日志中和mos中的GEN0 terminating the ASM instance due to error 15082 (文档 ID 2096988.1)描述比较匹配.根据客户反馈,他们使用ntp进行修改了时间,基本上可以确定是由于oracle的Bug 19032250(在12.1.0.2中修复)在ntp修改时间跨度过大触发的相关问题(人工直接修改时间也可能出现类似问题)

对于rac修改时间建议
1. 如果时间慢了,关闭数据库和集群直接把时间向前调整,启动集群和数据库
2. 如果时间快了,关闭数据库和集群等实际时间过关闭集群和库的时间之后,再往回调整时间,启动集群和数据库