OCR/Vote disk 维护操作

数据库版本

SQL>  select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Prod
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

ocr测试(可以online处理)

rac2-> ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     160396
         Used space (kbytes)      :       4376
         Available space (kbytes) :     156020
         ID                       : 1302494786
         Device/File Name         : /dev/raw/raw11
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

rac2-> more /etc/oracle/ocr.loc 
ocrconfig_loc=/dev/raw/raw11
local_only=false

--增加ocr镜像
[root@rac2 bin]# ./ocrconfig -replace ocrmirror /dev/raw/raw12

rac2-> ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     160396
         Used space (kbytes)      :       4376
         Available space (kbytes) :     156020
         ID                       : 1302494786
         Device/File Name         : /dev/raw/raw11
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/raw/raw12
                                    Device/File integrity check succeeded

         Cluster registry integrity check succeeded

rac2-> more /etc/oracle/ocr.loc 
#Device/file  getting replaced by device /dev/raw/raw12 
ocrconfig_loc=/dev/raw/raw11
ocrmirrorconfig_loc=/dev/raw/raw12
local_only=false

--删除ocr
[root@rac2 bin]# ./ocrconfig -replace ocr

rac2-> more /etc/oracle/ocr.loc 
#Device/file /dev/raw/raw11 being deleted 
ocrconfig_loc=/dev/raw/raw12
local_only=false

rac2-> ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     160396
         Used space (kbytes)      :       4376
         Available space (kbytes) :     156020
         ID                       : 1302494786
         Device/File Name         : /dev/raw/raw12
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

--补充删除ocr镜像
[root@rac2 bin]# ./ocrconfig -replace ocrmirror

Vote disk测试(10g offline/11g online)

--关闭crs
[root@rac2 bin]# ./crsctl stop crs
[root@rac1 bin]# ./crsctl stop crs

--查询vote disk
rac2-> crsctl query css votedisk
 0.     0    /dev/raw/raw31

--增加vote disk
[root@rac2 bin]# ./crsctl add css votedisk /dev/raw/raw23 -force
Now formatting voting disk: /dev/raw/raw23
successful addition of votedisk /dev/raw/raw23.
[root@rac2 bin]# ./crsctl add css votedisk /dev/raw/raw33 -force
Now formatting voting disk: /dev/raw/raw33
successful addition of votedisk /dev/raw/raw33.
[root@rac2 bin]# ./crsctl add css votedisk /dev/raw/raw32 -force
Now formatting voting disk: /dev/raw/raw32
successful addition of votedisk /dev/raw/raw32.

rac2-> crsctl query css votedisk
 0.     0    /dev/raw/raw31
 1.     0    /dev/raw/raw23
 2.     0    /dev/raw/raw33
 3.     0    /dev/raw/raw32

located 4 votedisk(s).

--删除vote disk
[root@rac2 bin]# ./crsctl delete css votedisk /dev/raw/raw33 -force
successful deletion of votedisk /dev/raw/raw33.

--启动crs
[root@rac2 bin]# ./crsctl start crs
[root@rac1 bin]# ./crsctl start crs

补充官方操作说明[ID 428681.1]

RAC 10g升级到10.2.0.5

1.Back Up database
一般情况下rman备份

2.备份ocr和vote disk

[root@rac2 bin]# ./ocrconfig -export /tmp/ocr_export.bak
[root@rac2 bin]# more /etc/oracle/ocr.loc 
ocrconfig_loc=/dev/raw/raw11
local_only=FALSE
[root@rac2 bin]# dd if=/dev/raw/raw11 of=/tmp/ocr_dd.bak
[root@rac2 bin]# dd if=/dev/raw/raw31 of=/tmp/vote_dd.bak

3.Update Oracle Time Zone Definitions
Actions for the DSTv4 update in the 10.2.0.5 patchset [ID 1086400.1]

4.Stopping All Processes
滚动升级关闭一个节点所有进程,非滚动升级关闭所有进程

$ isqlplusctl stop
$ emctl stop dbconsole
$ srvctl stop service -d db_name [-s service_name_list [-i inst_name]]
$ srvctl stop instance -d db_name -i inst_name
$ srvctl stop asm -n node
$ srvctl stop listener -n node [-l listenername]
$ srvctl stop nodeapps -n node
# CRS_home/bin/crsctl stop crs(root执行,滚动升级不需要关闭)

5.Back Up the System
$ORACLE_BASE中文件,主要包括(db和crs安装文件/oraInventory文件)

6.升级crs软件
执行./runInstaller选择crs目录

执行下面命令
# CRS_home/bin/crsctl stop crs
# CRS_home/install/root102.sh

7.升级db软件
关闭crs和db所有进程(步骤同4)
执行./runInstaller选择db目录

执行下面命令
# ORACLE_HOME/root.sh

8.升级数据库
8.1)检查数据库升级需要满足条件,对存在不合适之处,进行修正
How to Download and Run Oracle’s Database Pre-Upgrade Utility [ID 884522.1]

SQL> STARTUP UPGRADE
SQL> SPOOL upgrade_info.log 
SQL> @/rdbms/admin/utlu102i.sql
SQL> SPOOL OFF
SQL> ALTER SYSTEM SET CLUSTER_DATABASE=FALSE SCOPE=spfile;
--其他根据upgrade_info.log中提示修改
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP UPGRADE

8.2)启动监听
srvctl start listener -n node

8.3)升级数据库

SQL> SPOOL patch.log
SQL> @?/rdbms/admin/catupgrd.sql
--检查patch.log,发现有错误查找原因,重新执行catupgrd.sql脚本
SQL> SPOOL OFF
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP
SQL> @?/rdbms/admin/utlrp.sql
SQL> ALTER SYSTEM SET CLUSTER_DATABASE=TRUE SCOPE=spfile;
--包括其他修改调整参数
SQL> SHUTDOWN IMMEDIATE
--使用rac管理相关命令,启动需要启动资源

9.修改相关目录权限
# ORACLE_HOME/install/changePerm.sh

具体操作步骤请阅读README.html

永久表空间出现临时段不能扩展原因探讨

数据库版本

SQL> select * from v$version;

BANNER
-----------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

创建5M测试表空间

SQL> CREATE TABLESPACE T_1652 DATAFILE '/tmp/t_1652_01.dbf' size 5M 
  2  AUTOEXTEND OFF LOGGING PERMANENT EXTENT MANAGEMENT LOCAL AUTOALLOCATE 
  3  SEGMENT SPACE MANAGEMENT AUTO blocksize 8192;

Tablespace created.

测试CTAS

SQL> create table CHF.T_XIFENFEI TABLESPACE T_1652 as
  2  select LPAD('XIFENFEI',1024,'F') "C_XFF" from  dual connect by level <=3500;
create table CHF.T_XIFENFEI TABLESPACE T_1652 as
*
ERROR at line 1:
ORA-01652: unable to extend temp segment by 128 in tablespace T_1652


SQL> create table CHF.T_XIFENFEI TABLESPACE T_1652 as
  2  select LPAD('XIFENFEI',1024,'F') "C_XFF" from  dual connect by level <=3000;

Table created.

测试CREATE INDEX

SQL> create index chf.i_xifenfei on chf.t_xifenfei(c_xff)
  2  tablespace t_1652;
create index chf.i_xifenfei on chf.t_xifenfei(c_xff)
                                   *
ERROR at line 1:
ORA-01658: unable to create INITIAL extent for segment in tablespace T_1652


SQL> Select MAX(d.bytes) total_bytes,
  2         nvl(SUM(f.Bytes), 0) free_bytes,
  3         d.file_name,
  4         MAX(d.bytes) - nvl(SUM(f.bytes), 0) used_bytes,
  5  from   DBA_FREE_SPACE f , DBA_DATA_FILES d
  6  where  f.tablespace_name(+) = d.tablespace_name
  7  and    f.file_id(+) = d.file_id
  8  and    d.tablespace_name = 'T_1652'
  9  group by d.file_name;

TOTAL_BYTES FREE_BYTES FILE_NAME                 USED_BYTES  
----------- ---------- ------------------------- ---------- 
    5242880          0 /tmp/t_1652_01.dbf           5242880

SQL> drop table chf.t_xifenfei purge;

Table dropped.

SQL> create table CHF.T_XIFENFEI TABLESPACE T_1652 as
  2  select LPAD('XIFENFEI',1024,'F') "C_XFF" from  dual connect by level <=2000;

Table created.

SQL> Select MAX(d.bytes) total_bytes,
  2         nvl(SUM(f.Bytes), 0) free_bytes,
  3         d.file_name,
  4         MAX(d.bytes) - nvl(SUM(f.bytes), 0) used_bytes,
  5  from   DBA_FREE_SPACE f , DBA_DATA_FILES d
  6  where  f.tablespace_name(+) = d.tablespace_name
  7  and    f.file_id(+) = d.file_id
  8  and    d.tablespace_name = 'T_1652'
  9  group by d.file_name;


TOTAL_BYTES FREE_BYTES FILE_NAME                 USED_BYTES   
----------- ---------- ------------------------- ---------- 
    5242880    1048576 /tmp/t_1652_01.dbf           4194304  

SQL> create index chf.i_xifenfei on chf.t_xifenfei(c_xff)
  2  tablespace t_1652;
create index chf.i_xifenfei on chf.t_xifenfei(c_xff)
                                   *
ERROR at line 1:
ORA-01652: unable to extend temp segment by 128 in tablespace T_1652

SQL> ALTER DATABASE DATAFILE '/tmp/t_1652_01.dbf' RESIZE 10M;

Database altered.

SQL> create index chf.i_xifenfei on chf.t_xifenfei(c_xff)
  2  tablespace t_1652;

Index created.

测试MOVE

SQL> drop table chf.t_xifenfei purge;

Table dropped.

SQL> create table CHF.T_XIFENFEI TABLESPACE T_1652 as
  2  select LPAD('XIFENFEI',1024,'F') "C_XFF" from  dual connect by level <=3500;

Table created.

SQL> alter table chf.t_xifenfei move;
alter table chf.t_xifenfei move
*
ERROR at line 1:
ORA-01652: unable to extend temp segment by 128 in tablespace T_1652


SQL> Select MAX(d.bytes) total_bytes,
  2         nvl(SUM(f.Bytes), 0) free_bytes,
  3         d.file_name,
  4         MAX(d.bytes) - nvl(SUM(f.bytes), 0) used_bytes,
  5  from   DBA_FREE_SPACE f , DBA_DATA_FILES d
  6  where  f.tablespace_name(+) = d.tablespace_name
  7  and    f.file_id(+) = d.file_id
  8  and    d.tablespace_name = 'T_1652'
  9  group by d.file_name;

TOTAL_BYTES FREE_BYTES FILE_NAME                 USED_BYTES  
----------- ---------- ------------------------- ---------- 
   10485760    4194304 /tmp/t_1652_01.dbf           6291456  

SQL>  ALTER DATABASE DATAFILE '/tmp/t_1652_01.dbf' RESIZE 15M;

Database altered.

SQL> alter table chf.t_xifenfei move;

Table altered.

这里可以发现CTAS,CREATE INDEX,MOVE操作都有个共同点:需要一次性创建一个较大SEGMENT,但是这个SEGMENT的创建过程是在数据库中逐渐实现(非初始化指定大小)。
也就是说,ORACLE对这些对象的处理方法是:对于这样的segment先当作临时段处理,当处理完成后,再把这些在永久表空间中的临时段转换为永久段;所以当这些永久表空间中的临时段在扩展的时候,遇到该永久表空间不足,而该段目前还是临时段(在永久表空间中的临时段),就出现了ORA-01652提示一个永久表空间unable to extend temp segment

表空间online出现ORA-00600[kcbz_check_objd_typ]处理过程

online表空间出现ORA-00600[kcbz_check_objd_typ]

Fri Mar 30 14:09:24 2012
alter tablespace xff offline
Fri Mar 30 14:09:28 2012
Completed: alter tablespace xff offline
Fri Mar 30 17:49:59 2012
alter tablespace xff rename datafile '/oradataa/xifenfei.dbf' to '/oradatab/xifenfei_bak.dbf'
Fri Mar 30 17:50:03 2012
Completed: alter tablespace xff rename datafile '/oradataa/xifenfei.dbf' to '/oradatab/xifenfei_bak.dbf'
Fri Mar 30 17:50:03 2012
alter tablespace coweb_bak_new online
Fri Mar 30 17:53:08 2012
Errors in file /oracle/ora10/admin/ora10g/udump/ora10g_ora_21275.trc:
ORA-00600: internal error code, arguments: [kcbz_check_objd_typ], [0], [0], [1], [], [], [], []

分析trace文件

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/ora10/product
System name:	Linux
Node name:	FH-DB-01
Release:	2.6.18-92.el5
Version:	#1 SMP Tue Apr 29 13:16:15 EDT 2008
Machine:	x86_64
Instance name: ora10g
Redo thread mounted by this instance: 1
Oracle process number: 182
Unix process pid: 21275, image: oracle@FH-DB-01 (TNS V1-V3)

*** 2012-03-30 17:50:16.469
*** ACTION NAME:() 2012-03-30 17:50:15.939
*** MODULE NAME:(sqlplus@FH-DB-01 (TNS V1-V3)) 2012-03-30 17:50:15.939
*** SERVICE NAME:(SYS$USERS) 2012-03-30 17:50:15.939
*** SESSION ID:(921.23041) 2012-03-30 17:50:15.939
*** SESSION ID:(921.23041) 2012-03-30 17:50:15.939
OBJD MISMATCH typ=6, seg.obj=-2, diskobj=294051, dsflg=0, dsobj=294044, tid=294044, cls=1
Input data (nil), 0, 0
Formatted dump of block:
buffer tsn: 10 rdba: 0x0435c094 (1024/70631572)
scn: 0x0b2d.2b4f8874 seq: 0x01 flg: 0x04 tail: 0x88740601
frmt: 0x02 chkval: 0x5626 type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x000000006DBB2000 to 0x000000006DBB4000
06DBB2000 0000A206 0435C094 2B4F8874 04010B2D  [......5.t.O+-...]
06DBB2010 00005626 00000002 00047CA3 2B4F8874  [&V.......|..t.O+]
06DBB2020 00000B2D 00320002 0435C091 00000000  [-.....2...5.....]
06DBB2030 00000000 00000000 00000000 00000000  [................]
        Repeat 2 times
06DBB2060 00000000 02800000 00000000 00240000  [..............$.]
06DBB2070 1F3C1F60 00000000 00000000 00000000  [`.<.............]
06DBB2080 00000000 00001F60 00001F5C 00000000  [....`...\.......]
06DBB2090 00000000 00000000 00000000 00000000  [................]
        Repeat 501 times
06DBB3FF0 00000000 00000000 00000000 88740601  [..............t.]
Block header dump:  0x0435c094
 Object id on Block? Y
 seg/obj: 0x47ca3  csc: 0xb2d.2b4f8874  itc: 2  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x435c091 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000

这里可以得出几个结论:
1)版本平台信息
2)通过下面信息都是知道,data_object_id=294044的对象出现异常(seg.obj=-2)

OBJD MISMATCH typ=6, seg.obj=-2, diskobj=294051, dsflg=0, dsobj=294044, tid=294044, cls=1

Block header dump:  0x0435c094
 Object id on Block? Y
 seg/obj: 0x47ca3  csc: 0xb2d.2b4f8874  itc: 2  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x435c091 ver: 0x01 opc: 0
     inc: 0  exflg: 0

3)结合dba_objects视图查询出来是xifenfei_index 索引出现异常

问题原因猜测
因为没有找到权威解释,对于这里在online表空间的时候出现ORA-00600[kcbz_check_objd_typ]错误,个人的猜测可能是在online的时候,验证数据字典中关于该表空间中对象的相关记录和该表空间总的存储数据比较,然后发现不匹配,所以出现了该错误。

处理方法
虽然该表空间离线,但是我们可以使用drop操作直接删除数据字典中index的记录,然后再online表空间,这样可以绕过数据字典和表空间中的存储比较。当表空间online成功后,然后再创建index

ASMM表空间强制终止DML操作导致ORA-600 [ktspfupdst-1]

发现错误ORA-00600 [ktspfupdst-1],trace关键内容如下

Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /oracle9/app/product/9.2.0
System name:    AIX
Node name:      zwq_crm1
Release:        3
Version:        5
Machine:        00C420B44C00
Instance name: crm1
Redo thread mounted by this instance: 1
Oracle process number: 389
Unix process pid: 1896900, image: oracle@zwq_crm1 (TNS V1-V3)

----------------------------------------------
*** 2012-03-31 02:50:48.509
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktspfupdst-1], [], [], [], [], [], [], []
ORA-00604: error occurred at recursive SQL level 1
ORA-01013: user requested cancel of current operation
Current SQL statement for this session:
INSERT INTO XIFENFEI 
SELECT * FROM T_XFF
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp+0148          bl       ksedst               1029746FC ?
ksfdmp+0018          bl       01FD4014
kgerinv+00e8         bl       _ptrgl
kgeasnmierr+004c     bl       kgerinv              000000001 ? 000000000 ?
                                                   000000005 ? 000000001 ?
                                                   000000001 ?
ktspfupdst+0540      bl       kgeasnmierr          110006308 ? 1103994E8 ?
                                                   102A9239C ? 000000000 ?
                                                   000000005 ? 000000010 ?
                                                   000000020 ? 000000006 ?
ktspstchg+00e4       bl       ktspfupdst           000000060 ? 300000004 ?
                                                   FFFFFFFFFFF6E48 ?
                                                   50601CE000000ED ?
                                                   3B401B34C5D02F2A ?
                                                   B92000004000020 ?
kdoiur+062c          bl       ktspstchg            000000000 ? 700000C39D779E8 ?
                                                   000000000 ?
kcoubk+00e4          bl       _ptrgl
ktundo+0988          bl       kcoubk               1010CCD80 ? FFFFFFFFFFF76C0 ?
                                                   100ED51C0 ? FFFFFFFFFFF7150 ?
                                                   1101FAF78 ? 1102567C0 ?
                                                   700000C396A1300 ? 000000002 ?
ktubko+03bc          bl       ktundo               1840DFB30 ?
                                                   3B401B3400000002 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFF85D8 ?
                                                   700000C80A1E880 ? 2FFFF8540 ?
                                                   FFFFFFFFFFF8780 ?
ktuabt+0638          bl       ktubko               DF000000DF ?
                                                   FFFFFFFFFFF8690 ? 000000000 ?
                                                   FFFFFFFFFFF85D8 ? 102973880 ?
                                                   700000C844FA418 ?
ktcrab+02b4          bl       ktuabt               700000C80A1E840 ? 200017CD8 ?
ktcrsp+026c          bl       ktcrab               100F698E4 ? 000000001 ?
ksures+0074          bl       ktcrsp               700000C844FA448 ?
opiexe+3380          bl       01FD4138
opiall0+102c         bl       opiexe               400000000 ? 110002A48 ?
                                                   FFFFFFFFFFFA0A0 ?
kpoal8+0a78          bl       opiall0              5EFFFFBED4 ? 22103A43F8 ?
                                                   FFFFFFFFFFFA5B8 ? 000000000 ?
                                                   FFFFFFFFFFFA508 ? 1103A4B00 ?
                                                   6FF00000738 ?
                                                   24000000007FFF ?
opiodr+08cc          bl       _ptrgl
ttcpip+0cc4          bl       _ptrgl
opitsk+0d60          bl       ttcpip               11000CF90 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
opiino+0758          bl       opitsk               000000000 ? 000000000 ?
opiodr+08cc          bl       _ptrgl
opidrv+032c          bl       opiodr               3C00000018 ? 4101FAF78 ?
                                                   FFFFFFFFFFFF7B0 ? 0A000F350 ?
sou2o+0028           bl       opidrv               3C0C000000 ? 4A00E8B50 ?
                                                   FFFFFFFFFFFF7B0 ?
main+0138            bl       01FD3A28
__start+0098         bl       main                 000000000 ? 000000000 ?

--------------------- Binary Stack Dump ---------------------

这里可以得到信息如下:
1)系统平台aix 5.3,数据库版本9.2.0.8 rac
2)这个错误可能和一个insert select操作,然后取消有关系

表空间管理方式

SQL> SELECT tablespace_name, SEGMENT_SPACE_MANAGEMENT from dba_tablespaces   
  2  where tablespace_name=
  3  (select tablespace_name from dba_tables where table_name='XIFENFEI');

TABLESPACE_NAME                SEGMEN
------------------------------ ------
CUSTSERV                       AUTO

查看MOS发现和[ID 388599.1]相符
错误原因:

1. An insert or update on a table causes the addition of a new extent 
   and the operation is cancelled. 

2. The segment uses Automatic Segment Space Management (ASSM). 

3. The call stack in the associated trace file resembles:
    ktspfupdst ktspstchg kdoiur kcoubk ktundo ktubko ktuabt ktcrab ktcrsp 

解决方法:
官方给出方案就是升级到新版本,不过可以采取一个比较折中的处理方案,先想办法导出或者备份该对象数据,然后trunate或者重建表和相关index操作,解决该问题