ASMM表空间强制终止DML操作导致ORA-600 [ktspfupdst-1]

发现错误ORA-00600 [ktspfupdst-1],trace关键内容如下

Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production
ORACLE_HOME = /oracle9/app/product/9.2.0
System name:    AIX
Node name:      zwq_crm1
Release:        3
Version:        5
Machine:        00C420B44C00
Instance name: crm1
Redo thread mounted by this instance: 1
Oracle process number: 389
Unix process pid: 1896900, image: oracle@zwq_crm1 (TNS V1-V3)

----------------------------------------------
*** 2012-03-31 02:50:48.509
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktspfupdst-1], [], [], [], [], [], [], []
ORA-00604: error occurred at recursive SQL level 1
ORA-01013: user requested cancel of current operation
Current SQL statement for this session:
INSERT INTO XIFENFEI 
SELECT * FROM T_XFF
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp+0148          bl       ksedst               1029746FC ?
ksfdmp+0018          bl       01FD4014
kgerinv+00e8         bl       _ptrgl
kgeasnmierr+004c     bl       kgerinv              000000001 ? 000000000 ?
                                                   000000005 ? 000000001 ?
                                                   000000001 ?
ktspfupdst+0540      bl       kgeasnmierr          110006308 ? 1103994E8 ?
                                                   102A9239C ? 000000000 ?
                                                   000000005 ? 000000010 ?
                                                   000000020 ? 000000006 ?
ktspstchg+00e4       bl       ktspfupdst           000000060 ? 300000004 ?
                                                   FFFFFFFFFFF6E48 ?
                                                   50601CE000000ED ?
                                                   3B401B34C5D02F2A ?
                                                   B92000004000020 ?
kdoiur+062c          bl       ktspstchg            000000000 ? 700000C39D779E8 ?
                                                   000000000 ?
kcoubk+00e4          bl       _ptrgl
ktundo+0988          bl       kcoubk               1010CCD80 ? FFFFFFFFFFF76C0 ?
                                                   100ED51C0 ? FFFFFFFFFFF7150 ?
                                                   1101FAF78 ? 1102567C0 ?
                                                   700000C396A1300 ? 000000002 ?
ktubko+03bc          bl       ktundo               1840DFB30 ?
                                                   3B401B3400000002 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFF85D8 ?
                                                   700000C80A1E880 ? 2FFFF8540 ?
                                                   FFFFFFFFFFF8780 ?
ktuabt+0638          bl       ktubko               DF000000DF ?
                                                   FFFFFFFFFFF8690 ? 000000000 ?
                                                   FFFFFFFFFFF85D8 ? 102973880 ?
                                                   700000C844FA418 ?
ktcrab+02b4          bl       ktuabt               700000C80A1E840 ? 200017CD8 ?
ktcrsp+026c          bl       ktcrab               100F698E4 ? 000000001 ?
ksures+0074          bl       ktcrsp               700000C844FA448 ?
opiexe+3380          bl       01FD4138
opiall0+102c         bl       opiexe               400000000 ? 110002A48 ?
                                                   FFFFFFFFFFFA0A0 ?
kpoal8+0a78          bl       opiall0              5EFFFFBED4 ? 22103A43F8 ?
                                                   FFFFFFFFFFFA5B8 ? 000000000 ?
                                                   FFFFFFFFFFFA508 ? 1103A4B00 ?
                                                   6FF00000738 ?
                                                   24000000007FFF ?
opiodr+08cc          bl       _ptrgl
ttcpip+0cc4          bl       _ptrgl
opitsk+0d60          bl       ttcpip               11000CF90 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
opiino+0758          bl       opitsk               000000000 ? 000000000 ?
opiodr+08cc          bl       _ptrgl
opidrv+032c          bl       opiodr               3C00000018 ? 4101FAF78 ?
                                                   FFFFFFFFFFFF7B0 ? 0A000F350 ?
sou2o+0028           bl       opidrv               3C0C000000 ? 4A00E8B50 ?
                                                   FFFFFFFFFFFF7B0 ?
main+0138            bl       01FD3A28
__start+0098         bl       main                 000000000 ? 000000000 ?

--------------------- Binary Stack Dump ---------------------

这里可以得到信息如下:
1)系统平台aix 5.3,数据库版本9.2.0.8 rac
2)这个错误可能和一个insert select操作,然后取消有关系

表空间管理方式

SQL> SELECT tablespace_name, SEGMENT_SPACE_MANAGEMENT from dba_tablespaces   
  2  where tablespace_name=
  3  (select tablespace_name from dba_tables where table_name='XIFENFEI');

TABLESPACE_NAME                SEGMEN
------------------------------ ------
CUSTSERV                       AUTO

查看MOS发现和[ID 388599.1]相符
错误原因:

1. An insert or update on a table causes the addition of a new extent 
   and the operation is cancelled. 

2. The segment uses Automatic Segment Space Management (ASSM). 

3. The call stack in the associated trace file resembles:
    ktspfupdst ktspstchg kdoiur kcoubk ktundo ktubko ktuabt ktcrab ktcrsp 

解决方法:
官方给出方案就是升级到新版本,不过可以采取一个比较折中的处理方案,先想办法导出或者备份该对象数据,然后trunate或者重建表和相关index操作,解决该问题

DB2入门操作之二

查看db2版本
db2 => select * from sysibm.sysversions

列出所有实例
[db2inst1@xifenfei ~]$ db2ilist
db2inst1

列出当前实例
[db2inst1@xifenfei ~]$ db2 get instance

 The current database manager instance is:  db2inst1

察看示例配置文件
[db2inst1@xifenfei ~]$ db2 get dbm cfg|more

察看数据库配置参数信息
[db2inst1@xifenfei ~]$ db2 get db cfg for TOOLSDB|more


列出所有表空间的详细信息
[db2inst1@xifenfei ~]$ db2 list tablespaces show detail|more

连接数据库
[db2inst1@xifenfei ~]$ db2 connect to TOOLSDB

   Database Connection Information

 Database server        = DB2/LINUX 9.7.4
 SQL authorization ID   = DB2INST1
 Local database alias   = TOOLSDB

sql操作数据库
[db2inst1@xifenfei ~]$ db2 "select * from t_xff"

查看端口号
[db2inst1@xifenfei ~]$ db2 get dbm cfg|grep SVCENAME

查看表结构
[db2inst1@xifenfei ~]$ db2 describe table t_xifenfei

查看某个表索引
[db2inst1@xifenfei ~]$ db2 describe indexes for table t_xff

显示当前活动数据库
[db2inst1@xifenfei ~]$ db2 list active databases

列出所有的系统表
[db2inst1@xifenfei ~]$ db2 list tables for system

列出表空间
[db2inst1@xifenfei ~]$ db2 list tablespaces


显示用户数据库的存取权限
[db2inst1@xifenfei ~]$ db2 GET AUTHORIZATIONS

检查 DB2 数据库管理程序配置
[db2inst1@xifenfei ~]$ db2 get dbm cfg
Posted in DB2 |

通过hash_value获取sql语句执行计划

当我们没有权限访问业务表,但是需要查看shared pool中部分sql语句的执行计划,原则上来说,查询v$sql_plan视图结合hash_value可以实现,但是因为这个是表格形式,看起来不太美观,和我们长看的执行计划有一定的出入,这里提供两个脚本,实现查看该种情况下的执行计划。
oracle 9i

[oracle@xifenfei ~]$ more get_plan.sql 
set pagesize 0
set linesize 150
set serveroutput on size 10000
col plan_table_output format a125
undefine hash_value
set verify off feedback off
var hash_value varchar2(20)
begin
  :hash_value := '&hash_value';
end;
/
insert into plan_table
      (statement_id,timestamp,operation,options,object_node,object_owner,object_name,
       optimizer,search_columns,id,parent_id,position,cost,cardinality,bytes,other_tag,
       partition_start,partition_stop,partition_id,other,distribution,
       cpu_cost,io_cost,temp_space,access_predicates,filter_predicates
      )
select distinct hash_value,sysdate,operation,options,object_node,object_owner,object_name,
       optimizer,search_columns,id,parent_id,position,cost,cardinality,bytes,other_tag,
       partition_start,partition_stop,partition_id,other,distribution,
       cpu_cost,io_cost,temp_space,access_predicates,filter_predicates
  from v$sql_plan
 where hash_value = :hash_value
/
col piece noprint
select distinct piece,sql_text from v$sqltext where hash_value = :hash_value order by piece
/
@?/rdbms/admin/utlxplp.sql
set linesize 80
set verify on feedback on pagesize 1000

oracle 10g/11g

[oracle@xifenfei ~]$ more get_plan.sql 
set pagesize 0
set linesize 150
set serveroutput on size 10000
col plan_table_output format a125
undefine hash_value
set verify off feedback off
var hash_value varchar2(20)
begin
  :hash_value := '&hash_value';
end;
/
insert into plan_table
      (statement_id,timestamp,operation,options,object_node,object_owner,object_name,
       optimizer,search_columns,id,parent_id,position,cost,cardinality,bytes,other_tag,
       partition_start,partition_stop,partition_id,other,distribution,
       cpu_cost,io_cost,temp_space,access_predicates,filter_predicates,
       plan_id,OBJECT_ALIAS,DEPTH,PROJECTION,TIME,QBLOCK_NAME
      )
select distinct hash_value,sysdate,operation,options,object_node,object_owner,object_name,
       optimizer,search_columns,id,parent_id,position,cost,cardinality,bytes,other_tag,
       partition_start,partition_stop,partition_id,other,distribution,
       cpu_cost,io_cost,temp_space,access_predicates,filter_predicates,
       :hash_value,OBJECT_ALIAS,DEPTH,PROJECTION,TIME,QBLOCK_NAME
  from v$sql_plan
 where hash_value = :hash_value
/
col piece noprint
select distinct piece,sql_text from v$sqltext where hash_value = :hash_value order by piece
/
@?/rdbms/admin/utlxplp.sql
set linesize 80
set verify on feedback on pagesize 1000

使用方法

SQL> SELECT hash_value FROM V$SQL WHERE SQL_TEXT 
  2  LIKE 'SELECT * FROM SYS.SMON_SCN_TIME';

HASH_VALUE
----------
3019898357

SQL> @get_plan.sql
Enter value for hash_value: 3019898357
SELECT * FROM SYS.SMON_SCN_TIME

-----------------------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               |       |       |     3 (100)|          |
|   1 |  TABLE ACCESS FULL| SMON_SCN_TIME |     1 |  1163 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

补充说明
其实9i和10g/11g中得出执行计划的出入就是在plan_table表上
在9i中:plan_table表需要通过脚本创建并且授权

SQL> connect / as sysdba;
SQL> @?/rdbms/admin/utlxplan.sql;
SQL> create public synonym plan_table for plan_table; --建立同义词
SQL> grant all on plan_table to public;--授权所有用户

在10g/11g中:plan_table表系统自带,不需要创建。因为plan_table表中含有plan_id列,而得出执行计划时该列不能为空,所以上面脚本中对于10/11g数据库必须要填充plan_id值

通过修改基表(link$)让非public dblink变为public

有些朋友创建了一个非public的dblink,现在该数据库的其他用户需要去使用该dblink,在正常情况下无访问权限,需要重新建一个dblink,或者将原dblink修改为public。但是由于忘记了原dblink的目标段的密码,使得创建或者修改dblink的步骤无法进行下去。这里通过修改基表(link$),解决该问题。

创建dblink

SQL> show user;
USER is "SYS"
SQL> create database link "xff_dblink"
  2  connect to TEST
  3  identified by "test"
  4  using '11.1.1.1:1521/mcrm';

Database link created.

SQL> select * from dba_db_links where db_link like 'XFF_DBLINK%';

OWNER DB_LINK                                     USERN HOST               CREATED
----- ------------------------------------------- ----- ------------------ --------
SYS   XFF_DBLINK.REGRESS.RDBMS.DEV.US.ORACLE.COM  TEST  11.1.1.1:1521/mcrm 29-MAR-12

SQL> select sysdate from dual@xff_dblink;

SYSDATE
---------
29-MAR-12

SQL> CONN TEST/TEST
Connected.
SQL> SELECT SYSDATE FROM DUAL@XFF_DBLINK;
SELECT SYSDATE FROM DUAL@XFF_DBLINK
                         *
ERROR at line 1:
ORA-02019: connection description for remote database not found
--该dblink不是public的,所以test用户无权访问

dblink变为public类型

SQL> CONN / AS SYSDBA
Connected.
SQL> set long 1000
SQL> select  text from dba_views where view_name='DBA_DB_LINKS';

TEXT
-------------------------------------------------------------------
select u.name, l.name, l.userid, l.host, l.ctime
from sys.link$ l, sys.user$ u
where l.owner# = u.user#
--查询出dblink相关的基表有link$和user$

SQL> desc sys.link$
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 OWNER#                        NOT NULL NUMBER
 NAME                          NOT NULL VARCHAR2(128)
 CTIME                         NOT NULL DATE
 HOST                                   VARCHAR2(2000)
 USERID                                 VARCHAR2(30)
 PASSWORD                               VARCHAR2(30)
 FLAG                                   NUMBER
 AUTHUSR                                VARCHAR2(30)
 AUTHPWD                                VARCHAR2(30)
 PASSWORDX                              RAW(128)
 AUTHPWDX                               RAW(128)

SQL> select owner# from sys.link$ where name like 'XFF_DBLINK%';

    OWNER#
----------
         0
--XFF_DBLINK对应的用户标识记录在link$.owner#中

SQL> SELECT USER#,NAME FROM USER$ WHERE name in ('SYS','PUBLIC');

     USER# NAME
---------- ------------------------------
         1 PUBLIC
         0 SYS
--现link$.owner#值为0,表示该dblink所属用户为SYS,现在让该dblink变为public
--现需要让该dblink变为public,需要做的是修改link$.owner#的值为1

SQL> UPDATE LINK$ SET OWNER#=1 WHERE name like 'XFF_DBLINK%';

1 row updated.

SQL> COMMIT;

Commit complete.

--需要刷新shared_pool
SQL> ALTER SYSTEM FLUSH SHARED_POOL;

System altered.

--查看dblink所属者,已经修改为public
SQL> select owner from dba_db_links where db_link like 'XFF_DBLINK%';

OWNER
----------
PUBLIC

--测试dblink是否成功
SQL> CONN TEST/TEST
Connected.
SQL> SELECT SYSDATE FROM DUAL@XFF_DBLINK;

SYSDATE
---------
29-MAR-12

ORA-27103 when Memory target parameter is set to more than 3 GB(11.1.0.7)

朋友在数据库软件从11.1.0.6升级到11.1.0.7后,发现数据库无法打开,不能继续下一步升级
数据库启动

SQL> startup upgrade
ORA-03113: end-of-file on communication channel

alert日志

Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_10 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =182
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 11.1.0.7.0.
Using parameter settings in server-side spfile /u01/app/oracle/product/11.1.0/db_1/dbs/spfilecenterdb.ora
System parameters with non-default values:
  processes                = 1500
  sessions                 = 1655
  memory_target            = 12864M
  control_files            = "/u01/app/oracle/oradata/centerdb/control01.ctl"
  control_files            = "/u01/app/oracle/oradata/centerdb/control02.ctl"
  control_files            = "/u01/app/oracle/oradata/centerdb/control03.ctl"
  db_block_size            = 8192
  compatible               = "11.1.0.0.0"
  db_recovery_file_dest    = "/u01/app/oracle/flash_recovery_area"
  db_recovery_file_dest_size= 2G
  undo_tablespace          = "UNDOTBS1"
  remote_login_passwordfile= "EXCLUSIVE"
  db_domain                = ""
  dispatchers              = "(PROTOCOL=TCP) (SERVICE=centerdbXDB)"
  audit_file_dest          = "/u01/app/oracle/admin/centerdb/adump"
  audit_trail              = "DB"
  db_name                  = "centerdb"
  open_cursors             = 300
  diagnostic_dest          = "/u01/app/oracle"
Thu Mar 29 15:47:06 2012
PMON started with pid=2, OS id=16324 
Thu Mar 29 15:47:06 2012
VKTM started with pid=3, OS id=16326 at elevated priority
VKTM running at (20)ms precision
Thu Mar 29 15:47:06 2012
DIAG started with pid=4, OS id=16330 
Thu Mar 29 15:47:06 2012
DBRM started with pid=5, OS id=16332 
Thu Mar 29 15:47:06 2012
PSP0 started with pid=6, OS id=16334 
Thu Mar 29 15:47:06 2012
DIA0 started with pid=7, OS id=16336 
Thu Mar 29 15:47:06 2012
MMAN started with pid=8, OS id=16338 
Thu Mar 29 15:47:06 2012
DBW0 started with pid=9, OS id=16340 
Thu Mar 29 15:47:06 2012
DBW1 started with pid=10, OS id=16342 
Thu Mar 29 15:47:06 2012
DBW2 started with pid=11, OS id=16344 
Thu Mar 29 15:47:06 2012
DBW3 started with pid=12, OS id=16346 
Thu Mar 29 15:47:06 2012
DBW4 started with pid=13, OS id=16348 
Thu Mar 29 15:47:06 2012
DBW5 started with pid=14, OS id=16350 
Thu Mar 29 15:47:06 2012
LGWR started with pid=15, OS id=16352 
Thu Mar 29 15:47:06 2012
CKPT started with pid=16, OS id=16354 
Thu Mar 29 15:47:06 2012
SMON started with pid=17, OS id=16356 
Thu Mar 29 15:47:06 2012
RECO started with pid=18, OS id=16358 
Thu Mar 29 15:47:06 2012
MMON started with pid=19, OS id=16360 
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Thu Mar 29 15:47:06 2012
MMNL started with pid=20, OS id=16362 
starting up 1 shared server(s) ...
Errors in file /u01/app/oracle/diag/rdbms/centerdb/centerdb/trace/centerdb_mman_16338.trc:
ORA-27103: internal error
Additional information: -1
Additional information: 1
MMAN (ospid: 16338): terminating the instance due to error 27103
Instance terminated by MMAN, pid = 16338

这里可以发现memory_target在12g以上

trace文件内容

[oracle@fcdb trace]$ more /u01/app/oracle/diag/rdbms/centerdb/centerdb/trace/centerdb_mman_16338.trc
Trace file /u01/app/oracle/diag/rdbms/centerdb/centerdb/trace/centerdb_mman_16338.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.1.0/db_1
System name:    Linux
Node name:      fcdb
Release:        2.6.18-164.el5
Version:        #1 SMP Tue Aug 18 15:51:48 EDT 2009
Machine:        x86_64
Instance name: centerdb
Redo thread mounted by this instance: 0 <none>
Oracle process number: 8
Unix process pid: 16338, image: oracle@fcdb (MMAN)


*** 2012-03-29 15:47:06.865
*** SESSION ID:(1648.1) 2012-03-29 15:47:06.865
*** CLIENT ID:() 2012-03-29 15:47:06.865
*** SERVICE NAME:() 2012-03-29 15:47:06.865
*** MODULE NAME:() 2012-03-29 15:47:06.865
*** ACTION NAME:() 2012-03-29 15:47:06.865
 
error 27103 detected in background process
ORA-27103: internal error
Additional information: -1
Additional information: 1

*** 2012-03-29 15:47:06.865
MMAN (ospid: 16338): terminating the instance due to error 27103

结合alert和trace文件查询MOS,发现ORA-27103 when Memory target parameter is set to more than 3 GB [ID 743012.1]描述相符,是由于Bug:7272646引起.
鉴于朋友的数据库还升级过程中,所以给出的处理建议是先把memory_target改为2.8G,执行完升级操作,然后打上Patch:7272646
同时官方还给出了另一种解决方案:设置SHMMAX小于4G,个人不推荐;如果系统内存比较大,会出现多个内存段,影响系统性能