在RAC中lsnrctl和srvctl操作监听区别

朋友今天询问了一个问题RAC中使用srvctl 操作监听和lsnrctl 操作监听结果不一样,下面我通过实验说明问题
0.listener.ora文件内容

LISTENER_RAC1 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac1-vip)(PORT = 1521)(IP = FIRST))
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.11)(PORT = 1521)(IP = FIRST))
    )
  )

SID_LIST_LISTENER_RAC1 =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1)
      (PROGRAM = extproc)
    )
  )

1.srvctl 启动监听

rac1-> srvctl start listener -n rac1
rac1-> lsnrctl status

LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 11-MAR-2012 22:09:34

Copyright (c) 1991, 2005, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_RAC1
Version                   TNSLSNR for Linux: Version 10.2.0.1.0 - Production
Start Date                11-MAR-2012 22:07:21
Uptime                    0 days 0 hr. 2 min. 13 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/product/10.2.0/db_1/network/log/listener_rac1.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.21)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.11)(PORT=1521)))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
  Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

rac1-> crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.....XFF.cs application    ONLINE    ONLINE    rac1        
ora....db1.srv application    ONLINE    ONLINE    rac2        
ora.devdb.db   application    ONLINE    ONLINE    rac2        
ora....b1.inst application    ONLINE    ONLINE    rac1        
ora....b2.inst application    ONLINE    ONLINE    rac2        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    ONLINE    rac1        
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        

srvctl操作监听,自动反馈到crs中

2.使用srvctl关闭监听

rac1-> srvctl stop listener -n rac1
rac1-> crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.....XFF.cs application    ONLINE    ONLINE    rac1        
ora....db1.srv application    ONLINE    ONLINE    rac2        
ora.devdb.db   application    ONLINE    ONLINE    rac2        
ora....b1.inst application    ONLINE    ONLINE    rac1        
ora....b2.inst application    ONLINE    ONLINE    rac2        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    OFFLINE   OFFLINE               
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        

3.使用lsnrctl查看监听状态

rac1-> lsnrctl status

LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 11-MAR-2012 22:15:54

Copyright (c) 1991, 2005, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521)) <--host为空
TNS-12541: TNS:no listener
 TNS-12560: TNS:protocol adapter error
  TNS-00511: No listener
   Linux Error: 111: Connection refused



rac1-> lsnrctl 

LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 11-MAR-2012 22:16:55

Copyright (c) 1991, 2005, Oracle.  All rights reserved.

Welcome to LSNRCTL, type "help" for information.

LSNRCTL> status listener_rac1
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=rac1-vip)(PORT=1521)(IP=FIRST)))
TNS-12541: TNS:no listener
 TNS-12560: TNS:protocol adapter error
  TNS-00511: No listener
   Linux Error: 111: Connection refused
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.11)(PORT=1521)(IP=FIRST)))
TNS-12541: TNS:no listener
 TNS-12560: TNS:protocol adapter error
  TNS-00511: No listener
   Linux Error: 111: Connection refused

这里可以发现问题:
1)如果当前没有监听在运行,使用lsnrctl status的时候,会去检查默认的监听名称为listener的监听,如果该监听不存在不会使用hostname填充到hostname项中(注意下面的启动默认监听过程)
2)lsnrctl查看指定监听为listener_rac1,发现和listener.ora中配置相同

4.lsnrctl 关闭监听

rac1-> srvctl start listener -n rac1
rac1-> lsnrctl stop

LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 11-MAR-2012 22:43:14

Copyright (c) 1991, 2005, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521)) <--host为空
The command completed successfully
rac1-> crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.....XFF.cs application    ONLINE    ONLINE    rac1        
ora....db1.srv application    ONLINE    ONLINE    rac2        
ora.devdb.db   application    ONLINE    ONLINE    rac2        
ora....b1.inst application    ONLINE    ONLINE    rac1        
ora....b2.inst application    ONLINE    ONLINE    rac2        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    OFFLINE   OFFLINE               
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        

这里可以说明问题:
1)lsnrctl stop虽然是要停止掉默认监听,但是也会停止掉非默认监听
2)lsnrctl stop如果默认监听不存在,那么注册host也为空

5.使用lsnrctl启动默认监听

rac1-> lsnrctl start

LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 11-MAR-2012 22:17:37

Copyright (c) 1991, 2005, Oracle.  All rights reserved.

Starting /u01/app/oracle/product/10.2.0/db_1/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 10.2.0.1.0 - Production
System parameter file is /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Log messages written to /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=rac1)(PORT=1521)))

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 10.2.0.1.0 - Production
Start Date                11-MAR-2012 22:17:37
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=rac1)(PORT=1521))) <--主机名
The listener supports no services
The command completed successfully

rac1-> crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.....XFF.cs application    ONLINE    ONLINE    rac1        
ora....db1.srv application    ONLINE    ONLINE    rac2        
ora.devdb.db   application    ONLINE    ONLINE    rac2        
ora....b1.inst application    ONLINE    ONLINE    rac1        
ora....b2.inst application    ONLINE    ONLINE    rac2        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    OFFLINE   OFFLINE               
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        

这里发现问题:
1)监听的ip只有主机名的一个,和srvctl启动的监听不一样
2)虽然监听启动了,crs中依然显示为offline状态

6.使用lsnrctl启动listener_rac1监听

LSNRCTL> start listener_rac1
Starting /u01/app/oracle/product/10.2.0/db_1/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 10.2.0.1.0 - Production
System parameter file is /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Log messages written to /u01/app/oracle/product/10.2.0/db_1/network/log/listener_rac1.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.21)(PORT=1521)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.11)(PORT=1521)))

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=rac1-vip)(PORT=1521)(IP=FIRST)))
STATUS of the LISTENER
------------------------
Alias                     listener_rac1
Version                   TNSLSNR for Linux: Version 10.2.0.1.0 - Production
Start Date                11-MAR-2012 22:19:04
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/product/10.2.0/db_1/network/log/listener_rac1.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.21)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.11)(PORT=1521)))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
  Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

rac1-> crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.....XFF.cs application    ONLINE    ONLINE    rac1        
ora....db1.srv application    ONLINE    ONLINE    rac2        
ora.devdb.db   application    ONLINE    ONLINE    rac2        
ora....b1.inst application    ONLINE    ONLINE    rac1        
ora....b2.inst application    ONLINE    ONLINE    rac2        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    ONLINE    rac1        
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        

这里可以说明两个问题:
1)使用lsnrctl启动监听和srvctl启动一样
2)启动listener_rac1后,crs中监听资源变成online

7.问题原因分析

rac1-> srvctl config listener -n rac1
rac1 LISTENER_RAC1

通过这里可以发现,其实srvctl操作的监听就是LISTENER_RAC1,所以当我使用lsnrctl 操作LISTENER_RAC1监听时候crs会自动offline或者online,而lsnrctl 操作默认监听时crs不会online

误删除dual表恢复

1.10G中删除dual表恢复

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> select object_type,owner from dba_objects where object_name='DUAL';

OBJECT_TYPE         OWNER
------------------- ------------------------------
TABLE               SYS
SYNONYM             PUBLIC

SQL> drop table sys.dual;

Table dropped.

SQL> select object_type from dba_objects where object_name='DUAL';

OBJECT_TYPE
-------------------
SYNONYM

SQL> SELECT SYSDATE FROM dual;
SELECT SYSDATE FROM dual
                    *
ERROR at line 1:
ORA-01775: looping chain of synonyms


SQL> CREATE TABLE XFF AS SELECT * from dba_objects;

Table created.

SQL> drop table xff purge;
drop table xff purge
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-01775: looping chain of synonyms


设置10046跟踪会话发现,在每次删除表操作的时候发现如下错误
select dummy from dual where  ora_dict_obj_type = 'TABLE'

其实这里错误都很明显,是因为dual表不存在了,表对应的同义词还存在,当查询dual的时候,会去查询同义词,然后该同义词去找表,而表不存在,所以出现上述的ORA-01775: looping chain of synonyms错误

2.解决方法

SQL> CREATE TABLE "SYS"."DUAL" 
  2     (       "DUMMY" VARCHAR2(1)
  3     ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
  4    STORAGE(INITIAL 16384 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  5    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  6    TABLESPACE "SYSTEM" ;

Table created.

SQL> GRANT SELECT ON "SYS"."DUAL" TO PUBLIC WITH GRANT OPTION;

Grant succeeded.

SQL> insert into dual values('X');

1 row created.

SQL> COMMIT;

Commit complete.

--编译对象
SQL> @?/rdbms/admin/utlrp.sql

3.测试结果


SQL> select sysdate from dual;

SYSDATE
------------
13-MAR-12

SQL> drop table xff purge;

Table dropped.

rman 备份出现ORA-00245/RMAN-08132

rman备份出现如下错误

RMAN> backup full  tag 'dbfull' format '/jfkdata1/rman_bak/full_%d_%u' database 
 2>   include current controlfile  plus archivelog  format '/jfkdata1/rman_bak/arch_%d_%u'; 


Starting backup at 13-MAR-12
current log archived
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=297 instance=ykcdb2 device type=DISK
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set

input archived log thread=1 sequence=29 RECID=43 STAMP=777808820
input archived log thread=1 sequence=30 RECID=46 STAMP=777816035
input archived log thread=2 sequence=21 RECID=45 STAMP=777816033
channel ORA_DISK_1: starting piece 1 at 13-MAR-12
channel ORA_DISK_1: finished piece 1 at 13-MAR-12
piece handle=/jfkdata1/rman_bak/arch_YKCDB_01n5p1vf tag=DBFULL comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
Finished backup at 13-MAR-12

Starting backup at 13-MAR-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=/jfkdata1/oradata/ykcdb/sysaux01.dbf
input datafile file number=00001 name=/jfkdata1/oradata/ykcdb/system01.dbf
input datafile file number=00003 name=/jfkdata1/oradata/ykcdb/undotbs01.dbf
input datafile file number=00005 name=/jfkdata1/oradata/ykcdb/undotbs02.dbf
input datafile file number=00004 name=/jfkdata1/oradata/ykcdb/users01.dbf
channel ORA_DISK_1: starting piece 1 at 13-MAR-12
channel ORA_DISK_1: finished piece 1 at 13-MAR-12
piece handle=/jfkdata1/rman_bak/full_YKCDB_02n5p209 tag=DBFULL comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 13-MAR-12
channel ORA_DISK_1: finished piece 1 at 13-MAR-12
piece handle=/jfkdata1/rman_bak/full_YKCDB_03n5p213 tag=DBFULL comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 13-MAR-12

Starting backup at 13-MAR-12
current log archived
using channel ORA_DISK_1
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=2 sequence=22 RECID=48 STAMP=777816102
input archived log thread=1 sequence=31 RECID=47 STAMP=777816101
channel ORA_DISK_1: starting piece 1 at 13-MAR-12
channel ORA_DISK_1: finished piece 1 at 13-MAR-12
piece handle=/jfkdata1/rman_bak/arch_YKCDB_04n5p217 tag=DBFULL comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 13-MAR-12
ORA-00245: control file backup operation failed

RMAN-08132: WARNING: cannot update recovery area reclaimable file list

RMAN> 

从这里可以看出数据库文件,归档日志,spfile和controlfile都备份完成,后面又出现一个controlfile失败,应该是控制文件快照备份失败。

错误原因

RMAN creates a copy of the control file for read consistency, this is the snapshot controlfile. 
Due to the changes made to the controlfile backup mechanism in 11gR2 any instances 
in the cluster may write to the snapshot controlfile. Therefore, 
the snapshot controlfile file needs to be visible to all instances.

The same happens when a backup of the controlfile is created directly from 
sqlplus any instance in the cluster may write to the backup controfile file.

In 11gR2 onwards, the controlfile backup happens without holding the control file enqueue. 
For non-RAC database, this doesn't change anything. 

But, for RAC database, the snapshot controlfile location 
must be in a shared file system that will be accessible from all the nodes.

The snapshot controlfile MUST be  accessible by all nodes of a RAC database. 

大致意思就是rman的snapshot controlfile必须放在共享存储之上。

解决方法

The snapshot controlfile MUST be accessible by all nodes of a RAC database, 
if the snapshot controlfile does not reside in on a shared device this error will raise.

1. Check the snapshot controlfile location:
RMAN> show all;

2. Configure the snapshot controlfile to a shared disk:
RMAN> CONFIGURE SNAPSHOT CONTROLFILE NAME TO '<shared_disk>/snapcf_<DBNAME>.f';

记录一次ORA-600[13013]处理过程

在一次数据库的异常处理完成后,发现alert日志中出现ORA-600[13013]错误

Thu Mar 08 23:29:37 2012
Errors in file /opt/oracle/diag/rdbms/chf/chf/trace/chf_smon_24137.trc  (incident=38681):
ORA-00600: internal error code, arguments: [13013], [5001], [518], [4198427], [170], [4198427], [17], [], [], [], [], []
Incident details in: /opt/oracle/diag/rdbms/chf/chf/incident/incdir_38681/chf_smon_24137_i38681.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Non-fatal internal error happenned while SMON was doing flushing of monitored table stats.
SMON encountered 1 out of maximum 100 non-fatal internal errors.

trace文件中信息
从这里可以看出是对sys.col_usage$表进行update操作导致该错误发生

Dump continued from file: /opt/oracle/diag/rdbms/chf/chf/trace/chf_smon_24137.trc
ORA-00600: internal error code, arguments: [13013], [5001], [518], [4198427], [170], [4198427], [17], [], [], [], [], []

========= Dump for incident 38681 (ORA 600 [13013]) ========

*** 2012-03-08 23:29:37.400
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=3c1kubcdjnppq) -----
update sys.col_usage$ set   equality_preds    = equality_preds    + decode(bitand(:flag,1),0,0,1),   equijoin_preds    = equijoin_preds    + decode(bitand(:flag,2),0,0
,1),   nonequijoin_preds = nonequijoin_preds + decode(bitand(:flag,4),0,0,1),   range_preds       = range_preds       + decode(bitand(:flag,8),0,0,1),   like_preds    
    = like_preds        + decode(bitand(:flag,16),0,0,1),   null_preds        = null_preds        + decode(bitand(:flag,32),0,0,1),  timestamp = :time where obj# = :ob
jn and intcol# = :coln

MOS中关于ORA-600 [13013]描述

Format: ORA-600 [13013] [a] [b] {c} [d] [e] [f]
Arg [a] Passcount
Arg [b] Data Object number
Arg {c} Tablespace Decimal Relative DBA (RDBA) of block containing the row to be updated
Arg [d] Row Slot number
Arg [e] Decimal RDBA of block being updated (Typically same as {c})
Arg [f] Code

验证MOS中描述

SQL> select dbms_utility.data_block_address_file(4198427) rfile,
  2  dbms_utility.data_block_address_block(4198427) blocks
  3  from dual;

     RFILE     BLOCKS
---------- ----------
         1       4123

SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = &FILE_ID
  4     AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1;
Enter value for file_id: 1
old   3:  WHERE FILE_ID = &FILE_ID
new   3:  WHERE FILE_ID = 1
Enter value for block_id: 4123
old   4:    AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
new   4:    AND 4123 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1

OWNER SEGMENT_NAME SEGMENT_TY TABLESPACE PARTI
----- ------------ ---------- ---------- -----
SYS   COL_USAGE$   TABLE      SYSTEM
--和trace文件中异常表一致

SQL>  select object_type,object_name from dba_objects where object_id=518;

OBJECT_TYPE         OBJECT_NAME
------------------- ------------------------------
TABLE               COL_USAGE$
--也和trace文件中异常表一致

分析异常表

SQL> ANALYZE TABLE sys.COL_USAGE$ VALIDATE STRUCTURE CASCADE;
ANALYZE TABLE sys.COL_USAGE$ VALIDATE STRUCTURE CASCADE
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

SQL> select index_name,COLUMN_NAME,COLUMN_POSITION FROM DBA_IND_COLUMNS 
2    WHERE TABLE_NAME='COL_USAGE$';

INDEX_NAME      COLUMN_NAM COLUMN_POSITION
--------------- ---------- ---------------
I_COL_USAGE$    OBJ#                     1
I_COL_USAGE$    INTCOL#                  2

SQL> set autot trace exp
SQL>  SELECT /*+ FULL(t1) */ OBJ#,INTCOL#
  2   FROM sys.COL_USAGE$ t1
  3   MINUS
  4   SELECT /*+ index(t I_COL_USAGE$) */ OBJ#,INTCOL#
  5   FROM sys.COL_USAGE$ t where OBJ# is not null or INTCOL# is not null;

no rows selected
--无记录返回

Execution Plan
----------------------------------------------------------
Plan hash value: 399371572

------------------------------------------------------------------------------------

| Id  | Operation           | Name         | Rows  | Bytes | Cost (%CPU)| Time
   |

------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |              |  4262 | 76716 |    27  (71)| 00:00:01 |
|   1 |  MINUS              |              |       |       |            |          |
|   2 |   SORT UNIQUE       |              |  4262 | 38358 |     9  (12)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| COL_USAGE$   |  4262 | 38358 |     8   (0)| 00:00:01 |
|   4 |   SORT UNIQUE NOSORT|              |  4262 | 38358 |    18   (6)| 00:00:01 |
|*  5 |    INDEX FULL SCAN  | I_COL_USAGE$ |  4262 | 38358 |    17   (0)| 00:00:01 |

------------------------------------------------------------------------------------
--验证表两个sql是否正确(一个全表扫描,另个index 快速扫描)


SQL>  SELECT /*+ index(t I_COL_USAGE$) */ OBJ#,INTCOL#
  2   FROM sys.COL_USAGE$ t where OBJ# is not null or INTCOL# is not null
  3   MINUS
  4  SELECT /*+ FULL(t1) */ OBJ#,INTCOL#
  5   FROM sys.COL_USAGE$ t1;

      OBJ#    INTCOL#
---------- ----------
4294951004          2
4294951004          3
4294951004          4
4294951004         26
4294951004         27
4294951037          4
4294951037          5
4294951037          6
4294951037          9
4294951037         10
4294951840         11

      OBJ#    INTCOL#
---------- ----------
4294951840         12
4294951906          4
4294952709          3
4294952867          4
4294952867          9

16 rows selected.
--证明index中的记录比表中多了16条

解决问题并验证

SQL> alter index sys.I_COL_USAGE$ rebuild online;

Index altered.

SQL>  SELECT /*+ FULL(t1) */ OBJ#,INTCOL#
 FROM sys.COL_USAGE$ t1
  2    3   MINUS
  4   SELECT /*+ index(t I_COL_USAGE$) */ OBJ#,INTCOL#
  5   FROM sys.COL_USAGE$ t where OBJ# is not null or INTCOL# is not null
  6  ;

no rows selected

SQL>  SELECT /*+ index(t I_COL_USAGE$) */ OBJ#,INTCOL#
  2   FROM sys.COL_USAGE$ t where OBJ# is not null or INTCOL# is not null
  3   MINUS
  4  SELECT /*+ FULL(t1) */ OBJ#,INTCOL#
  5   FROM sys.COL_USAGE$ t1;

no rows selected

这次出现此问题的原因是因为在更新语句中使用索引找到一条记录,然后到表中去查询时该记录不存在,出现此错误,一般解决方法是重建索引
官方关于ORA-600[13013]说明

使用_allow_resetlogs_corruption打开无归档日志rman备份库

rman还原恢复操作

--还原数据库
RMAN> restore database;
--恢复数据库
RMAN> recover database;

Starting recover at 2012-03-08 21:20:45
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=65 device type=DISK

starting media recovery

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 03/08/2012 21:20:47
RMAN-06053: unable to perform media recovery because of missing log
RMAN-06025: no backup of archived log for thread 1 with sequence 2936 and starting SCN of 25991695 found to restore
RMAN-06025: no backup of archived log for thread 1 with sequence 2935 and starting SCN of 25991652 found to restore
RMAN-06025: no backup of archived log for thread 1 with sequence 2934 and starting SCN of 25991649 found to restore
……………………
RMAN-06025: no backup of archived log for thread 1 with sequence 2902 and starting SCN of 25991156 found to restore
这里报日志缺少,实际上是备份的数据库文件后,没有备份归档日志,归档日志全部丢失

进行不完全恢复

SQL> recover database until cancel;
ORA-00279: change 25991194 generated at 03/08/2012 20:33:58 needed for thread 1
ORA-00289: suggestion : /opt/oracle/oradata/archivelog/chf/1_2902_752334071.dbf
ORA-00280: change 25991194 for thread 1 is in sequence #2902


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/opt/oracle/oradata/chf/system01.dbf'


ORA-01112: media recovery not started


SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/opt/oracle/oradata/chf/system01.dbf'

查看相关SCN

SQL> select file#,to_char(checkpoint_change#,'999999999999') from v$datafile;

     FILE# TO_CHAR(CHECK
---------- -------------
         1      25992214
         2      25992214
         3      25992214
         4      25992214
         5      25992214
         6      25992214
         7      25992214
         8      25992214
         9      25992214
        10      25992214
        11      25992214

     FILE# TO_CHAR(CHECK
---------- -------------
        13      25992214
        14      25992214

13 rows selected.

SQL> select file#,online_status,to_char(change#,'999999999999') from v$recover_file;

     FILE# ONLINE_ TO_CHAR(CHANG
---------- ------- -------------
         1 ONLINE       25991194
         2 ONLINE       25991194
         3 ONLINE       25991194
         4 ONLINE       25991194
         5 ONLINE       25991194
         6 ONLINE       25991194
         7 ONLINE       25991194
         8 ONLINE       25991194
         9 ONLINE       25991194
        10 ONLINE       25991194
        11 ONLINE       25991194

     FILE# ONLINE_ TO_CHAR(CHANG
---------- ------- -------------
        13 ONLINE       25991194
        14 ONLINE       25991194

13 rows selected.

SQL> select file#,to_char(checkpoint_change#,'999999999999') from v$datafile_header;

     FILE# TO_CHAR(CHECK
---------- -------------
         1      25991194
         2      25991194
         3      25991194
         4      25991194
         5      25991194
         6      25991194
         7      25991194
         8      25991194
         9      25991194
        10      25991194
        11      25991194

     FILE# TO_CHAR(CHECK
---------- -------------
        13      25991194
        14      25991194

13 rows selected.

--发现数据文件scn和控制文件不一致,重建控制文件,然后查询相关scn
SQL> select file#,to_char(checkpoint_change#,'999999999999') from v$datafile;

     FILE# TO_CHAR(CHECK
---------- -------------
         1      25991194
         2      25991194
         3      25991194
         4      25991194
         5      25991194
         6      25991194
         7      25991194
         8      25991194
         9      25991194
        10      25991194
        11      25991194

     FILE# TO_CHAR(CHECK
---------- -------------
        13      25991194
        14      25991194

13 rows selected.

SQL> select file#,online_status,to_char(change#,'999999999999') from v$recover_file;

     FILE# ONLINE_ TO_CHAR(CHANG
---------- ------- -------------
         1 ONLINE       25991194
         2 ONLINE       25991194
         3 ONLINE       25991194
         4 ONLINE       25991194
         5 ONLINE       25991194
         6 ONLINE       25991194
         7 ONLINE       25991194
         8 ONLINE       25991194
         9 ONLINE       25991194
        10 ONLINE       25991194
        11 ONLINE       25991194

     FILE# ONLINE_ TO_CHAR(CHANG
---------- ------- -------------
        13 ONLINE       25991194
        14 ONLINE       25991194

13 rows selected.

SQL> select file#,to_char(checkpoint_change#,'999999999999') from v$datafile_header;

     FILE# TO_CHAR(CHECK
---------- -------------
         1      25991194
         2      25991194
         3      25991194
         4      25991194
         5      25991194
         6      25991194
         7      25991194
         8      25991194
         9      25991194
        10      25991194
        11      25991194

     FILE# TO_CHAR(CHECK
---------- -------------
        13      25991194
        14      25991194

13 rows selected.
--此时所有scn均一致

尝试打开数据库

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/opt/oracle/oradata/chf/system01.dbf'


SQL> recover database using backup controlfile until cancel;
ORA-00279: change 25991194 generated at 03/08/2012 20:33:58 needed for thread 1
ORA-00289: suggestion : /opt/oracle/oradata/archivelog/chf/1_2902_752334071.dbf
ORA-00280: change 25991194 for thread 1 is in sequence #2902


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/opt/oracle/oradata/chf/system01.dbf'


ORA-01112: media recovery not started

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/opt/oracle/oradata/chf/system01.dbf'

使用隐含参数打开数据库

SQL> create pfile='/tmp/pfile' from spfile;

File created.

-------/tmp/pfile中加上----------
_allow_resetlogs_corruption= TRUE
---------------------------------

SQL> startup mount pfile='/tmp/pfile' force
ORACLE instance started.

Total System Global Area  622149632 bytes
Fixed Size                  2230912 bytes
Variable Size             419431808 bytes
Database Buffers          192937984 bytes
Redo Buffers                7548928 bytes
Database mounted.
SQL> alter database open resetlogs;

Database altered.

SQL> select open_mode from v$database;

OPEN_MODE
--------------------
READ WRITE

总结
这次的试验没有多少实际意义,但是可以说明几个问题:
1.所有的数据文件的scn都一致,甚至和控制文件的也一致,数据库不一定可以open成功
(怀疑是数据文件中的scn大于data header scn)
2.对于这样的问题,如果使用bbed修改所有数据文件header的scn不知道是否可以解决
3.如果rman只备份了数据文件而没有任何一个归档日志,数据库通过隐含参数还是可以open,抢救数据