Oracle分布式事务故障处理

分布式事务,简单来说,是指一个事务在本地和远程执行,本地需要等待确认远程的事务结束后,进行下一步本地的操作。如通过dblink update远程数据库的一行记录,如果在执行过程中网络异常,或者其他事件导致本地数据库无法得知远程数据库的执行情况,此时就会发生in doublt的报错。此时需要dba介入,且需要分多种情况进行处理。分布式事务在commit提交时候,会经历3个阶段:
1.PREPARE PHASE
1.1 决定哪个数据库为commit point site。(注,参数文件中commit_point_strength值高的那个数据库为commit point site)
1.2 全局协调者(Global Coordinator)要求所有的点(除commit point site外)做好commit或者rollback的准备。此时,对分布式事务的表加锁。
1.3 所有分布式事务的节点将它的scn告知全局协调者。
1.4 全局协调者取各个点的最大的scn作为分布式事务的scn。(eygle在这篇文章中也测试过)

至此,所有的点都完成了准备工作,我们开始进入COMMIT PHASE阶段,此时除commit point site点外所有点的事务均为in doubt状态,直到COMMIT PHASE阶段结束。

如果数据库在此阶段出现问题,我们查询(假设远程数据库为commit point site,且本地数据库为Global Coordinator):

本地:

select local_tran_id,state from dba_2pc_pending;
LOCAL_TRAN_ID          STATE
---------------------- ----------------
2.12.64845              collecting

远程:

select local_tran_id,state from dba_2pc_pending;
no rows selected

即表示本地数据库要求其他点做好commit或者rollback准备,现在正在“收集”其他点的数据库的返回信息,但是远程数据库未知状态(in doubt)。我们需要将本地的Global Coordinator的状态清除掉:

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');

或者我们在查询的时候发现是如下的状态:

本地:

select local_tran_id,state from dba_2pc_pending;
LOCAL_TRAN_ID          STATE
---------------------- ----------------
2.12.64845             prepared

远程:

select local_tran_id,state from dba_2pc_pending;
no rows selected

即表示本地Global Coordinator已经做好准备,已经将分布式锁放到各个事务的表上,但是远程数据库的状态再次未知(in doubt),我们需要手工的将本地的transaction rollback掉,并且清除分布式事务信息:
本地:

rollback force 'local_tran_id';
execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');

2.COMMIT PHASE
2.1 Global Coordinator将最大scn传到commit point site,要求其commit。
2.2 commit point尝试commit或者rollback。分布式事务锁释放。
2.3 commit point通知Global Coordinator已经commit。
2.4 Global Coordinator通知分布式事务的所有点进行commit。

如果数据库在此阶段出现问题,我们查询

本地:

select local_tran_id,state from dba_2pc_pending;
LOCAL_TRAN_ID          STATE
---------------------- ----------------
2.12.64845             prepared

远程:

select local_tran_id,state from dba_2pc_pending;
LOCAL_TRAN_ID          STATE
---------------------- ----------------
1.92.66874             commited

即远程数据库可能已经commit,但是本地Global Coordinator未知远程数据库的状态,还是处于prepare的状态。我们需要在如下处理:

本地:

commit force 'local_tran_id';
execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');

远程:

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');

或者我们在查询的时候发现是如下的状态:

本地:

select local_tran_id,state from dba_2pc_pending;
LOCAL_TRAN_ID          STATE
---------------------- ----------------
2.12.64845            commited

远程:

select local_tran_id,state from dba_2pc_pending;
LOCAL_TRAN_ID          STATE
---------------------- ----------------
1.92.66874             commited

即远程数据库和本地数据库均已经完成commit,但是分布式事务的信息尚未清除,我们需要在本地和远程运行:

本地:

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');

远程:

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');

3.FORGET PHASE
3.1 参与的点通知commit point site他们已经完成commit,commit point site就能忘记(forget)这个事务。
3.2 commit point site在远程数据库上清除分布式事务信息。
3.3 commit point site通知Global Coordinator可以清除本地的分布式事务信息。
3.4 Global Coordinator清除分布式事务信息。

此时如果出现问题,我们查询:

本地:

select local_tran_id,state from dba_2pc_pending;
LOCAL_TRAN_ID          STATE
---------------------- ----------------
2.12.64845            commited

远程:

select local_tran_id,state from dba_2pc_pending;
no rows selected

即远程commit point site已经完成commit,通知Global Coordinator清除本地的分布式事务信息,但是Global Coordinator没有收到该信息。我们需要这样处理:

本地:

execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('local_tran_id');

综上,分布式事务的依次状态为:

phase       local_state    remote_state       action
----------- ---------- ------------------  --------------------------------------------
prepare     collecting       /              本地DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY
            prepared         /              本地rollback force后PURGE_LOST_DB_ENTRY
 
commit      prepared        commited        本地commit force后本地和远程均PURGE
            commited        commited        本地和远程均PURGE_LOST_DB_ENTRY
   
forget      commited         /              本地PURGE_LOST_DB_ENTRY

另,当我们遇到使用rollback/commit force的时候,无法正常的清除分布式事务的信息,会报错ORA-02058: no prepared transaction found with ID X.XX.XXXXX时,我们需要通过手工方式来清除该信息。
(注,以下方式修改数据字典,存在风险,使用前请备份好你的数据库)

情况1,在dba_2pc表中还有事务记录,但是实际已经不存在该事务了:

select local_tran_id, state from dba_2pc_pending;
LOCAL_TRAN_ID          STATE
---------------------- ----------------
1.92.66874             prepared

(注:’1.92.66874′的结构为rbs#, slot#, wrap#,此事务在rollback segment #1)

我们再用如下语句找出使用rollback segment #1且状态是active的transaction:

SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */
       KTUXESTA Status,
       KTUXECFL Flags
FROM x$ktuxe
WHERE ktuxesta!='INACTIVE'
      AND ktuxeusn= 1; <== 这是rollback segment#,即rbs#
no rows selected

因此我们在rollback force的时候会报错:

ORA-02058: no prepared transaction found with ID 1.92.66874
我们需要如下处理:

set transaction use rollback segment SYSTEM;
delete from sys.pending_trans$
  where local_tran_id = '1.92.66874';
delete from sys.pending_sessions$ where local_tran_id = '1.92.66874';
delete from sys.pending_sub_sessions$ where local_tran_id = '1.92.66874';
commit;

情况2,这种情况比较少见,在dba_2pc表中无法查到分布式事务信息,但是实际上却是存在该分布式事务的:

我们在alertlog中可以看到:

ORA-1591: lock held by in-doubt distributed transaction 1.92.66874
我们查询dba_2pc的表,发现没有分布式事务信息:

select local_tran_id, state from dba_2pc_pending 
where local_tran_id='1.92.66874';
no rows selected

但是去查实际的rollback segment信息,却发现有prepared状态的分布式事务存在:

SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */
       KTUXESTA Status,
       KTUXECFL Flags
FROM x$ktuxe
WHERE ktuxesta!='INACTIVE'
      AND ktuxeusn= 1;
  KTUXEUSN   KTUXESLT   KTUXESQN STATUS           FLAGS
---------- ---------- ---------- ---------------- ------------------------
         1         92      66874 PREPARED         SCO|COL|REV|DEAD

我们无法做commit force或者rollback force:

rollback force '1.92.66874';
ORA-02058: no prepared transaction found with ID 1.92.66874

我们用如下的方式手工清理:

alter system disable distributed recovery;
 
insert into pending_trans$ (
    LOCAL_TRAN_ID,
    GLOBAL_TRAN_FMT,
    GLOBAL_ORACLE_ID,
    STATE,
    STATUS,
    SESSION_VECTOR,
    RECO_VECTOR,
    TYPE#,
    FAIL_TIME,
    RECO_TIME)
values( '1.92.66874', /* <== 此处为你的local tran id */
    306206,                  /*                                         */
    'XXXXXXX.12345.1.2.3',   /*  这些值不必更改,   */
    'prepared','P',          /*  是静态参数,可以直接   */
    hextoraw( '00000001' ),  /*  在这个sql中使用                             */
    hextoraw( '00000000' ),  /*                                         */
    0, sysdate, sysdate );
 
insert into pending_sessions$
values( '1.92.66874',/* <==此处为你的local tran id  */
    1, hextoraw('05004F003A1500000104'),
    'C', 0, 30258592, '',
    146
  );
 
commit;
 
commit force '1.92.66874';

此时如果commit force还是出现报错,需要继续执行:

delete from pending_trans$ where local_tran_id='1.92.66874';
delete from pending_sessions$ where local_tran_id='1.92.66874';
commit;
alter system enable distributed recovery;

此时如果没有报错,则执行以下语句:

alter system enable distributed recovery;
and purge the dummy entry from the dictionary, using
connect / as sysdba
alter session set "_smu_debug_mode" = 4;
--注:如果使用auto的undo管理方式,需要执行此步骤,此步骤能避免在后续执行
--purge_lost_db_entry出现ORA-01453 的报错,详细信息可见Bug 2191458
commit; 
exec dbms_transaction.purge_lost_db_entry('1.92.66874');

原文:http://www.dbifan.com/200812/how-to-deal-with-distributed-transaction.html

Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable

1、检查alert日志发现错误

Wed Nov 30 13:36:47 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:48 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:48 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:49 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:50 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:51 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:52 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:52 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:53 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:54 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:55 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:56 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:56 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:57 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:57 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:36:58 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 13:37:12 2011
Some indexes or index [sub]partitions of table VAS.TAB_PUB_CALLLOG have been marked unusable
Wed Nov 30 22:00:09 2011
…………
Wed Nov 30 22:00:15 2011
GATHER_STATS_JOB encountered errors.  Check the trace file.
Wed Nov 30 22:00:15 2011
Errors in file /opt/oracle/admin/ora9i/bdump/ora9i_j001_21372.trc:
ORA-20000: index "VAS"."XN_CALLLOG_ANALYSIS_PK"  or partition of such index is in unusable state

2、查看trace文件

/opt/oracle/admin/ora9i/bdump/ora9i_j001_21372.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /opt/oracle/product/10.2.0/db_1
System name:    Linux
Node name:      localhost.localdomain
Release:        2.6.18-92.el5
Version:        #1 SMP Tue Apr 29 13:16:15 EDT 2008
Machine:        x86_64
Instance name: ora9i
Redo thread mounted by this instance: 1
Oracle process number: 42
Unix process pid: 21372, image: oracle@localhost.localdomain (J001)

*** 2011-11-30 22:00:15.782
*** ACTION NAME:(GATHER_STATS_JOB) 2011-11-30 22:00:15.763
*** MODULE NAME:(DBMS_SCHEDULER) 2011-11-30 22:00:15.763
*** SERVICE NAME:(SYS$USERS) 2011-11-30 22:00:15.763
*** SESSION ID:(532.60095) 2011-11-30 22:00:15.763
ORA-20000: index "VAS"."XN_CALLLOG_ANALYSIS_PK"  or partition of such index is in unusable state
*** 2011-11-30 22:00:15.782
GATHER_STATS_JOB: GATHER_TABLE_STATS('"VAS"','"TAB_XN_CALLLOG_ANALYSIS"','""', ...)
ORA-20000: index "VAS"."XN_CALLLOG_ANALYSIS_PK"  or partition of such index is in unusable state

3、日志初步结论
通过alert日志,感觉应该是对分区表操作,导致”VAS”.”XN_CALLLOG_ANALYSIS_PK”索引变成了unusable state,然后在数据库自动收集统计信息的时候报错(最大可能是全局index导致)

4、验证猜测是否正确

SQL> SELECT owner,index_name,table_name,status FROM DBA_indexes 
   2 WHERE index_name='XN_CALLLOG_ANALYSIS_PK' AND owner='VAS';
 
OWNER                          INDEX_NAME                     TABLE_NAME                     STATUS
------------------------------ ------------------------------ ------------------------------ --------
VAS                            XN_CALLLOG_ANALYSIS_PK         TAB_XN_CALLLOG_ANALYSIS        UNUSABLE

5、解决相关问题问题

SELECT 'ALTER INDEX ' || INDEX_OWNER || '.' || INDEX_NAME ||
'REBUILD PARTITION ' || PARTITION_NAME || ' NOLOGGING online;'
FROM DBA_IND_PARTITIONS
WHERE INDEX_OWNER NOT IN ('SYS', 'SYSTEM', 'PUBLIC')
AND STATUS = 'UNUSABLE'
UNION ALL
SELECT 'alter index ' ||OWNER || '.' || A.INDEX_NAME || ' REBUILD online nologging;'
FROM DBA_INDEXES A
WHERE OWNER NOT IN ('SYS', 'SYSTEM', 'PUBLIC')
AND STATUS = 'UNUSABLE';

执行生成sql,解决相关index unusable问题

V$PWFILE_USERS和密码文件关系

一、V$PWFILE_USERS定义

V$PWFILE_USERS lists all users in the password file, and indicates whether the user has been granted the SYSDBA, SYSOPER, and SYSASM privileges.
Column	    Datatype	Description
USERNAM    VARCHAR2(30)	Name of the user that is contained in the password file
SYSDBA	   VARCHAR2(5)	Indicates whether the user can connect with SYSDBA privileges (TRUE) or not (FALSE)
SYSOPER	   VARCHAR2(5)	Indicates whether the user can connect with SYSOPER privileges (TRUE) or not (FALSE)
SYSASM	   VARCHAR2(5)	Indicates whether the user can connect with SYSASM privileges (TRUE) or not (FALSE)

二、v$pwfile_users与密码文件关系

[oracle@node1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sun Dec 4 19:08:06 2011

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options

--查看密码文件用户权限
SQL> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE

SQL> show parameter instance_name;        

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------
instance_name                        string      ora11g

--查看系统级别查看密码文件内容
SQL> !strings $ORACLE_HOME/dbs/orapwora11g
]\[Z
ORACLE Remote Password file
INTERNAL
A1174901D667F113
18698BFD1A045BCC

--创建新sysdba用户,查看视图和密码文件变化
SQL> create user xff01 identified by xifenfei;

User created.

SQL> grant sysdba to xff01;

Grant succeeded.

SQL>  select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE
XFF01                          TRUE  FALSE FALSE

SQL>  !strings $ORACLE_HOME/dbs/orapwora11g
]\[Z
ORACLE Remote Password file
INTERNAL
A1174901D667F113
18698BFD1A045BCC
XFF01
D32693095588EF4F

--删除密码文件
SQL> ! mv $ORACLE_HOME/dbs/orapwora11g $ORACLE_HOME/dbs/orapwora11g_bak

SQL> !ls $ORACLE_HOME/dbs/orapwora11g
ls: /opt/oracle/product/11.2.0/db_1/dbs/orapwora11g: 没有那个文件或目录

--查看视图
SQL>  select * from v$pwfile_users;

no rows selected

SQL>  ! mv $ORACLE_HOME/dbs/orapwora11g_bak $ORACLE_HOME/dbs/orapwora11g

SQL> !ls $ORACLE_HOME/dbs/orapwora11g
/opt/oracle/product/11.2.0/db_1/dbs/orapwora11g

SQL> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE
XFF01                          TRUE  FALSE FALSE

--改变sysdba用户权限,视图内容变化
SQL> grant sysoper to xff01;

Grant succeeded.

SQL>  select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE
XFF01                          TRUE  TRUE  FALSE

--密码文件内容无变化
SQL> !strings $ORACLE_HOME/dbs/orapwora11g
]\[Z
ORACLE Remote Password file
INTERNAL
A1174901D667F113
18698BFD1A045BCC
XFF01
D32693095588EF4F

SQL> revoke sysdba from xff01;

Revoke succeeded.

SQL> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE
XFF01                          FALSE TRUE  FALSE

SQL> revoke sysoper  from xff01;

Revoke succeeded.

SQL> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE
XFF01                          FALSE FALSE FALSE

SQL> revoke sysoper  from xff01;

Revoke succeeded.

--回收sysdba,sysoper权限后,视图记录消失
SQL> select * from v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS                            TRUE  TRUE  FALSE

--密码文件内容无变化
SQL>  !strings $ORACLE_HOME/dbs/orapwora11g
]\[Z
ORACLE Remote Password file
INTERNAL
A1174901D667F113
18698BFD1A045BCC
XFF01
D32693095588EF4F

--删除其中sysdba用户
SQL> drop user xff01;

User dropped.

--密码文件内容还是无变化
SQL>  !strings $ORACLE_HOME/dbs/orapwora11g
]\[Z
ORACLE Remote Password file
INTERNAL
A1174901D667F113
18698BFD1A045BCC
XFF01
D32693095588EF4F

--重启数据库密码文件依然无变化
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> !strings $ORACLE_HOME/dbs/orapwora11g
]\[Z
ORACLE Remote Password file
INTERNAL
A1174901D667F113
18698BFD1A045BCC
XFF01
D32693095588EF4F

1、如果密码文件不存在或者名称错误,查询v$pwfile_users将得到空记录
2、添加sysdba等权限用户,会记录到密码文件和v$pwfile_users中
3、到回收sysdba等权限用户,密码文件记录依然存在,但是v$pwfile_users中无对应记录

三、远程登录测试

--密码文件记录存在,视图不记录不存在,登录失败
[oracle@node1 ~]$ sqlplus xff01/xifenfei@ora11g as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Sun Dec 4 19:42:26 2011

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

ERROR:
ORA-01031: insufficient privileges


Enter user-name

--密码文件视图记录均存在,登录成功
[oracle@node1 ~]$ sqlplus sys/xifenfei@ora11g as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Sun Dec 4 19:42:10 2011

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

Connected to an idle instance.

是否能远程登录,依照v$pwfile_users为准

四、创建密码文件

win:
orapwd file=%ORACLE_HOME%\database\PWD%ORACLE_SID%.ora entries=3 password=manager force=y
linux:
orapwd file=$ORACLE_HOME/dbs/orapw$ORACLE_SID entries=3 password=manager force=y

下篇:V$PWFILE_USERS和密码文件关系(续)

ORA-1502问题分析解决

1、出现ORA-1502
接到开发报告,我们报表程序出现SQLCODE=[-1502]错误提示,sql执行不成功,根据这个提示,我猜想是ORA-1502错误,查询ora-1502错误

[oracle@node1 ~]$ oerr ora 1502
01502, 00000, "index '%s.%s' or partition of such index is in unusable state"
// MERGE: 1489 RENUMBERED TO 1502
// *Cause: An attempt has been made to access an index or index partition
//         that has been marked unusable by a direct load or by a DDL
//         operation
// *Action: DROP the specified index, or REBUILD the specified index, or
//         REBUILD the unusable index partition

根据这个提示,错误的原因是因为有index编程unusable state

--查询分区index是否有UNUSABLE的index
SQL> col index_owner for a20
SQL> col index_name for a30
SQL> col partition_name for a20    
SQL>  SELECT INDEX_OWNER, INDEX_NAME, PARTITION_NAME
  2     FROM DBA_IND_PARTITIONS
  3    WHERE INDEX_OWNER NOT IN ('SYS', 'SYSTEM', 'PUBLIC')
  4     AND STATUS = 'UNUSABLE';

INDEX_OWNER          INDEX_NAME                     PARTITION_NAME
-------------------- ------------------------------ --------------------
STAT_YDZJ            IND_STAT_DAY_COMPANY_DAY        P_201111
STAT_YDZJ            IND_STAT_DAY_COMPANY_MOBILE     P_201111
STAT_YDZJ            IND_STAT_DAY_COMPANY_AREA       P_201111

--查询无效全局index和普通index
SQL>  SELECT OWNER, a.index_name
  2     FROM Dba_Indexes  a
  3    WHERE OWNER NOT IN ('SYS', 'SYSTEM', 'PUBLIC')
  4     AND ROWNUM<2;

OWNER                          INDEX_NAME
------------------------------ ------------------------------
STAT_YDZJ                      PK_ND_STAT_DAY_COMPANY

2、解决问题
批量生产sql语句,重建这些unusable index

SELECT 'ALTER INDEX  ' || INDEX_OWNER || '.' || INDEX_NAME ||
       ' REBUILD PARTITION ' || PARTITION_NAME || ' NOLOGGING online;'
  FROM DBA_IND_PARTITIONS
 WHERE INDEX_OWNER NOT IN ('SYS', 'SYSTEM', 'PUBLIC')
   AND STATUS = 'UNUSABLE'
UNION ALL
SELECT 'alter index  ' ||OWNER || '.' || A.INDEX_NAME || ' REBUILD online nologging;'
  FROM DBA_INDEXES A
 WHERE OWNER NOT IN ('SYS', 'SYSTEM', 'PUBLIC')
 AND  STATUS = 'UNUSABLE';

执行完上面sql生成语句后,让开发测试程序,反馈工作正常

3、问题分析
我昨天对STAT_DAY_COMPANY表添加了分区(SPLIT拆分MAXVALUE),本以为该表的所有index都是local index不知道为何有了一个全局index(公司规定所有分区表都只能建立local index),所以没有做相关查询,导致这次出现了index无效。但是为什么还有三个local index也变成了unusable,这个估计是大批量append插入数据导致。而开发那边正好是批量append插入数据到STAT_DAY_COMPANY表中,导致出现了ORA-1502错误
建议:处理分区表时,要对index查询清楚,不要按照规定或者惯性思维办事。

时间不同步导致ogg部署异常

一、错误检查

[oracle@srtcreen ~]$ ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 11.1.1.1.1 OGGCORE_11.1.1.1.1_PLATFORMS_110729.1700 
Linux, x64, 64bit (optimized), Oracle 10g on Jul 29 2011 19:43:29

Copyright (C) 1995, 2011, Oracle and/or its affiliates. All rights reserved.



GGSCI (srtcreen) 1> info all

Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     R_1         00:00:00      00:00:07    
REPLICAT    RUNNING     R_2         00:00:00      00:00:05    
REPLICAT    ABENDED     R_3         19:19:34      00:17:33    
REPLICAT    STOPPED     R_4         00:00:00      19:44:24    
REPLICAT    STOPPED     R_5         00:00:00      19:44:13    

GGSCI (srtcreen) 2> view report r_3


***********************************************************************
                 Oracle GoldenGate Delivery for Oracle
     Version 11.1.1.1.1 OGGCORE_11.1.1.1.1_PLATFORMS_110729.1700 
   Linux, x64, 64bit (optimized), Oracle 10g on Sep 13 2011 21:33:03
 
Copyright (C) 1995, 2011, Oracle and/or its affiliates. All rights reserved.


                    Starting at 2011-12-02 16:36:58
***********************************************************************

Operating System Version:
Linux
Version #1 SMP Fri Apr 2 14:58:14 EDT 2010, Release 2.6.18-194.el5
Node: srtcreen
Machine: x86_64
                         soft limit   hard limit
Address Space Size   :    unlimited    unlimited
Heap Size            :    unlimited    unlimited
File Size            :    unlimited    unlimited
CPU Time             :    unlimited    unlimited

Process id: 13398

Description: 

***********************************************************************
**            Running with the following parameters                  **
***********************************************************************
replicat r_3
ASSUMETARGETDEFS
HANDLECOLLISIONS
SETENV (NLS_LANG =AMERICAN_AMERICA.ZHS16GBK)
Set environment variable (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
userid it1ogg, password ********
--file for dicarded transaction --
discardfile /opt/OGG/discard/R_3.txt, append, megabytes 100
DDL
MAP CSCNEW.TAB_CS_CALL_PICKUP, TARGET SRT_CREEN.TAB_CS_CALL_PICKUP, KEYCOLS (CALL_ID);


CACHEMGR virtual memory values (may have been adjusted)
CACHEBUFFERSIZE:                         64K
CACHESIZE:                              512M
CACHEBUFFERSIZE (soft max):               4M
CACHEPAGEOUTSIZE (normal):                4M
PROCESS VM AVAIL FROM OS (min):           1G
CACHESIZEMAX (strict force to disk):    881M

Database Version:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

Database Language and Character Set:
NLS_LANG = "AMERICAN_AMERICA.ZHS16GBK" 
NLS_LANGUAGE     = "AMERICAN" 
NLS_TERRITORY    = "AMERICA" 
NLS_CHARACTERSET = "ZHS16GBK" 

For further information on character set settings, please refer to user manual.

***********************************************************************
**                     Run Time Messages                             **
***********************************************************************

Opened trail file /opt/OGG/dirdat/U3000000 at 2011-12-02 16:36:58

MAP resolved (entry CSCNEW.TAB_CS_CALL_PICKUP):
  MAP CSCNEW.TAB_CS_CALL_PICKUP, TARGET SRT_CREEN.TAB_CS_CALL_PICKUP, KEYCOLS (CALL_ID);
Using following columns in default map by name:
  PICKUP_ID, CALL_ID, CALL_SERIAL, USER_ID, PICKUP_TIME, CALL_RESULT, 
  FAIL_REASON, CALL_TIME, CALL_DURA, END_TIME

Using the following key columns for target table SRT_CREEN.TAB_CS_CALL_PICKUP: CALL_ID.


2011-12-02 16:36:58  WARNING OGG-00869  OCI Error ORA-01407: cannot update ("SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDAT
E "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = :a2,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CAL
L_TIME" = :a7,"CALL_DURA" = :a8,"END_T>.

2011-12-02 16:36:58  WARNING OGG-01004  Aborted grouped transaction on 'SRT_CREEN.TAB_CS_CALL_PICKUP', Database error 1407 (OCI Error ORA-01407: cannot updat
e ("SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDATE "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = 
:a2,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CALL_TIME" = :a7,"CALL_DURA" = :a8,"END_T>).

2011-12-02 16:36:58  WARNING OGG-01003  Repositioning to rba 924 in seqno 0.

2011-12-02 16:36:59  WARNING OGG-01154  SQL error 1407 mapping CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP OCI Error ORA-01407: cannot update (
"SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDATE "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = :a2
,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CALL_TIME" = :a7,"CALL_DURA" = :a8,"END_T>.

2011-12-02 16:36:59  WARNING OGG-01003  Repositioning to rba 924 in seqno 0.

***********************************************************************
*                   ** Run Time Statistics **                         *
***********************************************************************

Last record for the last committed transaction is the following: 
___________________________________________________________________
Trail name :  /opt/OGG/dirdat/U3000000
Hdr-Ind    :     E  (x45)     Partition  :     .  (x04) 
UndoFlag   :     .  (x00)     BeforeAfter:     A  (x41) 
RecLength  :   157 (x009d)    IO Time    : 2011-12-01 21:17:24.084108  
IOType     :    15  (x0f)     OrigNode   :   255  (xff)
TransInd   :     .  (x03)     FormatType :     R  (x52)
SyskeyLen  :     0  (x00)     Incomplete :     .  (x00)
AuditRBA   :       3225       AuditPos   : 42227728
Continued  :     N  (x00)     RecCount   :     1  (x01)

2011-12-01 21:17:24.084108 FieldComp          Len   157 RBA 924
Name: CSCNEW.TAB_CS_CALL_PICKUP
___________________________________________________________________

Reading /opt/OGG/dirdat/U3000000, current RBA 924, 0 records

Report at 2011-12-02 16:36:59 (activity since 2011-12-02 16:36:58)

From Table CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP:
       #                   inserts:         0
       #                   updates:         0
       #                   deletes:         0
       #                  discards:         1


DDL replication statistics:

                    Operations:         0
             Mapped operations:         0
           Unmapped operations:         0
              Other operations:         0
           Excluded operations:         0
                        Errors:         0
                Retried errors:         0
              Discarded errors:         0
                Ignored errors:         0




Last log location read:
     FILE:      /opt/OGG/dirdat/U3000000
     SEQNO:     0
     RBA:       924
     TIMESTAMP: 2011-12-01 21:17:24.084108
     EOF:       NO
     READERR:   0

2011-12-02 16:36:59  ERROR   OGG-01668  PROCESS ABENDING.

--发现奇怪现象,我2011-12-02早上过来检查这个,发现时间竟然显示2011-12-02 16:36:59,第一反应系统时间错误,继续检查

GGSCI (srtcreen) 2> info r_2

REPLICAT   R_2       Last Started 2011-12-01 21:09   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:05 ago)
Log Read Checkpoint  File /opt/OGG/dirdat/U2000000
                     2011-12-02 17:01:29.927591  RBA 6234

GGSCI (srtcreen) 3> exit
[oracle@srtcreen OGG]$ cd dirdat
[oracle@srtcreen dirdat]$ ll
total 396
-rw-rw-rw- 1 oracle oinstall      0 Dec  1 21:10 U1000000
-rw-rw-rw- 1 oracle oinstall   5984 Dec  2 05:36 U2000000
-rw-rw-rw- 1 oracle oinstall 392258 Dec  2 16:52 U3000000
[oracle@srtcreen dirdat]$ stat U3000000
  File: `U3000000'
  Size: 392258          Blocks: 776        IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 213844220   Links: 1
Access: (0666/-rw-rw-rw-)  Uid: (  501/  oracle)   Gid: (  501/oinstall)
Access: 2011-12-02 16:36:59.000000000 +0800
Modify: 2011-12-02 16:52:55.000000000 +0800
Change: 2011-12-02 16:52:55.000000000 +0800

###############查看源端数据库服务器时间#####################
[oracle@tykf ~]$ date
Fri Dec  2 08:50:10 CST 2011
[oracle@tykf ~]$ ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 11.1.1.0.0 Build 078
Linux, x64, 64bit (optimized), Oracle 10 on Jul 28 2010 13:21:11

Copyright (C) 1995, 2010, Oracle and/or its affiliates. All rights reserved.



GGSCI (tykf) 1> info all

Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     EXT_1       00:00:00      00:00:05    
EXTRACT     RUNNING     EXT_2       00:00:00      00:00:01    
EXTRACT     RUNNING     EXT_3       00:00:00      00:00:05    
EXTRACT     STOPPED     EXT_4       00:00:00      21:59:56    
EXTRACT     STOPPED     EXT_5       00:00:00      21:59:28    
EXTRACT     RUNNING     P_1         00:00:00      00:00:04    
EXTRACT     RUNNING     P_2         00:00:00      00:00:04    
EXTRACT     RUNNING     P_3         00:00:00      00:00:05    
EXTRACT     STOPPED     P_4         00:00:00      21:56:42    
EXTRACT     STOPPED     P_5         00:00:00      21:56:11    


GGSCI (tykf) 2> info ext_3

EXTRACT    EXT_3     Last Started 2011-12-01 13:11   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:03 ago)
Log Read Checkpoint  Oracle Redo Logs
                     2011-12-02 08:50:06  Seqno 3233, RBA 32267264


GGSCI (tykf) 3> info p_3

EXTRACT    P_3       Last Started 2011-12-01 13:11   Status RUNNING
Checkpoint Lag       00:00:00 (updated 00:00:00 ago)
Log Read Checkpoint  File /opt/OGG/dirdat/extract/A3000000
                     2011-12-02 08:50:15.000000  RBA 393705

--ext_3和p_3是目标端r_3的对应进程
#######################################################################

果然是系统时间错误,源端和目标端相差了近八个小时

二、更正目标端时间

[oracle@srtcreen dirdat]$ date
Fri Dec  2 16:55:55 CST 2011
[oracle@srtcreen OGG]$ su - root
Password: 
[root@srtcreen ~]# date -s 08:58:20
Fri Dec  2  08:58:20 CST
[root@srtcreen ~]# clock -w
[root@srtcreen ~]# date
Fri Dec  2  08:58:28 CST

[root@srtcreen ~]# su - oracle
[oracle@srtcreen dirdat]$ ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 11.1.1.1.1 OGGCORE_11.1.1.1.1_PLATFORMS_110729.1700 
Linux, x64, 64bit (optimized), Oracle 10g on Jul 29 2011 19:43:29

Copyright (C) 1995, 2011, Oracle and/or its affiliates. All rights reserved.



GGSCI (srtcreen) 1> info all

Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     R_1         00:00:00      unknown     
REPLICAT    RUNNING     R_2         00:00:00      unknown     
REPLICAT    ABENDED     R_3         19:19:34      unknown     
REPLICAT    STOPPED     R_4         00:00:00      11:54:31    
REPLICAT    STOPPED     R_5         00:00:00      11:54:20    
[oracle@srtcreen OGG]$ cd dirdat
[oracle@srtcreen dirdat]$ ll
total 412
-rw-rw-rw- 1 oracle oinstall      0 Dec  1 21:10 U1000000
-rw-rw-rw- 1 oracle oinstall   6485 Dec  2  2011 U2000000
-rw-rw-rw- 1 oracle oinstall 407519 Dec  2 09:04 U3000000

最新写到目标端的数据已经是当前修改时间(这个是系统时间,肯定会修改过来)
出现Time Since Chkpt unknown,处理方法见ogg中Time Since Chkpt显示unknown解决

三、重设r_3时间点

GGSCI (srtcreen) 7> alter r_3,begin 2011-12-02 09:00:00
REPLICAT altered.


GGSCI (srtcreen) 9> start r_3

Sending START request to MANAGER ...
REPLICAT R_3 starting


GGSCI (srtcreen) 10> info all

Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     R_1         00:00:00      00:00:05      
REPLICAT    RUNNING     R_2         00:00:00      00:00:02     
REPLICAT    ABENDED     R_3         unknown       00:00:00    
REPLICAT    STOPPED     R_4         00:00:00      11:55:40    
REPLICAT    STOPPED     R_5         00:00:00      11:55:29    


GGSCI (srtcreen) 11> view report r_3


***********************************************************************
                 Oracle GoldenGate Delivery for Oracle
     Version 11.1.1.1.1 OGGCORE_11.1.1.1.1_PLATFORMS_110729.1700 
   Linux, x64, 64bit (optimized), Oracle 10g on Sep 13 2011 21:33:03
 
Copyright (C) 1995, 2011, Oracle and/or its affiliates. All rights reserved.


                    Starting at 2011-12-02 09:09:31
***********************************************************************

Operating System Version:
Linux
Version #1 SMP Fri Apr 2 14:58:14 EDT 2010, Release 2.6.18-194.el5
Node: srtcreen
Machine: x86_64
                         soft limit   hard limit
Address Space Size   :    unlimited    unlimited
Heap Size            :    unlimited    unlimited
File Size            :    unlimited    unlimited
CPU Time             :    unlimited    unlimited

Process id: 13629

Description: 

***********************************************************************
**            Running with the following parameters                  **
***********************************************************************
replicat r_3
ASSUMETARGETDEFS
HANDLECOLLISIONS
SETENV (NLS_LANG =AMERICAN_AMERICA.ZHS16GBK)
Set environment variable (NLS_LANG=AMERICAN_AMERICA.ZHS16GBK)
userid it1ogg, password ********
--file for dicarded transaction --
discardfile /opt/OGG/discard/R_3.txt, append, megabytes 100
DDL
MAP CSCNEW.TAB_CS_CALL_PICKUP, TARGET SRT_CREEN.TAB_CS_CALL_PICKUP, KEYCOLS (CALL_ID);


CACHEMGR virtual memory values (may have been adjusted)
CACHEBUFFERSIZE:                         64K
CACHESIZE:                              512M
CACHEBUFFERSIZE (soft max):               4M
CACHEPAGEOUTSIZE (normal):                4M
PROCESS VM AVAIL FROM OS (min):           1G
CACHESIZEMAX (strict force to disk):    881M

Database Version:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

Database Language and Character Set:
NLS_LANG = "AMERICAN_AMERICA.ZHS16GBK" 
NLS_LANGUAGE     = "AMERICAN" 
NLS_TERRITORY    = "AMERICA" 
NLS_CHARACTERSET = "ZHS16GBK" 

For further information on character set settings, please refer to user manual.

***********************************************************************
**                     Run Time Messages                             **
***********************************************************************

Opened trail file /opt/OGG/dirdat/U3000000 at 2011-12-02 09:09:32

MAP resolved (entry CSCNEW.TAB_CS_CALL_PICKUP):
  MAP CSCNEW.TAB_CS_CALL_PICKUP, TARGET SRT_CREEN.TAB_CS_CALL_PICKUP, KEYCOLS (CALL_ID);
Using following columns in default map by name:
  PICKUP_ID, CALL_ID, CALL_SERIAL, USER_ID, PICKUP_TIME, CALL_RESULT, 
  FAIL_REASON, CALL_TIME, CALL_DURA, END_TIME

Using the following key columns for target table SRT_CREEN.TAB_CS_CALL_PICKUP: CALL_ID.


2011-12-02 09:09:32  WARNING OGG-00869  OCI Error ORA-01407: cannot update ("SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDAT
E "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = :a2,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CAL
L_TIME" = :a7,"CALL_DURA" = :a8,"END_T>.

2011-12-02 09:09:32  WARNING OGG-01004  Aborted grouped transaction on 'SRT_CREEN.TAB_CS_CALL_PICKUP', Database error 1407 (OCI Error ORA-01407: cannot updat
e ("SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDATE "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = 
:a2,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CALL_TIME" = :a7,"CALL_DURA" = :a8,"END_T>).

2011-12-02 09:09:32  WARNING OGG-01003  Repositioning to rba 375833 in seqno 0.

2011-12-02 09:09:32  WARNING OGG-01154  SQL error 1407 mapping CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP OCI Error ORA-01407: cannot update (
"SRT_CREEN"."TAB_CS_CALL_PICKUP"."PICKUP_ID") to NULL (status = 1407), SQL <UPDATE "SRT_CREEN"."TAB_CS_CALL_PICKUP" SET "PICKUP_ID" = :a0,"CALL_SERIAL" = :a2
,"USER_ID" = :a3,"PICKUP_TIME" = :a4,"CALL_RESULT" = :a5,"FAIL_REASON" = :a6,"CALL_TIME" = :a7,"CALL_DURA" = :a8,"END_T>.

2011-12-02 09:09:32  WARNING OGG-01003  Repositioning to rba 375833 in seqno 0.

Source Context :
  SourceModule            : [er.main]
  SourceID                : [/scratch/angorant/view_storage/angorant_ogg_12978807_x64/oggcore/OpenSys/src/app/er/rep.c]
  SourceFunction          : [take_rep_err_action]
  SourceLine              : [16134]
  ThreadBacktrace         : [8] elements
                          : [/opt/OGG/replicat(CMessageContext::AddThreadContext()+0x26) [0x5ef8b6]]
                          : [/opt/OGG/replicat(CMessageFactory::CreateMessage(CSourceContext*, unsigned int, ...)+0x7b2) [0x5e6382]]
                          : [/opt/OGG/replicat(_MSG_ERR_MAP_TO_TANDEM_FAILED(CSourceContext*, DBString<777> const&, DBString<777> const&, CMessageFactory::Me
ssageDisposition)+0x9b) [0x5c4bcb]]
                          : [/opt/OGG/replicat [0x81ac2f]]
                          : [/opt/OGG/replicat [0x8f73e2]]
                          : [/opt/OGG/replicat(main+0x84b) [0x50764b]]
                          : [/lib64/libc.so.6(__libc_start_main+0xf4) [0x3e2f41d994]]
                          : [/opt/OGG/replicat(__gxx_personality_v0+0x1da) [0x4e3c2a]]

2011-12-02 09:09:32  ERROR   OGG-01296  Error mapping from CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP.

***********************************************************************
*                   ** Run Time Statistics **                         *
***********************************************************************

Last record for the last committed transaction is the following: 
___________________________________________________________________
Trail name :  /opt/OGG/dirdat/U3000000
Hdr-Ind    :     E  (x45)     Partition  :     .  (x04) 
UndoFlag   :     .  (x00)     BeforeAfter:     A  (x41) 
RecLength  :   157 (x009d)    IO Time    : 2011-12-02 11:52:49.559112  
IOType     :    15  (x0f)     OrigNode   :   255  (xff)
TransInd   :     .  (x03)     FormatType :     R  (x52)
SyskeyLen  :     0  (x00)     Incomplete :     .  (x00)
AuditRBA   :       3233       AuditPos   : 8194064
Continued  :     N  (x00)     RecCount   :     1  (x01)

2011-12-02 11:52:49.559112 FieldComp          Len   157 RBA 375833
Name: CSCNEW.TAB_CS_CALL_PICKUP
___________________________________________________________________

Reading /opt/OGG/dirdat/U3000000, current RBA 375833, 0 records

Report at 2011-12-02 09:09:32 (activity since 2011-12-02 09:09:32)

From Table CSCNEW.TAB_CS_CALL_PICKUP to SRT_CREEN.TAB_CS_CALL_PICKUP:
       #                   inserts:         0
       #                   updates:         0
       #                   deletes:         0
       #                  discards:         1


DDL replication statistics:

                    Operations:         0
             Mapped operations:         0
           Unmapped operations:         0
              Other operations:         0
           Excluded operations:         0
                        Errors:         0
                Retried errors:         0
              Discarded errors:         0
                Ignored errors:         0




Last log location read:
     FILE:      /opt/OGG/dirdat/U3000000
     SEQNO:     0
     RBA:       375833
     TIMESTAMP: 2011-12-02 11:52:49.559112
     EOF:       NO
     READERR:   0


2011-12-02 09:09:32  ERROR   OGG-01668  PROCESS ABENDING.
--这里可以看出我重设的当前的时间点对应的io time为2011-12-02 11:52:49.559112,而现在尚未到这个时间,进程启动失败

GGSCI (srtcreen) 2> alter r_3,begin 2011-12-02 18:00:00
REPLICAT altered.
--重设时间到今天早上我重设时间之前的错误时间之后

GGSCI (srtcreen) 3> start r_3

Sending START request to MANAGER ...
REPLICAT R_3 starting


GGSCI (srtcreen) 4> info all

Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     R_1         00:00:00      00:00:06    
REPLICAT    RUNNING     R_2         00:00:00      00:00:01    
REPLICAT    RUNNING     R_3         unknown       00:00:08    
REPLICAT    RUNNING     R_4         00:00:00      00:00:09    
REPLICAT    RUNNING     R_5         00:00:00      00:00:09    


GGSCI (srtcreen) 5> info all

Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     R_1         00:00:00      00:00:09    
REPLICAT    RUNNING     R_2         00:00:00      00:00:03    
REPLICAT    RUNNING     R_3         00:00:00      00:00:00    
REPLICAT    RUNNING     R_4         00:00:00      00:00:02    
REPLICAT    RUNNING     R_5         00:00:00      00:00:01    


GGSCI (srtcreen) 6> stats r_3

Sending STATS request to REPLICAT R_3 ...

No active replication maps
DDL replication statistics:

*** Total statistics since replicat started     ***
        Operations                                   0.00
        Mapped operations                            0.00
        Unmapped operations                          0.00
        Other operations                             0.00
        Excluded operations                          0.00
        Errors                                       0.00
        Retried errors                               0.00
        Discarded errors                             0.00
        Ignored errors                               0.00
--进程启动,工作正常

三、补充说明
1、部署ogg前需要核对两边时间是否相同,为了避免不必要的麻烦,建议配置ntp同步时间
2、这里因为我们这要求该同步上线无严格时间要求,所以通过重设r_3时间点实现工作正常,如果事情比较紧急,建议删除trail文件,重建pump和replicat进程
3、这里说明ogg同步也还不是和时间一点关系都没有