查看Aix进程占用内存大小

内存从大到小排序

-bash-3.00$ ps aux | head -1 ; ps aux | sort -rn +3 | head -10
USER         PID %CPU %MEM   SZ  RSS    TTY STAT    STIME  TIME COMMAND
root     1908868  0.1  2.0 30508 29724      - A      Dec 21 235:34 /usr/sbin/secld
zhldang  2728188  0.0  0.0 1208 1288 pts/15 A    11:20:20  0:00 -bash 
zhldang  1421376  0.0  0.0 2036 1056      - A    11:20:15  0:00 sshd: zhldang@p
zerbo625 2482224  0.0  0.0 1164 1244 pts/24 A    11:04:35  0:00 -bash 
zerbo625 1523730  0.0  0.0 1796 1036      - A    11:04:29  0:00 sshd: zerbo625@
yxq198sm 2429092  0.0  0.0  152  156      - A    12:24:48  0:00 sleep 30 
yxq198sm 1327254  0.0  0.0 1600 1872      - A      Feb 01  0:06 /usr/bin/bsh /u
yanmotia 1274070  0.0  0.0 1164  768 pts/22 A      Dec 23  0:00 -bash 
xinximai 2658402  0.0  0.0 1156  760 pts/83 A      Dec 28  0:00 -bash 
xifengke 2330668  0.0  0.0 1160  756 pts/80 A      Dec 05  0:00 -bash 

内存从大到小排序(详细执行命令)

-bash-3.00$ ps -ealf | head -1 ; ps -ealf | sort -rn +9 | head  -10
       F S      UID     PID    PPID   C PRI NI ADDR    SZ    WCHAN    STIME    TTY  TIME CMD
  240001 A     root  188582       1   0  60 20 2823c400 36708        *   Jan 03      - 676:39 /usr/sbin/secldapclntd 
  240001 A     root 1908868       1   0  60 22 6648400 30508        *   Dec 21      - 235:40 /usr/sbin/secldapclntd 
  240001 A     root  331840   86176   0  60 20 205f8400  4088            Sep 15      - 11:04 /usr/lpp/X11/bin/X -D /usr/lib/X11//rgb -T -force :0 -auth /var/dt/A:0-0xukaa 
  240001 A     root 2216054  233648   0  60 20 142a400  3992 f100060002c9e8c8 11:19:59      -  0:00 sshd: zhldang [priv] 
  240001 A     root 1515620  233648   0  60 20 2075f400  3992 f100060003bb0cc8 12:21:48      -  0:00 sshd: undeadbird [priv] 
  240001 A     root 2719848  233648   0  60 20 916a400  3988 f100060002b068c8 13:07:55      -  0:00 sshd: wangyh0313 [priv] 
  240001 A     root 2003106  233648   0  60 20 3428400  3876 f1000600038f94c8 11:02:58      -  0:00 sshd: intrepid [priv] 
  240001 A     root 2560234  233648   0  60 20 44a8400  3816 f1000600018264c8 11:01:54      -  0:00 sshd: xifenfei [priv] 
  240001 A     root 2846944  233648   0  60 20 22c7b400  3784 f100060003b050c8 11:04:12      -  0:00 sshd: zerbo625 [priv] 
  200001 A undeadbi 2220232 1007670   0  60 20 2355f400  3104 f100060001c2c878 13:03:08  pts/9  0:00 /usr/bin/smitty 

查找某个进程使用内存

-bash-3.00$ ps aux | head -1 ; ps aux | grep xifenfei
USER         PID %CPU %MEM   SZ  RSS    TTY STAT    STIME  TIME COMMAND
xifenfei 2494532  0.0  0.0 1836 1044      - A    11:02:05  0:00 sshd: xifenfei@
xifenfei 2666546  0.0  0.0 1148 1228 pts/20 A    11:02:10  0:00 -bash 
root     2560234  0.0  0.0 3816 3024      - A    11:01:54  0:00 sshd: xifenfei 
xifenfei 2220230  0.0  0.0  912  940 pts/20 A    13:03:06  0:00 ps aux 
xifenfei 1310908  0.0  0.0  228  240 pts/20 A    13:03:06  0:00 grep xifenfei 

查找某个进程使用内存(详细执行命令)

-bash-3.00$ ps -ealf | head -1 ;ps -ealf|grep xifenfei
       F S      UID     PID    PPID   C PRI NI ADDR    SZ    WCHAN    STIME    TTY  TIME CMD
   40001 A xifenfei 2494532 2560234   0  60 20 37575400  1836          11:02:05      -  0:00 sshd: xifenfei@pts/20 
  200001 A xifenfei 2510946 2666546   1  60 20 14045400   960          12:57:25 pts/20  0:00 ps -ealf 
  200001 A xifenfei 2543730 2666546   0  60 20 10cc6400   228 f100010017976ca0 12:57:25 pts/20  0:00 grep xifenfei 
  240001 A     root 2560234  233648   0  60 20 44a8400  3816 f1000600018264c8 11:01:54      -  0:00 sshd: xifenfei [priv] 
Posted in AIX |

ORA-00600[ktspNextL1:4]

在检查某运营商的客服数据库时发现如下错误

Tue Jan 31 22:00:50 2012
Errors in file /oracle10/admin/ahunicom/bdump/ahunicom1_j005_24445074.trc:
ORA-00600: internal error code, arguments: [ktspNextL1:4], [], [], [], [], [], [], []
Tue Jan 31 22:01:18 2012
Trace dumping is performing id=[cdmp_20120131220118]
Tue Jan 31 22:01:18 2012
Errors in file /oracle10/admin/ahunicom/bdump/ahunicom1_j005_24445074.trc:
ORA-00600: internal error code, arguments: [ORA-00600: internal error code, arguments: [ktspNextL1:4], [], [], [], [], [], [], []
ORA-06512: at "SYS.PRVT_ADVISOR", line 1624
ORA-06512: at "SYS.DBMS_ADVISOR", line 186
ORA-06512: at "SYS.DBMS_SPACE", line 1500
ORA-06512: at "SYS.DBMS_SPACE", line 1566
], [], [], [], [], [], [], []

查看trace文件

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /oracle10/app/product/db/10.2.0
System name:    AIX
Node name:      zwq_kfdb1
Release:        1
Version:        6
Machine:        00C5C4764C00
Instance name: ahunicom1
Redo thread mounted by this instance: 1
Oracle process number: 192
Unix process pid: 24445074, image: oracle@zwq_kfdb1 (J005)

*** ACTION NAME:(AUTO_SPACE_ADVISOR_JOB) 2012-01-31 22:00:50.874
*** MODULE NAME:(DBMS_SCHEDULER) 2012-01-31 22:00:50.874
*** SERVICE NAME:(SYS$USERS) 2012-01-31 22:00:50.874
*** SESSION ID:(454.44574) 2012-01-31 22:00:50.874
*** 2012-01-31 22:00:50.873
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktspNextL1:4], [], [], [], [], [], [], []
Current SQL statement for this session:
insert into wri$_adv_objspace_trend_data select timepoint,  space_usage, space_alloc, quality from  table(dbms_space.object_growth_trend(:1, :2, :3, :4, NULL
, NULL,  NULL, 'FALSE', :5, 'FALSE'))
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
70000010d9a65e0      1834  package body SYS.DBMS_SPACE
70000010d9a65e0      3922  package body SYS.DBMS_SPACE
70000010d9a65e0      4233  package body SYS.DBMS_SPACE
70000011a7d9c88         1  anonymous block
700000180678048       344  SYS.WRI$_ADV_OBJSPACE_TREND_T
700000180678048      1485  SYS.WRI$_ADV_OBJSPACE_TREND_T
7000001334867d0      1535  package body SYS.PRVT_ADVISOR
7000001334867d0      1618  package body SYS.PRVT_ADVISOR
70000011f9f64a8       186  package body SYS.DBMS_ADVISOR
70000010d9a65e0      1500  package body SYS.DBMS_SPACE
70000010d9a65e0      1566  package body SYS.DBMS_SPACE
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              70000017E9975D0 ? 100000001 ?
ksedmp+0290          bl       ksedst               104A2CDB0 ?
ksfdmp+0018          bl       03F2735C             
kgerinv+00dc         bl       _ptrgl               
kgeasnmierr+004c     bl       kgerinv              000000000 ? 000000001 ?
                                                   000000005 ? 7000000E19760FC ?
                                                   7000000E1976014 ?
ktspGetNextL1ForSca  bl       01F94828             
n+0104                                             
ktspScanInit+026c    bl       ktspGetNextL1ForSca  A0A0E1B89 ? 10564BCF4 ?
                              n                    80003000804DC ?
ktspGenExtentMap1+0  bl       ktspScanInit         000000000 ?
0e8                                                
kteinmap1+00bc       bl       ktspGenExtentMap1    000000000 ? 000000001 ?
                                                   FFFFFFFFFFE3A80 ?
kteinmap+0010        bl       kteinmap1            000000000 ? 00000000D ?
kdgini+036c          bl       kteinmap             FFFFFFFFFFE3DA0 ?
kdg_block_auto+018c  bl       kdgini               000000000 ? A0B4A708C ?
                                                   5676A0005676A ?
                                                   80003000804DC ? 000000000 ?
                                                   70000016AD82170 ? 110195498 ?
                                                   70000016AD82148 ?
ktsa_object_space_u  bl       kdg_block_auto       FFFFFFFFFFE4C50 ?
sage+0950                                          FFFFFFFFFFE4CA0 ? 200000780 ?
                                                   70000011A7D9C88 ?

由trace文件中的insert语句可以知道,这个错误是DBMS_SPACE.OBJECT_GROWTH_TREND进行空间分析时被触发
查询MOS[ID 841158.1],发现这个是一个没有公布的bug(5649098),在11.1.0.7中被修复

解决方案:
1.忽略这个错误,因为这个错误是不可重复的,发生的概率不大
2.升级到11.1.0.7及其以上版本
3.如果遇到这个错误,可以手工执行dbms_space.auto_space_advisor_job_proc();

老版本PL/SQL Developer操作数据库导致ORA-00600[17113]

在巡检某运营商的计费库时,发现alert日志中发现如下错误

Thu Feb  2 13:54:52 2012
Errors in file /oracle9/app/admin/bill/udump/bill1_ora_35651918.trc:
ORA-00600: internal error code, arguments: [17113], [0x000000000], [], [], [], [], [], []

查看trace文件

Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.6.0 - Production
ORACLE_HOME = /oracle9/app/product/9.2.0
System name:    AIX
Node name:      zwq_bill_1
Release:        1
Version:        6
Machine:        00F64FF34C00
Instance name: bill1
Redo thread mounted by this instance: 1
Oracle process number: 200
Unix process pid: 35651918, image: oracle@zwq_bill_1 (TNS V1-V3)

*** 2012-02-02 13:54:52.169
*** SESSION ID:(210.1380) 2012-02-02 13:54:52.150
********** Internal heap ERROR 17113 addr=0 *********
******************************************************
HEAP DUMP heap name=""  desc=0
 extent sz=0x0 alt=0 het=0 rec=0 flg=0 opc=0
 parent=0 owner=0 nex=0 xsz=0x0
 Hla: -1
*** 2012-02-02 13:54:52.169
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [17113], [0x000000000], [], [], [], [], [], []
Current SQL statement for this session:
declare
  t_owner varchar2(30);
  t_name  varchar2(30);
  procedure check_mview is
    dummy integer;
  begin
    if :object_type = 'TABLE' then
      select 1 into dummy
      from sys.all_objects
      where owner = :object_owner
      and object_name = :object_name
      and object_type = 'MATERIALIZED VIEW'
      and rownum = 1;
      :object_type := 'MATERIALIZED VIEW';
    end if;
  exception
    when others then null;
  end;
begin
  :sub_object := null;
  if :deep != 0 then
    begin
      if :part2 is null then
        select constraint_type, owner, constraint_name
          into :object_type, :object_owner, :object_name
          from sys.all_constraints c
         where c.constraint_name = :part1 and c.owner = user
           and rownum = 1;
      else
        select constraint_type, owner, constraint_name, :part3
          into :object_type, :object_owner, :object_name, :sub_object
          from sys.all_constraints c
         where c.constraint_name = :part2 and c.owner = :part1
           and rownum = 1;
      end if;
      if :object_type = 'P' then :object_type := 'PRIMARY KEY'; end if;
      if :object_type = 'U' then :object_type := 'UNIQUE KEY'; end if;
      if :object_type = 'R' then :object_type := 'FOREIGN KEY'; end if;
      if :object_type = 'C' then :object_type := 'CHECK CONSTRAINT'; end if;        
      return;
    exception
      when no_data_found then null;
    end;
  end if;
  :sub_object := :part2;
  if (:part2 is null) or (:part1 != user) then
    begin
      select object_type, user, :part1
      into :object_type, :object_owner, :object_name
      from sys.all_objects
      where owner = user
      and object_name = :part1
      and object_type in ('MATERIALIZED VIEW', 'TABLE', 'VIEW', 'SEQUENCE', 'PROCEDURE', 'FUNCTION', 'PACKAGE', 'TYPE', 'TRIGGER', 'SYNONYM')
      and rownum = 1;
      if :object_type = 'SYNONYM' then
        select s.table_owner, s.table_name
          into t_owner, t_name
          from sys.all_synonyms s
         where s.synonym_name = :part1
           and s.owner = user
           and rownum = 1;
        select o.object_type, o.owner, o.object_name
          into :object_type, :object_owner, :object_name
          from sys.all_objects o 
         where o.owner = t_owner
           and o.object_name = t_name
           and object_type in ('MATERIALIZED VIEW', 'TABLE', 'VIEW', 'SEQUENCE', 'PROCEDURE', 'FUNCTION', 'PACKAGE', 'TYPE', 'TRIGGER', 'SYNONYM')
           and rownum = 1;
      end if;
      :sub_object := :part2;
      if :part3 is not null then
        :sub_object := :sub_object || '.' || :part3;
      end if;
      check_mview;
      return;
    exception
      when no_data_found then null;
    end;
  end if;
  begin
    select s.table_owner, s.table_name
      into t_owner, t_name
      from sys.all_synonyms s
     where s.synonym_name = :part1
       and s.owner = 'PUBLIC'
       and rownum = 1;
    select o.object_type, o.owner, o.object_name
      into :object_type, :object_owner, :object_name
      from sys.all_objects o 
     where o.owner = t_owner
       and o.object_name = t_name
       and object_type in ('MATERIALIZED VIEW', 'TABLE', 'VIEW', 'SEQUENCE', 'PROCEDURE', 'FUNCTION', 'PACKAGE', 'TYPE', 'TRIGGER', 'SYNONYM')
       and rownum = 1;
    check_mview;
    return;
  exception
    when no_data_found then null;
  end;
  :sub_object := :part3;
  begin
    select o.object_type, o.owner, o.object_name
      into :object_type, :object_owner, :object_name
      from sys.all_objects o
     where o.owner = :part1
       and o.object_name = :part2
       and object_type in ('MATERIALIZED VIEW', 'TABLE', 'VIEW', 'SEQUENCE', 'PROCEDURE', 'FUNCTION', 'PACKAGE', 'TYPE', 'TRIGGER', 'SYNONYM')
       and rownum = 1;
    check_mview;
    return;
  exception
    when no_data_found then null;
  end;
  begin
    if :part2 is null and :part3 is null
    then
      select 'USER', null, :part1
      into :object_type, :object_owner, :object_name
      from sys.all_users u
      where u.username = :part1
      and rownum = 1;
      return;
    end if;
  exception
    when no_data_found then null;
  end;
  begin
    if :part2 is null and :part3 is null and :deep != 0
    then
      select 'ROLE', null, :part1
      into :object_type, :object_owner, :object_name
      from sys.session_roles r
      where r.role = :part1
      and rownum = 1;
      return;
    end if;
  exception
    when no_data_found then null;
  end;
  :object_owner := null;
  :object_type := null;
  :object_name := null;
  :sub_object := null;
end;

通过观察trace的这部分可以知道,是执行这条sql语句导致了ora-600[17113]错误的产生,而这条语句是查询系统对象,初步怀疑是oracle客户端上产生,继续阅读trace文件发现

    SO: 7000000b926f188, type: 4, owner: 7000000bd298b00, flag: INIT/-/-/0x00
    (session) trans: 0, creator: 7000000bd298b00, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0001-00C8-0009F6EF, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 47, prv: 0, sql: 7000000d481ee78, psql: 7000000e086d8d8, user: 567/IBILLAPP
    O/S info: user: Administrator, term: WWW-39A255460E8, ospid: 784:2080, machine: WORKGROUP\WWW-39A255460E8
              program: plsqldev.exe
    application name: PL/SQL Developer, hash value=1190136663
    action name: Main session, hash value=1773317990
    last wait for 'SQL*Net more data from client' blocking sess=0x0 seq=78 wait_time=4
                driver id=54435000, #bytes=34, =0
    temporary object counter: 0

从这里进一步确认是有人使用PL/SQL Developer从WORKGROUP\WWW-39A255460E8的机器上操作该数据库导致,查询mos,果然发现该问题记录[ID 396326.1]。
解决办法:使用新版本的plsql dev工具即可解决问题。

ORA-19583/ORA-27206/ORA-06512

检查alert日志发现近期数据库出现了ORA-19583/ORA-27206/ORA-06512错误

Fri Dec 30 12:22:36 2011
Errors in file /oracle9/app/admin/ykcdb/udump/ykcdb_ora_5390564.trc:
ORA-19583: conversation terminated due to error
ORA-27206: requested file not found in media management catalog
ORA-06512: at "SYS.DBMS_BACKUP_RESTORE", line 625

进一步查看trace文件发现

Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.7.0 - Production
ORACLE_HOME = /oracle9/app/product/9.2.0
System name:    AIX
Node name:      OFFON2
Release:        3
Version:        5
Machine:        000A36B8D600
Instance name: ykcdb
Redo thread mounted by this instance: 1
Oracle process number: 79
Unix process pid: 5390564, image: oracle@OFFON2 (TNS V1-V3)

*** 2011-12-30 12:22:36.051
*** SESSION ID:(32.62184) 2011-12-30 12:22:36.023
FATAL ERROR IN TWO-TASK SERVER: error = 12152
*** 2011-12-30 12:22:36.051
ksedmp: internal or fatal error
ORA-19583: conversation terminated due to error
ORA-27206: requested file not found in media management catalog
ORA-06512: at "SYS.DBMS_BACKUP_RESTORE", line 625
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedmp+0148          bl       ksedst               1025AE55C ?
opitsk+0968          bl       ksedmp               20289E71C ?
opiino+0798          bl       opitsk               000000000 ? 000000000 ?
opiodr+08c8          bl       _ptrgl               
opidrv+032c          bl       opiodr               3C00000018 ? 4101C7F40 ?
                                                   FFFFFFFFFFFF810 ? 0A03547C8 ?
sou2o+0028           bl       opidrv               3C0C000000 ? 4A039C2D0 

查看mos发现[ID 558993.1]提供了错误原因和解决方案
错误原因:RMAN backup to tape using veritas netbackup,The MML parameters like NB_ORA_CLIENT and NB_ORA_SERV were not given in proper case while configuring channel in RMAN backup.
解决方法:The NB_ORA_CLIENT and NB_ORA_SERV MML parameters are case sensitive. So make sure to give these parameters in same case while allocating or configuring RMAN channel as it is configure in Veritas server.

使用dblink导致scn发现改变

1.dblink目标端

[oracle@node1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sat Jan 21 15:25:59 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options

SQL> select dbms_flashback.GET_SYSTEM_CHANGE_NUMBER scn from dual;

       SCN
----------
  17454270

SQL> select file#,CHECKPOINT_CHANGE# scn from v$datafile;

     FILE#        SCN
---------- ----------
         1   17454265
         2   17454265
         3   17454265
         4   17454265
         5   17454265
         6   17454265
         7   17454265
         8   17454265
         9   17454265
        10   17454265
        11   17454265

     FILE#        SCN
---------- ----------
        13   17454265
        14   17454265

13 rows selected.

通过这里可以知道,该数据库目前的scn是小于18000000。

2.dblink操作

[oracle@node1 admin]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Sat Jan 21 15:49:39 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select dbms_flashback.GET_SYSTEM_CHANGE_NUMBER scn from dual;

       SCN
----------
  20359364

SQL> select dbms_flashback.GET_SYSTEM_CHANGE_NUMBER scn from dual@chf;

       SCN
----------
  20359367

通过这步操作发现目标端的scn发生改变,和原端库scn一致(因为先后关系,有小出入)

3.再次查看dblink目标端

SQL>  alter system checkpoint;

System altered.

SQL> select file#,CHECKPOINT_CHANGE# scn from v$datafile;

     FILE#        SCN
---------- ----------
         1   20359409
         2   20359409
         3   20359409
         4   20359409
         5   20359409
         6   20359409
         7   20359409
         8   20359409
         9   20359409
        10   20359409
        11   20359409

     FILE#        SCN
---------- ----------
        13   20359409
        14   20359409

13 rows selected.

通过checkpoint之后,发现目标库的scn已经发现改变(向前推荐),从而证明使用dblink,确实使得scn小的库和scn大的库一致。