aix中procmap 查看oracle进程占用系统内存

procmap是用来显示进程地址空间,通过这个命令找出来的“read/write”表示为进程的私有内存,如果对应到oracle 进程的LOCAL中来,也就是对应了是oracle 会话进程占用的操作系统内存,和sga与pga无关,即ORACLE数据库进程占用的额外的系统内存,在计算oracle数据库消耗内存的时候,要考虑sga+pga+process占用的内存
procmap命令使用

$procmap 7931354
7931354 : oracleccicdx (LOCAL=NO) 
100000000            95504K  read/exec         oracle
110000035             2399K  read/write        oracle
9fffffff0000000         51K  read/exec         /usr/ccs/bin/usla64
9fffffff000cfe2          0K  read/write        /usr/ccs/bin/usla64
900000000b05930          2K  read/exec         /usr/lib/libC.a[shr3_64.o]
9001000a0122930          0K  read/write        /usr/lib/libC.a[shr3_64.o]
900000000ae6b00        118K  read/exec         /usr/lib/libC.a[shrcore_64.o]
9001000a030a100         12K  read/write        /usr/lib/libC.a[shrcore_64.o]
900000000ac8000        118K  read/exec         /usr/lib/libC.a[ansicore_64.o]
9001000a0300e00         36K  read/write        /usr/lib/libC.a[ansicore_64.o]
900000000411468          0K  read/exec         /usr/lib/libicudata.a[shr_64.o]
9001000a0121468          0K  read/write        /usr/lib/libicudata.a[shr_64.o]
90000000040f738          2K  read/exec         /usr/lib/libC.a[shr2_64.o]
9001000a0314738          0K  read/write        /usr/lib/libC.a[shr2_64.o]
9000000008dd800       1699K  read/exec         /usr/lib/libC.a[ansi_64.o]
9001000a0315a00        277K  read/write        /usr/lib/libC.a[ansi_64.o]
9000000008bab00        135K  read/exec         /usr/lib/libC.a[shr_64.o]
9001000a030eb00         19K  read/write        /usr/lib/libC.a[shr_64.o]
900000000708180       1732K  read/exec         /usr/lib/libicuuc.a[shr_64.o]
9001000a035cdac        180K  read/write        /usr/lib/libicuuc.a[shr_64.o]
900000000493d80       2510K  read/exec         /usr/lib/libicui18n.a[shr_64.o]
9001000a038a148        270K  read/write        /usr/lib/libicui18n.a[shr_64.o]
900000000473200         91K  read/exec         /usr/lib/libsrc.a[shr_64.o]
9001000a01127a8         55K  read/write        /usr/lib/libsrc.a[shr_64.o]
90000000045a300         98K  read/exec         /usr/lib/libcorcfg.a[shr_64.o]
9001000a04147c8         18K  read/write        /usr/lib/libcorcfg.a[shr_64.o]
900000000b16200        750K  read/exec         /usr/lib/liblvm.a[shr_64.o]
9001000a03dd028        219K  read/write        /usr/lib/liblvm.a[shr_64.o]
900000000444f00         82K  read/exec         /usr/lib/libcfg.a[shr_64.o]
9001000a03d58f0         26K  read/write        /usr/lib/libcfg.a[shr_64.o]
90000000040e3a0          2K  read/exec         /usr/lib/libcrypt.a[shr_64.o]
9001000a0106948          0K  read/write        /usr/lib/libcrypt.a[shr_64.o]
90000001615d860          5K  read/exec         /usr/lib/libc.a[aio_64.o]
9001000a3aed568          0K  read/write        /usr/lib/libc.a[aio_64.o]
9000000003efc00        120K  read/exec         /usr/lib/libodm.a[shr_64.o]
9001000a0107cc8         40K  read/write        /usr/lib/libodm.a[shr_64.o]
900000000bd2c80        147K  read/exec         /usr/lib/libperfstat.a[shr_64.o]
9001000a041a960         14K  read/write        /usr/lib/libperfstat.a[shr_64.o]
9000000017d7000          0K  read/exec         /usr/lib/libdl.a[shr_64.o]
9001000a0517000          0K  read/write        /usr/lib/libdl.a[shr_64.o]
9000000158ed100       8636K  read/exec         /oracle/product/db10gr2/lib/libjox10.a[shr.o]
8001000a0000b78        587K  read/write        /oracle/product/db10gr2/lib/libjox10.a[shr.o]
900000000a87000        257K  read/exec         /usr/lib/libpthreads.a[shr_xpg5_64.o]
9001000a0274000        559K  read/write        /usr/lib/libpthreads.a[shr_xpg5_64.o]
900000000000800       4025K  read/exec         /usr/lib/libc.a[shr_64.o]
9001000a0000020       1047K  read/write        /usr/lib/libc.a[shr_64.o]
         Total      121863K

简化命令,统计私有内存,procmap 7931354|grep “read/write” |awk -F ” ” ‘{print $2}’,通过相关计算的出来,在当前的操作系统和数据库版本中,一个LOCAL=NO进程占用系统内存为:5758KB

补充说明
1.操作系统版本

$oslevel -r
6100-06

2.数据库版本

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 - Production

3.通过跟踪多个LOCAL=NO进程,发现类似进程占用的系统内存相同,估算给系统oracle进程占用的内存,可以通过该值进行大概估算
4.确认ORACLE使用的内存量不是以往认识的sga+pga,实际上应该是sga+pga+所有oracle进程占用
5.在linux中使用pmap来查看

ORACLE在AIX中产生SOFTWARE PROGRAM ABNORMALLY TERMINATED警告原因

数据库中发现如下错误
该错误的解决方案:ORA-07445[dbgrmqmqpk_query_pick_key()+0f88]

Dump file /oracle/diag/rdbms/sgerp5/sgerp5/incident/incdir_579300/sgerp5_m000_7602504_i579300.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/product/11.1.0/db_1
System name:    AIX
Node name:  sgerp5
Release:    1
Version:    6
Machine:    00C8F0564C00
Instance name: sgerp5
Redo thread mounted by this instance: 1
Oracle process number: 138
Unix process pid: 7602504, image: oracle@sgerp5 (m000)
 
*** 2012-05-11 03:52:35.200
*** SESSION ID:(752.5029) 2012-05-11 03:52:35.200
*** CLIENT ID:() 2012-05-11 03:52:35.200
*** SERVICE NAME:(SYS$BACKGROUND) 2012-05-11 03:52:35.200
*** MODULE NAME:(MMON_SLAVE) 2012-05-11 03:52:35.200
*** ACTION NAME:(Auto-Purge Slave Action) 2012-05-11 03:52:35.200
 
Dump continued from file: /oracle/diag/rdbms/sgerp5/sgerp5/trace/sgerp5_m000_7602504.trc
ORA-07445: exception encountered: core dump [dbgrmqmqpk_query_pick_key()+0f88] 
[SIGSEGV] [ADDR:0xB38F0000000049][PC:0x100213C08] [Address not mapped to object] []

errpt错误说明
在产生7445错误的同时观察aix系统错误日志发现SOFTWARE PROGRAM ABNORMALLY TERMINATED错误

sgerp5_[oracle]-->errpt -aj A924A5FC
---------------------------------------------------------------------------
LABEL:          CORE_DUMP
IDENTIFIER:     A924A5FC

Date/Time:       Fri May 11 03:52:55 BEIST 2012
Sequence Number: 471
Machine Id:      00C8F0564C00
Node Id:         sgerp5
Class:           S
Type:            PERM
WPAR:            Global
Resource Name:   SYSPROC         

Description
SOFTWARE PROGRAM ABNORMALLY TERMINATED

Probable Causes
SOFTWARE PROGRAM

User Causes
USER GENERATED SIGNAL

        Recommended Actions
        CORRECT THEN RETRY

Failure Causes
SOFTWARE PROGRAM

        Recommended Actions
        RERUN THE APPLICATION PROGRAM
        IF PROBLEM PERSISTS THEN DO THE FOLLOWING
        CONTACT APPROPRIATE SERVICE REPRESENTATIVE

Detail Data
SIGNAL NUMBER
           6
USER'S PROCESS ID:
               7602504
FILE SYSTEM SERIAL NUMBER
          14
INODE NUMBER
           0      367648
CORE FILE NAME
/oracle/diag/rdbms/sgerp5/sgerp5/cdump/core_7602504/core
PROGRAM NAME
oracle
STACK EXECUTION DISABLED
           0
COME FROM ADDRESS REGISTER
sskgmcrea 0

PROCESSOR ID
  hw_fru_id: 1
  hw_cpu_id: 2

ADDITIONAL INFORMATION
skgdbgcra 224
??
ksdbgcra 3D0
ssexhd 978
??

Symptom Data
REPORTABLE
1
INTERNAL ERROR
0
SYMPTOM CODE
PCSS/SPI2 FLDS/oracle SIG/6 FLDS/skgdbgcra VALU/224

错误原因

This error is logged when a software program abnormally ends and causes a core dump. Users might
not be exiting applications correctly, the system might have been shut down while users were
working in application, or the user's terminal might have locked up and the application stopped
1)这里也就是说如果oracle进程在aix机器上异常终止,并且产生了一个core dump文件,
  就会出现SOFTWARE PROGRAM ABNORMALLY TERMINATED警告信息
2)用户登录系统没有正常退出,而系统被关闭
3)用户强制终止一个一个lock,而导致进程停止

本次AIX日志警告原因:由于进程7602504异常终止(ORA-07445错误)并且产生了 /oracle/diag/rdbms/sgerp5/sgerp5/cdump/core_7602504/core dump 文件,从而有了AIX中的SOFTWARE PROGRAM ABNORMALLY TERMINATED警告信息

ksh翻上/下条和自动补全功能

AIX默认安装ksh,对于习惯了bash的人来说,不能tab自动补全,不能翻上/下,感觉使用起来很不方便,在ksh中不能直接实现这些功能,可以使用另外的方法来完成
一.安装bash程序,使用起来就和bash一样

二.ksh中通过其他方法完成
翻上/下条功能
1、在主目录中 vi .profile
2、添加一行:export EDITOR=vi
3、保存.profile,重新登陆;或者source ~/.profile
现在如果要使用翻上/下条功能,只需要按下esc键,然后使用j/k翻上/下即可;如果要退回到输入功能,直接输入i,然后输入即可.其实所有操作就是和vi中的操作一样.

自动补全功能
使用esc+\

Posted in AIX |

通过netstat+rmsock查找AIX端口对应进程

rmsock除去不包含文件描述符的套接字。它接受 socket、tcpcb、inpcb、ripcb 或 rawcb 地址并将其转换成套接字地址。然后检查每个进程所有打开的文件以查找套接字的匹配。如果没找到匹配,对该套接字执行异常终止操作,而不考虑套接字 linger 选项的存在。套接字保留的端口号释放。如果发现匹配,文件描述符和主进程状态显示给用户。
命令格式:rmsock Address TypeofAddress

[zwq:/]netstat -Aan|grep 6200|grep LISTEN
f1000e0000307bb0 tcp4       0      0  *.6200             *.*                LISTEN
--f1000e0000307bb0 为系统内核地址

[zwq:/]rmsock f1000e0000307bb0 tcpcb
The socket 0x307808 is being held by proccess 5701830 (ons).

[zwq:/]ps -ef|grep 5701830|grep -v grep
oracle10  5701830  5112098   0   Apr 21      -  7:17 /oracle10/app/product/crs/10.2.0/opmn/bin/ons -d 

nmon使用说明

Nmon 工具是 IBM 提供的免费的监控 AIX 系统与 Linux 系统资源的工具。该工具可实时监控系统性能,也可以将服务器的系统资源耗用情况收集起来并输出一个特定的文件,并可利用 excel 分析工具进行数据的统计分析,非常利用 UNIX 或者 Linux 系统的性能数据分析。

1.下载地址
nmon官网
NMON_Analyser官网
本地下载nmon
本地下载nmon_analyser

2.安装nmon
在压缩包中找到相应的版本,上传至服务器,然后授予执行权限

3.主要操作说明

+-HELP---------most-keys-toggle-on/off------------------------------------------+
|h = Help information     q = Quit nmon             0 = reset peak counts       |
|+ = double refresh time  - = half refresh          r = ResourcesCPU/HW/MHz/AIX |
|c = CPU by processor     C=upto 128 CPUs           p = LPAR Stats (if LPAR)    |
|l = CPU avg longer term  k = Kernel Internal       # = PhysicalCPU if SPLPAR   |
|m = Memory & Paging      M = Multiple Page Sizes   P = Paging Space            |
|d = DiskI/O Graphs       D = DiskIO +Service times o = Disks %Busy Map         |
|a = Disk Adapter         e = ESS vpath stats       V = Volume Group stats      |
|^ = FC Adapter (fcstat)  O = VIOS SEA (entstat)    v = Verbose=OK/Warn/Danger  |
|n = Network stats        N=NFS stats (NN for v4)   j = JFS Usage stats         |
|A = Async I/O Servers    w = see AIX wait procs   "="= Net/Disk KB<-->MB       |
|b = black&white mode     g = User-Defined-Disk-Groups (see cmdline -g)         |
|t = Top-Process --->     1=basic 2=CPU-Use 3=CPU(default) 4=Size 5=Disk-I/O    |
|u = Top+cmd arguments    U = Top+WLM Classes       . = only busy disks & procs |
|W = WLM Section          S = WLM SubClasses)                                   |

4.实时监控结果
1)监控内存使用情况

| Memory -----------------------------------------------------------------------|
|          Physical  PageSpace |        pages/sec  In     Out | FileSystemCache |
|% Used       93.8%     34.3%  | to Paging Space   0.0    0.0 | (numperm) 44.3% |
|% Free        6.2%     65.7%  | to File System    0.0  257.9 | Process   18.2% |
|MB Used    1786.0MB   175.8MB | Page Scans        0.0        | System    31.4% |
|MB Free     118.0MB   336.2MB | Page Cycles       0.0        | Free       6.2% |
|Total(MB)  1904.0MB   512.0MB | Page Steals       0.0        |           ------|
|                              | Page Faults     279.9        | Total    100.0% |
|------------------------------------------------------------ | numclient 44.3% |
|Min/Maxperm     361MB( 19%)  1443MB( 76%) <--% of RAM        | maxclient 75.8% |
|Min/Maxfree     960   1088       Total Virtual    2.4GB      | User      58.4% |
|Min/Maxpgahead    2      8    Accessed Virtual    0.9GB 40.1%| Pinned    28.6% |
|-------------------------------------------------------------------------------|

2)监控cpu使用情况

|                           0----------25-----------50----------75----------100 
|CPU User%  Sys% Wait% Idle%|           |            |           |            | 
|  0   0.0   0.0   0.0 100.0|>                                                | 
|  1   0.0   0.0   0.0 100.0|>          |                                       
|  2   0.0   0.0   0.0 100.0|>                                                |
|  3   0.0   0.0   0.0 100.0|>                                                |
|Physical Averages          +-----------|------------|-----------|------------+
|All   0.2   2.5   0.7  96.6|>                                                |
|                           +-----------|------------|-----------|------------+

3)监控进程状态

| Top-Processes-(147) -----Mode=3  [1=Basic 2=CPU 3=Perf 4=Size 5=I/O 6=Cmds]-----------------------------|
|  PID       %CPU     Size      Res     Res      Res     Char    RAM      Paging         Command          |
|            Used       KB      Set     Text     Data     I/O     Use   io   other repage                 |
| 1908868     0.8    30508    29764      132    29632        2    2%      0      3      0 secldapclntd    |
| 2306196     0.7      512      512        0      512        0    0%      1      8      0 trclogio        |
| 2732116     0.6     2520        0        0        0        0    0%      0     33      0 <defunct Zombie>|
|  340036     0.2     1416      296       72      224        0    0%      0      0      0 dtgreet         |

5.监控一段时间性能

-f            spreadsheet output format [note: default -s300 -c288]
optional
 -s <seconds>  between refreshing the screen [default 2]
 -c <number>   of refreshes [default millions]
 -t            spreadsheet includes top processes
具体信息nmon -h

例如:nmon -f -t -s 30 -c 120
-s 30:每30秒进行一次数据采集
-c 120:一共采集120次

6.分析数据
打开nmon analyser,设置宏的安全级别是低 ,之后点击 Analyser NMON data 按钮 输入文件 保存成excel格式即可。