Exadata不是万能的 一直觉得xd DB端的服务器太差,但是oracle总是可以理直气壮的说我们卖的不是硬件是软件.垃圾SQL压死Exadata还是很容易的

系统为X2-2 Quarter rack Red Hat Enterprise Linux Server release 5.7 (Tikanga)

[root@dm01cel02 ~]# imageinfo 

Kernel version: 2.6.18-274.18.1.0.1.el5 #1 SMP Thu Feb 9 19:07:16 EST 2012 x86_64
Cell version: OSS_11.2.3.1.0_LINUX.X64_120304
Cell rpm version: cell-11.2.3.1.0_LINUX.X64_120304-1

Active image version: 11.2.3.1.0.120304
Active image activated: 2012-05-09 14:12:31 -0700
Active image status: success
Active system partition on device: /dev/md5
Active software partition on device: /dev/md7

In partition rollback: Impossible

Cell boot usb partition: /dev/sdm1
Cell boot usb version: 11.2.3.1.0.120304

Inactive image version: undefined
Rollback to the inactive partitions: Impossible

BI Exadata 下午节点2出现hang 死

Node1 log:

2012-11-02 13:23:46.299: [    CSSD][1094125888]clssgmGrockOpTagProcess: Request to commission member(2) using key(2) for grock(CLSN.ONSPROC.MASTER)
2012-11-02 13:23:46.299: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSPROC.MASTER), commissioner(2/2)
2012-11-02 13:23:46.299: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSPROC.MASTER) count(2) master(1) event(18), incarn 2, mbrc 1, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.299: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSPROC.MASTER), updateseq(25), status(0), sendresp(1)
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSPROC.MASTER), updateseq(25) msgseq(26), lastupdt<0x2aaab45140e0>, ignoreseq(0)
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmGrockOpTagProcess: Request to commission member(-1) using key(2) for grock(CLSN.ONSPROC.MASTER)
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSPROC.MASTER), commissioner(-1/0)
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSPROC.MASTER) count(2) master(1) event(18), incarn 0, mbrc 1, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.399: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSPROC.MASTER), updateseq(26), status(0), sendresp(1)
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(25) msgseq(26), lastupdt<0x2aaab809aca0>, ignoreseq(0)
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmAddMember: member (2/0x2aaaaca7bd60) added. pbsz(0) prsz(0) flags 0x0 to grock (0x2aaab807bc60/CLSN.ONSNETPROC.MASTER)
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSNETPROC.MASTER) count(2) master(1) event(1), incarn 8, mbrc 2, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmCommonAddMember: global group grock CLSN.ONSNETPROC.MASTER member(2/Remote) node(2) flags 0x0 0xaca7bd60
2012-11-02 13:23:46.529: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(26), status(0), sendresp(1)
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(26) msgseq(27), lastupdt<0x2aaaaca468a0>, ignoreseq(0)
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmGrockOpTagProcess: Request to commission member(2) using key(2) for grock(CLSN.ONSNETPROC.MASTER)
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(2/2)
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSNETPROC.MASTER) count(2) master(1) event(18), incarn 2, mbrc 1, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.627: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(27), status(0), sendresp(1)
2012-11-02 13:23:46.882: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(27) msgseq(28), lastupdt<0x14eba350>, ignoreseq(0)
2012-11-02 13:23:46.882: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), private data(2052), incarn(7)
2012-11-02 13:23:46.882: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSNETPROC.MASTER) count(2) master(1) event(8), incarn 7, mbrc 0, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.882: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(28), status(0), sendresp(1)
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(28) msgseq(29), lastupdt<0x14e56670>, ignoreseq(0)
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmGrockOpTagProcess: Request to commission member(-1) using key(2) for grock(CLSN.ONSNETPROC.MASTER)
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(-1/0)
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmQueueGrockEvent: groupName(CLSN.ONSNETPROC.MASTER) count(2) master(1) event(18), incarn 0, mbrc 1, to member 1, events 0xa0, state 0x0
2012-11-02 13:23:46.886: [    CSSD][1094125888]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(29), status(0), sendresp(1)

同时node 1 因为得不到相应的资源而hang住:

2:  0: waiting for 'enq: TO - contention' 
3:  0: waiting for 'rdbms ipc message'  
4:  0: waiting for 'VKTM Logical Idle Wait' 
5:  0: waiting for 'rdbms ipc message'  
6:  0: waiting for 'DIAG idle wait'     
7:  0: waiting for 'rdbms ipc message'  
8:  0: waiting for 'PING'               
9:  0: waiting for 'rdbms ipc message'  
10: 0: waiting for 'rdbms ipc message'  
11: 0: waiting for 'DIAG idle wait'     
12: 0: waiting for 'rdbms ipc message'  
13: 0: waiting for 'ges remote message' 
14: 0: waiting for 'gcs remote message' 
15: 0: waiting for 'gcs remote message' 
16: 0: waiting for 'rdbms ipc message'  
17: 0: waiting for 'GCR sleep'          
18: 0: waiting for 'rdbms ipc message'  
19: 0: waiting for 'rdbms ipc message'  
20: 0: waiting for 'rdbms ipc message'  
21: 0: waiting for 'rdbms ipc message'  
22: 0: waiting for 'rdbms ipc message'  
23: 0: waiting for 'ges inquiry response' 
24: 0: waiting for 'DFS lock handle'    
。。。。。。。。。。
158:0: waiting for 'library cache lock' 
159:0: waiting for 'library cache lock' 
160:0: waiting for 'library cache lock' 
162:0: waiting for 'LNS ASYNC end of log' 
163:0: waiting for 'rdbms ipc message'  
164:0: waiting for 'wait for unread message on broadcast channel' 
165:0: waiting for 'Streams AQ: qmn coordinator idle wait' 
166:0: waiting for 'Streams AQ: qmn slave idle wait' 
167:1: waited for 'Streams AQ: waiting for time management or cleanup tasks' 
168:0: waiting for 'rdbms ipc message'  
170:0: waiting for 'class slave wait'   
173:0: waiting for 'library cache pin'  
     Cmd: Select
174:0: waiting for 'SQL*Net message from client' 
175:0: waiting for 'SQL*Net message from client' 
     Cmd: PL/SQL Execute
188:0: waiting for 'library cache lock' 
208:0: waiting for 'SQL*Net message from client' 
241:0: waiting for 'enq: PS - contention'[Enqueue PS-00000001-00000E0F] 
247:0: waiting for 'enq: FB - contention' 
     Cmd: Insert
261:0: waiting for 'SQL*Net message from client' 
292:0: waiting for 'library cache lock' 
293:0: waiting for 'library cache lock' 
294:0: waiting for 'library cache lock' 
295:0: waiting for 'library cache lock' 
296:0: waiting for 'library cache lock' 
297:0: waiting for 'library cache lock' 
298:0: waiting for 'library cache lock' 
299:0: waiting for 'library cache lock' 
Blockers
~~~~~~~~

        Above is a list of all the processes. If they are waiting for a resource
        then it will be given in square brackets. Below is a summary of the
        waited upon resources, together with the holder of that resource.
        Notes:
        ~~~~~
         o A process id of '???' implies that the holder was not found in the
           systemstate.

                    Resource Holder State
Enqueue DR-00000000-00000000    ??? Blocker
Enqueue CF-00000000-00000000    23: 0: waiting for 'ges inquiry response'
Enqueue TT-00000001-00000000    ??? Blocker
Enqueue PS-00000001-00000E0F    ??? Blocker

Object Names
~~~~~~~~~~~~
Enqueue DR-00000000-00000000                                  
Enqueue CF-00000000-00000000                                  
Enqueue TT-00000001-00000000                                  
Enqueue PS-00000001-00000E0F

—————————————————————————————————————-

Node 2: 瞬间负载到665

-bash-3.2# uptime
13:49:56 up 93 days, 3:50, 4 users, load average: 665.00, 585.62, 416.01
-bash-3.2# su – oracle
[oracle@dm01db02 ~]$ ps -ef |grep -i local=no |awk ‘{print $2}’|xargs kill -9
kill 104896: No such process
kill 104996: No such process
kill 116568: No such process
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$ uptime
13:52:09 up 93 days, 3:52, 12 users, load average: 542.61, 584.73, 438.93
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$
[oracle@dm01db02 ~]$ ora active
13:52:12 up 93 days, 3:52, 12 users, load average: 499.96, 575.19, 436.62
select b.sid,substr(b.username,1,10) username,decode(program, null, machine,replace(program,’ (TNS V1-V3)’,”)||decode(machine,null,’@’||terminal)) machine,
*

Node 2 log:

Nov  2 13:07:55 dm01db02 kernel: osysmond.bin: page allocation failure. order:5, mode:0xd0
Nov  2 13:07:55 dm01db02 kernel:
Nov  2 13:07:55 dm01db02 kernel: Call Trace:
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff8000f6ee>] __alloc_pages+0x2ef/0x308
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff800179d2>] cache_grow+0x139/0x3c7
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff8005bdfe>] cache_alloc_refill+0x138/0x188
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff800deaa4>] __kmalloc+0x95/0x9f
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff8002de21>] __alloc_skb+0x5c/0x12e
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff80025f4d>] tcp_sendmsg+0x184/0xb07
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff80063002>] thread_return+0x62/0xfe
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff80054f00>] sock_sendmsg+0xf8/0x14a
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff800a2f69>] autoremove_wake_function+0x0/0x2e
Nov  2 13:07:55 dm01db02 kernel:  [<ffffffff8003e10e>] do_futex+0x2c2/0xce5
Nov  2 13:08:28 dm01db02 kernel:  [<ffffffff8022e041>] sys_sendto+0x131/0x164
Nov  2 13:08:28 dm01db02 kernel:  [<ffffffff800b9042>] audit_filter_syscall+0x87/0xad
Nov  2 13:08:28 dm01db02 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0




已经拿不到内存了


[root@dm01db02 log]# cat /proc/meminfo 
MemTotal:     98848188 kB
MemFree:      23464372 kB
Buffers:         22128 kB
Cached:       28832676 kB
SwapCached:     171616 kB
Active:       31461904 kB
Inactive:      1366720 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     98848188 kB
LowFree:      23464372 kB
SwapTotal:    25165816 kB
SwapFree:     24766720 kB
Dirty:            1044 kB

13:07 分的系统负载

top - 13:07:30 up 93 days,  3:07,  0 users,  load average: 18.21, 9.87, 7.92
Tasks: 1029 total,  36 running, 991 sleeping,   0 stopped,   2 zombie
Cpu(s): 20.6%us, 73.9%sy,  0.0%ni,  5.1%id,  0.2%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:  98848188k total, 98193652k used,   654536k free,      972k buffers
Swap: 25165816k total,   673764k used, 24492052k free, 30687764k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                  
114139 oracle    18   0 33.3g  10g 9.1g R 100.0 10.8   1:48.71 oracleedw12 (LOCAL=NO)                                                                  
109482 oracle    25   0 32.9g  16g  16g R 100.0 17.8 134:19.39 oracleedw12 (LOCAL=NO)                                                                  
  1403 root      19  -5     0    0    0 R 100.0  0.0  11:46.46 [kswapd0]     ---------------> SWAP                                                                             
105153 oracle    25   0 32.3g 134m  89m R 100.0  0.1   0:08.46 oracleedw12 (LOCAL=NO)                                                                  
122184 oracle    16   0 32.2g 8.1g 8.0g S 100.0  8.5   0:45.68 oracleedw12 (LOCAL=NO)                                                                  
105965 root      24   0     0    0    0 Z 100.0  0.0   0:05.62 [rds-ping] <defunct>                                                                    
105068 oracle    25   0 32.6g 1.3g 1.0g R 100.0  1.4   0:22.82 oracleedw12 (LOCAL=NO)                                                                  
104996 oracle    25   0 32.6g 1.7g 1.3g R 100.0  1.8   0:31.10 oracleedw12 (LOCAL=NO)                                                                  
 67163 oracle    25   0 33.4g  15g  14g R 99.4 16.1 167:26.91 oracleedw12 (LOCAL=NO)      

这几个top session 的CPU 使用都为100%

Nov  2 13:33:39 dm01db02 kernel: Node 0 DMA: 6*4kB 2*8kB 3*16kB 3*32kB 3*64kB 3*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 2*4096kB = 9720kB
Nov  2 13:33:39 dm01db02 kernel: Node 0 DMA32: 0*4kB 0*8kB 3*16kB 1*32kB 0*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 92*4096kB = 380752kB
Nov  2 13:33:39 dm01db02 kernel: Node 0 Normal: 105*4kB 33*8kB 0*16kB 1*32kB 11*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 11*4096kB = 50444kB
Nov  2 13:33:39 dm01db02 kernel: Node 0 HighMem: empty
Nov  2 13:33:39 dm01db02 kernel: 7498545 pagecache pages
Nov  2 13:33:39 dm01db02 kernel: Swap cache: add 3728475, delete 3674006, find 632575/817933, race 97+346
Nov  2 13:33:39 dm01db02 kernel: Free swap  = 21173180kB
Nov  2 13:33:39 dm01db02 kernel: Total swap = 25165816kB
Nov  2 13:33:39 dm01db02 kernel: Free swap:       21173140kB
Nov  2 13:33:39 dm01db02 kernel: 25690112 pages of RAM
Nov  2 13:33:39 dm01db02 kernel: 978065 reserved pages
Nov  2 13:33:39 dm01db02 kernel: 555281842 pages shared
Nov  2 13:33:39 dm01db02 kernel: 54880 pages swap cached

DRM 信息:

2012-11-02 12:58:21.409428 :
End DRM(3959) for pkey transfer request(s) from 1
* received DRM start msg from 1 (cnt 3, last 1, rmno 3960)

*** 2012-11-02 13:02:45.382
Rcvd DRM(3960) AFFINITY Transfer pkey 2220292.0 to 2 oscan 1.1
Rcvd DRM(3960) AFFINITY Transfer pkey 2221641.0 to 1 oscan 1.1
Rcvd DRM(3960) AFFINITY Transfer pkey 2221642.0 to 1 oscan 1.1
ftd (30) received from node 1 (8 0.30/0.0)
all ftds received

* kjxftdn: break from kjxftdn, post lmon later
ftd (33) received from node 1 (8 0.33/0.0)
all ftds received

* kjxftdn: break from kjxftdn, post lmon later

*** 2012-11-02 13:02:45.561
ftd (35) received from node 1 (8 0.35/0.0)
all ftds received

* kjxftdn: break from kjxftdn, post lmon later
ftd (37) received from node 1 (8 0.37/0.0)
all ftds received
ASH WORST MINUTES FOR DRM FREEZE WAITS:

APPROACH: These are the minutes where the avg drm freeze time
was the highest (in milliseconds).

MINUTE              INST_ID EVENT                            TOTAL_WAIT_TIME      WAITS   AVG_TIME_WAITED
---------------- ---------- ------------------------------ ----------------- ---------- -----------------
Nov02_1359                2 gcs drm freeze in enter server         10757.718         30           358.591
Nov02_1429                2 gcs drm freeze in enter server          2495.686          8           311.961


DYNAMIC_REMASTER_STATS
This shows where time is spent during DRM operations.

Instance: 1
Remaster Ops: 7
Remaster Time: 2275
Remastered Objects: 127
Quiesce Time: 165
Freeze Time: 10
Cleanup Time: 131
Replay Time: 108
Fixwrite Time: 184
Sync Time: 1674
Resources Cleaned: 0
Replayed Locks Sent: 822
Replayed Locks Received: 1299002
Current Objects: 39

Instance: 2
Remaster Ops: 7
Remaster Time: 2263
Remastered Objects: 127
Quiesce Time: 739
Freeze Time: 13
Cleanup Time: 218
Replay Time: 829
Fixwrite Time: 426
Sync Time: 37
Resources Cleaned: 0
Replayed Locks Sent: 91404
Replayed Locks Received: 571022
Current Objects: 114

附一份简化的CHM 信息 使用OCLUMON 得到

system_dump_chm.txt

通过kill process之前的netstat 找到对应的IP:PORT

tcp        0      0 10.0.1.204:1522             10.1.0.90:50956             ESTABLISHED 27295/oracleedw12  

通过IP 找到对应机器 10.1.0.90(xen19-vm05)

通过machine_name 找到sql_id

总结

12:30左右 QRY的用户发起了大量操作 累积时间1小时 ,可能导致了大量DRM, 一个长时间的过程 cpu idle 降至0.x% ,memory被吃光 ,由于exadata的DB端服务器并不是很强劲,导致 hang死.另外注意大量processes 导致的PTE 内存消耗 同样会耗尽内存。

请注意即使XD使用了RDS还是请调整节点分配策略, DBA会商议是否关闭 DRM和RML, 后续SQL 会跟BI 商议 可能会开启RM 控制QRY 操作