Today one of the Linux server reported high CPU% utilization and then it was time to work on it.The server showed 96-99% CPU utilization and 5 PIDs eating the maximum, so it was time to login to the database (10.2.0.5 version) and start checking whats going on.
00:40:26 SQL> select nvl(s.username, '(oracle)') AS username,s.sid,s.serial#,p.spid,sw.event,sw.wait_time,sw.seconds_in_wait,sw.state from v$session_wait sw,v$session s,V$process p 00:40:26 2 where s.sid=sw.sid and s.paddr=p.addr and s.status='ACTIVE' AND s.username not in 'oracle' order by seconds_in_wait; USERNAME SID SERIAL# SPID EVENT WAIT_TIME SECONDS_IN_WAIT STATE ------------------------------ ---------- ---------- ------------ ---------------------------------------- ---------- --------------- ------------------- SYS 766 5818 9643 SQL*Net message to client -1 0 WAITED SHORT TIME ABCD 801 11249 23920 db file sequential read -1 18 WAITED SHORT TIME ABCD 773 10953 30486 local write wait -1 117 WAITED SHORT TIME ABCD 811 8161 29946 local write wait -1 153 WAITED SHORT TIME ABCD 770 63261 24011 local write wait -1 171 WAITED SHORT TIME ABCD 822 16487 31327 local write wait -1 550 WAITED SHORT TIME
Using Tanel’s session wait (sw.sql) script i tried to dig a little more
00:47:39 SQL> @sw 822 old 38: sid IN (&1) new 38: sid IN (822) SID STATE EVENT SEQ# SEC_IN_WAIT P1 P2 P3 P1TRANSL ------- ------- ---------------------------------------- ---------- ----------- ------------------ ------------------ ------------------ ------------------------------------------ 822 WORKING On CPU / runqueue 82 75 file#= 1025 block#= 2 0 00:47:44 SQL> @sw 773 old 38: sid IN (&1) new 38: sid IN (773) SID STATE EVENT SEQ# SEC_IN_WAIT P1 P2 P3 P1TRANSL ------- ------- ---------------------------------------- ---------- ----------- ------------------ ------------------ ------------------ ------------------------------------------ 773 WORKING On CPU / runqueue 60 99 file#= 1025 block#= 2 0 00:47:48 SQL> @sw 801 old 38: sid IN (&1) new 38: sid IN (801) SID STATE EVENT SEQ# SEC_IN_WAIT P1 P2 P3 P1TRANSL ------- ------- ---------------------------------------- ---------- ----------- ------------------ ------------------ ------------------ ------------------------------------------ 801 WORKING On CPU / runqueue 8182 154 file#= 407 block#= 678 blocks= 1 00:47:52 SQL> @sw 770 old 38: sid IN (&1) new 38: sid IN (770) SID STATE EVENT SEQ# SEC_IN_WAIT P1 P2 P3 P1TRANSL ------- ------- ---------------------------------------- ---------- ----------- ------------------ ------------------ ------------------ ------------------------------------------ 770 WORKING On CPU / runqueue 117 3 file#= 1025 block#= 2 0 00:47:55 SQL> @sw 811 old 38: sid IN (&1) new 38: sid IN (811) SID STATE EVENT SEQ# SEC_IN_WAIT P1 P2 P3 P1TRANSL ------- ------- ---------------------------------------- ---------- ----------- ------------------ ------------------ ------------------ ------------------------------------------ 811 WORKING On CPU / runqueue 52 934 file#= 1025 block#= 2 0
The session wait for all the sessions waiting on “local write wait” event showed P1 as file#=1025 and P2 as block#=2.
00:49:14 SQL> select NAME from v$datafile where FILE#=1025; no rows selected
Thats interesting.On the file# for the session showing ‘db file sequential read’ events showed a datafile.All the ABCD’s sessions showed
insert into xyz_tmp (id, b_id, c_type) select id, table_id, 'COL' from tab_col c1 where search(c1.desc, :search, null) > 0
Table “xyz_tmp” is a temporary table where data is getting inserted based on condition from tab_col table.
The “local write wait” event was something i saw for the first time and started searching GOOGLE :), and found the below link
http://www.freelists.org/post/oracle-l/local-write-wait-event
Few points from the link to check
1. From Matthew
“local write wait” refers to the wait in a session to reuse the same buffer, while dbwr is trying to flush the same dirty buffer.There are variety of bugs listed in Metalink refering to this on the 10G version related to truncates. Could be one of the bugs or an OS problem with the disk dbwr is trying to write to.
2. Reply from K Gopalakrishnan
Basically ‘local write’ wait happens (as the name indicates) when the session is waiting for its local (means writes pending because of its own operation) write operation. This could happen typically if the underlying disc has some serious problems (one of the member disk crash in RAID-05 – for example, or a controller failure). That is why I might have said ‘ you never see this wait in the normal databases!’. You may see this during (rarely) Truncating a large table while most of the buffers of that table in cache. During TRUNCATEs the session has to a local checkpoint and during this process, the session may wait for ‘local write’ wait.
3. Jonathan Lewis’s Experience
The only time I have seen local write wait and tracked it back to source, it has been the query co-ordinator cleaning up after parallel execution slaves have been involved in parallel create table / index. Each slave thinks it owns a segment, so creates a segment header as the first block of the data it generates. The QC wipes all but one of them and puts them into the segment free list.
There may be other reasons that I’ve not yet come across.
Also useful http://sites.google.com/site/embtdbo/wait-event-documentation/oracle-local-write-wait
As the query was coming from Middle Tier, the team checked and said its spinning and hence killed the session finally.I feel its still not resolved, killing the session is not the solution.But sometimes,you just do it.