vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| A single proc system. When system is idle (0 run threads), vmstat shows a blocked thread of 1 and I/O wait of 99%. As expected io-stat and filemon shows no disk activity. In other words the system is really doing nothing. If cpu intensive thread is run, the wait i/o figures will drop. There is no performance problem. The reported high wait value however affects the reporting to our customer. So far I've refused to reboot. I need to find the thread/process responsible causing this 1 blocked I/O. What is the approach I can use. What commands besides vmstat,iostat,filemon,netpmon,tprof are available for this purpose? Thanx for you're help. Raix |
| |||
| Raix RS wrote: > So far I've refused to reboot. I need to find the thread/process > responsible causing this 1 blocked I/O. What is the approach I can > use. echo "th | grep ' r '" | crash on aix4. dunno how to do that on aix5 |
| |||
| that will get you the runnable thread ? if the thread is waiting for an i/o to complete it will not be runnable.. The problem with the wait i/o figure as you quite rightly pointed out is that single proc systems with 1 thread with an i/o outstanding and nothing else runnable will show wait i/o as 100%.... its working as designed... Also note, that a kwrite or kread to an NFS mounted filesystem will not show up on iostat as disk usage... I did also come across an proxy app a while ago that was writing logs to the proxy server over the network causing high wait i/o, but would not show up on iostat or filemon. The customer will have to live with that, or buy another system. I would be tempted to install the latest nmon and run with the flags "c" and "t" , then select "5" to give you the processes with i/o. easier than mucking around in kdb or crash.... Rgds Mark -- Posted via http://dbforums.com |
| |||
| |
| |||
| I performmed the nmon command. Used K Set Text Data I/O Use io other repage 27048 5.0 6040 5336 472 4864 0 1% 11 0 0 fileserver 24784 1.5 40320 40172 4 40168 0 8% 0 0 0 java 3140 0.5 412 36 4 32 0 0% 0 0 0 syncd 1916 0.0 504 68 20 48 0 0% 0 0 0 qdaemon I looked at the fileserver process (27048) which has considerable io ps -ef|grep 27048 UID PID PPID C STIME TTY TIME CMD root 27048 8884 10 Aug 17 - 233:10 /usr/afs/bin/fileserver Looking at this line I can see this process is cpu intensive and has a scheduling penalty of C=10 Also this process is concerned with afs. This one could very well be it. A software call with ibm support has also been made. I'm curious to see how they plan to tackle this problem. Therefore I will wait before restarting this process. I'll will post the outcome/results once I have them. Raix. |
| ||||
| you could use kill -17 to stop it (prevent it from being scheduled) and kill -19 to wake it up again then you can see if wait-time drops Raix RS wrote: > I performmed the nmon command. > Used K Set Text Data I/O Use io other repage > 27048 5.0 6040 5336 472 4864 0 1% 11 0 0 > fileserver > 24784 1.5 40320 40172 4 40168 0 8% 0 0 0 java > 3140 0.5 412 36 4 32 0 0% 0 0 0 syncd > 1916 0.0 504 68 20 48 0 0% 0 0 0 qdaemon > > I looked at the fileserver process (27048) which has considerable io > > ps -ef|grep 27048 > UID PID PPID C STIME TTY TIME CMD > root 27048 8884 10 Aug 17 - 233:10 > /usr/afs/bin/fileserver > Looking at this line I can see this process is cpu intensive and has a > scheduling penalty of C=10 > Also this process is concerned with afs. This one could very well be > it. > > A software call with ibm support has also been made. I'm curious to > see how they plan to tackle this problem. Therefore I will wait before > restarting this process. > > I'll will post the outcome/results once I have them. > > Raix. |