View Single Post

   
  #9 (permalink)  
Old 04-07-2008, 08:35 AM
James_Szabadics
 
Posts: n/a
Default Re: help analyzing low system(with sar/vmstat/u386mon/sarcheck data)

It could be network or disk or both! Having Bela in here is truly
awesome, I am not in Bela's league of understanding the inner workings
of SCO but I have some practical generic advice for you to consider.

The buffer cache flush daemon "bdflush" will be regularly flushing,
when it does it is writing your (huge) buffer cache to disk. This
could be responsible for the surges in disk i/o that you see.

making your buffer cache smaller or more frequent flushing or a
combination of both could help to smooth out the big data write
tsunamis into smaller waves but looking at the underlying disk and/or
RAID architectecture is also important. If your system is
experiencing a situation where the activity that is generating i/o is
very bursty and infrequent then a bigger cache could help deal with a
slow disk but if the action is frequent or continuous then you really
need a faster disk. The frequency of i/o bursts and the timing of the
bdflush is also important but faster disks always help.

You have a lot of collisions on your network - you need to deal with
that too. That could be a range of issues but work through a process
of elimination looking at things like the following........

a. Switch configuration (assuming you have managed switches with
some layer3 capabilities) - I have found IGMP snooping turned ON will
assist in managing broadcast traffic
b. check the event logs on the switch looking at error counts by
port and follow the trail to track down the source of the noise where
the counts are highest.
c. beef up the server to switch connection - make sure the data
pipes are fat where data converges! Upgrade the server NIC to gigabit
and push it into a gigabit port on your switch make sure the backbone
of your network linking your LAN segments together has fat pipes
too....
d. look at implementing some QoS for your telnet traffic if all of
the above are fine


Reply With Quote