This is a discussion on the utime system call is slow within the Sun Solaris Administration forums, part of the Solaris Operating System category; --> I have two Suns, one 8 processor 1280 and a dual processor Nettra 240. We primarily run compiles and ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I have two Suns, one 8 processor 1280 and a dual processor Nettra 240. We primarily run compiles and a lot of in house development tools on them. Performance on the 1280 has been degrading and when I was asked to look into it, I saw that operations that use the utime system call across NFS take significantly longer on the 1280 than on the 240 as seen by truss: >From the 1280: syscall seconds calls utime 466.91 3202 >From the 240: syscall seconds calls utime 25.37 3202 I have rerun my tests several time over the course of several days and continue to see the same behavior. I dusted off my copy of System Performance Tuning and started going down the list to see if there were any significant bottlenecks: Network: netstat -i shows no errors or collisions for the client or NFS server I am updating timestamps on. Did I mention that the slow server was running with IP load balancing on two gigabit network interfaces and the fast one was running at 100M? It isn't network IO. Memory: with 20G of RAM, there isn't any significant paging or swapping CPU: a lot of time spent in system CPU according to the output of sar, often two to three times as much spent in system as in user time. Also, virtual adrian from the se toolkit frequently reports mutex contention on the 1280 NFS: I'd love to blame it, but why do the same operations perform speedily on the other server? (and a few others too, but lets leave them out to keep things simple) nfsstat -c does show that a lot of calls are for getattr, but given that this machine's job is to iterate over large NFS filesystem repeatedly to run compiles, a high getattr count seems reasonable. I did play with file attribute caching mount options, but they havn't helped and I think getattr may be a red herring. Retransmissions were less than 0.01% The DNLC hit ratio is 99% according to vmstat -s So, what tools are out there and available to go deeper into the system than truss? tia, Matt |
| |||
| On Mon, 31 Jul 2006, Matt wrote: > So, what tools are out there and available to go deeper into the system > than truss? DTrace. -- Rich Teer, SCNA, SCSA, OpenSolaris CAB member President, Rite Online Inc. Voice: +1 (250) 979-1638 URL: http://www.rite-group.com/rich |
| |||
| In article <1154381552.055550.121000@s13g2000cwa.googlegroups .com>, "Matt" <matt.keeton@gmail.com> wrote: > Rich Teer wrote: > > On Mon, 31 Jul 2006, Matt wrote: > > > > > So, what tools are out there and available to go deeper into the system > > > than truss? > > > > DTrace. > > > Right, critical piece I forget to mention. Solaris 2.8 Time for an upgrade. -- DeeDee, don't press that button! DeeDee! NO! Dee... |
| |||
| Matt wrote: > CPU: > a lot of time spent in system CPU according to the output of sar, often > two to three times as much spent in system as in user time. Also, > virtual adrian from the se toolkit frequently reports mutex contention > on the 1280 If mpstat(1M) is reporting high values of "smtx" - " sleep on mutex " you probably have a softare running in the machine that is not multithreaded enough for an 8 CPU system. It may be that the impact is not noticable on your 2 CPU system. also the CPU's in the 240 has higher clock speed and flies trhough the critical code sections faster. smtx values above 200 was considered "high" for the US-II I had a couple of servers once that had smtx values above 1000 and they were slow......... //Lars |
| ||||
| > If mpstat(1M) is reporting high values of "smtx" - " sleep on > mutex " > you probably have a softare running in the machine that is not > multithreaded enough for an 8 CPU system. > It may be that the impact is not noticable on your 2 CPU system. > also the CPU's in the 240 has higher clock speed and flies trhough > the critical code sections faster. > > smtx values above 200 was considered "high" for the US-II > I had a couple of servers once that had smtx values above 1000 > and they were slow......... Thanks Lars. Over what interval should I check? mpstat 1 mpstat 5 mpstat 10 |