View Single Post

   
  #10 (permalink)  
Old 01-05-2008, 11:33 AM
Michael E. Thomadakis
 
Posts: n/a
Default Re: SMT on IBM Power5+

Jan-Frode Myklebust wrote:
> On 2007-04-05, Michael E. Thomadakis <miket@sc.tamu.edu> wrote:
>> Jan thanks for your reply. We have a 40 node p5-575 cluster with 32GB
>> DRAM/node. Nodes are attached to 2 planes of HPS. I was contending that
>> SMT ON should benefit the cluster since all of the cluster control
>> processes (RSCT, GPFS, HPS assistance, etc.) are heavily multi-threaded
>> and by not being compute-intensive, they should benefit from SMT ON.

>
> I believe these cluster control processes will have a negligble
> impact (*). It's the applications you'll be running that will decide
> if you should use SMT or not. We let the users decide on a job
> by job basis if they want SMT by specifying this in their LoadLeveler
> scripts, and then have LL pre/post scripts turn on/off SMT


I believe that under reasonable caveats and circumstances, SMT ON will
benefit MPI code that uses many non-blocking I/O and aggregate /
reduction calls with relatively smaller amounts of data per call.

This is particularly true for MPI code that is compute intensive (CPUs
~100% utilization) and then I/O arrives or needs to be checked for (with
the dreadful polling). With SMT OFF, HPS assist threads have to swap out
compute intensive ones which implies save/restore of their state and
loss of some of their RSS in the cache(s). You may say this is
negligible but I have seen polling threads almost exclusively doing
nanosleep() on a tight loop waiting for data to come in at the rate of
100s times/sec. I believe that this high rate of thread swapping is far
beyond the annoyance level and it is detrimental to the compute threads.

With SMT ON, these I/O threads should not impact much the compute ones.

I am in the process of putting together my own SMT benchmarking suite
but I was hoping to find an existing one to get some quick first cut
results.


>
>> Are there any public benchmarks that I can run to demonstrate the
>> benefits of SMT ?

>
> Don't know, but it will probably be easy to demonstrate higher
> troughput by running 32 separate cpu intensive tasks on a system
> with SMT=on compared to SMT=off. And then you could probably also
> experience the negative impact of SMT by running a 16 task
> MPI-job on a system with SMT=on -- where you might not get all
> tasks running on separate physical cpus...
>
> * BTW: do make sure that there's nothing causing any load on a
> system that's idle. An idle system should show a load of
> 0.00 in "uptime". We had an "xmtrend" system process that
> was generating a load of about 0.05-0.10 on our nodes,
> and that was killing the parallel benchmark performance
> on fully loaded systems.



I've also saw these pesky 'xmtrend' running on some of the nodes at 100%
CPU utilization (spinning over a failing syscall). I went ahead and
killed those who were mollesting the processors with a workload of > 10%...



>
> -jf


Michael
Reply With Quote