vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| hello group, I have a Power4 (p630) and a Power5 (p520) box. Running some number crunching code, compiled in "common" mode, reveals that the relative performance gain is hardly above the ratio to be expected from the GHz ratio (1.2 vs 1.5). Shouldn't one expect a bit more "intrinsic" gain when going from P4 to P5 ? |
| |||
| Michael Kraemer wrote: > hello group, > > I have a Power4 (p630) and a Power5 (p520) box. > Running some number crunching code, compiled in "common" mode, > reveals that the relative performance gain > is hardly above the ratio to be expected from the GHz ratio (1.2 vs 1.5). > Shouldn't one expect a bit more "intrinsic" gain when > going from P4 to P5 ? Can you explain in a little more detail ... p630 4 Way ? p520 2 Way ? Running AIX 5.3 ? smt on/off ? What are the exact models you are using? as the p630 1.2GHz rPerf ranges from around 2.5 to 8.0'ish depending on model and the p520 ranges from approx 3.0 to 9.0 ish Rgds Mark Taylor |
| |||
| In article <1133197000.391261.268420@g47g2000cwa.googlegroups .com>, "Mark Taylor" <mky@talk21.com> writes: > Can you explain in a little more detail ... > p630 4 Way ? yes > p520 2 Way ? yes but I wanted to test the single CPU performance first, because I guess this determines how fast a job can be done. The "way-ness" determines how many of them I can do at the same time. > Running AIX 5.3 ? 5.2 > smt on/off ? should it make a difference between P4 and P5 ? I'm running 5.2 "off-the-shelf" (CD) > What are the exact models you are using? as the p630 1.2GHz rPerf > ranges from around 2.5 to 8.0'ish depending on model and the p520 > ranges from approx 3.0 to 9.0 ish What's the big difference here, if one just compares single CPU performance ? |
| |||
| >> should it make a difference between P4 and P5 ? >> I'm running 5.2 "off-the-shelf" (CD) 5.2 doesn't support smt. >> What's the big difference here, if one just compares single CPU performance ? you didn't state this in your original post >> Shouldn't one expect a bit more "intrinsic" gain when >> going from P4 to P5 ? That would wholly depend on your application, what were you expecting ? |
| |||
| In article <1133373979.773585.275720@z14g2000cwz.googlegroups .com>, "Mark Taylor" <mky@talk21.com> writes: > >> should it make a difference between P4 and P5 ? > >> I'm running 5.2 "off-the-shelf" (CD) > > 5.2 doesn't support smt. So should I expect significantly better performance with 5.3/smt ? > > >> What's the big difference here, if one just compares single CPU performance ? > > you didn't state this in your original post > >> Shouldn't one expect a bit more "intrinsic" gain when > >> going from P4 to P5 ? > > That would wholly depend on your application, what were you expecting ? Well, I don't think my app(s) are so very special, some double precision float, some integer, addressing, branching. No Linpack type stuff, however. I was expecting at least some performance gain, let's say around 1.5 . When I compare the very same code on e.g. an ancient F50 ( 333 MHz PPC 604e ) vs a less ancient 44P ( 375 MHz Power3 ) I measure a speed gain of 1.5, which is well above the clock ratio. Comparing that Power3 box with a p630 or p615 ( 1.2 Ghz Power4 ) I see a gain of 2.4, which is well below the clock ratio, but since there's a net gain I wouldn't complain. Now comparing that Power4 box with a shiny new p520 ( 1.5GHz Power5 ) the gain is a less-than-stellar 15%. From clock ratio alone I would have expected 25%. My other p615 (1.4 GHz Power4) even beats the p520 by 5% ! So where's the incentive to go Power5 ? And yes, I've tried different compiler versions as well as tuning options, the differences are always negligible. Which is consistent what I've seen in the past, all my codes seem to be resistant against -qtune & friends :-( |
| |||
| >> So should I expect significantly better performance with 5.3/smt ? up to 30% with some applications >>Now comparing that Power4 box with a shiny new p520 ( 1.5GHz Power5 ) >>the gain is a less-than-stellar 15%. From clock ratio alone I would >>have expected 25% i think you have answered your own question really. p5 are still dual core the same as p4 .. p5+ however are quad core with larger on board cache, so there will be some latency/access benefits to be had... check out the rperf figures for each of the systems to see if they tie in with what you are seeing.. Rgds Mark Taylor |
| |||
| Mark Taylor schrieb: >>>So should I expect significantly better performance > > with 5.3/smt ? > > up to 30% with some applications > > >>>Now comparing that Power4 box with a shiny new p520 ( 1.5GHz Power5 ) >>>the gain is a less-than-stellar 15%. From clock ratio alone I would >>>have expected 25% > > > i think you have answered your own question really. p5 are still dual > core the same as p4 .. p5+ however are quad core with larger on board > cache, so there will be some latency/access benefits to be had... > > check out the rperf figures for each of the systems to see if they tie > in with what you are seeing.. > well, I rather refer to IBMs own published Spec2000 values. They show that even with smt off a Power5 box should give a significant performance adavantage over Power4(+) even for single threaded execution. Unfortunately at least two of my numbercrunching codes do not follow these rules. I did systematic comparisons on various IBM platforms, ranging from a lowly 43P-140@166MHz up to a p520@1.5GHz. There seems to be a linear relationship between execution speed and cycle time, regardless of architecture, with two little exceptions: some Power3-boxes (44Ps) perform better than that, the Power5 boxes perform significantly worse. These findings hold across compilers (C version 3, 6, and 8), AIX versions (4.3, 5.1, 5.2, 5.3 including smt on) and -qarch/-qtune options, none of these variations had any effect. So, based on my experience I'm inclined to say that Power5 isn't worth looking at :-( I would be glad if somebody would convince me of the opposite. |
| |||
| In article <1140091982.452220.133010@z14g2000cwz.googlegroups .com>, thu.nnguyen@gmail.com writes: > Hmm, thats strange, our p550 1.65Ghz runs our codes approx. 1.5-1.7x > faster than the p630 1.45Ghz (there's a even bigger speedup if using > SMT and multithreaded code). > > And this correlates well with the benchmarks IBM publishes. > that's what I've expected too. Unfortunately it isn't true in my environment. I'm really stuck at this point. I need some smitty - System Environments - Change Characteristics of Operating System - Release Brakes and Set System Speed to Advertised Warp Factor :-) |