climateprediction.net home page
HadSM3MH Performance

HadSM3MH Performance

Message boards : Number crunching : HadSM3MH Performance
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2169
Credit: 64,555,907
RAC: 5,858
Message 36532 - Posted: 28 Mar 2009, 19:15:23 UTC - in response to Message 36524.  

Looks like I may have answered my own question. My Core i7 920 is up running a HadSM3MH model at 0.5134 s/TS. Note that this model was 2/3 done by the older computer. This is slightly faster than my old Core2Duo E8400 (that ran 333MHz faster).

The durations in CPU time for your last several trickles indicate about .45 s/TS for that model since the 920 took over.
Are CPDN models floating-point bound processes? Or would it make more sense to have HT disabled for best performance? By performance, I mean RAC.

I was actually hoping someone would experiment with that. I would suggest running 4 hadsm3\'s with HT off, then 8 with HT on, and see if it takes takes less than twice as much time to complete them. Or one could get an estimate by running 4 with HT off through several trickles, writing down average s/TS for those 4 models, then turning HT on and letting it download 4 more and run them through several trickles along with the first 4. Then see what the avg s/TS values were for the 4 new models. If the avg s/TS of the last 4 models downloaded is less than twice the average of the first 4 when they were running by themselves, HT would help throughput.

We have a Core i7 920 at work that runs a high resolution meteorological model. Running one model with 8 threads with HT on, or on 4 threads with HT off, it takes almost exactly the same amount of time either way. But running 4 vs. 8 instances of cpdn would no doubt tax the memory bandwidth in different ways.
ID: 36532 · Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 36540 - Posted: 29 Mar 2009, 4:52:20 UTC - in response to Message 36532.  

I was actually hoping someone would experiment with that. I would suggest running 4 hadsm3\'s with HT off, then 8 with HT on, and see if it takes takes less than twice as much time to complete them. Or one could get an estimate by running 4 with HT off through several trickles, writing down average s/TS for those 4 models, then turning HT on and letting it download 4 more and run them through several trickles along with the first 4. Then see what the avg s/TS values were for the 4 new models. If the avg s/TS of the last 4 models downloaded is less than twice the average of the first 4 when they were running by themselves, HT would help throughput.

We have a Core i7 920 at work that runs a high resolution meteorological model. Running one model with 8 threads with HT on, or on 4 threads with HT off, it takes almost exactly the same amount of time either way. But running 4 vs. 8 instances of cpdn would no doubt tax the memory bandwidth in different ways.


Once of my concerns is that I run multiple projects. With no ability to interlace projects with CPU cores in a logical fashion, I could not guarantee that two climate models wouldn\'t fight for the same floating point units. Also, another project\'s code may compete with Climate. To make matter\'s worse, I have to run my memory at 1066MHz (CAS 7) because the cheap DDR3 I bought wants more than the allowed voltage for Core i7\'s memory controller (1.65V). Therefore, the reduced memory bandwidth may limit the performance when running several models at once.

I\'m having a hard time getting BOINC to download more than two models at once. When I have some more time, I\'ll try editing the long/short term debts of BOINC to force more models to download. It took near 1 hour to download the latest model type HadAM3P.
ID: 36540 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,114,703
RAC: 2,578
Message 36554 - Posted: 29 Mar 2009, 18:57:16 UTC - in response to Message 36540.  

I had the same experience with the HadAM3P. It took me 44 minutes to download the first WU on a 15mps cable connection. This very long download seems to happen only on the first download of the HadAM3P. The second WU downloaded in only about 20minutes.


I was actually hoping someone would experiment with that. I would suggest running 4 hadsm3\'s with HT off, then 8 with HT on, and see if it takes takes less than twice as much time to complete them. Or one could get an estimate by running 4 with HT off through several trickles, writing down average s/TS for those 4 models, then turning HT on and letting it download 4 more and run them through several trickles along with the first 4. Then see what the avg s/TS values were for the 4 new models. If the avg s/TS of the last 4 models downloaded is less than twice the average of the first 4 when they were running by themselves, HT would help throughput.

We have a Core i7 920 at work that runs a high resolution meteorological model. Running one model with 8 threads with HT on, or on 4 threads with HT off, it takes almost exactly the same amount of time either way. But running 4 vs. 8 instances of cpdn would no doubt tax the memory bandwidth in different ways.


Once of my concerns is that I run multiple projects. With no ability to interlace projects with CPU cores in a logical fashion, I could not guarantee that two climate models wouldn\'t fight for the same floating point units. Also, another project\'s code may compete with Climate. To make matter\'s worse, I have to run my memory at 1066MHz (CAS 7) because the cheap DDR3 I bought wants more than the allowed voltage for Core i7\'s memory controller (1.65V). Therefore, the reduced memory bandwidth may limit the performance when running several models at once.

I\'m having a hard time getting BOINC to download more than two models at once. When I have some more time, I\'ll try editing the long/short term debts of BOINC to force more models to download. It took near 1 hour to download the latest model type HadAM3P.


ID: 36554 · Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 36730 - Posted: 17 Apr 2009, 19:32:37 UTC

Running four HadAM3P models at the same time resulting in their speed being between 3.2 and 3.7 s/TS. Running one model w/ other tasks such as SETI@Home resulted in a speed of 2.3 s/TS. That's quite a difference.

I'm migrating my system over to the new Fedora 11 Beta, so it'll be June before I can test with 8 CPDN models at once (w/ HT on).
ID: 36730 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : HadSM3MH Performance

©2024 climateprediction.net