climateprediction.net home page
HADAM3P's too much RAC weight?

HADAM3P's too much RAC weight?

Message boards : Number crunching : HADAM3P's too much RAC weight?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 36552 - Posted: 29 Mar 2009, 17:54:50 UTC

Two months ago 4300 RAC points would put a machine in the top ten. Now it won\'t even get you into the top 100. Glancing around at the tasks of the top 100 machines it looks like they\'re mostly running HADAM3P\'s.
ID: 36552 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 36553 - Posted: 29 Mar 2009, 18:53:40 UTC

Hi Belfry

HadAM3Ps do receive rather more credits per hour than the other models. However, the slightly higher credit level has to compensate for the massive uploads and downloads and the fact that if you run several of them on a multicore computer they slow each other down. If I run two together on my Core2Duo 6600 I get no more credit per hour than when running other model types. They seem to swap a lot of memory and contend for this resource. So members running these should ideally check what they download and run in tandem.

We\'ve seen a case of spectacularly low running speed with HadAM3P where a member with a 6-core hyperthreaded computer is (or certainly was) running 6 of them together.

Other model types do also vary in speed according to the computer and OS. It\'s a good idea for everybody to keep an eye on the forum News thread where we warn about potential problems.
Cpdn news
ID: 36553 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 36555 - Posted: 29 Mar 2009, 19:52:17 UTC
Last modified: 29 Mar 2009, 19:57:22 UTC

It looks like 2~3x the RAC of the \"legacy\" apps. Here are some machines that look like they\'re honest and crunching 24/7:

2P/quad-core Xeon, XPsp3 running legacy apps: 539345

Core 2 Duo, XPsp3 running HADAM3P\'s: 950312. Note how this one is knocking the Xeon out.

Core 2 Duo, XPsp2 running HADAM3P\'s: 957143. This machine has only been crunching for twenty days and is also knocking the Xeon out. Its total credit is 33211 (1660 credit per day) and RAC is 4269!

In a couple months every top 1000 machine will be exclusively crunching HADAM3P\'s. It doesn\'t much matter to me, I\'ll still be poking along with my old machine.
ID: 36555 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 36561 - Posted: 29 Mar 2009, 21:28:41 UTC
Last modified: 29 Mar 2009, 21:55:46 UTC

Hi again

I\'ve looked at those computers. The two C2Duos are faster than mine.

Anonymous\'s computer is by my calculations getting about 33 credits/hour for its HadAM3Ps. When it ran Mid-Holocenes it got about 34 credits/hour for them. It would probably run the HadAM3Ps faster if it didn\'t have a full load of them.

The first C2Duo you link to is getting about 31 credits/hour for its HadAM3Ps. For a fast machine that isn\'t much more than one would expect for other model types. One of its Mid-Holocenes got 33.7 credits/hour and its HadAM got 29.3.

The real comparison between models must be by credits/hour.

Members have commented recently that our RAC on CPDN is artificially high at the moment. This seems to be because on several days our credits haven\'t been exported to the stats sites. When 2 or 3 days\' worth of credits are exported all together it pushes the RAC up abnormally. We\'ll have to keep an eye on this to see whether our RAC does return to normal after a few weeks as it should if the CPDN server behaves for Milo.

My own RAC is showing as about 1880. But my slower computer has earned almost nothing for the last 10 days because it\'s been crunching for another project and CPDN Beta tasks which generated little or no credit. My C2D has earned less than usual because until about 3 days ago it was also testing Beta models that didn\'t trickle ie no credits. The most my two computers can earn per day together is about 1300 - 1350 credits.
Cpdn news
ID: 36561 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 36563 - Posted: 29 Mar 2009, 22:10:25 UTC
Last modified: 29 Mar 2009, 22:56:12 UTC

Actually this thread is mistitled, it should read \"HADAM3P\'s too much RAC weight?\".

The RAC calculation for HADAM3P is as accurate as an Enron accountant.

Here\'s two twenty-day-old machines (manually searched them out, not a lot going on today): An eight-core Xeon and the last Core 2 Duo example in my previous post. If you scroll to the bottom of the pages you can see how many credits each machine has produced in the last twenty days. The Xeon produces around 3200 per day, The Core 2 Duo around 1600. The RAC\'s are 1491 and 4269 respectively. The Xeon hasn\'t reported today and I know there have been some problems with the CPDN servers which hurts the Xeon RAC right now, but there is no way the Core 2 Duo should have that much RAC--it hasn\'t even produced 2000 credits in a single day.

I also noticed iansm posted about this earlier. If we look at his/her BoincStats page, the third chart down indicates a very steady daily credit contribution (apart from the server hiccups) but the fourth chart shows his/her RAC skyrocketing.

I am relieved this calculation only affects RAC and not total credit. But it does make shopping for a new computer harder, as RAC is now more dependent on the application being run rather than the underlying hardware.
ID: 36563 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 36566 - Posted: 30 Mar 2009, 1:40:17 UTC
Last modified: 30 Mar 2009, 1:48:42 UTC

Geophi and I have looked at your links and think you\'re right - there\'s something wrong with the RAC of members like iansm who\'ve run HadAM3P models. He started his first Enron models on the day they were launched, 11 March. I think the first two rises in Ian\'s RAC on 17 and 19 March may be normal, but there should have been no further rises between 20 and 23 March incl. And Ian\'s credit spike on 25 March produced no rise in RAC, which also seems abnormal.

Geophi has started a thread about this in the hidden moderators\' forum section, which Milo will see. I\'ve posted in detail there about Ian\'s graphs. For comparison I\'ve also linked to the graphs of Neil Hunter who hasn\'t run any Enrons.

The HadAM3P models themselves are innocent. I wonder whether there could be be an error in one of the trickle, credit or export scripts and it only relates to this model type.

Could everyone please note that all model types are generating the correct amount of credit.

I hope iansm and Neil Hunter don\'t mind that they\'ve been mentioned by name. I\'ll send them both PMs on Monday so they do at least know.

Belfry, thanks for delving into the problem for us. I\'ve edited the thread title to what you suggested.
Cpdn news
ID: 36566 · Report as offensive     Reply Quote
Profile old_user81594

Send message
Joined: 11 Jun 05
Posts: 67
Credit: 1,222,916
RAC: 0
Message 36570 - Posted: 30 Mar 2009, 19:18:40 UTC - in response to Message 36566.  

Could everyone please note that all model types are generating the correct amount of credit.

I hope iansm and Neil Hunter don\'t mind that they\'ve been mentioned by name. I\'ll send them both PMs on Monday so they do at least know.

Belfry, thanks for delving into the problem for us. I\'ve edited the thread title to what you suggested.


Hi everyone,

Well this is perfect timing to see this thread (and the PM from Mo).
I have just started a few of the new HADAM3P models just 48 hours ago. I grabbed three for my quad-core machine and two for my dual-core.
Firstly, one model crashed after just 11 hours, which forced me to do a Windows repair install which was annoying to say the least.

Secondly, they are certainly slowing down. On the Q6600 machine, they are currently running at around 5.1s/TS each, and 6.8s/TS on my 950D. I\'ll monitor how much slower they get. On the Q6600, I am running World Community Grid and Rosetta as well as two HADAM3P models plus a mid-Holocene. I hope a combination of these will be OK, as I read that running four HADAM3 models will slow the s/TS dramatically - there must be a FSB/RAM bottleneck.

Lastly, the RAC issue is weird!! The credits seem perfectly normal - but the RAC has jumped far too much in too short a space of time.
See my BoincStats page - http://boincstats.com/stats/user_graph.php?pr=cpdn&id=81594

You\'ll see my credit is normal for today (my PCs run other projects too, so credit is 1267 today, but the RAC has gone from 358 on Saturday to over 1100 in just 48 hours, despite the credits being just 79 yesterday and 1267 today!!! I would have expected a jump to maybe 450-500 maximum after just this amount of credit; it should take about 6 weeks to get the RAC up to your daily credit level I believe.

Anyway, I\'ll keep an eye on all of this, and will let you know my findings in a couple of days.

Best regards,

Neil.
ID: 36570 · Report as offensive     Reply Quote
Profile old_user81594

Send message
Joined: 11 Jun 05
Posts: 67
Credit: 1,222,916
RAC: 0
Message 36572 - Posted: 30 Mar 2009, 19:32:20 UTC - in response to Message 36561.  



....When 2 or 3 days\' worth of credits are exported all together it pushes the RAC up abnormally. We\'ll have to keep an eye on this to see whether our RAC does return to normal after a few weeks as it should if the CPDN server behaves for Milo....



I have to disagree here because if the credits aren\'t exported for a few days, then your RAC will drop over that period, only to return to the level it would have been at when the credits are finally exported.

The RAC is just a measure of average performance over time, as you know, so if your PC is capable of RAC=1000 for example, then you let it run 24/7 but disconnect from the internet for a week, your RAC will drop exponentially over the time disconnected to around 500 or so.
When it is reconnected, you may get 7000 credits or something crazy, but your RAC should only pop back to 1000 again; no more, no less.
Or is the RAC equation more complex than this???

Regards,

Neil
ID: 36572 · Report as offensive     Reply Quote
old_user105019

Send message
Joined: 30 Oct 05
Posts: 15
Credit: 96,265
RAC: 0
Message 36576 - Posted: 31 Mar 2009, 8:51:13 UTC - in response to Message 36572.  

The RAC is just a measure of average performance over time, as you know, so if your PC is capable of RAC=1000 for example, then you let it run 24/7 but disconnect from the internet for a week, your RAC will drop exponentially over the time disconnected to around 500 or so.
When it is reconnected, you may get 7000 credits or something crazy, but your RAC should only pop back to 1000 again; no more, no less.
Or is the RAC equation more complex than this???

Assuming that it is still the same formula used as listed here, then there is a higher weight attached to recent credit than older credit. If you trust my maths and the simplified cases, the following examples can show this. I\'ve assumed that credit and RAC are only calculated once per day, and that RAC was stable at the start.

Case 1 - 1000 credits per day added. RAC stays permanently at 1000

Case 2 - Upload turned off for a week. After 7 days the RAC will have dropped to 500. On day 8, 8000 credits will be added (1000 per day of running time). After the calculation on day 8 though, the RAC is now 1207.07 even though you have only produced 1000 credits per day. This will obviously decay back to 1000 over a couple of weeks.

Hope this helps

Michael
ID: 36576 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 36599 - Posted: 1 Apr 2009, 22:02:43 UTC
Last modified: 1 Apr 2009, 22:03:09 UTC

Hi Michael

Yes, the explanation\'s useful, but I\'m afraid the RAC of members with the new tasks isn\'t following these calculations. The RAC of people with the new models has shot up while other members not crunching them have normal RAC.

Milo doesn\'t think it can be a bug or misentered value in any of the CPDN credit scripts. He and Tolu have done things with these scripts recently, but the scripts themselves haven\'t been altered. Milo will look into the problem when he has time.

The problem could lie in how CPDN exports its stats. You\'d think all our credits would go into a single pot and be exported to the stats sites as a single homogeneous quantity. However, we already knew that this isn\'t always the case. Those of us who crunch CPDN beta models have our Beta credits transferred to CPDN and added to our CPDN accounts. But the credit originating from the Beta project doesn\'t show up on the stats sites in our RAC. Willy\'s BoincStats RAC figures include it, but the usual RAC figures don\'t. It doesn\'t get completely assimilated with our CPDN credit and somewhere along the line is treated differently.

Richard Haselgrove thinks something similar could be happening to CPDN credit from the HadAM3P models. At some stage it\'s being treated differently from credit generated by other model types.

I think Richard\'s probably right about this but I\'m not sure that his theory is a sufficient explanation.

I\'ve posted an alternative explanation here.
Cpdn news
ID: 36599 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 36600 - Posted: 2 Apr 2009, 5:01:03 UTC

I see what you mean by the way the new Hadam3p WU’s are effecting the RAC. Running 3 WU’s (usually 1 Hadcm3 and 2 Hadsm3mh) on 2 middling level laptops (1 duel core, 1 single core) I used to usually be somewhere around 17th or 18th on the team USA listings with an RAC in the low 600’s. For the past week I have been running 1 Hadam3p in place of one of the Hadsm3mh. My RAC is now 905(!), but, I have slipped to 29th in the rankings. I am about to add another Hadam3p (on the machine that doesn\'t have one). I wonder how high my RAC will go. :)

Oh well, I don’t really run the models for the credits or the team rankings anyway.

ID: 36600 · Report as offensive     Reply Quote
Profile Milo Thurston
Volunteer moderator
Volunteer developer

Send message
Joined: 2 Mar 06
Posts: 253
Credit: 363,646
RAC: 0
Message 36601 - Posted: 2 Apr 2009, 9:21:22 UTC - in response to Message 36599.  
Last modified: 2 Apr 2009, 9:36:52 UTC


Richard Haselgrove thinks something similar could be happening to CPDN credit from the HadAM3P models. At some stage it\'s being treated differently from credit generated by other model types.

I think Richard\'s probably right about this but I\'m not sure that his theory is a sufficient explanation.


It is indeed a most perplexing mystery; I can\'t think of anywhere where hadam3p would be treated differently by BOINC as entries will be in the same tables as models that aren\'t causing a problem, and as I understand it it\'s BOINC code that should be dealing with stats exports and calculating RAC.
ID: 36601 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 925
Credit: 34,100,818
RAC: 11,270
Message 36603 - Posted: 2 Apr 2009, 11:03:51 UTC - in response to Message 36601.  


Richard Haselgrove thinks something similar could be happening to CPDN credit from the HadAM3P models. At some stage it\'s being treated differently from credit generated by other model types.

I think Richard\'s probably right about this but I\'m not sure that his theory is a sufficient explanation.

It is indeed a most perplexing mystery; I can\'t think of anywhere where hadam3p would be treated differently by BOINC as entries will be in the same tables as models that aren\'t causing a problem, and as I understand it it\'s BOINC code that should be dealing with stats exports and calculating RAC.

Whatever the problem is, I think we have to look close at home. When I post this message, you should see a RAC of 7,947 below my name - well, you shouldn\'t see it, because it\'s far too high, but that\'s the figure that CPDN will display for you. And if you look at my computers, you\'ll see that the sum of the four individual RACs adds up to about the same: I should be nearer 2,000 per day for all hosts combined.

So I\'m pretty certain that the errors are not occuring at the \'export\' stage. The faulty figures are in our local database long before the exports are run: the problem has to be in the RAC calculation which is performed locally on the CPDN server (as per Milo\'s link - minor typo corrected in the quote above).

CPDN must do things differently from the standard BOINC code, because of the daily \'recalculate all credit from first principles\', and I don\'t know how you handle the \'aging\' process (7 day half-life) used to define RAC. But because the issue has arisen so suddenly since the introduction of the HadAM3P models, I can only surmise that the credits from these tasks are being fed into the calculations (both host and user) at a different phase in relation to the RAC update from the inputs from other model types.
ID: 36603 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 36604 - Posted: 2 Apr 2009, 12:07:05 UTC
Last modified: 2 Apr 2009, 12:10:31 UTC

Thanks mo.v, geophi, Milo & Richard for delving into this.

It looks like HADAM3P is being treated like HADAM3H in the RAC calculation. If you look at the Core 2 examples I cited above, RAC is too high by a factor approximately of 2.61, which is the ratio of HADAM3H to HADAM3P credits--5184/1983.

I\'ve contacted my Las Vegas bookie and adjusted all my bets to account for the Enron-like RAC\'s--so you can take your time fixing this.
ID: 36604 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 925
Credit: 34,100,818
RAC: 11,270
Message 36608 - Posted: 2 Apr 2009, 15:24:50 UTC

OK, here\'s my betting slip for the Enron Bookies - if you could drop it in the next time you\'re passing, please? :-)

I invoke the principle of Murphy\'s Razor - anything which can get this many people this perplaxed for so long must be simple and obvious.

I think there must be a script which examines every reported trickle to establish its credit-worthiness. It must make two calculations:

a) What it the absolute credit value of this trickle? (to be added to host, user and team totals)
b) What is the discounted (npv) credit value of the trickle [reduced by e^(-ln(2)*t / 604800)]? (to be added to the host, user and team RACs)

When the script segment for HadAM3P models was added, it was copied and pasted from the HadAM3H template. Calculation (a) was updated to reflect the base trickle value for the new model type, calculation (b) wasn\'t.
ID: 36608 · Report as offensive     Reply Quote
Profile old_user81594

Send message
Joined: 11 Jun 05
Posts: 67
Credit: 1,222,916
RAC: 0
Message 36709 - Posted: 13 Apr 2009, 6:16:25 UTC - in response to Message 36603.  


So I\'m pretty certain that the errors are not occuring at the \'export\' stage. The faulty figures are in our local database long before the exports are run: the problem has to be in the RAC calculation which is performed locally on the CPDN server......


Has something been fixed, as I see that my RAC from yesterday in my BoincSynergy signature has dropped like a stone, or was that simply because there was a server-outage yesterday which was only fixed late on in the evening??
Assuming the source of the RAC problem is found, I guess it\'ll take another 6 weeks or so for everyone to get back to their normal (appropriate) RAC?

Best regards,

Neil.
ID: 36709 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 36761 - Posted: 21 Apr 2009, 19:24:25 UTC
Last modified: 21 Apr 2009, 19:43:01 UTC

Runaway RAC Effect!

After running a single HADAM3P for a couple weeks, I'm throwing out my 2.6 factor idea. My RAC hovers around 470 for one slab model, but it's now up to 1500--with no signs of leveling off. Maybe there's no linear relationship! Maybe it's logarithmic. Has anyone running HADAM3P's seen the RAC level off?

The floor of the top 100 is now 6700, whereas it was 4300 when I started this thread. Pretty soon we'll need scientific notation to depict these numbers.
ID: 36761 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 36765 - Posted: 21 Apr 2009, 22:00:06 UTC
Last modified: 21 Apr 2009, 22:02:06 UTC

Maybe RAC values aren't being deprecated by days: while a HADAM3P project is running its accumulated credit is calculated as though it was all reported yesterday. RAC is deprecated again after project completion.

I didn't do any calculations, so this is merely another guess.
ID: 36765 · Report as offensive     Reply Quote
Billy

Send message
Joined: 23 Jan 07
Posts: 26
Credit: 852,233
RAC: 0
Message 36807 - Posted: 25 Apr 2009, 14:03:28 UTC

Yes, the RAC is still being reported incorrectly, very high, especially on the Mac's. It is many times higher than the actual output.
ID: 36807 · Report as offensive     Reply Quote
Virtual Boss*
Avatar

Send message
Joined: 14 May 08
Posts: 29
Credit: 776,852
RAC: 0
Message 37021 - Posted: 31 May 2009, 11:50:04 UTC

Is there any progress on this problem?

While I am aware that the proper credit is being given, the false RAC is skewing all stats re 'Posiion by RAC'.

This has affected CPDN stats drastically, but has also affected Total Boinc stats as well, making this stat almost meaningless.

Obviously this problem is not a high priority for CPDN staff but has a fairly high annoyance rating for any crunchers trying to track their progress through the stats sites.

Note: This is not a gripe, only a query on resolution progress, and a reminder it does affect more than just this projects crunchers.
ID: 37021 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : HADAM3P's too much RAC weight?

©2024 climateprediction.net