climateprediction.net home page
FAMOUS SUCCESS/FAILURE RATIO

FAMOUS SUCCESS/FAILURE RATIO

Message boards : Number crunching : FAMOUS SUCCESS/FAILURE RATIO
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 7 · Next

AuthorMessage
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 39440 - Posted: 1 Apr 2010, 17:32:06 UTC

I started this thread because I have been wondering what the success to failure ratio is with the new FAMOUS WU’s. Please post how many FAMOUS WU’s you have run to completion and how many have crashed along the way. It might also be useful if you included the type of OS, the type of processor (Intel or AMD) and the processor speed.

I just seceded in finishing 1 FAMOUS model, but 2 others crashed along the way. This makes my success to failure rate 1:2 so far. OS is Windows7 64 bit and processor is Intel Core2duo 2.2 GHz.

ID: 39440 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 39441 - Posted: 1 Apr 2010, 18:37:29 UTC
Last modified: 1 Apr 2010, 18:52:41 UTC

You should also include the 1st part of the model\'s name, e.g. r100_599, as some of the 1st part are know to be more reliable than others, (e.g. r109), and the start year affects how erratic the model is. Start-year 599 is a \'spinup\', and can be worse than a start-year further along.

My only mainsite model, a r185_799, is a little over halfway, with about 4 days to go.

edit
I forgot about this one:
r219_599, Intel P4 @3.2GHz, XP Pro.
Failed with the expected P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED
ID: 39441 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1079
Credit: 6,902,393
RAC: 6,787
Message 39443 - Posted: 1 Apr 2010, 19:02:04 UTC

... and don\'t worry about failures: the purpose of this group of FAMOUS work units is to separate those that lead to stable climates from those that don\'t.
ID: 39443 · Report as offensive     Reply Quote
Lockleys

Send message
Joined: 13 Jan 07
Posts: 195
Credit: 10,581,566
RAC: 0
Message 39445 - Posted: 1 Apr 2010, 21:40:03 UTC

My 1 failure (negative pressure): r150_799.
My 2 completed successfully: r152_1199 and r152_1399.
PC is Intel Core 2 Quad @ 2.8MHz Win7 Home Premium.
ID: 39445 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 39457 - Posted: 2 Apr 2010, 19:55:47 UTC

I don’t know if this means much now that the present version of the “FAMOUS“ model has been withdrawn, but, model
Famous_r125_1399_200_006632634_4 crashed at approx. 30% completion.

Windows7 64 bit running on an Intel Core2Duo T6600 2.2 GHz chip (4 GB of RAM).

ID: 39457 · Report as offensive     Reply Quote
3rkko

Send message
Joined: 12 Feb 08
Posts: 66
Credit: 4,877,652
RAC: 0
Message 39463 - Posted: 2 Apr 2010, 23:50:57 UTC

2 success r212_599, r182_599
0 failure
2 in progress
Phenom II X4 955, Win7 64
ID: 39463 · Report as offensive     Reply Quote
old_user596405

Send message
Joined: 4 Oct 09
Posts: 73
Credit: 7,242,427
RAC: 0
Message 39466 - Posted: 3 Apr 2010, 8:14:33 UTC

This system has now finished processing 15 models - Intel Q6600 @ 3.2, 4GB RAM, Win 7 Home x64.

2 successes - r112_1399, r193_999

13 failures, with key sterr out message lines. 8 were Theta related.

The popular reason...
r157_799, r168_1599, r168_1599 (a different one), r174_1599, r179_1399
Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.

Variation?
r119_1399, r156_599, r175_1799
Model crashed: ATM_DYN : INVALID THETA DETECTED.

The remainder were likely caused by reboots, caused by power supply blips one night and flaky PSU in aftermath (sorted by clearing a build-up of static).
Error messages seem to suggest this kind of event.

Anyway, to complete the record for this machine...
r118_1199, r176_799, r185_799, r117_999, r215_599

Am maintaining similar logs for another 2 systems (4 + 15 models) and will post scores when all finished in 3 days time.
ID: 39466 · Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 39469 - Posted: 4 Apr 2010, 5:12:07 UTC

1 success
5 failures
ID: 39469 · Report as offensive     Reply Quote
[B^S] mavau

Send message
Joined: 30 Aug 04
Posts: 142
Credit: 9,936,132
RAC: 0
Message 39474 - Posted: 5 Apr 2010, 13:45:30 UTC

One is still running.
6835949

One failure
Three successes
6835757

6835847

6836187

I notice my failed model has one success noted.


Forum search Site search
ID: 39474 · Report as offensive     Reply Quote
peterfilla

Send message
Joined: 27 Sep 04
Posts: 27
Credit: 11,115,003
RAC: 0
Message 39475 - Posted: 6 Apr 2010, 8:54:07 UTC

Problem OS-related ? WU 6835571 -> Windows is crashing, Linux running ;
my Model crashed too (Win XP Pro)
ID: 39475 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 39477 - Posted: 6 Apr 2010, 9:38:14 UTC
Last modified: 6 Apr 2010, 10:07:02 UTC

If a model becomes unstable on one OS (Windows, Linux or Mac) plus processor type (Intel or AMD) it is likely to develop exactly the same instability at the same moment on other computers with the same OS + processor type combination. There are 5 combinations:

Windows + Intel
Windows + AMD
Linux + Intel
Linux + AMD
Mac + Intel

I\'ve looked through quite a few CPDN FAMOUS WUs to see the situation and have noticed that in a small number of cases one or more computer(s) with Windows + Intel develops an instability but another computer with the same combination processes past that point. In at least one case the other computer developed an instability later. This is rare.

HadSM iceworlds also depend on this OS + processor type combination. But in one case I saw 3 computers with Linux and a particular processor develop an iceworld while a fourth with the same combination completed the model normally. This is also rare.

The processor type matters because each deals with a particular aspect of the arithmetic differently. I think the difference lies in how each deals with rounding off the last value after the decimal point ie treatment of rounding errors.

[Edit: I didn\'t look into the likelihood that computers continuing past an expected instability point were overclocked. Insufficiently tested overclocking could generate processing differences.]
Cpdn news
ID: 39477 · Report as offensive     Reply Quote
old_user596405

Send message
Joined: 4 Oct 09
Posts: 73
Credit: 7,242,427
RAC: 0
Message 39482 - Posted: 6 Apr 2010, 13:05:24 UTC

My other two systems have now finished their batches of Famous models.
Results a bit better than the first one which had only 2 passes from 15!

Links below are to Task details.

Intel Q6600 @ 2.4 stock, 3GB RAM, Win XP Pro SP3 (32-bit).

2 passed - r100_599, r185_799

2 failed -
r149_599
r186_799

Intel i7 920 @ 3.0, 6GB RAM, Win 7 Home (64-bit) - i.e. slightly overclocked.

8 passed - r107_1799, r144_799, r146_1199, r147_799, r148_599, r152_1199, r153_1399, r197_599

7 failed -
r145_999
r149_599
r151_999
r155_1799
r156_599
r158_999
r218_599

Meantime, back to running only SM3 and AM3P models. :)
ID: 39482 · Report as offensive     Reply Quote
[B^S] mavau

Send message
Joined: 30 Aug 04
Posts: 142
Credit: 9,936,132
RAC: 0
Message 39486 - Posted: 6 Apr 2010, 18:58:32 UTC

The last model finished successfully
6835949
So 4 out of 5

Details for this machine:

Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Intel64 Family 6 Model 26 Stepping 4] Microsoft Windows Vista Ultimate x64 Edition, Service Pack 2, (06.00.6002.00)
No overclocking
Running 24/7 alongside Milkyway on the GPU when available.
Also used for daily work and surfing and games...

FYI I\'ve just checked that my one failure was only a success on a Xeon running Darwin.
Hope this helps


Forum search Site search
ID: 39486 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 39487 - Posted: 6 Apr 2010, 20:29:28 UTC - in response to Message 39486.  

The last model finished successfully
6835949
So 4 out of 5

Details for this machine:

Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Intel64 Family 6 Model 26 Stepping 4] Microsoft Windows Vista Ultimate x64 Edition, Service Pack 2, (06.00.6002.00)
No overclocking
Running 24/7 alongside Milkyway on the GPU when available.
Also used for daily work and surfing and games...

FYI I\'ve just checked that my one failure was only a success on a Xeon running Darwin.
Hope this helps


Either you have an incredibly stabile computer, or really great luck. Ever thought of betting on horse races. ;-)


ID: 39487 · Report as offensive     Reply Quote
[B^S] mavau

Send message
Joined: 30 Aug 04
Posts: 142
Credit: 9,936,132
RAC: 0
Message 39497 - Posted: 7 Apr 2010, 18:45:32 UTC

It\'s true I haven\'t had many issues with my models.
This one is an XPS MT. Cleaning it up after 8 months has cut the fan noise down.
Otherwise, I think the Vista Service Packs have helped with the occasional power outage.
I remember when you had to be extra careful shutting down, particularly on my laptop, probably due to delayed disk activity . But this desktop has only had one possibly related iceworld.
One thing I do is get Windows Update to ask me when to install patches, so I can shut down BOINC first (OTOH, I keep everything updated).
Another is check temperature. (I have to take my laptop apart every 6-8 months to clean it up)
Finally, sorry to say that with the 8 models running, I\'ve let go of regular backups.


Forum search Site search
ID: 39497 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1079
Credit: 6,902,393
RAC: 6,787
Message 39499 - Posted: 7 Apr 2010, 22:01:41 UTC - in response to Message 39497.  

[[B^S] mavau wrote:] One thing I do is get Windows Update to ask me when to install patches, so I can shut down BOINC first (OTOH, I keep everything updated).

Actually, that\'s a very good tip that we don\'t mention enough. Installing Windows updates (particularly if an automatic re-boot is triggered) has certainly caused problems for models I\'ve had running. Keeping the update warnings on and choosing when to download and install keeps things running smoothly.
ID: 39499 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 39533 - Posted: 12 Apr 2010, 3:58:14 UTC
Last modified: 12 Apr 2010, 3:59:37 UTC

Famous r131_1399_200_00632156_1 finished successfully. Windows7 64 bit Intel Core3Duo 2.2 GHz processor with 4 GB of RAM. That is my last famous WU from the first batch.

Does anyone know when the next batch will be released?
ID: 39533 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 39534 - Posted: 12 Apr 2010, 8:43:47 UTC

At the moment on the Beta project we\'re testing 6.04 which has quite a high crash rate during the early years. Some of these crashes are caused by deliberately wild pertubations. Hiro\'s talking about another version, presumably beta, in which a filtering mechanism will prevent some of the crashes caused by wild parameter value pertubations. He and Tolu tried this before but it didn\'t work on the earlier version.

So it doesn\'t look as if a release on the main CPDN site is imminent.

If anyone with plenty of experience with CPDN model types + a willingness to look at their progress regularly + ability to report experiences on the forum wants to join Beta, send me a private message and I\'ll explain how to attach.
Cpdn news
ID: 39534 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 39843 - Posted: 3 Jun 2010, 18:18:56 UTC

Hi, everyone:

I see that the FAMOUS models are back so I am reactivating this thread and asking people to report their successful completions and failures with this type of model. Please include processor type (Intel v. AMD), OS version, and amount of RAM. You might also include the s/TS and total time to complete the WU.

Hopefully this batch will be more stable than the last one was.

ID: 39843 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 39846 - Posted: 3 Jun 2010, 21:46:53 UTC

Not much more stable, because of the science behind the modelling.
You WILL get failures, especially with the 'spinups'.
It's much like the early days of the project, 2003-2005, where the object is to find what parts of parameter space works and what doesn't.

ID: 39846 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 7 · Next

Message boards : Number crunching : FAMOUS SUCCESS/FAILURE RATIO

©2024 climateprediction.net