climateprediction.net home page
FAMOUS SUCCESS/FAILURE RATIO

FAMOUS SUCCESS/FAILURE RATIO

Message boards : Number crunching : FAMOUS SUCCESS/FAILURE RATIO
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
Profile Greg van Paassen

Send message
Joined: 17 Nov 07
Posts: 142
Credit: 4,271,370
RAC: 0
Message 40389 - Posted: 19 Aug 2010, 20:00:53 UTC

Just had one of mine fail at about 34%, with a different error this time - i.e. not "invalid theta": famous_ubod_599_200_006647976_2.

The error was

SETPOS: Seek Failed: Invalid argument
SETPOS: Unit 61 to Word Address -198 Failed with Error Code -1

Model crashed: SETPOS: Unit 61 to Word Address -198 Failed with Error Code -1

repeated 6 times. Same exit code 22, though.

This breaks a run of 7 successes. Totals so far: 17 completed, 9 failed (plus 3 "download errors" from the server glitch back in June).
ID: 40389 · Report as offensive     Reply Quote
Profile Greg van Paassen

Send message
Joined: 17 Nov 07
Posts: 142
Credit: 4,271,370
RAC: 0
Message 40390 - Posted: 19 Aug 2010, 20:10:04 UTC - in response to Message 40389.  

Just a note on those 3 "download errors": Two of them didn't get processed at all:

famous_uopf_1599_200_006664862 and famous_uopj_1799_200_006664866'

I wonder how many more work units are like that, and whether it will be a problem for the experiment?
ID: 40390 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 40392 - Posted: 19 Aug 2010, 20:50:18 UTC

Greg

Your recent failure was Invalid theta. The other messages are most likely what happened when the program was suddenly diverted to a different (incorrect) area of code by the failure. The researchers will pick it up when looking through the lists, so not a problem for you.

The models that didn't arrive due to download errors are called phantom models.
And they are a problem to the project, because there's less chance of that particular combination getting processed by someone else. (No chance, if all of the batch failed to download.)

If the area of parameter space involved with the download problems at that time is important enough to which ever physicists are running those models, then they'll request that they be included again at some point.

ID: 40392 · Report as offensive     Reply Quote
Profile Greg van Paassen

Send message
Joined: 17 Nov 07
Posts: 142
Credit: 4,271,370
RAC: 0
Message 40394 - Posted: 20 Aug 2010, 5:28:56 UTC - in response to Message 40392.  

Les - well, maybe. The model ran for about 20 hours after the third and last "Invalid Theta" message appeared in stderr.txt. (Note to programmers: it'd be handy if error messages were timestamped.)

All of my Famous models have logged at least one "invalid theta" message, but the majority go on to completion. I guess the code's "back up and re-try" works ;-).

As well as the "download error" models, I have two "normal" phantoms: famous_u0ch_1999_200_006633300_5 and famous_ulrv_799_200_006661062_0

These phantoms are "In Progress" according to the web site, but never made it to my machine. I recall watching (in the Boinc Manager) one of the download files, for u0ch, get to about 90% downloaded - and then just vanish. Not to worry: someone else managed a complete run for that work unit.
ID: 40394 · Report as offensive     Reply Quote
old_user92639

Send message
Joined: 13 Aug 05
Posts: 54
Credit: 117,227
RAC: 0
Message 40401 - Posted: 22 Aug 2010, 18:02:19 UTC

ID: 40401 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,094,802
RAC: 2,822
Message 40434 - Posted: 27 Aug 2010, 0:37:45 UTC

Famous_ueet_999_200_006651520_4 failed. Reason: Model crashed: ATM_DYN : INVALID THETA DETECTED. Computer is Windows 7 64 bit with Intel Core 2 DUO 2.2 GHz processor with 4 GB of RAM.


ID: 40434 · Report as offensive     Reply Quote
Profile Ananas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 40437 - Posted: 27 Aug 2010, 20:36:48 UTC

Model crashed: ATM_DYN : INVALID THETA DETECTED. three results of that WU did that already.

I still have 7 active Famous 6.11 and a bunch of finished ones on that box. Besides the one mentioned here no errors so far.
ID: 40437 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,094,802
RAC: 2,822
Message 40446 - Posted: 29 Aug 2010, 3:10:29 UTC

Famous_u9rf_1599_200_006645494_3 finished successfully. OS is Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

ID: 40446 · Report as offensive     Reply Quote
old_user92639

Send message
Joined: 13 Aug 05
Posts: 54
Credit: 117,227
RAC: 0
Message 40447 - Posted: 29 Aug 2010, 4:34:35 UTC

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11515402

22 error ^^
Le périphérique ne reconnait pas la commande. (0x16) - exit code 22 (0x16)


28-Aug-2010 22:01:05 [climateprediction.net] Started upload of famous_ufhh_1599_200_006652912_4_8.zip
28-Aug-2010 22:01:06 [climateprediction.net] Sending scheduler request: To send trickle-up message.
28-Aug-2010 22:01:06 [climateprediction.net] Not reporting or requesting tasks
28-Aug-2010 22:01:12 [climateprediction.net] Scheduler request completed
28-Aug-2010 22:04:20 [climateprediction.net] Finished upload of famous_ufhh_1599_200_006652912_4_8.zip
28-Aug-2010 23:10:23 [climateprediction.net] Computation for task famous_ufhh_1599_200_006652912_4 finished
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_9.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_10.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_11.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_12.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_13.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_14.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_15.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_16.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_17.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_18.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_19.zip for task famous_ufhh_1599_200_006652912_4 absent
28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_20.zip for task famous_ufhh_1599_200_006652912_4 absent

ID: 40447 · Report as offensive     Reply Quote
Profile Ananas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 40451 - Posted: 29 Aug 2010, 20:39:16 UTC - in response to Message 40447.  

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11515402

22 error ^^
Le périphérique ne reconnait pas la commande. (0x16) - exit code 22 (0x16)


...


This is a "Theta" issue too, the filetransfer errors are just results of that Theta thing.
ID: 40451 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 40625 - Posted: 8 Sep 2010, 18:09:34 UTC
Last modified: 8 Sep 2010, 18:14:13 UTC

Hypothetical question. In general, the researchers don't know which combinations of perturbed parameters are plausible until they're tried and have identical failures, or similar completions, within a Task. (We're still testing this in Beta.) The range of possible parameter combinations and perturbations is vast.

The Models we run are not untested. They were developed by the U.K. MetOffice and are used in regular weather and climate applications; our task in Beta is to test the envelope that allows a SuperComputer Model to run on a PC, as well as parameter ranges. (CPDN's goal is not "the" solution for the "climate problem." Rather, it is to understand a reasonable range. There is quite a bit of Project and science background information on the other Boards, starting with the home page. http://climateprediction.net/)

Edit: Added hot link.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 40625 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 40631 - Posted: 8 Sep 2010, 22:30:42 UTC - in response to Message 40630.  

The model being validated just means that the program software is OK as far as is known. But that's with the combinations of hardware and software that the testers used.

All 'climate' parameters/values can fail if used in certain combinations. Or if the models were to be run for longer periods.

If the models DON'T fail from instability, then they can still do so because of the hardware/software used on the computer running the model.
e.g. Some people overclock their computers and say that they're still stable. But the Floating Point Unit, (FPU), that is used for lots of calculations may have trouble providing data at the faster rate, and give values that cause the model to be slightly different to what it would be if the computer wasn't overclocked. And, over time, these slight differences add up.


Backups: Here
ID: 40631 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 40633 - Posted: 8 Sep 2010, 23:35:55 UTC

Back when we ran the original 200-year ocean Spinups for the 180-year HadCM3 Tasks, there was a baseline, unperturbed, Task thrown into the mix. On the other hand, none of the Spinups had particularly aggressive parameters because the goal was a set of ocean files to put into HadCM3 Tasks, so every participant wouldn't have to run that nearly four months of work to get to the three-plus-month Task at hand. If I recall correctly, the Spinups didn't crash - unless the computer did it (as one of mine did, within hours of completion after nearly four months on a Pentium-4, thanks to a power glitch that found its way to the machine despite a UPS unit [fortunately, I made daily backups]).

Except for the aside about my machine, is that within range of what you are getting at? (I confess to not understanding what you really want to know.)
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 40633 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2169
Credit: 64,550,109
RAC: 6,649
Message 40635 - Posted: 9 Sep 2010, 2:51:08 UTC - in response to Message 40623.  

Success/failure ratio rises as 'no go' parameter space is identified and avoided, but if combinations of physically-plausible parameter values fail then does this suggest that the general model is not robust?

It is sometimes challenging to state what a physically plausible parameter value is. Processes (like thunderstorms or individual clouds) that are too small scale to model in the large grids scale of the model have to be parameterized. This describes parameters from the basic experiment strategy for older models. Individual links within this text take you to further explanations of parameters:
Parameters

Every climate model has to make a number of approximations, called parameterisations. To read more about these, click here. Basically this means that there are numbers in the model which are given a certain, fixed value, but this value is not known for sure and a range of values could be equally realistic. The experiments will investigate the effect on the modelled climate of varying the value of 20 of the most poorly understood parameters in the model - such as the relationship between the number of raindrops in a cloud and how much it actually rains (to see what they are, click here). It is possible that some combinations of parameters may replicate the past climate equally well, but produce widely different forecasts for what might happen in the future. Some combinations of parameters will not work at all, produce a completely unrealistic climate ( for example an Earth that boils or freezes, or oscillates between very hot and very cold every couple of years) and probably crash the model. It is not possible for us to tell beforehand what these combinations will be.


And this is a very good description of the millennium experiment which talks about why some models in this experiment are expected to fail.
ID: 40635 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,094,802
RAC: 2,822
Message 40646 - Posted: 9 Sep 2010, 18:27:14 UTC

Famous_u9d4_599_200_006644979_1 completed successfully.
OS is Win7 32 bit running on a Core 2 Duo 1.5 GHz processor with 2 BG of RAM.

ID: 40646 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,094,802
RAC: 2,822
Message 40655 - Posted: 11 Sep 2010, 5:03:09 UTC

Famous_u9no_1399_200_006645359_3 finished successfully. OS is Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

ID: 40655 · Report as offensive     Reply Quote
Profile Strathpeffer
Avatar

Send message
Joined: 9 Jan 07
Posts: 497
Credit: 342,899
RAC: 0
Message 40698 - Posted: 17 Sep 2010, 18:43:51 UTC

Sorry to report that my Famous_ubdx_599_200_006647600_0 has crashed with an "unrecoverable error" :-(
Visit the Scotland team
ID: 40698 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 40700 - Posted: 17 Sep 2010, 20:07:39 UTC - in response to Message 40698.  

Or more explicitly, with: INVALID THETA

Backups: Here
ID: 40700 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,094,802
RAC: 2,822
Message 40709 - Posted: 18 Sep 2010, 4:06:39 UTC

Famous_ufb3_999_200_006652682_2 completed successfully. OS is Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.
ID: 40709 · Report as offensive     Reply Quote
Profile Strathpeffer
Avatar

Send message
Joined: 9 Jan 07
Posts: 497
Credit: 342,899
RAC: 0
Message 40744 - Posted: 22 Sep 2010, 16:46:41 UTC - in response to Message 40700.  
Last modified: 22 Sep 2010, 16:46:55 UTC

Or more explicitly, with: INVALID THETA

Thanks Les, that info wasn't yet showing when I first posted. When the "Invalid Theta" message did appear, I meant to come back and amend my post but got kinda sidetracked, as happens around here! Thanks for clarifying. ;-)
Visit the Scotland team
ID: 40744 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : FAMOUS SUCCESS/FAILURE RATIO

©2024 climateprediction.net