climateprediction.net home page

FAMOUS SUCCESS/FAILURE RATIO


Advanced search

Message boards : Number crunching : FAMOUS SUCCESS/FAILURE RATIO

AuthorMessage
Profile JIM
Send message
Joined: Dec 31 07
Posts: 682
Credit: 4,224,379
RAC: 2,953
Message 39440 - Posted 1 Apr 2010 17:32:06 UTC

    I started this thread because I have been wondering what the success to failure ratio is with the new FAMOUS WU’s. Please post how many FAMOUS WU’s you have run to completion and how many have crashed along the way. It might also be useful if you included the type of OS, the type of processor (Intel or AMD) and the processor speed.

    I just seceded in finishing 1 FAMOUS model, but 2 others crashed along the way. This makes my success to failure rate 1:2 so far. OS is Windows7 64 bit and processor is Intel Core2duo 2.2 GHz.

    ____________

    Les Bayliss
    Forum moderator
    Send message
    Joined: Sep 5 04
    Posts: 5428
    Credit: 9,074,925
    RAC: 1,934
    Message 39441 - Posted 1 Apr 2010 18:37:29 UTC

      Last modified: 1 Apr 2010 18:52:41 UTC

      You should also include the 1st part of the model\'s name, e.g. r100_599, as some of the 1st part are know to be more reliable than others, (e.g. r109), and the start year affects how erratic the model is. Start-year 599 is a \'spinup\', and can be worse than a start-year further along.

      My only mainsite model, a r185_799, is a little over halfway, with about 4 days to go.

      edit
      I forgot about this one:
      r219_599, Intel P4 @3.2GHz, XP Pro.
      Failed with the expected P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED

      Profile Iain Inglis
      Forum moderator
      Send message
      Joined: Jan 16 10
      Posts: 522
      Credit: 9,532
      RAC: 0
      Message 39443 - Posted 1 Apr 2010 19:02:04 UTC

        ... and don\'t worry about failures: the purpose of this group of FAMOUS work units is to separate those that lead to stable climates from those that don\'t.

        Lockleys
        Send message
        Joined: Jan 13 07
        Posts: 126
        Credit: 4,385,572
        RAC: 3,108
        Message 39445 - Posted 1 Apr 2010 21:40:03 UTC

          My 1 failure (negative pressure): r150_799.
          My 2 completed successfully: r152_1199 and r152_1399.
          PC is Intel Core 2 Quad @ 2.8MHz Win7 Home Premium.

          Profile JIM
          Send message
          Joined: Dec 31 07
          Posts: 682
          Credit: 4,224,379
          RAC: 2,953
          Message 39457 - Posted 2 Apr 2010 19:55:47 UTC

            I don’t know if this means much now that the present version of the “FAMOUS“ model has been withdrawn, but, model
            Famous_r125_1399_200_006632634_4 crashed at approx. 30% completion.

            Windows7 64 bit running on an Intel Core2Duo T6600 2.2 GHz chip (4 GB of RAM).

            ____________

            3rkko
            Send message
            Joined: Feb 12 08
            Posts: 54
            Credit: 4,247,765
            RAC: 0
            Message 39463 - Posted 2 Apr 2010 23:50:57 UTC

              2 success r212_599, r182_599
              0 failure
              2 in progress
              Phenom II X4 955, Win7 64

              ian.sm
              Send message
              Joined: Oct 4 09
              Posts: 73
              Credit: 7,242,427
              RAC: 0
              Message 39466 - Posted 3 Apr 2010 8:14:33 UTC

                This system has now finished processing 15 models - Intel Q6600 @ 3.2, 4GB RAM, Win 7 Home x64.

                2 successes - r112_1399, r193_999

                13 failures, with key sterr out message lines. 8 were Theta related.

                The popular reason...
                r157_799, r168_1599, r168_1599 (a different one), r174_1599, r179_1399
                Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.

                Variation?
                r119_1399, r156_599, r175_1799
                Model crashed: ATM_DYN : INVALID THETA DETECTED.

                The remainder were likely caused by reboots, caused by power supply blips one night and flaky PSU in aftermath (sorted by clearing a build-up of static).
                Error messages seem to suggest this kind of event.

                Anyway, to complete the record for this machine...
                r118_1199, r176_799, r185_799, r117_999, r215_599

                Am maintaining similar logs for another 2 systems (4 + 15 models) and will post scores when all finished in 3 days time.

                DJStarfox
                Send message
                Joined: Jan 27 07
                Posts: 262
                Credit: 1,178,267
                RAC: 127
                Message 39469 - Posted 4 Apr 2010 5:12:07 UTC

                  1 success
                  5 failures

                  [B^S] mavau
                  Send message
                  Joined: Aug 30 04
                  Posts: 142
                  Credit: 8,238,300
                  RAC: 3,469
                  Message 39474 - Posted 5 Apr 2010 13:45:30 UTC

                    One is still running.
                    6835949

                    One failure
                    Three successes
                    6835757

                    6835847

                    6836187

                    I notice my failed model has one success noted.

                    ____________

                    Forum search Site search

                    peterfilla
                    Send message
                    Joined: Sep 27 04
                    Posts: 24
                    Credit: 10,775,173
                    RAC: 575
                    Message 39475 - Posted 6 Apr 2010 8:54:07 UTC

                      Problem OS-related ? WU 6835571 -> Windows is crashing, Linux running ;
                      my Model crashed too (Win XP Pro)
                      ____________

                      Profile mo.v
                      Forum moderator
                      Avatar
                      Send message
                      Joined: Sep 29 04
                      Posts: 2359
                      Credit: 7,177,874
                      RAC: 1,646
                      Message 39477 - Posted 6 Apr 2010 9:38:14 UTC

                        Last modified: 6 Apr 2010 10:07:02 UTC

                        If a model becomes unstable on one OS (Windows, Linux or Mac) plus processor type (Intel or AMD) it is likely to develop exactly the same instability at the same moment on other computers with the same OS + processor type combination. There are 5 combinations:

                        Windows + Intel
                        Windows + AMD
                        Linux + Intel
                        Linux + AMD
                        Mac + Intel

                        I\'ve looked through quite a few CPDN FAMOUS WUs to see the situation and have noticed that in a small number of cases one or more computer(s) with Windows + Intel develops an instability but another computer with the same combination processes past that point. In at least one case the other computer developed an instability later. This is rare.

                        HadSM iceworlds also depend on this OS + processor type combination. But in one case I saw 3 computers with Linux and a particular processor develop an iceworld while a fourth with the same combination completed the model normally. This is also rare.

                        The processor type matters because each deals with a particular aspect of the arithmetic differently. I think the difference lies in how each deals with rounding off the last value after the decimal point ie treatment of rounding errors.

                        [Edit: I didn\'t look into the likelihood that computers continuing past an expected instability point were overclocked. Insufficiently tested overclocking could generate processing differences.]
                        ____________
                        Cpdn news

                        ian.sm
                        Send message
                        Joined: Oct 4 09
                        Posts: 73
                        Credit: 7,242,427
                        RAC: 0
                        Message 39482 - Posted 6 Apr 2010 13:05:24 UTC

                          My other two systems have now finished their batches of Famous models.
                          Results a bit better than the first one which had only 2 passes from 15!

                          Links below are to Task details.

                          Intel Q6600 @ 2.4 stock, 3GB RAM, Win XP Pro SP3 (32-bit).

                          2 passed - r100_599, r185_799

                          2 failed -
                          r149_599
                          r186_799

                          Intel i7 920 @ 3.0, 6GB RAM, Win 7 Home (64-bit) - i.e. slightly overclocked.

                          8 passed - r107_1799, r144_799, r146_1199, r147_799, r148_599, r152_1199, r153_1399, r197_599

                          7 failed -
                          r145_999
                          r149_599
                          r151_999
                          r155_1799
                          r156_599
                          r158_999
                          r218_599

                          Meantime, back to running only SM3 and AM3P models. :)

                          [B^S] mavau
                          Send message
                          Joined: Aug 30 04
                          Posts: 142
                          Credit: 8,238,300
                          RAC: 3,469
                          Message 39486 - Posted 6 Apr 2010 18:58:32 UTC

                            The last model finished successfully
                            6835949
                            So 4 out of 5

                            Details for this machine:

                            Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Intel64 Family 6 Model 26 Stepping 4] Microsoft Windows Vista Ultimate x64 Edition, Service Pack 2, (06.00.6002.00)
                            No overclocking
                            Running 24/7 alongside Milkyway on the GPU when available.
                            Also used for daily work and surfing and games...

                            FYI I\'ve just checked that my one failure was only a success on a Xeon running Darwin.
                            Hope this helps

                            ____________

                            Forum search Site search

                            Profile JIM
                            Send message
                            Joined: Dec 31 07
                            Posts: 682
                            Credit: 4,224,379
                            RAC: 2,953
                            Message 39487 - Posted 6 Apr 2010 20:29:28 UTC - in response to Message 39486.

                              The last model finished successfully
                              6835949
                              So 4 out of 5

                              Details for this machine:

                              Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Intel64 Family 6 Model 26 Stepping 4] Microsoft Windows Vista Ultimate x64 Edition, Service Pack 2, (06.00.6002.00)
                              No overclocking
                              Running 24/7 alongside Milkyway on the GPU when available.
                              Also used for daily work and surfing and games...

                              FYI I\'ve just checked that my one failure was only a success on a Xeon running Darwin.
                              Hope this helps


                              Either you have an incredibly stabile computer, or really great luck. Ever thought of betting on horse races. ;-)


                              ____________

                              [B^S] mavau
                              Send message
                              Joined: Aug 30 04
                              Posts: 142
                              Credit: 8,238,300
                              RAC: 3,469
                              Message 39497 - Posted 7 Apr 2010 18:45:32 UTC

                                It\'s true I haven\'t had many issues with my models.
                                This one is an XPS MT. Cleaning it up after 8 months has cut the fan noise down.
                                Otherwise, I think the Vista Service Packs have helped with the occasional power outage.
                                I remember when you had to be extra careful shutting down, particularly on my laptop, probably due to delayed disk activity . But this desktop has only had one possibly related iceworld.
                                One thing I do is get Windows Update to ask me when to install patches, so I can shut down BOINC first (OTOH, I keep everything updated).
                                Another is check temperature. (I have to take my laptop apart every 6-8 months to clean it up)
                                Finally, sorry to say that with the 8 models running, I\'ve let go of regular backups.

                                ____________

                                Forum search Site search

                                Profile Iain Inglis
                                Forum moderator
                                Send message
                                Joined: Jan 16 10
                                Posts: 522
                                Credit: 9,532
                                RAC: 0
                                Message 39499 - Posted 7 Apr 2010 22:01:41 UTC - in response to Message 39497.

                                  [[B^S] mavau wrote:] One thing I do is get Windows Update to ask me when to install patches, so I can shut down BOINC first (OTOH, I keep everything updated).

                                  Actually, that\'s a very good tip that we don\'t mention enough. Installing Windows updates (particularly if an automatic re-boot is triggered) has certainly caused problems for models I\'ve had running. Keeping the update warnings on and choosing when to download and install keeps things running smoothly.

                                  Profile JIM
                                  Send message
                                  Joined: Dec 31 07
                                  Posts: 682
                                  Credit: 4,224,379
                                  RAC: 2,953
                                  Message 39533 - Posted 12 Apr 2010 3:58:14 UTC

                                    Last modified: 12 Apr 2010 3:59:37 UTC

                                    Famous r131_1399_200_00632156_1 finished successfully. Windows7 64 bit Intel Core3Duo 2.2 GHz processor with 4 GB of RAM. That is my last famous WU from the first batch.

                                    Does anyone know when the next batch will be released?
                                    ____________

                                    Profile mo.v
                                    Forum moderator
                                    Avatar
                                    Send message
                                    Joined: Sep 29 04
                                    Posts: 2359
                                    Credit: 7,177,874
                                    RAC: 1,646
                                    Message 39534 - Posted 12 Apr 2010 8:43:47 UTC

                                      At the moment on the Beta project we\'re testing 6.04 which has quite a high crash rate during the early years. Some of these crashes are caused by deliberately wild pertubations. Hiro\'s talking about another version, presumably beta, in which a filtering mechanism will prevent some of the crashes caused by wild parameter value pertubations. He and Tolu tried this before but it didn\'t work on the earlier version.

                                      So it doesn\'t look as if a release on the main CPDN site is imminent.

                                      If anyone with plenty of experience with CPDN model types + a willingness to look at their progress regularly + ability to report experiences on the forum wants to join Beta, send me a private message and I\'ll explain how to attach.
                                      ____________
                                      Cpdn news

                                      Profile JIM
                                      Send message
                                      Joined: Dec 31 07
                                      Posts: 682
                                      Credit: 4,224,379
                                      RAC: 2,953
                                      Message 39843 - Posted 3 Jun 2010 18:18:56 UTC

                                        Hi, everyone:

                                        I see that the FAMOUS models are back so I am reactivating this thread and asking people to report their successful completions and failures with this type of model. Please include processor type (Intel v. AMD), OS version, and amount of RAM. You might also include the s/TS and total time to complete the WU.

                                        Hopefully this batch will be more stable than the last one was.

                                        ____________

                                        Les Bayliss
                                        Forum moderator
                                        Send message
                                        Joined: Sep 5 04
                                        Posts: 5428
                                        Credit: 9,074,925
                                        RAC: 1,934
                                        Message 39846 - Posted 3 Jun 2010 21:46:53 UTC

                                          Not much more stable, because of the science behind the modelling.
                                          You WILL get failures, especially with the 'spinups'.
                                          It's much like the early days of the project, 2003-2005, where the object is to find what parts of parameter space works and what doesn't.

                                          Profile JIM
                                          Send message
                                          Joined: Dec 31 07
                                          Posts: 682
                                          Credit: 4,224,379
                                          RAC: 2,953
                                          Message 39849 - Posted 4 Jun 2010 0:21:06 UTC

                                            I am afraid that Les is right. One WU already failed on my faster machine. It ran less than 1 hour of CPU time. It was gone so fast that I didn’t even get a chance to write down it designation. All I remember in that it started in 1799. I guess that there is no reason to make backups was the WU's will just fail again at the same point if restored.

                                            Computer has an Intel 2.2 GHz processor running Windows 7 64 bit with 4 GB of RAM.

                                            In science they say that even negative results are results. If it helps to weed out bad starting parameters from the good ones it’s worth the computer time.

                                            ____________

                                            Profile JIM
                                            Send message
                                            Joined: Dec 31 07
                                            Posts: 682
                                            Credit: 4,224,379
                                            RAC: 2,953
                                            Message 39879 - Posted 8 Jun 2010 3:44:55 UTC

                                              The WU famous_u0of_1599_200_006633730_6 crashed on 07/06/2010 at approx. 37% completion. Os is Windows 7 64 bit running on Intel Core2Duo 2.2 GHz processor with 4 GB RAM.


                                              ____________

                                              [B^S] mavau
                                              Send message
                                              Joined: Aug 30 04
                                              Posts: 142
                                              Credit: 8,238,300
                                              RAC: 3,469
                                              Message 39880 - Posted 8 Jun 2010 6:12:09 UTC

                                                Two crashes so far:
                                                famous_u0nl_1599_200_006633700_4
                                                famous_u0mr_1599_200_006633670_1
                                                Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Intel64 Family 6 Model 26 Stepping 4] Microsoft Windows Vista Ultimate x64 Edition, Service Pack 2, (06.00.6002.00)
                                                ____________

                                                Forum search Site search

                                                Profile astroWX
                                                Forum moderator
                                                Send message
                                                Joined: Aug 5 04
                                                Posts: 1304
                                                Credit: 38,908,080
                                                RAC: 23,052
                                                Message 39892 - Posted 8 Jun 2010 17:59:42 UTC

                                                  Last modified: 8 Jun 2010 18:16:51 UTC

                                                  'NEGATIVE PRESSURE VALUE CREATED' on Q9300 in Vista_x64 after T.S. 140,426 (7.5%): famous_u0wv_1799_200_006634034_6

                                                  Edit: Three other Tasks in the Work Unit crashed with 'NEGATIVE PRESSURE'; three Tasks continue in progress.
                                                  ____________
                                                  "We have met the enemy and he is us." -- Pogo
                                                  Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                                  Lockleys
                                                  Send message
                                                  Joined: Jan 13 07
                                                  Posts: 126
                                                  Credit: 4,385,572
                                                  RAC: 3,108
                                                  Message 39911 - Posted 10 Jun 2010 10:14:54 UTC

                                                    Model famous_u0ct_999_200_006633312_5 crashed on Q9550 Quad Intel Win7 x64 at about 3.1% complete.

                                                    Profile astroWX
                                                    Forum moderator
                                                    Send message
                                                    Joined: Aug 5 04
                                                    Posts: 1304
                                                    Credit: 38,908,080
                                                    RAC: 23,052
                                                    Message 39922 - Posted 12 Jun 2010 5:05:16 UTC

                                                      Last modified: 12 Jun 2010 5:09:10 UTC

                                                      Completed on Q9300 in XP_x64: famous_u0tv_1799_200_006633926_5 Temperature curves reach for the stars at the end
                                                      Includes Tambora, Krakatoa, Katmai, Pinatubo volcanic events.
                                                      Edit: That leaves me at 50% for v.6.10.
                                                      ____________
                                                      "We have met the enemy and he is us." -- Pogo
                                                      Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                                      Lockleys
                                                      Send message
                                                      Joined: Jan 13 07
                                                      Posts: 126
                                                      Credit: 4,385,572
                                                      RAC: 3,108
                                                      Message 39923 - Posted 12 Jun 2010 10:21:25 UTC

                                                        Model famous_u0o9_1599_200_006633724_6 crashed on C2D 6400 @ 2.13GHz WIN XP at about 38.25% complete. INVALID THETA DETECTED.

                                                        ian.sm
                                                        Send message
                                                        Joined: Oct 4 09
                                                        Posts: 73
                                                        Credit: 7,242,427
                                                        RAC: 0
                                                        Message 39936 - Posted 14 Jun 2010 9:58:20 UTC

                                                          Today - famous_upgs_1799_200_006665847_6 - Invalid Theta Detected.
                                                          18.5% completed on i7 920 @ 3.4, Win 7.
                                                          This machine has successfully completed 2 with 8 still running.

                                                          3 June - famous_u0ny_1799_200_006633713_1 - Invalid Theta Detected.
                                                          Failed before first trickle (after less than 1 hour) on a Q6600 @ 3.2, Win 7.
                                                          This machine has successfully completed 3 with one still running.

                                                          [B^S] mavau
                                                          Send message
                                                          Joined: Aug 30 04
                                                          Posts: 142
                                                          Credit: 8,238,300
                                                          RAC: 3,469
                                                          Message 39937 - Posted 14 Jun 2010 11:22:48 UTC

                                                            Latest results:
                                                            famous_u0pt_1999_200_006633780
                                                            I don't understand the difference in credit.
                                                            famous_u0pq_1999_200_006633777_3
                                                            That one completed with a workunit error?
                                                            ____________

                                                            Forum search Site search

                                                            Profile mo.v
                                                            Forum moderator
                                                            Avatar
                                                            Send message
                                                            Joined: Sep 29 04
                                                            Posts: 2359
                                                            Credit: 7,177,874
                                                            RAC: 1,646
                                                            Message 39938 - Posted 14 Jun 2010 12:18:33 UTC

                                                              I think you're refering to the phrase 'Workunit error - check skipped'. This line is really for Boinc projects that compare two or more completed tasks to validate them and decide which will be the canonical result, which I think means the definitive result for the researchers.

                                                              CPDN doesn't validate results by this method. Almost every completed result is used. So this line is irrelevant for CPDN.

                                                              There are often also red lines on workunit pages that are irrelevant, such as 'Too many results' or 'Too many errors - may have a bug'(!!). I don't know whether it would be possible for CPDN to hide these lines. Milo's too busy still fixing problems from the Boinc upgrade to ask at the moment.

                                                              It would definitely be better if CPDN members never saw these Boinc phrases.

                                                              If you look at the News thread post about FAMOUS (top of this Number Crunching section) it may explain what you need to know about the credits.
                                                              ____________
                                                              Cpdn news

                                                              Profile Thyme Lawn
                                                              Forum moderator
                                                              Send message
                                                              Joined: Aug 5 04
                                                              Posts: 1232
                                                              Credit: 10,449,542
                                                              RAC: 1,317
                                                              Message 39939 - Posted 14 Jun 2010 12:34:24 UTC - in response to Message 39937.

                                                                Latest results:
                                                                famous_u0pt_1999_200_006633780
                                                                I don't understand the difference in credit.

                                                                Granted credits are recalculated once a day. The 2 completed tasks should both have 6,176.41 credits tomorrow.
                                                                ____________
                                                                "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer

                                                                [B^S] mavau
                                                                Send message
                                                                Joined: Aug 30 04
                                                                Posts: 142
                                                                Credit: 8,238,300
                                                                RAC: 3,469
                                                                Message 39941 - Posted 14 Jun 2010 18:31:15 UTC

                                                                  Thanks for the info.
                                                                  Another completed model with:'Workunit error - check skipped'.
                                                                  famous_u0my_1799_200_006633677
                                                                  Looking at the Work unit page, it seems connected with "Too many total results" as you say.
                                                                  To sum up, those two messages are just BOINC artefacts. They don't concern CPDN and are nothing to worry about:

                                                                    Workunit error - check skipped
                                                                    Too many total results



                                                                  ____________

                                                                  Forum search Site search

                                                                  Profile JIM
                                                                  Send message
                                                                  Joined: Dec 31 07
                                                                  Posts: 682
                                                                  Credit: 4,224,379
                                                                  RAC: 2,953
                                                                  Message 39947 - Posted 15 Jun 2010 19:34:43 UTC

                                                                    Famous_u0mw1999_200_006633675_1 completed successfully. OS is Windows 7 64 bit running on Intel Core 2 Duo 2.2 GHz with 4 GB’s of RAM. s/TS is 0.46.

                                                                    ____________

                                                                    Profile astroWX
                                                                    Forum moderator
                                                                    Send message
                                                                    Joined: Aug 5 04
                                                                    Posts: 1304
                                                                    Credit: 38,908,080
                                                                    RAC: 23,052
                                                                    Message 39955 - Posted 17 Jun 2010 21:07:52 UTC

                                                                      Summary for recent time away from home (since Friday):
                                                                      Seven successful completions, seven crashes. (All in 64-bit Windows, Vista/W7/XP.) So, I remain at 50% for v.6.10.
                                                                      ____________
                                                                      "We have met the enemy and he is us." -- Pogo
                                                                      Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                                                      [B^S] mavau
                                                                      Send message
                                                                      Joined: Aug 30 04
                                                                      Posts: 142
                                                                      Credit: 8,238,300
                                                                      RAC: 3,469
                                                                      Message 39956 - Posted 17 Jun 2010 21:28:23 UTC

                                                                        So far 5 completed, two errors.
                                                                        I'm changing preferences to run only Famous to see what happens.
                                                                        ____________

                                                                        Forum search Site search

                                                                        [B^S] mavau
                                                                        Send message
                                                                        Joined: Aug 30 04
                                                                        Posts: 142
                                                                        Credit: 8,238,300
                                                                        RAC: 3,469
                                                                        Message 39970 - Posted 19 Jun 2010 5:55:58 UTC

                                                                          Another crash, so 7 completed 3 crashes.
                                                                          ____________

                                                                          Forum search Site search

                                                                          Les Bayliss
                                                                          Forum moderator
                                                                          Send message
                                                                          Joined: Sep 5 04
                                                                          Posts: 5428
                                                                          Credit: 9,074,925
                                                                          RAC: 1,934
                                                                          Message 39971 - Posted 19 Jun 2010 6:01:48 UTC

                                                                            2 completed, no failures.
                                                                            Win Xp Pro 4Gigs ram.

                                                                            [B^S] mavau
                                                                            Send message
                                                                            Joined: Aug 30 04
                                                                            Posts: 142
                                                                            Credit: 8,238,300
                                                                            RAC: 3,469
                                                                            Message 39974 - Posted 19 Jun 2010 11:04:24 UTC

                                                                              Last modified: 19 Jun 2010 11:04:46 UTC

                                                                              I thought this one would complete :-)
                                                                              Inspiron laptop with Windows 7 Pro 64, 2Gigs RAM.
                                                                              ____________

                                                                              Forum search Site search

                                                                              ian.sm
                                                                              Send message
                                                                              Joined: Oct 4 09
                                                                              Posts: 73
                                                                              Credit: 7,242,427
                                                                              RAC: 0
                                                                              Message 39983 - Posted 20 Jun 2010 15:30:27 UTC - in response to Message 39936.

                                                                                On 14 June, iansm wrote:
                                                                                Today - famous_upgs_1799_200_006665847_6 - Invalid Theta Detected.
                                                                                18.5% completed on i7 920 @ 3.4, Win 7.
                                                                                This machine has successfully completed 2 with 8 still running.

                                                                                3 June - famous_u0ny_1799_200_006633713_1 - Invalid Theta Detected.
                                                                                Failed before first trickle (after less than 1 hour) on a Q6600 @ 3.2, Win 7.
                                                                                This machine has successfully completed 3 with one still running.

                                                                                Another crashed in the 920 system today at 39.5%.
                                                                                famous_upfc_1599_200_006665795_3 - Invalid Theta Detected.
                                                                                3 completed, 2 crashed and 6 still running.

                                                                                The Q6600 system finished its 4th and final model (meantime) with just the one failure.

                                                                                Profile [B@H] Ray
                                                                                Avatar
                                                                                Send message
                                                                                Joined: Aug 19 05
                                                                                Posts: 104
                                                                                Credit: 1,753,585
                                                                                RAC: 201
                                                                                Message 39990 - Posted 21 Jun 2010 20:04:02 UTC

                                                                                  I had THIS ONE crash today.

                                                                                  Completed TS 1,141,946
                                                                                  Average time per TS 0.4505

                                                                                  System
                                                                                  AMD Athlon II 235e
                                                                                  6 Gigs ram.

                                                                                  Today I can't get a new one, they get download errors, checked and others get them on the same units, will run HADAM3P till I can get one tomorrow. Already got my quota of one per day.
                                                                                  ____________
                                                                                  Keep on crunching Pizza@Home

                                                                                  Profile Greg van Paassen
                                                                                  Send message
                                                                                  Joined: Nov 17 07
                                                                                  Posts: 142
                                                                                  Credit: 4,271,370
                                                                                  RAC: 0
                                                                                  Message 39991 - Posted 21 Jun 2010 22:27:39 UTC

                                                                                    Core i3 530 2.93GHz, 2GB Kingston valueRAM, Gigabyte H55M UD2H mo'board, Linux Arch 2.6.33, 100% CPDN.

                                                                                    Crashed 3:

                                                                                    u0d9_0599 neg. press. 42,999 sec

                                                                                    upij_0799 theta 271,931 sec

                                                                                    u0s5_1999 neg. press. 155,591 sec


                                                                                    Completed 2:
                                                                                    u0s4_1799 1,029,859 sec

                                                                                    u0sp_1799 1,029,843 sec


                                                                                    In progress 1: u089_0599 - 90%

                                                                                    Mystery (says in progress on web page, but isn't on PC) 1: u0ch_1999

                                                                                    [B^S] mavau
                                                                                    Send message
                                                                                    Joined: Aug 30 04
                                                                                    Posts: 142
                                                                                    Credit: 8,238,300
                                                                                    RAC: 3,469
                                                                                    Message 40007 - Posted 24 Jun 2010 12:00:08 UTC

                                                                                      One mistake in my previous post: only 6 completed models.
                                                                                      And 7 crashes.
                                                                                      The latest:
                                                                                      famous_uow2_1799_200_006665101

                                                                                      famous_uoxh_1799_200_006665152

                                                                                      famous_uowz_1799_200_006665134

                                                                                      ____________

                                                                                      Forum search Site search

                                                                                      Profile JIM
                                                                                      Send message
                                                                                      Joined: Dec 31 07
                                                                                      Posts: 682
                                                                                      Credit: 4,224,379
                                                                                      RAC: 2,953
                                                                                      Message 40028 - Posted 26 Jun 2010 17:09:15 UTC

                                                                                        Last modified: 26 Jun 2010 17:12:42 UTC

                                                                                        Famous_u0na_1799_200_006633689_6 crashed at 96% completion. OS is Windows 7 32 bit running on an Intel Core 2 Duo 1.5 GHz processor with 2 GB of RAM. 1.06s/TS RIP :(
                                                                                        ____________

                                                                                        Profile JIM
                                                                                        Send message
                                                                                        Joined: Dec 31 07
                                                                                        Posts: 682
                                                                                        Credit: 4,224,379
                                                                                        RAC: 2,953
                                                                                        Message 40033 - Posted 27 Jun 2010 14:30:31 UTC

                                                                                          Last modified: 27 Jun 2010 14:34:04 UTC

                                                                                          Famous _u0mw_1799_200_006634055_6 completed successfully. Os is Windows 7 32 bit running on Intel Core 2 Duo 1.5 GHz processor with 2 GB of RAM. 1.05s/TS. :) I seem to be running about 50% success rate on this type.
                                                                                          ____________

                                                                                          Profile mo.v
                                                                                          Forum moderator
                                                                                          Avatar
                                                                                          Send message
                                                                                          Joined: Sep 29 04
                                                                                          Posts: 2359
                                                                                          Credit: 7,177,874
                                                                                          RAC: 1,646
                                                                                          Message 40034 - Posted 27 Jun 2010 15:30:49 UTC

                                                                                            Last modified: 27 Jun 2010 15:46:44 UTC

                                                                                            I've looked at how some of the top computers are doing, adding together results for FAMOUS 6.10 and 6.11. I've not counted models with downloading errors as that was a server problem.

                                                                                            Peter, Linux: 6 completed, 5 errored

                                                                                            Ian Rees, Windows: 5 completed, 5 errored

                                                                                            Montes, Mac: 2 completed, 7 errored

                                                                                            Mike Koehler, Mac: 2 completed, 6 errored

                                                                                            Anonymous, Windows: 1 completed, 6 errored


                                                                                            This is less than the approx 50% success rate you estimate, but two factors make the above figures not entirely reliable.

                                                                                            * Models that crash take less computing time than completions.

                                                                                            * The list doesn't include partly processed models and the further a model has progressed the less likely it must be to crash, ie the more likely to succeed.

                                                                                            So I think the success ratio of these computers will probably increase as they have time to finish more models.

                                                                                            A more accurate estimate could be obtained by trawling through many workunits to see how many succeed on all platforms and how many crash on one, two or three. But this would be extraordinarily time-consuming. Because some computers crash models for non-model-related reasons one would need to look at the stderr of every model failure apart from those that couldn't get started because of a computer misconfiguration.

                                                                                            I will not be doing this.

                                                                                            The % of workunits that complete on all platforms must be lower than the average success % on members' computers.

                                                                                            One of us could look at those very stable top computers again after say another month.
                                                                                            ____________
                                                                                            Cpdn news

                                                                                            ian.sm
                                                                                            Send message
                                                                                            Joined: Oct 4 09
                                                                                            Posts: 73
                                                                                            Credit: 7,242,427
                                                                                            RAC: 0
                                                                                            Message 40039 - Posted 28 Jun 2010 6:28:23 UTC

                                                                                              One more crash in my i7 920 system (@3.4 with Win 7 Home x64) at 51.5%.
                                                                                              famous_upfd_1799_200_006665796_0 - Invalid Theta Detected.

                                                                                              3 completed, 3 crashed and 5 still running in this machine.

                                                                                              Profile Iain Inglis
                                                                                              Forum moderator
                                                                                              Send message
                                                                                              Joined: Jan 16 10
                                                                                              Posts: 522
                                                                                              Credit: 9,532
                                                                                              RAC: 0
                                                                                              Message 40040 - Posted 28 Jun 2010 8:30:21 UTC - in response to Message 40039.

                                                                                                Invalid Theta Detected.

                                                                                                Just in case anyone is wondering what 'theta' is: potential temperature.

                                                                                                [B^S] mavau
                                                                                                Send message
                                                                                                Joined: Aug 30 04
                                                                                                Posts: 142
                                                                                                Credit: 8,238,300
                                                                                                RAC: 3,469
                                                                                                Message 40051 - Posted 29 Jun 2010 12:53:07 UTC

                                                                                                  Two more successes:

                                                                                                  famous_up1h_1399_200_006665296

                                                                                                  famous_uoxz_1799_200_006665170

                                                                                                  8 completed models, 7 crashes, 8 running on the corei7 and 1 on the Inspiron.

                                                                                                  ____________

                                                                                                  Forum search Site search

                                                                                                  Profile genes
                                                                                                  Avatar
                                                                                                  Send message
                                                                                                  Joined: Aug 9 04
                                                                                                  Posts: 25
                                                                                                  Credit: 4,756,980
                                                                                                  RAC: 0
                                                                                                  Message 40052 - Posted 29 Jun 2010 14:36:18 UTC

                                                                                                    Last modified: 29 Jun 2010 14:37:18 UTC

                                                                                                    Invalid Theta on this task: famous_r100_799_200_006666899_1.

                                                                                                    So far, five completions, and one other Invalid Theta. All on Win7_x64.

                                                                                                    [B^S] mavau
                                                                                                    Send message
                                                                                                    Joined: Aug 30 04
                                                                                                    Posts: 142
                                                                                                    Credit: 8,238,300
                                                                                                    RAC: 3,469
                                                                                                    Message 40080 - Posted 5 Jul 2010 8:23:23 UTC

                                                                                                      I'm now on 13 completed models and 10 crashes.
                                                                                                      ____________

                                                                                                      Forum search Site search

                                                                                                      Profile JIM
                                                                                                      Send message
                                                                                                      Joined: Dec 31 07
                                                                                                      Posts: 682
                                                                                                      Credit: 4,224,379
                                                                                                      RAC: 2,953
                                                                                                      Message 40095 - Posted 9 Jul 2010 18:11:23 UTC

                                                                                                        Famous_u0qu_1799_200_006667114_2 completed successfully. OS is Windows 7 64 bit running on a Intel Core 2 Duo 2.2 GHz with 4 GB’s of RAM.

                                                                                                        ____________

                                                                                                        Profile Greg van Paassen
                                                                                                        Send message
                                                                                                        Joined: Nov 17 07
                                                                                                        Posts: 142
                                                                                                        Credit: 4,271,370
                                                                                                        RAC: 0
                                                                                                        Message 40112 - Posted 11 Jul 2010 22:28:38 UTC - in response to Message 40034.

                                                                                                          I've had a look in a little more depth at the FAMOUS success/failure stats from the first two pages of the 'Top Computers' list.

                                                                                                          I tried to pick computers with at least 700,000 credits, so not "drive-bys". Compute errors only, as before.

                                                                                                          Computer.......OS.........Pend+Invalid......Error.....Error%..Overall.Fail% 976458 Darwin 11 29 73 1013254 Darwin 4 29 88 1001600 Darwin 0 9 ALL 978938 Darwin 4 12 75 1063866 Darwin 3 27 90 83% Darwin excluding 1001600: 82% Darwin 1000554 W7 2 3 60 961681 WSv2008 7 12 63 882224 WXP X64 5 2 29 55% Windows 1036870 Lin 2.6.16 16 8 33 1072992 Lin 2.6.32 6 7 54 1047400 Lin 2.6.32 FC12 7 6 46 42% Linux


                                                                                                          Of course this is a snapshot, so you won't get these numbers now, or not all of them anyway. And early days, and all that. However.

                                                                                                          Is it possible there is a problem with the MacOS code? Especially since most of the Darwin computers have relatively few failures with the other types of models.

                                                                                                          Edit: will cross-post on CPDN board as this board seems to ignore the "pre" tag, so the table is not easy to follow.

                                                                                                          Profile geophi
                                                                                                          Forum moderator
                                                                                                          Send message
                                                                                                          Joined: Aug 7 04
                                                                                                          Posts: 1478
                                                                                                          Credit: 23,133,788
                                                                                                          RAC: 11,481
                                                                                                          Message 40115 - Posted 11 Jul 2010 23:22:57 UTC - in response to Message 40112.

                                                                                                            Last modified: 11 Jul 2010 23:24:42 UTC

                                                                                                            On my systems here at cpdn...

                                                                                                            Core i7 920 in Linux
                                                                                                            6 completed, 7 failed, 4 in progress
                                                                                                            Phenom II X4 940 in Linux
                                                                                                            7 completed, 5 failed, 4 in progress
                                                                                                            Core 2 E6420 in Windows
                                                                                                            2 completed, 0 failed, 1 in progress

                                                                                                            Les Bayliss
                                                                                                            Forum moderator
                                                                                                            Send message
                                                                                                            Joined: Sep 5 04
                                                                                                            Posts: 5428
                                                                                                            Credit: 9,074,925
                                                                                                            RAC: 1,934
                                                                                                            Message 40116 - Posted 12 Jul 2010 0:05:27 UTC - in response to Message 40112.

                                                                                                              There's always the possibility of faulty data files, but ALL types of climate model are tested for months on our beta site.

                                                                                                              It's possible that your comparisons are too simplistic.

                                                                                                              As I said near the start of this thread, it's known that some of the series of models with "early label names" were being "pushed hard" with their forcing values, making them more unstable. (Some of the models that I have now, are up to the "u" series.)

                                                                                                              And I also said there that the models with a start year of 599 are 'spinups', which are also more unstable than any of the subsequent year starts. As these later years use data from models of the previous year that completed, (which will allow these 2 years to be "stitched together" to form a longer year), it's more likely that the parameter values used are from a stable part of parameter space.
                                                                                                              And they will definitely be using a spinup that was stable. :)

                                                                                                              So your comparison would need to take into account these 2 items: the series name, and the start year of the models.


                                                                                                              ____________
                                                                                                              Backups: Here

                                                                                                              Profile Greg van Paassen
                                                                                                              Send message
                                                                                                              Joined: Nov 17 07
                                                                                                              Posts: 142
                                                                                                              Credit: 4,271,370
                                                                                                              RAC: 0
                                                                                                              Message 40117 - Posted 12 Jul 2010 0:22:29 UTC - in response to Message 40115.

                                                                                                                On my own machine, Core i3 Linux, I have had 3 complete and 5 failed, a failure rate of 63%.

                                                                                                                I have my suspicions about my computer's memory (Kingston valueRAM), even though it passes the memtest86+ test. I have underclocked the memory by 10% and the latest 4 models are running fine so far. Time will tell.

                                                                                                                In case you can't decipher the messed-up table below, the essence was

                                                                                                                Darwin failure rate 82%, Windows failure rate 55%, Linux failure rate 42%. Darwin seems to be an outlier.

                                                                                                                Profile geophi
                                                                                                                Forum moderator
                                                                                                                Send message
                                                                                                                Joined: Aug 7 04
                                                                                                                Posts: 1478
                                                                                                                Credit: 23,133,788
                                                                                                                RAC: 11,481
                                                                                                                Message 40120 - Posted 12 Jul 2010 1:07:42 UTC

                                                                                                                  Last modified: 12 Jul 2010 1:12:08 UTC

                                                                                                                  If I recall correctly from beta, the FAMOUS application for Darwin is using a higher optimization because they couldn't compile it without it. That may, or may not have anything to do with the failure rate.

                                                                                                                  As Les said, however, some of these sets will be inherently more unstable than others due to parameter choices. It's difficult to accept only a 50% success rate when it's previously been > 95%, but that's the nature of running this FAMOUS experiment.

                                                                                                                  Profile JIM
                                                                                                                  Send message
                                                                                                                  Joined: Dec 31 07
                                                                                                                  Posts: 682
                                                                                                                  Credit: 4,224,379
                                                                                                                  RAC: 2,953
                                                                                                                  Message 40121 - Posted 12 Jul 2010 1:53:45 UTC

                                                                                                                    Famous_r149_799_200_006666483_5 completed successfully. OS is Windows 7 64 bit running on Intel Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

                                                                                                                    I don’t know if it is just luck, but, this is 2 for 2 with the Famous models with the new graphics.


                                                                                                                    ____________

                                                                                                                    Profile Greg van Paassen
                                                                                                                    Send message
                                                                                                                    Joined: Nov 17 07
                                                                                                                    Posts: 142
                                                                                                                    Credit: 4,271,370
                                                                                                                    RAC: 0
                                                                                                                    Message 40125 - Posted 12 Jul 2010 3:47:16 UTC - in response to Message 40116.

                                                                                                                      More detailed investigation as suggested by Les.

                                                                                                                      Ignoring anything that is not "famous_uxxx_", and all with _599_ start year, i.e. looking at just "u series and not 599":-

                                                                                                                      Darwin Xeon (3 computers): 20 succeeded, 70 failed.

                                                                                                                      Darwin i7 (1 computer): 9 succeeded, 7 failed.

                                                                                                                      Win Opteron (1 computer): 6 succeeded, 6 failed.

                                                                                                                      Linux Xeon (2 computers): 15 succeeded, 9 failed.

                                                                                                                      Linux i7 (1 computer): 5 succeeded, 7 failed.

                                                                                                                      All of these are compatible with the "about fifty-fifty chance of failure" warning, except for Darwin Xeon. It could be just chance... but it might not.

                                                                                                                      (And actually, the r series and the "599s" don't make much difference to the percentages, in the tiny sample of computers I looked at.)

                                                                                                                      I'm not comparing the failure rate to anything--I've been away from the project for a few years, and only had about 10 SM3s before starting on famouses. I don't have Darwin, or a Xeon--more's the pity ;-). I'm just saying that there might be something to look into, using proper statistical methods.

                                                                                                                      Geophi - compiler (option) problems was my first guess. Famous models seem to be smaller than others, only about 30 MB resident rather than 100+ MB -- CPUs seem to spend less time moving data in and out from memory, and more time computing. Maybe the famous code has flushed out a very obscure intermittent bug.

                                                                                                                      And maybe it's just chance.

                                                                                                                      This is about as much investigation as I'm prepared to do without writing scripts, and it'd be better for someone who has direct access to the database to do that. So: leaving it there, thanks for listening. ;-)

                                                                                                                      Profile Iain Inglis
                                                                                                                      Forum moderator
                                                                                                                      Send message
                                                                                                                      Joined: Jan 16 10
                                                                                                                      Posts: 522
                                                                                                                      Credit: 9,532
                                                                                                                      RAC: 0
                                                                                                                      Message 40128 - Posted 12 Jul 2010 8:05:58 UTC

                                                                                                                        On the Darwin thing: I have 5 succeeded and 3 failed on beta. On main-project Windows, 1 succeeded and 3 failed. (Plus, the current beta WUs are apparently exploring a different parameter range - just to add to the confusion over success/failure ratios.)

                                                                                                                        Profile JIM
                                                                                                                        Send message
                                                                                                                        Joined: Dec 31 07
                                                                                                                        Posts: 682
                                                                                                                        Credit: 4,224,379
                                                                                                                        RAC: 2,953
                                                                                                                        Message 40195 - Posted 21 Jul 2010 0:34:50 UTC

                                                                                                                          Famous_u0il_1799_200_006667077_3 finished successfully.
                                                                                                                          OS is Windows 7 64 bit running on Intel Core 2 Duo 2.2 GHz processor with 4 GB of RAM.
                                                                                                                          THREE IN A ROW AND COUNTING.
                                                                                                                          ____________

                                                                                                                          Profile geophi
                                                                                                                          Forum moderator
                                                                                                                          Send message
                                                                                                                          Joined: Aug 7 04
                                                                                                                          Posts: 1478
                                                                                                                          Credit: 23,133,788
                                                                                                                          RAC: 11,481
                                                                                                                          Message 40202 - Posted 22 Jul 2010 3:49:39 UTC - in response to Message 40115.

                                                                                                                            Updated as of July 21, on my systems here at cpdn...

                                                                                                                            Core i7 920 in Linux
                                                                                                                            8 completed, 10 failed, 4 in progress
                                                                                                                            Phenom II X4 940 in Linux
                                                                                                                            11 completed, 7 failed, 4 in progress
                                                                                                                            Core 2 E6420 in Windows
                                                                                                                            3 completed, 1 failed, 1 in progress

                                                                                                                            Profile Mike Francis
                                                                                                                            Avatar
                                                                                                                            Send message
                                                                                                                            Joined: Nov 24 05
                                                                                                                            Posts: 1
                                                                                                                            Credit: 612,262
                                                                                                                            RAC: 0
                                                                                                                            Message 40208 - Posted 22 Jul 2010 13:32:36 UTC

                                                                                                                              Q6600 2.4gig running stock.
                                                                                                                              Windows XP 64 bit.
                                                                                                                              i Famous run to completion;
                                                                                                                              and then on second Famous;
                                                                                                                              7/22/2010 7:02:13 AM climateprediction.net Started upload of famous_u01x_1799_200_006632920_5_6.zip
                                                                                                                              7/22/2010 7:02:36 AM climateprediction.net Finished upload of famous_u01x_1799_200_006632920_5_6.zip
                                                                                                                              7/22/2010 8:12:12 AM climateprediction.net Sending scheduler request: To send trickle-up message.
                                                                                                                              7/22/2010 8:12:12 AM climateprediction.net Not reporting or requesting tasks
                                                                                                                              7/22/2010 8:12:14 AM climateprediction.net Scheduler request completed
                                                                                                                              7/22/2010 8:38:35 AM climateprediction.net Resuming task famous_u01x_1799_200_006632920_5 using famous version 611
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Computation for task famous_u01x_1799_200_006632920_5 finished
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_7.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_8.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_9.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_10.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_11.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_12.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_13.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_14.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_15.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_16.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_17.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_18.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_19.zip for task famous_u01x_1799_200_006632920_5 absent
                                                                                                                              7/22/2010 9:10:01 AM climateprediction.net Output file famous_u01x_1799_200_006632920_5_20.zip for task famous_u01x_1799_200_006632920_5 absent

                                                                                                                              ____________

                                                                                                                              Profile mo.v
                                                                                                                              Forum moderator
                                                                                                                              Avatar
                                                                                                                              Send message
                                                                                                                              Joined: Sep 29 04
                                                                                                                              Posts: 2359
                                                                                                                              Credit: 7,177,874
                                                                                                                              RAC: 1,646
                                                                                                                              Message 40210 - Posted 22 Jul 2010 13:49:04 UTC

                                                                                                                                Last modified: 22 Jul 2010 13:50:06 UTC

                                                                                                                                Mike, all those messages about the missing files just means that the model crashed before it could generate those files.

                                                                                                                                Here's the crashed model's web page. If you click on stderr out + you'll see that it crashed because of NEGATIVE THETA ie caused by the model's initial parameters. Nothing to worry about. The researchers want us to run them whether they crash or complete.
                                                                                                                                ____________
                                                                                                                                Cpdn news

                                                                                                                                Profile JIM
                                                                                                                                Send message
                                                                                                                                Joined: Dec 31 07
                                                                                                                                Posts: 682
                                                                                                                                Credit: 4,224,379
                                                                                                                                RAC: 2,953
                                                                                                                                Message 40213 - Posted 22 Jul 2010 18:37:08 UTC

                                                                                                                                  Famous_up5n_1599_200_00665446_1 and Famous_umvv_1999_200_ 006662502_5 both completed successfully.

                                                                                                                                  Famous_up5n_1599_200_00665446_1 OS is Windows 7 32 bit on a Core 2 Duo 1.5 GHz processor with 2 GB of RAM.

                                                                                                                                  Famous_umvv_1999_200_ 006662502_5 was run on Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

                                                                                                                                  This makes 5 successful completions in a row. Have they done something to improve stability or did the Scientists just front load most of the WU‘s with extreme parameters (that are more likely to fail) in the very early batches?

                                                                                                                                  ____________

                                                                                                                                  Les Bayliss
                                                                                                                                  Forum moderator
                                                                                                                                  Send message
                                                                                                                                  Joined: Sep 5 04
                                                                                                                                  Posts: 5428
                                                                                                                                  Credit: 9,074,925
                                                                                                                                  RAC: 1,934
                                                                                                                                  Message 40215 - Posted 22 Jul 2010 21:42:55 UTC - in response to Message 40213.

                                                                                                                                    As I said somewhere, the new model type takes us back to 2003-4, when the original 'slab' model was used.
                                                                                                                                    The only way to find out which values lead to a long run, was to try them, and 'mark off' those values that caused early failures, and keep those that lasted the distance.
                                                                                                                                    And this is what is happening with these totally different Millennium models: try everything and see what happens.

                                                                                                                                    As it says here:

                                                                                                                                    Slogan : Historical climate records tell various stories — Let's test them all.

                                                                                                                                    And it also says:
                                                                                                                                    In addition to perturbations for internal physics parameters of the model and initial condition, this experiment requires a large number of forcing perturbations to deal with the large uncertainty in the historical forcings.


                                                                                                                                    The very first versions were more unstable, so more testing was done to find failure points, and compiler options were also changed.
                                                                                                                                    And the type 'name' series are using different degrees of values, and this affects the stability.

                                                                                                                                    Remember, this is a short term project, and lots of climatologists are poring over the results as they come in. On beta, Hiro is watching as each new trickle arrives. Well, several times a day. :)
                                                                                                                                    The current 'test' version, which has a name starting with s2..., is producing 'hot' results, and Hiro knows about these before they fail/complete.
                                                                                                                                    No doubt something similar is happening on this main site as well.


                                                                                                                                    ____________
                                                                                                                                    Backups: Here

                                                                                                                                    Profile JIM
                                                                                                                                    Send message
                                                                                                                                    Joined: Dec 31 07
                                                                                                                                    Posts: 682
                                                                                                                                    Credit: 4,224,379
                                                                                                                                    RAC: 2,953
                                                                                                                                    Message 40227 - Posted 23 Jul 2010 21:03:06 UTC

                                                                                                                                      Famous_up23_1999_200_006665318 finished successfully.
                                                                                                                                      OS is Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.


                                                                                                                                      ____________

                                                                                                                                      Profile JIM
                                                                                                                                      Send message
                                                                                                                                      Joined: Dec 31 07
                                                                                                                                      Posts: 682
                                                                                                                                      Credit: 4,224,379
                                                                                                                                      RAC: 2,953
                                                                                                                                      Message 40269 - Posted 29 Jul 2010 15:05:05 UTC

                                                                                                                                        Famous_uky9_1599_200_006659996_3 failed at approx. 80% completion. OS is Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

                                                                                                                                        ____________

                                                                                                                                        [AF>france>pas-de-calais]symaski62
                                                                                                                                        Send message
                                                                                                                                        Joined: Aug 13 05
                                                                                                                                        Posts: 54
                                                                                                                                        Credit: 117,227
                                                                                                                                        RAC: 0
                                                                                                                                        Message 40289 - Posted 3 Aug 2010 7:16:44 UTC

                                                                                                                                          http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11460092

                                                                                                                                          process exited with code 22 (0x16, -234)

                                                                                                                                          Suspended CPDN Monitor - Suspend request from BOINC...

                                                                                                                                          Model crashed: ATM_DYN : INVALID THETA DETECTED.

                                                                                                                                          error
                                                                                                                                          ____________

                                                                                                                                          Profile Overtonesinger
                                                                                                                                          Send message
                                                                                                                                          Joined: Dec 30 05
                                                                                                                                          Posts: 5
                                                                                                                                          Credit: 940,357
                                                                                                                                          RAC: 0
                                                                                                                                          Message 40294 - Posted 3 Aug 2010 9:33:19 UTC

                                                                                                                                            I have also many FAMOUS models crashing in last few days, on Intel Pentium Dual CPU "E2200" at 2.2 GHz (native).
                                                                                                                                            I have never had any crashing models before and nothing has changed in the computer. It is perfectlz stable.
                                                                                                                                            Is there some workaround for those crashes?

                                                                                                                                            computer:
                                                                                                                                            http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1051527

                                                                                                                                            Peace and Love!
                                                                                                                                            Filip
                                                                                                                                            ____________

                                                                                                                                            Profile mo.v
                                                                                                                                            Forum moderator
                                                                                                                                            Avatar
                                                                                                                                            Send message
                                                                                                                                            Joined: Sep 29 04
                                                                                                                                            Posts: 2359
                                                                                                                                            Credit: 7,177,874
                                                                                                                                            RAC: 1,646
                                                                                                                                            Message 40296 - Posted 3 Aug 2010 11:13:13 UTC

                                                                                                                                              Last modified: 3 Aug 2010 11:14:28 UTC

                                                                                                                                              Hello Overtonesinger

                                                                                                                                              I've looked at the results for computer 1051527 which has an excellent list of model completions.

                                                                                                                                              If you look at the web pages for the crashed FAMOUS models here and here and for each model click on + beside stderr you will see extra messages. Both models have exit code 22 and messages including NEGATIVE PRESSURE or INVALID THETA.

                                                                                                                                              FAMOUS models are experimenting with some very extreme parameter values. In some cases this causes model crashes. It is not the fault of the computer; it's part of the experiment and even the crashed models are useful for Hiro, the researcher. If the crash is caused by the model parameter values you usually see NEGATIVE PRESSURE or INVALID THETA messages.

                                                                                                                                              If you look at the workunit page for each crashed model (each model/task belongs to a workunit containing several copies of the same task) you see for example this. The processing of models depends on a combination of the computer's CPU (Intel or AMD) and its operating system (Windows, Linux or Mac/Darwin). Computers with the same combination usually all complete or all crash at the same processing moment. You will see that the two computers with Darwin crashed at the same moment, but not at the same moment as your computer which has Windows. The computer with Linux may complete the model.

                                                                                                                                              But if we look at the other workunit we find two computers that couldn't start the model. Their models have -226 and -185 exit codes. These mean there's a problem in those computers. Their firewall or antivirus is probably blocking Boinc.

                                                                                                                                              Don't try to back up or restore FAMOUS models that crash on a stable computer. They would crash again at the same processing moment.
                                                                                                                                              ____________
                                                                                                                                              Cpdn news

                                                                                                                                              Profile JIM
                                                                                                                                              Send message
                                                                                                                                              Joined: Dec 31 07
                                                                                                                                              Posts: 682
                                                                                                                                              Credit: 4,224,379
                                                                                                                                              RAC: 2,953
                                                                                                                                              Message 40322 - Posted 6 Aug 2010 19:20:50 UTC

                                                                                                                                                Famous_u1eh-1199_200_006634660_4 completed successfully.

                                                                                                                                                Famous_u42q_1799_200_006638125_5 failed and Famous_u57z_1799_200_ 006639610_3 failed at approx. 45% completion. OS is Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

                                                                                                                                                ____________

                                                                                                                                                [AF>france>pas-de-calais]symaski62
                                                                                                                                                Send message
                                                                                                                                                Joined: Aug 13 05
                                                                                                                                                Posts: 54
                                                                                                                                                Credit: 117,227
                                                                                                                                                RAC: 0
                                                                                                                                                Message 40330 - Posted 8 Aug 2010 22:08:02 UTC - in response to Message 40289.

                                                                                                                                                  http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11460092

                                                                                                                                                  process exited with code 22 (0x16, -234)

                                                                                                                                                  Suspended CPDN Monitor - Suspend request from BOINC...

                                                                                                                                                  Model crashed: ATM_DYN : INVALID THETA DETECTED.

                                                                                                                                                  error


                                                                                                                                                  Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy

                                                                                                                                                  update ^^

                                                                                                                                                  ____________

                                                                                                                                                  Profile astroWX
                                                                                                                                                  Forum moderator
                                                                                                                                                  Send message
                                                                                                                                                  Joined: Aug 5 04
                                                                                                                                                  Posts: 1304
                                                                                                                                                  Credit: 38,908,080
                                                                                                                                                  RAC: 23,052
                                                                                                                                                  Message 40335 - Posted 9 Aug 2010 3:56:25 UTC

                                                                                                                                                    At risk of jinxing myself (Superstitious? Who? Me?), I've had more FAMOUS successes than failures lately, both here and on Beta. (Fingers crossed ...) May that be, or soon become, true for everyone.
                                                                                                                                                    ____________
                                                                                                                                                    "We have met the enemy and he is us." -- Pogo
                                                                                                                                                    Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                                                                                                                                    Profile JIM
                                                                                                                                                    Send message
                                                                                                                                                    Joined: Dec 31 07
                                                                                                                                                    Posts: 682
                                                                                                                                                    Credit: 4,224,379
                                                                                                                                                    RAC: 2,953
                                                                                                                                                    Message 40336 - Posted 9 Aug 2010 5:09:17 UTC - in response to Message 40335.

                                                                                                                                                      I hate to say this but I felt the same way a little while back. Had 5 successes in a row. Since then I have had 3 crashes, with only 1 successful completion. I guess the law of averages is catching up with me.
                                                                                                                                                      [/quote]
                                                                                                                                                      ____________

                                                                                                                                                      littleBouncer
                                                                                                                                                      Avatar
                                                                                                                                                      Send message
                                                                                                                                                      Joined: Oct 21 04
                                                                                                                                                      Posts: 24
                                                                                                                                                      Credit: 207,633
                                                                                                                                                      RAC: 0
                                                                                                                                                      Message 40350 - Posted 11 Aug 2010 7:46:28 UTC

                                                                                                                                                        Last modified: 11 Aug 2010 7:54:21 UTC

                                                                                                                                                        First task was a success

                                                                                                                                                        http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11432527

                                                                                                                                                        but the second crashed:

                                                                                                                                                        http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11474733

                                                                                                                                                        I had the message that a certain .zip-file wasn't there (I can't remember the full file-name :( )

                                                                                                                                                        greetz from Switzerland
                                                                                                                                                        littleBouncer
                                                                                                                                                        ____________

                                                                                                                                                        Les Bayliss
                                                                                                                                                        Forum moderator
                                                                                                                                                        Send message
                                                                                                                                                        Joined: Sep 5 04
                                                                                                                                                        Posts: 5428
                                                                                                                                                        Credit: 9,074,925
                                                                                                                                                        RAC: 1,934
                                                                                                                                                        Message 40351 - Posted 11 Aug 2010 9:33:01 UTC

                                                                                                                                                          Missing zip file messages are normal when a model crashes - if the model hasn't progressed to the point where the file is created, then BOINC can't find it to upload it.

                                                                                                                                                          The real messages about the failure are on the web page for the model; click the + sign alongside stderr to see them.

                                                                                                                                                          DaveG27
                                                                                                                                                          Send message
                                                                                                                                                          Joined: Nov 8 06
                                                                                                                                                          Posts: 18
                                                                                                                                                          Credit: 2,425,895
                                                                                                                                                          RAC: 0
                                                                                                                                                          Message 40352 - Posted 11 Aug 2010 22:11:33 UTC

                                                                                                                                                            I have so far 21 successes's 10 failures all "negative theta",6 in progress ,4 waiting to run(reserve supply because of down load problems).
                                                                                                                                                            As failures run for a shorter time this will skew the results in the short term and failures will appear higher than they actually are and a true ratio will become apparent over the longer term.
                                                                                                                                                            Two of my m/c's run linux and one windows 7 failure rates seem about the same.
                                                                                                                                                            When checking out the others in w.u. of failed models I was surprised by the differences on windows system between Xp,vista and 7 whether it failed or not and how far it got one would expect them all to fail at the same point which they generally did when running the same o.s.
                                                                                                                                                            Two of the w.u.'s had a lot of linux's one they all failed at the same point the other they were all different (you can't win)
                                                                                                                                                            Perhaps more research needs to be done on this to see if it is true or not and not just a coincidence on the ones I looked at.

                                                                                                                                                            Profile JIM
                                                                                                                                                            Send message
                                                                                                                                                            Joined: Dec 31 07
                                                                                                                                                            Posts: 682
                                                                                                                                                            Credit: 4,224,379
                                                                                                                                                            RAC: 2,953
                                                                                                                                                            Message 40377 - Posted 17 Aug 2010 17:53:10 UTC

                                                                                                                                                              Famous_u6f5_1399_200_006641826_6 failed at appromx. 95 % on a machine running Windows7 64 bit with 2.2 GHz Core 2 Duo processor and 4 GB of RAM.

                                                                                                                                                              Famous_1399_200_006641826_6 completed successfully on a machine running Windows7 64 bit with 2.2 GHz Core 2 Duo processor and 4 GB of RAM.

                                                                                                                                                              Famous_u34a_1799_200_006636885_2 completed successfully on machine running Windows7 32 bit with Core 2 Duo 1.5 GHz processor and 2 GB of RAM.

                                                                                                                                                              Famous_u34a_1799_200_006636891_1 completed successfully on a machine running Windows7 32 bit with Core 2 Duo 1.5 GHz processor and 2 GB of RAM.

                                                                                                                                                              ____________

                                                                                                                                                              Profile Strathpeffer
                                                                                                                                                              Avatar
                                                                                                                                                              Send message
                                                                                                                                                              Joined: Jan 9 07
                                                                                                                                                              Posts: 497
                                                                                                                                                              Credit: 342,899
                                                                                                                                                              RAC: 179
                                                                                                                                                              Message 40378 - Posted 17 Aug 2010 18:02:17 UTC

                                                                                                                                                                Last modified: 17 Aug 2010 18:03:33 UTC

                                                                                                                                                                Don't know if this is of any interest but it might be, because team Scotland members have quite a good record of completing long models, from BBC onwards. From this page of Iansm's brilliant stats for the team, it can be seen that, of 484 FAMOUS models issued to team members to date, 210 have completed and 170 have failed.
                                                                                                                                                                ____________
                                                                                                                                                                Visit the Scotland team

                                                                                                                                                                Profile JIM
                                                                                                                                                                Send message
                                                                                                                                                                Joined: Dec 31 07
                                                                                                                                                                Posts: 682
                                                                                                                                                                Credit: 4,224,379
                                                                                                                                                                RAC: 2,953
                                                                                                                                                                Message 40385 - Posted 19 Aug 2010 7:02:42 UTC

                                                                                                                                                                  Famous_ua0y_1799_200_006636885_2 failed at appromx. 12% running on 2.2 GHz Core 2 Duo processor running Windows 7 64 bit. At least this one had the good grace to fail early (36 hours) and not after 11 days (95%) of running.

                                                                                                                                                                  ____________

                                                                                                                                                                  Profile Greg van Paassen
                                                                                                                                                                  Send message
                                                                                                                                                                  Joined: Nov 17 07
                                                                                                                                                                  Posts: 142
                                                                                                                                                                  Credit: 4,271,370
                                                                                                                                                                  RAC: 0
                                                                                                                                                                  Message 40389 - Posted 19 Aug 2010 20:00:53 UTC

                                                                                                                                                                    Just had one of mine fail at about 34%, with a different error this time - i.e. not "invalid theta": famous_ubod_599_200_006647976_2.

                                                                                                                                                                    The error was

                                                                                                                                                                    SETPOS: Seek Failed: Invalid argument
                                                                                                                                                                    SETPOS: Unit 61 to Word Address -198 Failed with Error Code -1

                                                                                                                                                                    Model crashed: SETPOS: Unit 61 to Word Address -198 Failed with Error Code -1

                                                                                                                                                                    repeated 6 times. Same exit code 22, though.

                                                                                                                                                                    This breaks a run of 7 successes. Totals so far: 17 completed, 9 failed (plus 3 "download errors" from the server glitch back in June).

                                                                                                                                                                    Profile Greg van Paassen
                                                                                                                                                                    Send message
                                                                                                                                                                    Joined: Nov 17 07
                                                                                                                                                                    Posts: 142
                                                                                                                                                                    Credit: 4,271,370
                                                                                                                                                                    RAC: 0
                                                                                                                                                                    Message 40390 - Posted 19 Aug 2010 20:10:04 UTC - in response to Message 40389.

                                                                                                                                                                      Just a note on those 3 "download errors": Two of them didn't get processed at all:

                                                                                                                                                                      famous_uopf_1599_200_006664862 and famous_uopj_1799_200_006664866'

                                                                                                                                                                      I wonder how many more work units are like that, and whether it will be a problem for the experiment?

                                                                                                                                                                      Les Bayliss
                                                                                                                                                                      Forum moderator
                                                                                                                                                                      Send message
                                                                                                                                                                      Joined: Sep 5 04
                                                                                                                                                                      Posts: 5428
                                                                                                                                                                      Credit: 9,074,925
                                                                                                                                                                      RAC: 1,934
                                                                                                                                                                      Message 40392 - Posted 19 Aug 2010 20:50:18 UTC

                                                                                                                                                                        Greg

                                                                                                                                                                        Your recent failure was Invalid theta. The other messages are most likely what happened when the program was suddenly diverted to a different (incorrect) area of code by the failure. The researchers will pick it up when looking through the lists, so not a problem for you.

                                                                                                                                                                        The models that didn't arrive due to download errors are called phantom models.
                                                                                                                                                                        And they are a problem to the project, because there's less chance of that particular combination getting processed by someone else. (No chance, if all of the batch failed to download.)

                                                                                                                                                                        If the area of parameter space involved with the download problems at that time is important enough to which ever physicists are running those models, then they'll request that they be included again at some point.

                                                                                                                                                                        Profile Greg van Paassen
                                                                                                                                                                        Send message
                                                                                                                                                                        Joined: Nov 17 07
                                                                                                                                                                        Posts: 142
                                                                                                                                                                        Credit: 4,271,370
                                                                                                                                                                        RAC: 0
                                                                                                                                                                        Message 40394 - Posted 20 Aug 2010 5:28:56 UTC - in response to Message 40392.

                                                                                                                                                                          Les - well, maybe. The model ran for about 20 hours after the third and last "Invalid Theta" message appeared in stderr.txt. (Note to programmers: it'd be handy if error messages were timestamped.)

                                                                                                                                                                          All of my Famous models have logged at least one "invalid theta" message, but the majority go on to completion. I guess the code's "back up and re-try" works ;-).

                                                                                                                                                                          As well as the "download error" models, I have two "normal" phantoms: famous_u0ch_1999_200_006633300_5 and famous_ulrv_799_200_006661062_0

                                                                                                                                                                          These phantoms are "In Progress" according to the web site, but never made it to my machine. I recall watching (in the Boinc Manager) one of the download files, for u0ch, get to about 90% downloaded - and then just vanish. Not to worry: someone else managed a complete run for that work unit.

                                                                                                                                                                          [AF>france>pas-de-calais]symaski62
                                                                                                                                                                          Send message
                                                                                                                                                                          Joined: Aug 13 05
                                                                                                                                                                          Posts: 54
                                                                                                                                                                          Credit: 117,227
                                                                                                                                                                          RAC: 0
                                                                                                                                                                          Message 40401 - Posted 22 Aug 2010 18:02:19 UTC

                                                                                                                                                                            http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6856284

                                                                                                                                                                            hello ^^

                                                                                                                                                                            error 6

                                                                                                                                                                            O_o
                                                                                                                                                                            ____________

                                                                                                                                                                            Profile JIM
                                                                                                                                                                            Send message
                                                                                                                                                                            Joined: Dec 31 07
                                                                                                                                                                            Posts: 682
                                                                                                                                                                            Credit: 4,224,379
                                                                                                                                                                            RAC: 2,953
                                                                                                                                                                            Message 40434 - Posted 27 Aug 2010 0:37:45 UTC

                                                                                                                                                                              Famous_ueet_999_200_006651520_4 failed. Reason: Model crashed: ATM_DYN : INVALID THETA DETECTED. Computer is Windows 7 64 bit with Intel Core 2 DUO 2.2 GHz processor with 4 GB of RAM.


                                                                                                                                                                              ____________

                                                                                                                                                                              Profile Ananas
                                                                                                                                                                              Forum moderator
                                                                                                                                                                              Send message
                                                                                                                                                                              Joined: Oct 31 04
                                                                                                                                                                              Posts: 336
                                                                                                                                                                              Credit: 3,316,482
                                                                                                                                                                              RAC: 0
                                                                                                                                                                              Message 40437 - Posted 27 Aug 2010 20:36:48 UTC

                                                                                                                                                                                Model crashed: ATM_DYN : INVALID THETA DETECTED. three results of that WU did that already.

                                                                                                                                                                                I still have 7 active Famous 6.11 and a bunch of finished ones on that box. Besides the one mentioned here no errors so far.

                                                                                                                                                                                Profile JIM
                                                                                                                                                                                Send message
                                                                                                                                                                                Joined: Dec 31 07
                                                                                                                                                                                Posts: 682
                                                                                                                                                                                Credit: 4,224,379
                                                                                                                                                                                RAC: 2,953
                                                                                                                                                                                Message 40446 - Posted 29 Aug 2010 3:10:29 UTC

                                                                                                                                                                                  Famous_u9rf_1599_200_006645494_3 finished successfully. OS is Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

                                                                                                                                                                                  ____________

                                                                                                                                                                                  [AF>france>pas-de-calais]symaski62
                                                                                                                                                                                  Send message
                                                                                                                                                                                  Joined: Aug 13 05
                                                                                                                                                                                  Posts: 54
                                                                                                                                                                                  Credit: 117,227
                                                                                                                                                                                  RAC: 0
                                                                                                                                                                                  Message 40447 - Posted 29 Aug 2010 4:34:35 UTC

                                                                                                                                                                                    http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11515402

                                                                                                                                                                                    22 error ^^
                                                                                                                                                                                    Le périphérique ne reconnait pas la commande. (0x16) - exit code 22 (0x16)



                                                                                                                                                                                    28-Aug-2010 22:01:05 [climateprediction.net] Started upload of famous_ufhh_1599_200_006652912_4_8.zip
                                                                                                                                                                                    28-Aug-2010 22:01:06 [climateprediction.net] Sending scheduler request: To send trickle-up message.
                                                                                                                                                                                    28-Aug-2010 22:01:06 [climateprediction.net] Not reporting or requesting tasks
                                                                                                                                                                                    28-Aug-2010 22:01:12 [climateprediction.net] Scheduler request completed
                                                                                                                                                                                    28-Aug-2010 22:04:20 [climateprediction.net] Finished upload of famous_ufhh_1599_200_006652912_4_8.zip
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Computation for task famous_ufhh_1599_200_006652912_4 finished
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_9.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_10.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_11.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_12.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_13.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_14.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_15.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_16.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_17.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_18.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_19.zip for task famous_ufhh_1599_200_006652912_4 absent
                                                                                                                                                                                    28-Aug-2010 23:10:23 [climateprediction.net] Output file famous_ufhh_1599_200_006652912_4_20.zip for task famous_ufhh_1599_200_006652912_4 absent

                                                                                                                                                                                    ____________

                                                                                                                                                                                    Profile Ananas
                                                                                                                                                                                    Forum moderator
                                                                                                                                                                                    Send message
                                                                                                                                                                                    Joined: Oct 31 04
                                                                                                                                                                                    Posts: 336
                                                                                                                                                                                    Credit: 3,316,482
                                                                                                                                                                                    RAC: 0
                                                                                                                                                                                    Message 40451 - Posted 29 Aug 2010 20:39:16 UTC - in response to Message 40447.

                                                                                                                                                                                      http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11515402

                                                                                                                                                                                      22 error ^^
                                                                                                                                                                                      Le périphérique ne reconnait pas la commande. (0x16) - exit code 22 (0x16)


                                                                                                                                                                                      ...


                                                                                                                                                                                      This is a "Theta" issue too, the filetransfer errors are just results of that Theta thing.

                                                                                                                                                                                      Misty
                                                                                                                                                                                      Send message
                                                                                                                                                                                      Joined: Feb 14 06
                                                                                                                                                                                      Posts: 24
                                                                                                                                                                                      Credit: 2,842,313
                                                                                                                                                                                      RAC: 1,510
                                                                                                                                                                                      Message 40623 - Posted 8 Sep 2010 16:07:08 UTC

                                                                                                                                                                                        Success/failure ratio rises as 'no go' parameter space is identified and avoided, but if combinations of physically-plausible parameter values fail then does this suggest that the general model is not robust?

                                                                                                                                                                                        Profile astroWX
                                                                                                                                                                                        Forum moderator
                                                                                                                                                                                        Send message
                                                                                                                                                                                        Joined: Aug 5 04
                                                                                                                                                                                        Posts: 1304
                                                                                                                                                                                        Credit: 38,908,080
                                                                                                                                                                                        RAC: 23,052
                                                                                                                                                                                        Message 40625 - Posted 8 Sep 2010 18:09:34 UTC

                                                                                                                                                                                          Last modified: 8 Sep 2010 18:14:13 UTC

                                                                                                                                                                                          Hypothetical question. In general, the researchers don't know which combinations of perturbed parameters are plausible until they're tried and have identical failures, or similar completions, within a Task. (We're still testing this in Beta.) The range of possible parameter combinations and perturbations is vast.

                                                                                                                                                                                          The Models we run are not untested. They were developed by the U.K. MetOffice and are used in regular weather and climate applications; our task in Beta is to test the envelope that allows a SuperComputer Model to run on a PC, as well as parameter ranges. (CPDN's goal is not "the" solution for the "climate problem." Rather, it is to understand a reasonable range. There is quite a bit of Project and science background information on the other Boards, starting with the home page. http://climateprediction.net/)

                                                                                                                                                                                          Edit: Added hot link.
                                                                                                                                                                                          ____________
                                                                                                                                                                                          "We have met the enemy and he is us." -- Pogo
                                                                                                                                                                                          Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                                                                                                                                                                          Misty
                                                                                                                                                                                          Send message
                                                                                                                                                                                          Joined: Feb 14 06
                                                                                                                                                                                          Posts: 24
                                                                                                                                                                                          Credit: 2,842,313
                                                                                                                                                                                          RAC: 1,510
                                                                                                                                                                                          Message 40630 - Posted 8 Sep 2010 22:18:07 UTC

                                                                                                                                                                                            Thanks for your response. I posted here because it related to the thread subject, albeit with wider implications.

                                                                                                                                                                                            Maybe it's pointless to pursue this if the question is considered naive, ill-informed or trivial. Do you mean that because the model is validated already then combinations of physically-plausible parameter values (i.e. values that are realistic on the basis of physical measurements) will not fail, or are some model parameters not directly related to physical measurements?

                                                                                                                                                                                            Les Bayliss
                                                                                                                                                                                            Forum moderator
                                                                                                                                                                                            Send message
                                                                                                                                                                                            Joined: Sep 5 04
                                                                                                                                                                                            Posts: 5428
                                                                                                                                                                                            Credit: 9,074,925
                                                                                                                                                                                            RAC: 1,934
                                                                                                                                                                                            Message 40631 - Posted 8 Sep 2010 22:30:42 UTC - in response to Message 40630.

                                                                                                                                                                                              The model being validated just means that the program software is OK as far as is known. But that's with the combinations of hardware and software that the testers used.

                                                                                                                                                                                              All 'climate' parameters/values can fail if used in certain combinations. Or if the models were to be run for longer periods.

                                                                                                                                                                                              If the models DON'T fail from instability, then they can still do so because of the hardware/software used on the computer running the model.
                                                                                                                                                                                              e.g. Some people overclock their computers and say that they're still stable. But the Floating Point Unit, (FPU), that is used for lots of calculations may have trouble providing data at the faster rate, and give values that cause the model to be slightly different to what it would be if the computer wasn't overclocked. And, over time, these slight differences add up.


                                                                                                                                                                                              ____________
                                                                                                                                                                                              Backups: Here

                                                                                                                                                                                              Misty
                                                                                                                                                                                              Send message
                                                                                                                                                                                              Joined: Feb 14 06
                                                                                                                                                                                              Posts: 24
                                                                                                                                                                                              Credit: 2,842,313
                                                                                                                                                                                              RAC: 1,510
                                                                                                                                                                                              Message 40632 - Posted 8 Sep 2010 23:10:42 UTC

                                                                                                                                                                                                Perhaps I'm not making myself clear. By 'validated' I mean that given 'sensible' inputs the model generates sensible outputs, for example making accurate predictions from historical data sets. Repeat runs of a parameter combination will presumably identify variation due to software-hardware interaction. But is there a straightforward answer to my question?

                                                                                                                                                                                                Profile astroWX
                                                                                                                                                                                                Forum moderator
                                                                                                                                                                                                Send message
                                                                                                                                                                                                Joined: Aug 5 04
                                                                                                                                                                                                Posts: 1304
                                                                                                                                                                                                Credit: 38,908,080
                                                                                                                                                                                                RAC: 23,052
                                                                                                                                                                                                Message 40633 - Posted 8 Sep 2010 23:35:55 UTC

                                                                                                                                                                                                  Back when we ran the original 200-year ocean Spinups for the 180-year HadCM3 Tasks, there was a baseline, unperturbed, Task thrown into the mix. On the other hand, none of the Spinups had particularly aggressive parameters because the goal was a set of ocean files to put into HadCM3 Tasks, so every participant wouldn't have to run that nearly four months of work to get to the three-plus-month Task at hand. If I recall correctly, the Spinups didn't crash - unless the computer did it (as one of mine did, within hours of completion after nearly four months on a Pentium-4, thanks to a power glitch that found its way to the machine despite a UPS unit [fortunately, I made daily backups]).

                                                                                                                                                                                                  Except for the aside about my machine, is that within range of what you are getting at? (I confess to not understanding what you really want to know.)
                                                                                                                                                                                                  ____________
                                                                                                                                                                                                  "We have met the enemy and he is us." -- Pogo
                                                                                                                                                                                                  Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                                                                                                                                                                                  Profile geophi
                                                                                                                                                                                                  Forum moderator
                                                                                                                                                                                                  Send message
                                                                                                                                                                                                  Joined: Aug 7 04
                                                                                                                                                                                                  Posts: 1478
                                                                                                                                                                                                  Credit: 23,133,788
                                                                                                                                                                                                  RAC: 11,481
                                                                                                                                                                                                  Message 40635 - Posted 9 Sep 2010 2:51:08 UTC - in response to Message 40623.

                                                                                                                                                                                                    Success/failure ratio rises as 'no go' parameter space is identified and avoided, but if combinations of physically-plausible parameter values fail then does this suggest that the general model is not robust?

                                                                                                                                                                                                    It is sometimes challenging to state what a physically plausible parameter value is. Processes (like thunderstorms or individual clouds) that are too small scale to model in the large grids scale of the model have to be parameterized. This describes parameters from the basic experiment strategy for older models. Individual links within this text take you to further explanations of parameters:
                                                                                                                                                                                                    Parameters

                                                                                                                                                                                                    Every climate model has to make a number of approximations, called parameterisations. To read more about these, click here. Basically this means that there are numbers in the model which are given a certain, fixed value, but this value is not known for sure and a range of values could be equally realistic. The experiments will investigate the effect on the modelled climate of varying the value of 20 of the most poorly understood parameters in the model - such as the relationship between the number of raindrops in a cloud and how much it actually rains (to see what they are, click here). It is possible that some combinations of parameters may replicate the past climate equally well, but produce widely different forecasts for what might happen in the future. Some combinations of parameters will not work at all, produce a completely unrealistic climate ( for example an Earth that boils or freezes, or oscillates between very hot and very cold every couple of years) and probably crash the model. It is not possible for us to tell beforehand what these combinations will be.


                                                                                                                                                                                                    And this is a very good description of the millennium experiment which talks about why some models in this experiment are expected to fail.

                                                                                                                                                                                                    Misty
                                                                                                                                                                                                    Send message
                                                                                                                                                                                                    Joined: Feb 14 06
                                                                                                                                                                                                    Posts: 24
                                                                                                                                                                                                    Credit: 2,842,313
                                                                                                                                                                                                    RAC: 1,510
                                                                                                                                                                                                    Message 40643 - Posted 9 Sep 2010 13:33:24 UTC

                                                                                                                                                                                                      Thank you all. I'm a bit closer to understanding now.

                                                                                                                                                                                                      Profile JIM
                                                                                                                                                                                                      Send message
                                                                                                                                                                                                      Joined: Dec 31 07
                                                                                                                                                                                                      Posts: 682
                                                                                                                                                                                                      Credit: 4,224,379
                                                                                                                                                                                                      RAC: 2,953
                                                                                                                                                                                                      Message 40646 - Posted 9 Sep 2010 18:27:14 UTC

                                                                                                                                                                                                        Famous_u9d4_599_200_006644979_1 completed successfully.
                                                                                                                                                                                                        OS is Win7 32 bit running on a Core 2 Duo 1.5 GHz processor with 2 BG of RAM.

                                                                                                                                                                                                        ____________

                                                                                                                                                                                                        Profile JIM
                                                                                                                                                                                                        Send message
                                                                                                                                                                                                        Joined: Dec 31 07
                                                                                                                                                                                                        Posts: 682
                                                                                                                                                                                                        Credit: 4,224,379
                                                                                                                                                                                                        RAC: 2,953
                                                                                                                                                                                                        Message 40655 - Posted 11 Sep 2010 5:03:09 UTC

                                                                                                                                                                                                          Famous_u9no_1399_200_006645359_3 finished successfully. OS is Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

                                                                                                                                                                                                          ____________

                                                                                                                                                                                                          Profile Strathpeffer
                                                                                                                                                                                                          Avatar
                                                                                                                                                                                                          Send message
                                                                                                                                                                                                          Joined: Jan 9 07
                                                                                                                                                                                                          Posts: 497
                                                                                                                                                                                                          Credit: 342,899
                                                                                                                                                                                                          RAC: 179
                                                                                                                                                                                                          Message 40698 - Posted 17 Sep 2010 18:43:51 UTC

                                                                                                                                                                                                            Sorry to report that my Famous_ubdx_599_200_006647600_0 has crashed with an "unrecoverable error" :-(
                                                                                                                                                                                                            ____________
                                                                                                                                                                                                            Visit the Scotland team

                                                                                                                                                                                                            Les Bayliss
                                                                                                                                                                                                            Forum moderator
                                                                                                                                                                                                            Send message
                                                                                                                                                                                                            Joined: Sep 5 04
                                                                                                                                                                                                            Posts: 5428
                                                                                                                                                                                                            Credit: 9,074,925
                                                                                                                                                                                                            RAC: 1,934
                                                                                                                                                                                                            Message 40700 - Posted 17 Sep 2010 20:07:39 UTC - in response to Message 40698.

                                                                                                                                                                                                              Or more explicitly, with: INVALID THETA

                                                                                                                                                                                                              ____________
                                                                                                                                                                                                              Backups: Here

                                                                                                                                                                                                              Profile JIM
                                                                                                                                                                                                              Send message
                                                                                                                                                                                                              Joined: Dec 31 07
                                                                                                                                                                                                              Posts: 682
                                                                                                                                                                                                              Credit: 4,224,379
                                                                                                                                                                                                              RAC: 2,953
                                                                                                                                                                                                              Message 40709 - Posted 18 Sep 2010 4:06:39 UTC

                                                                                                                                                                                                                Famous_ufb3_999_200_006652682_2 completed successfully. OS is Windows 7 64 bit running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

                                                                                                                                                                                                                Profile Strathpeffer
                                                                                                                                                                                                                Avatar
                                                                                                                                                                                                                Send message
                                                                                                                                                                                                                Joined: Jan 9 07
                                                                                                                                                                                                                Posts: 497
                                                                                                                                                                                                                Credit: 342,899
                                                                                                                                                                                                                RAC: 179
                                                                                                                                                                                                                Message 40744 - Posted 22 Sep 2010 16:46:41 UTC - in response to Message 40700.

                                                                                                                                                                                                                  Last modified: 22 Sep 2010 16:46:55 UTC

                                                                                                                                                                                                                  Or more explicitly, with: INVALID THETA

                                                                                                                                                                                                                  Thanks Les, that info wasn't yet showing when I first posted. When the "Invalid Theta" message did appear, I meant to come back and amend my post but got kinda sidetracked, as happens around here! Thanks for clarifying. ;-)
                                                                                                                                                                                                                  ____________
                                                                                                                                                                                                                  Visit the Scotland team

                                                                                                                                                                                                                  Keith Scott
                                                                                                                                                                                                                  Send message
                                                                                                                                                                                                                  Joined: Feb 20 06
                                                                                                                                                                                                                  Posts: 158
                                                                                                                                                                                                                  Credit: 1,251,176
                                                                                                                                                                                                                  RAC: 0
                                                                                                                                                                                                                  Message 40745 - Posted 22 Sep 2010 20:16:07 UTC

                                                                                                                                                                                                                    27 tasks finished by MacBookPro Intel Core Duo 2.16 GHz running Darwin 9.8.0

                                                                                                                                                                                                                    Completed u series 6 v series 1
                                                                                                                                                                                                                    Error while computing 11 9
                                                                                                                                                                                                                    Totals 17 10

                                                                                                                                                                                                                    Only one was for year 599, and was a completed u series task.

                                                                                                                                                                                                                    2 v series In progress have been excluded as also have v series 2 ghosts, which are "in progress" due to a resetting of CPDN.

                                                                                                                                                                                                                    Keith

                                                                                                                                                                                                                    Profile John Hunt
                                                                                                                                                                                                                    Avatar
                                                                                                                                                                                                                    Send message
                                                                                                                                                                                                                    Joined: Mar 5 05
                                                                                                                                                                                                                    Posts: 64
                                                                                                                                                                                                                    Credit: 452,382
                                                                                                                                                                                                                    RAC: 0
                                                                                                                                                                                                                    Message 40770 - Posted 25 Sep 2010 6:39:06 UTC

                                                                                                                                                                                                                      Famous_uiav_599_200_006656562_1 completed on Core2Quad Q6600 @2.4GHz Windows XP Home.

                                                                                                                                                                                                                      http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11533655

                                                                                                                                                                                                                      Profile JIM
                                                                                                                                                                                                                      Send message
                                                                                                                                                                                                                      Joined: Dec 31 07
                                                                                                                                                                                                                      Posts: 682
                                                                                                                                                                                                                      Credit: 4,224,379
                                                                                                                                                                                                                      RAC: 2,953
                                                                                                                                                                                                                      Message 40771 - Posted 25 Sep 2010 7:39:48 UTC

                                                                                                                                                                                                                        Last modified: 25 Sep 2010 7:40:26 UTC

                                                                                                                                                                                                                        Famous_v0ta_1799_200_006686828_3 failed. Reason “Invalid Theta”. OS is Windows 7 32 bit running on a Core 2 Duo 1.5 GHz processor with 2 GB of RAM.

                                                                                                                                                                                                                        Profile JIM
                                                                                                                                                                                                                        Send message
                                                                                                                                                                                                                        Joined: Dec 31 07
                                                                                                                                                                                                                        Posts: 682
                                                                                                                                                                                                                        Credit: 4,224,379
                                                                                                                                                                                                                        RAC: 2,953
                                                                                                                                                                                                                        Message 40773 - Posted 25 Sep 2010 17:19:07 UTC

                                                                                                                                                                                                                          Last modified: 25 Sep 2010 17:33:00 UTC

                                                                                                                                                                                                                          Famous_ubr6_1799_200_006648077_5 failed. Reason invalid theta. OS is Windows 7 64 bit SP1 beta running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.
                                                                                                                                                                                                                          ____________

                                                                                                                                                                                                                          Profile Greg van Paassen
                                                                                                                                                                                                                          Send message
                                                                                                                                                                                                                          Joined: Nov 17 07
                                                                                                                                                                                                                          Posts: 142
                                                                                                                                                                                                                          Credit: 4,271,370
                                                                                                                                                                                                                          RAC: 0
                                                                                                                                                                                                                          Message 40774 - Posted 25 Sep 2010 22:38:57 UTC

                                                                                                                                                                                                                            famous_ue4u_799_200_006651161_6 failed at 84%, win7-intel. Nothing unusual about the temperature chart.[/url]

                                                                                                                                                                                                                            [AF>france>pas-de-calais]symaski62
                                                                                                                                                                                                                            Send message
                                                                                                                                                                                                                            Joined: Aug 13 05
                                                                                                                                                                                                                            Posts: 54
                                                                                                                                                                                                                            Credit: 117,227
                                                                                                                                                                                                                            RAC: 0
                                                                                                                                                                                                                            Message 40795 - Posted 29 Sep 2010 7:23:15 UTC

                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11696619


                                                                                                                                                                                                                              BUFFIN: Read Failed: No such file or directory
                                                                                                                                                                                                                              BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

                                                                                                                                                                                                                              BUFFIN: Read Failed: No such file or directory
                                                                                                                                                                                                                              BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

                                                                                                                                                                                                                              BUFFIN: Read Failed: No such file or directory
                                                                                                                                                                                                                              BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

                                                                                                                                                                                                                              BUFFIN: Read Failed: No such file or directory
                                                                                                                                                                                                                              BUFFIN: C I/O Error feof - Unit 69 - Return code = 16


                                                                                                                                                                                                                              ____________

                                                                                                                                                                                                                              Profile Thyme Lawn
                                                                                                                                                                                                                              Forum moderator
                                                                                                                                                                                                                              Send message
                                                                                                                                                                                                                              Joined: Aug 5 04
                                                                                                                                                                                                                              Posts: 1232
                                                                                                                                                                                                                              Credit: 10,449,542
                                                                                                                                                                                                                              RAC: 1,317
                                                                                                                                                                                                                              Message 40798 - Posted 29 Sep 2010 10:00:17 UTC - in response to Message 40795.

                                                                                                                                                                                                                                The BUFFIN errors happen when a FAMOUS task is removed from memory between generating a trickle and the next checkpoint. The task is restarted from the checkpoint before the trickle and the error is generated when a second attenmpt is made to post-process the data for the previous year. The errors are harmless.
                                                                                                                                                                                                                                ____________
                                                                                                                                                                                                                                "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer

                                                                                                                                                                                                                                Profile John Hunt
                                                                                                                                                                                                                                Avatar
                                                                                                                                                                                                                                Send message
                                                                                                                                                                                                                                Joined: Mar 5 05
                                                                                                                                                                                                                                Posts: 64
                                                                                                                                                                                                                                Credit: 452,382
                                                                                                                                                                                                                                RAC: 0
                                                                                                                                                                                                                                Message 40803 - Posted 29 Sep 2010 22:10:47 UTC

                                                                                                                                                                                                                                  Famous_uiau_1999_200_006656561_2 completed on Core2quad Q6600 Windows XP Home.
                                                                                                                                                                                                                                  Workunit error - check skipped.

                                                                                                                                                                                                                                  http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11533651

                                                                                                                                                                                                                                  Profile mo.v
                                                                                                                                                                                                                                  Forum moderator
                                                                                                                                                                                                                                  Avatar
                                                                                                                                                                                                                                  Send message
                                                                                                                                                                                                                                  Joined: Sep 29 04
                                                                                                                                                                                                                                  Posts: 2359
                                                                                                                                                                                                                                  Credit: 7,177,874
                                                                                                                                                                                                                                  RAC: 1,646
                                                                                                                                                                                                                                  Message 40804 - Posted 29 Sep 2010 22:58:58 UTC

                                                                                                                                                                                                                                    This 'Workunit error - check skipped' message means nothing for CPDN because our models aren't validated in the same way as tasks from other projects. It's a confounded nuisance and must put some people off. I don't know whether Milo could get rid of it.

                                                                                                                                                                                                                                    Similarly, on your model's workunit page we see 'Too many total results' which is another irrelevant message.
                                                                                                                                                                                                                                    ____________
                                                                                                                                                                                                                                    Cpdn news

                                                                                                                                                                                                                                    Les Bayliss
                                                                                                                                                                                                                                    Forum moderator
                                                                                                                                                                                                                                    Send message
                                                                                                                                                                                                                                    Joined: Sep 5 04
                                                                                                                                                                                                                                    Posts: 5428
                                                                                                                                                                                                                                    Credit: 9,074,925
                                                                                                                                                                                                                                    RAC: 1,934
                                                                                                                                                                                                                                    Message 40805 - Posted 29 Sep 2010 23:09:24 UTC

                                                                                                                                                                                                                                      John

                                                                                                                                                                                                                                      The only messages relevant to climate models are found in stderr, which is on the main model page.
                                                                                                                                                                                                                                      As there aren't any error messages, and it says further up the page: Over Success Done, that particular model is just that; a success.


                                                                                                                                                                                                                                      ____________
                                                                                                                                                                                                                                      Backups: Here

                                                                                                                                                                                                                                      [B^S] mavau
                                                                                                                                                                                                                                      Send message
                                                                                                                                                                                                                                      Joined: Aug 30 04
                                                                                                                                                                                                                                      Posts: 142
                                                                                                                                                                                                                                      Credit: 8,238,300
                                                                                                                                                                                                                                      RAC: 3,469
                                                                                                                                                                                                                                      Message 40976 - Posted 5 Nov 2010 20:16:21 UTC

                                                                                                                                                                                                                                        My latest stats:
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Completed UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.10
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11
                                                                                                                                                                                                                                        Error while computing UK Met Office FAMOUS v6.11

                                                                                                                                                                                                                                        63 completed, 35 errors, not counting the phantoms.

                                                                                                                                                                                                                                        ____________

                                                                                                                                                                                                                                        Forum search Site search

                                                                                                                                                                                                                                        Profile mo.v
                                                                                                                                                                                                                                        Forum moderator
                                                                                                                                                                                                                                        Avatar
                                                                                                                                                                                                                                        Send message
                                                                                                                                                                                                                                        Joined: Sep 29 04
                                                                                                                                                                                                                                        Posts: 2359
                                                                                                                                                                                                                                        Credit: 7,177,874
                                                                                                                                                                                                                                        RAC: 1,646
                                                                                                                                                                                                                                        Message 40979 - Posted 6 Nov 2010 9:05:32 UTC

                                                                                                                                                                                                                                          Thank you for the results of such a large number of models. Superficially this appears to mean a success rate of about 64% and a failure rate of about 36%. However, as the failures take less time to run because they crash before the end, the failure rate must be lower (if we mean the probability that any model will complete or fail).

                                                                                                                                                                                                                                          I'm not sure how to calculate this.

                                                                                                                                                                                                                                          Ideally the calculation would need to take into account whether on average the crashes occur at 50% completion (ie are equally likely to happen at any processing moment). I don't know this.
                                                                                                                                                                                                                                          ____________
                                                                                                                                                                                                                                          Cpdn news

                                                                                                                                                                                                                                          transient
                                                                                                                                                                                                                                          Send message
                                                                                                                                                                                                                                          Joined: Oct 3 06
                                                                                                                                                                                                                                          Posts: 42
                                                                                                                                                                                                                                          Credit: 2,411,041
                                                                                                                                                                                                                                          RAC: 1,126
                                                                                                                                                                                                                                          Message 40981 - Posted 6 Nov 2010 10:33:09 UTC

                                                                                                                                                                                                                                            Last modified: 6 Nov 2010 10:33:40 UTC

                                                                                                                                                                                                                                            Maybe if you incorporate the CPU-time a better idea of failure rate can be gotten.

                                                                                                                                                                                                                                            Looking at the stats for the first host of [B^S] mavau's list, the total CPU time spent on FAMOUS models comes to approximately 61 million seconds (60785865.05). About 48 million seconds of those (48398651.3) were spent on successfully completed models. Maybe it is fair to say that makes for a 80% success rate for that particular host? Those numbers are based on 87 models (55/32).
                                                                                                                                                                                                                                            Or would you have to take into account the time spent if all models had completed successfully? In that case you'll get an about 63% success rate

                                                                                                                                                                                                                                            [B^S] mavau
                                                                                                                                                                                                                                            Send message
                                                                                                                                                                                                                                            Joined: Aug 30 04
                                                                                                                                                                                                                                            Posts: 142
                                                                                                                                                                                                                                            Credit: 8,238,300
                                                                                                                                                                                                                                            RAC: 3,469
                                                                                                                                                                                                                                            Message 40985 - Posted 6 Nov 2010 14:17:38 UTC

                                                                                                                                                                                                                                              I've been through my results in more detail.
                                                                                                                                                                                                                                              First, some of my errors were successes on other platforms.
                                                                                                                                                                                                                                              Error while computing Darwin success
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6923876
                                                                                                                                                                                                                                              Error while computing Linux success
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6922152
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6919805
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6867265
                                                                                                                                                                                                                                              Error while computing Darwin and Linux success
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6869834
                                                                                                                                                                                                                                              Error while computing Windows 7 64-bit AMD success
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6870035
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6868920
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6867868
                                                                                                                                                                                                                                              Error while computing XP AMD success
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6868473

                                                                                                                                                                                                                                              And here is a look at my completed models failing on other platforms/combinations.
                                                                                                                                                                                                                                              This is not a full list. I've tried to exclude computers with constant failures, immediate failures...
                                                                                                                                                                                                                                              I haven't checked every failure. I've noted a few disk errors on Windows 7 I hadn't seen before towards the end of the list.
                                                                                                                                                                                                                                              Note the large number of invalid thetas on Darwin

                                                                                                                                                                                                                                              Linux AMD
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6837149
                                                                                                                                                                                                                                              Linux Xeon
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6870087
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6870008
                                                                                                                                                                                                                                              Darwin
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6867021
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6865840
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6889698
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6868148
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6868668
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6918367
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6918280
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6894568
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6922925
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6895570
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6893598
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6918731
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6918391
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6921744
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6920379
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6935286
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6938524
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6940555

                                                                                                                                                                                                                                              Linux and Darwin
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6869410
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6919132

                                                                                                                                                                                                                                              XP AMD
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6865756
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6889575
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6868483

                                                                                                                                                                                                                                              XP Intel
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6865602
                                                                                                                                                                                                                                              Windows 7 AMD Vista AMD
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6837049
                                                                                                                                                                                                                                              Server 2003 AMD Linux Intel
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6868542
                                                                                                                                                                                                                                              Windows 7 64 AMD Darwin
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6921166
                                                                                                                                                                                                                                              Windows 7 64 Intel disk error
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6918496
                                                                                                                                                                                                                                              Windows 7 64 Intel disk error and Darwin
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6921700

                                                                                                                                                                                                                                              XP AMD Darwin
                                                                                                                                                                                                                                              http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6921012

                                                                                                                                                                                                                                              ____________

                                                                                                                                                                                                                                              Forum search Site search

                                                                                                                                                                                                                                              [B^S] mavau
                                                                                                                                                                                                                                              Send message
                                                                                                                                                                                                                                              Joined: Aug 30 04
                                                                                                                                                                                                                                              Posts: 142
                                                                                                                                                                                                                                              Credit: 8,238,300
                                                                                                                                                                                                                                              RAC: 3,469
                                                                                                                                                                                                                                              Message 40986 - Posted 6 Nov 2010 19:41:20 UTC

                                                                                                                                                                                                                                                An early failure I'd missed (application doesn't show in the right column).
                                                                                                                                                                                                                                                Windows machines fail at the same point, Linux a little bit later, and Darwin succeeds.
                                                                                                                                                                                                                                                http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=6835898
                                                                                                                                                                                                                                                ____________

                                                                                                                                                                                                                                                Forum search Site search

                                                                                                                                                                                                                                                Les Bayliss
                                                                                                                                                                                                                                                Forum moderator
                                                                                                                                                                                                                                                Send message
                                                                                                                                                                                                                                                Joined: Sep 5 04
                                                                                                                                                                                                                                                Posts: 5428
                                                                                                                                                                                                                                                Credit: 9,074,925
                                                                                                                                                                                                                                                RAC: 1,934
                                                                                                                                                                                                                                                Message 40987 - Posted 6 Nov 2010 20:07:09 UTC

                                                                                                                                                                                                                                                  The reason for the larger proportion of Darwin failures, is because the compiler used couldn't be set to not use SSE2 on Macs.
                                                                                                                                                                                                                                                  So, while Windows and Linux were eventually set to not use SSE2, and therefore be more stable, (but slower), Macs weren't.
                                                                                                                                                                                                                                                  (All of this was during testing on the beta site.)

                                                                                                                                                                                                                                                  Statistics is beset with problems.




                                                                                                                                                                                                                                                  ____________
                                                                                                                                                                                                                                                  Backups: Here

                                                                                                                                                                                                                                                  [AF>france>pas-de-calais]symaski62
                                                                                                                                                                                                                                                  Send message
                                                                                                                                                                                                                                                  Joined: Aug 13 05
                                                                                                                                                                                                                                                  Posts: 54
                                                                                                                                                                                                                                                  Credit: 117,227
                                                                                                                                                                                                                                                  RAC: 0
                                                                                                                                                                                                                                                  Message 41031 - Posted 14 Nov 2010 23:40:52 UTC

                                                                                                                                                                                                                                                    http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=12006782

                                                                                                                                                                                                                                                    famous_w9iy_599_200_006759034_0

                                                                                                                                                                                                                                                    FAIL cold temperature °C


                                                                                                                                                                                                                                                    ____________

                                                                                                                                                                                                                                                    Profile mo.v
                                                                                                                                                                                                                                                    Forum moderator
                                                                                                                                                                                                                                                    Avatar
                                                                                                                                                                                                                                                    Send message
                                                                                                                                                                                                                                                    Joined: Sep 29 04
                                                                                                                                                                                                                                                    Posts: 2359
                                                                                                                                                                                                                                                    Credit: 7,177,874
                                                                                                                                                                                                                                                    RAC: 1,646
                                                                                                                                                                                                                                                    Message 41032 - Posted 15 Nov 2010 0:19:28 UTC

                                                                                                                                                                                                                                                      Yes, what a cold graph.
                                                                                                                                                                                                                                                      ____________
                                                                                                                                                                                                                                                      Cpdn news

                                                                                                                                                                                                                                                      Profile JIM
                                                                                                                                                                                                                                                      Send message
                                                                                                                                                                                                                                                      Joined: Dec 31 07
                                                                                                                                                                                                                                                      Posts: 682
                                                                                                                                                                                                                                                      Credit: 4,224,379
                                                                                                                                                                                                                                                      RAC: 2,953
                                                                                                                                                                                                                                                      Message 41034 - Posted 15 Nov 2010 0:56:15 UTC - in response to Message 41031.

                                                                                                                                                                                                                                                        I don’t think that I have ever seen a graph like that before. That is more than just a cold snap. It is more a glacial age. It looks more like the entire Earth was entering a snowball phase like what geologists think happened about 700 million years ago.

                                                                                                                                                                                                                                                        ____________

                                                                                                                                                                                                                                                        Profile JIM
                                                                                                                                                                                                                                                        Send message
                                                                                                                                                                                                                                                        Joined: Dec 31 07
                                                                                                                                                                                                                                                        Posts: 682
                                                                                                                                                                                                                                                        Credit: 4,224,379
                                                                                                                                                                                                                                                        RAC: 2,953
                                                                                                                                                                                                                                                        Message 41057 - Posted 17 Nov 2010 5:31:43 UTC

                                                                                                                                                                                                                                                          Famous_v1eo_999_200_00672929-1 failed about 56 years. Invalid theta. OS is Windows 7 64 bit SP1 RC running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

                                                                                                                                                                                                                                                          ____________

                                                                                                                                                                                                                                                          [B^S] mavau
                                                                                                                                                                                                                                                          Send message
                                                                                                                                                                                                                                                          Joined: Aug 30 04
                                                                                                                                                                                                                                                          Posts: 142
                                                                                                                                                                                                                                                          Credit: 8,238,300
                                                                                                                                                                                                                                                          RAC: 3,469
                                                                                                                                                                                                                                                          Message 41067 - Posted 17 Nov 2010 19:54:12 UTC

                                                                                                                                                                                                                                                            Here's one model I'm curious about (getting very cold).
                                                                                                                                                                                                                                                            Let's see how it develops:
                                                                                                                                                                                                                                                            http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=12000683
                                                                                                                                                                                                                                                            Another note, this w series seems to run much slower (.55 v .48. on my machine).

                                                                                                                                                                                                                                                            ____________

                                                                                                                                                                                                                                                            Forum search Site search

                                                                                                                                                                                                                                                            Les Bayliss
                                                                                                                                                                                                                                                            Forum moderator
                                                                                                                                                                                                                                                            Send message
                                                                                                                                                                                                                                                            Joined: Sep 5 04
                                                                                                                                                                                                                                                            Posts: 5428
                                                                                                                                                                                                                                                            Credit: 9,074,925
                                                                                                                                                                                                                                                            RAC: 1,934
                                                                                                                                                                                                                                                            Message 41069 - Posted 17 Nov 2010 20:48:01 UTC - in response to Message 41067.

                                                                                                                                                                                                                                                              From a post by Hiro on the beta site:

                                                                                                                                                                                                                                                              On the main site, we have just started famous_w series of experiment using the same version of Famous.

                                                                                                                                                                                                                                                              The initial workunits are spin up runs with a wider range of parameters, including a new parameter for the number of dynamic sweeps. Actually, we perturbed the sweep parameter before, but only for a few work units.


                                                                                                                                                                                                                                                              and later:

                                                                                                                                                                                                                                                              To add a bit of background, we started using 2 sweep dynamics to stabilize the model. This effectively make the time step of the atmospheric _dynamics_ by half. However, the run speed hardly increases because the atmospheric dynamics (excluding what we call "physics" and radiative transfer) is a very small in term of CPU time.

                                                                                                                                                                                                                                                              According to my 5 or so cluster runs for the millennium and some results from Bristol group, this eliminates most of the cold crashes (still not perfect, though).


                                                                                                                                                                                                                                                              I think that the 2nd post also refers to the "w" series models.


                                                                                                                                                                                                                                                              ____________
                                                                                                                                                                                                                                                              Backups: Here

                                                                                                                                                                                                                                                              [B^S] mavau
                                                                                                                                                                                                                                                              Send message
                                                                                                                                                                                                                                                              Joined: Aug 30 04
                                                                                                                                                                                                                                                              Posts: 142
                                                                                                                                                                                                                                                              Credit: 8,238,300
                                                                                                                                                                                                                                                              RAC: 3,469
                                                                                                                                                                                                                                                              Message 41113 - Posted 20 Nov 2010 19:31:17 UTC

                                                                                                                                                                                                                                                                I had missed this cold failure:
                                                                                                                                                                                                                                                                http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11997733
                                                                                                                                                                                                                                                                And this cold mode l is still running:
                                                                                                                                                                                                                                                                http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=12000683
                                                                                                                                                                                                                                                                ____________

                                                                                                                                                                                                                                                                Forum search Site search

                                                                                                                                                                                                                                                                [B^S] mavau
                                                                                                                                                                                                                                                                Send message
                                                                                                                                                                                                                                                                Joined: Aug 30 04
                                                                                                                                                                                                                                                                Posts: 142
                                                                                                                                                                                                                                                                Credit: 8,238,300
                                                                                                                                                                                                                                                                RAC: 3,469
                                                                                                                                                                                                                                                                Message 41154 - Posted 24 Nov 2010 15:01:55 UTC

                                                                                                                                                                                                                                                                  That second one took some time dying. Very cold.
                                                                                                                                                                                                                                                                  ____________

                                                                                                                                                                                                                                                                  Forum search Site search

                                                                                                                                                                                                                                                                  3rkko
                                                                                                                                                                                                                                                                  Send message
                                                                                                                                                                                                                                                                  Joined: Feb 12 08
                                                                                                                                                                                                                                                                  Posts: 54
                                                                                                                                                                                                                                                                  Credit: 4,247,765
                                                                                                                                                                                                                                                                  RAC: 0
                                                                                                                                                                                                                                                                  Message 41155 - Posted 24 Nov 2010 22:33:13 UTC

                                                                                                                                                                                                                                                                    Another cold failure:
                                                                                                                                                                                                                                                                    http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=12234436

                                                                                                                                                                                                                                                                    Profile JIM
                                                                                                                                                                                                                                                                    Send message
                                                                                                                                                                                                                                                                    Joined: Dec 31 07
                                                                                                                                                                                                                                                                    Posts: 682
                                                                                                                                                                                                                                                                    Credit: 4,224,379
                                                                                                                                                                                                                                                                    RAC: 2,953
                                                                                                                                                                                                                                                                    Message 41176 - Posted 28 Nov 2010 14:18:11 UTC

                                                                                                                                                                                                                                                                      Famous_ kHz_1999_200_006712331_3 completed successfully. OS is Windows 7 64 bit SP1 RC running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

                                                                                                                                                                                                                                                                      A very warm one. Average temp. rising steadily throughout the 21th and 22nd centuries from 17.2 to 22.9 degrees C. Rise is greatest in the Northern Hemisphere were it top out at 24.4C. Solar constant is at default.




                                                                                                                                                                                                                                                                      ____________

                                                                                                                                                                                                                                                                      Profile Greg van Paassen
                                                                                                                                                                                                                                                                      Send message
                                                                                                                                                                                                                                                                      Joined: Nov 17 07
                                                                                                                                                                                                                                                                      Posts: 142
                                                                                                                                                                                                                                                                      Credit: 4,271,370
                                                                                                                                                                                                                                                                      RAC: 0
                                                                                                                                                                                                                                                                      Message 41180 - Posted 28 Nov 2010 20:05:09 UTC

                                                                                                                                                                                                                                                                        Last modified: 28 Nov 2010 20:07:31 UTC

                                                                                                                                                                                                                                                                        In the famous_w0xx_599 series, I've had two fail and one succeed, so far.

                                                                                                                                                                                                                                                                        One of the failures was a runaway, reaching 38.5 Celsius before crashing. The other was a cold world, crashing at 8.7 C.

                                                                                                                                                                                                                                                                        The one that succeeded had quite extreme-looking values for ice fall speed, entrainment coefficient, and temp range of ice albedo variation. You just can't tell.

                                                                                                                                                                                                                                                                        Back on "v series" famouses now - the luck of the draw.

                                                                                                                                                                                                                                                                        Profile geophi
                                                                                                                                                                                                                                                                        Forum moderator
                                                                                                                                                                                                                                                                        Send message
                                                                                                                                                                                                                                                                        Joined: Aug 7 04
                                                                                                                                                                                                                                                                        Posts: 1478
                                                                                                                                                                                                                                                                        Credit: 23,133,788
                                                                                                                                                                                                                                                                        RAC: 11,481
                                                                                                                                                                                                                                                                        Message 41319 - Posted 18 Dec 2010 15:21:14 UTC

                                                                                                                                                                                                                                                                          Last modified: 18 Dec 2010 15:28:25 UTC

                                                                                                                                                                                                                                                                          Overall
                                                                                                                                                                                                                                                                          130 success, 73 failures while computing (64% success ratio)
                                                                                                                                                                                                                                                                          w-series
                                                                                                                                                                                                                                                                          2 success, 12 failures while computing (14% success ratio)

                                                                                                                                                                                                                                                                          Core i7 920 Linux
                                                                                                                                                                                                                                                                          Success
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          52/0
                                                                                                                                                                                                                                                                          Computing Failure
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          30/2

                                                                                                                                                                                                                                                                          Phenom II X4 940 Linux
                                                                                                                                                                                                                                                                          Success
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          50/0
                                                                                                                                                                                                                                                                          Computing Failure
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          27/6

                                                                                                                                                                                                                                                                          Phenom II X6 1090T Linux
                                                                                                                                                                                                                                                                          Success
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          12/2
                                                                                                                                                                                                                                                                          Computing Failure
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          7/1

                                                                                                                                                                                                                                                                          Phenom II X2 B93 Windows
                                                                                                                                                                                                                                                                          Success
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          10/0
                                                                                                                                                                                                                                                                          Computing Failure
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          5/1

                                                                                                                                                                                                                                                                          Core2 E8600 Windows
                                                                                                                                                                                                                                                                          Success
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          6/0
                                                                                                                                                                                                                                                                          Computing Failure
                                                                                                                                                                                                                                                                          All/w-series
                                                                                                                                                                                                                                                                          4/2

                                                                                                                                                                                                                                                                          [B^S] mavau
                                                                                                                                                                                                                                                                          Send message
                                                                                                                                                                                                                                                                          Joined: Aug 30 04
                                                                                                                                                                                                                                                                          Posts: 142
                                                                                                                                                                                                                                                                          Credit: 8,238,300
                                                                                                                                                                                                                                                                          RAC: 3,469
                                                                                                                                                                                                                                                                          Message 41494 - Posted 17 Jan 2011 19:47:27 UTC

                                                                                                                                                                                                                                                                            Had a big crash (6 models) 12 days ago, due to disk issues (bad sectors).
                                                                                                                                                                                                                                                                            Early symptom: McAfee check took ages to complete.
                                                                                                                                                                                                                                                                            I eventually noticed all the disk error messages in Event Viewer.
                                                                                                                                                                                                                                                                            This solution should work some time:
                                                                                                                                                                                                                                                                            Have chkdsk identify the bad sectors once in a while (second checkbox).

                                                                                                                                                                                                                                                                            The new batch has been successful, except for:
                                                                                                                                                                                                                                                                            http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=12400747

                                                                                                                                                                                                                                                                            I hadn't met the error before.
                                                                                                                                                                                                                                                                            Some links below for everybody's information:
                                                                                                                                                                                                                                                                            http://ncas-cms.nerc.ac.uk/trac/UMHelpdesk/ticket/399
                                                                                                                                                                                                                                                                            http://cpdnbeta.oerc.ox.ac.uk/forum_thread.php?id=229

                                                                                                                                                                                                                                                                            Happy crunching for 2011.
                                                                                                                                                                                                                                                                            ____________

                                                                                                                                                                                                                                                                            Forum search Site search

                                                                                                                                                                                                                                                                            Les Bayliss
                                                                                                                                                                                                                                                                            Forum moderator
                                                                                                                                                                                                                                                                            Send message
                                                                                                                                                                                                                                                                            Joined: Sep 5 04
                                                                                                                                                                                                                                                                            Posts: 5428
                                                                                                                                                                                                                                                                            Credit: 9,074,925
                                                                                                                                                                                                                                                                            RAC: 1,934
                                                                                                                                                                                                                                                                            Message 41496 - Posted 17 Jan 2011 20:17:37 UTC - in response to Message 41494.

                                                                                                                                                                                                                                                                              I've passed this on to the project person for FAMOUS.


                                                                                                                                                                                                                                                                              ____________
                                                                                                                                                                                                                                                                              Backups: Here

                                                                                                                                                                                                                                                                              Les Bayliss
                                                                                                                                                                                                                                                                              Forum moderator
                                                                                                                                                                                                                                                                              Send message
                                                                                                                                                                                                                                                                              Joined: Sep 5 04
                                                                                                                                                                                                                                                                              Posts: 5428
                                                                                                                                                                                                                                                                              Credit: 9,074,925
                                                                                                                                                                                                                                                                              RAC: 1,934
                                                                                                                                                                                                                                                                              Message 41503 - Posted 18 Jan 2011 19:52:09 UTC - in response to Message 41496.

                                                                                                                                                                                                                                                                                Mavau
                                                                                                                                                                                                                                                                                and others who get a Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH error message:

                                                                                                                                                                                                                                                                                This is caused by one of the auxiliary files not having sufficient data to cover the full modelling period.
                                                                                                                                                                                                                                                                                Those data sets still in the queue that were affected have now been removed.

                                                                                                                                                                                                                                                                                Apologies for the mix up.

                                                                                                                                                                                                                                                                                Also, Kraken has filled up yet again, and data needs to be moved off to storage.
                                                                                                                                                                                                                                                                                Milo is not available to do this, so we wait in hope. :)


                                                                                                                                                                                                                                                                                ____________
                                                                                                                                                                                                                                                                                Backups: Here

                                                                                                                                                                                                                                                                                dajashby
                                                                                                                                                                                                                                                                                Send message
                                                                                                                                                                                                                                                                                Joined: Sep 1 04
                                                                                                                                                                                                                                                                                Posts: 55
                                                                                                                                                                                                                                                                                Credit: 11,632,870
                                                                                                                                                                                                                                                                                RAC: 858
                                                                                                                                                                                                                                                                                Message 41505 - Posted 19 Jan 2011 0:07:35 UTC

                                                                                                                                                                                                                                                                                  I have a Macbook Air (1114220) that's been crunching since the middle of December. It's received nothing but FAMOUS models, and has completed 6 successfully out of 48 downloaded. I haven't checked them all, but the ones I have looked at state "INVALID THETA DETECTED". My impression from a search of the boards is that Famous models are reasonably prone to fail, but the percentage on this machine seems way too high. My question is, is there a way to conveniently exclude this machine from receiving Famous models, and if there is, should I do so? Of the 6 PCs I have on the project, this one is far and away the most "productive", when measured by credits received - over the last 5 days it's averaged 2,308.42 credits (1,154.21 per CPU core). My 2 Quad core Windows machines averaged 550 and 599 per core over the same period. The other two Core2 Duo machines (both Windows boxes) managed slightly less.
                                                                                                                                                                                                                                                                                  ____________
                                                                                                                                                                                                                                                                                  Derrick Ashby

                                                                                                                                                                                                                                                                                  Profile Greg van Paassen
                                                                                                                                                                                                                                                                                  Send message
                                                                                                                                                                                                                                                                                  Joined: Nov 17 07
                                                                                                                                                                                                                                                                                  Posts: 142
                                                                                                                                                                                                                                                                                  Credit: 4,271,370
                                                                                                                                                                                                                                                                                  RAC: 0
                                                                                                                                                                                                                                                                                  Message 41506 - Posted 19 Jan 2011 1:03:14 UTC - in response to Message 41505.

                                                                                                                                                                                                                                                                                    Yes, you can exclude the Air from getting Famouses.

                                                                                                                                                                                                                                                                                    To do so:

                                                                                                                                                                                                                                                                                    (1) Go into "Your account" - see the blue menu on the left.

                                                                                                                                                                                                                                                                                    (2). Scroll down to "computers", go into this, and then into "Details" for the Air. Set the Air to be in a different 'Location' from your other computers -- say, School.

                                                                                                                                                                                                                                                                                    (3) Back on the "Your account" page, go into "climateprediction.net preferences". Find the link for "Add preferences for School", and in there, select the applications that you want to allow, and de-select "accept work from other applications?"

                                                                                                                                                                                                                                                                                    ------------

                                                                                                                                                                                                                                                                                    The reason for high daily credit and famouses failing so frequently on Macs is that the CPDN programmers could not get the Famous application to compile without extra optimizations. The result is that Famouses run very fast but also crash more often on Macs than on other platforms.

                                                                                                                                                                                                                                                                                    HTH

                                                                                                                                                                                                                                                                                    3rkko
                                                                                                                                                                                                                                                                                    Send message
                                                                                                                                                                                                                                                                                    Joined: Feb 12 08
                                                                                                                                                                                                                                                                                    Posts: 54
                                                                                                                                                                                                                                                                                    Credit: 4,247,765
                                                                                                                                                                                                                                                                                    RAC: 0
                                                                                                                                                                                                                                                                                    Message 41509 - Posted 19 Jan 2011 16:48:50 UTC - in response to Message 41506.

                                                                                                                                                                                                                                                                                      Unfortunately Famous is currently the only model type available for Mac and Linux, so if you exclude Famous you will get no work at all.

                                                                                                                                                                                                                                                                                      Profile Dave Jackson
                                                                                                                                                                                                                                                                                      Send message
                                                                                                                                                                                                                                                                                      Joined: May 15 09
                                                                                                                                                                                                                                                                                      Posts: 870
                                                                                                                                                                                                                                                                                      Credit: 657,884
                                                                                                                                                                                                                                                                                      RAC: 201
                                                                                                                                                                                                                                                                                      Message 41510 - Posted 19 Jan 2011 18:57:00 UTC

                                                                                                                                                                                                                                                                                        64bit linux on dual core Intel
                                                                                                                                                                                                                                                                                        10 errored out all together, 2 probably due to reboot issues. 2 are 599 models which are known to be more prone to crashing.
                                                                                                                                                                                                                                                                                        u4pe1999 74gg999 ugyf1799 v3pc1899 vhcx1199 vizg1599 vizh1799 w56v599 w8y4599 w158599 Some invalid theta the rest negative pressure values.
                                                                                                                                                                                                                                                                                        Completed.
                                                                                                                                                                                                                                                                                        v3cz1799 v1b01799 va9x1799 ubdw1999 uh8d1799 which makes 5 or 1/3 completed. On my partners box winxp amd. vnt18199, the only famous unit started completed.

                                                                                                                                                                                                                                                                                        Post to thread

                                                                                                                                                                                                                                                                                        Message boards : Number crunching : FAMOUS SUCCESS/FAILURE RATIO




                                                                                                                                                                                                                                                                                        Copyright © 2002-2014 climateprediction.net