climateprediction.net home page

Linux Sulphur 4.23 Unstable


Advanced search

Questions and Answers : Unix/Linux : Linux Sulphur 4.23 Unstable

AuthorMessage
Profile geophi
Forum moderator
Send message
Joined: Aug 7 04
Posts: 1475
Credit: 22,606,103
RAC: 2,242
Message 19144 - Posted 10 Jan 2006 15:56:31 UTC

    Last modified: 26 May 2006 4:39:36 UTC

    Edit...** It appears that the sulphur 4.23 Linux model is completely unstable. The programmers know about it, but their time is consumed with finishing up development of the next experiment. Suggestion is to run other BOINC projects until this is fixed, or until the next experiment is released (Beta IS stable on Linux). Those running Linux sulphur 4.22 and previous should be relatively stable. **

    Edit 2...** LS Diseño has figured out a way to revert from 4.23 to 4.21 and has instructions for doing so below. You may want to give this a try if you want to continue crunching sulphur with Linux. **

    Edit 3...** The coupled model, hadcm3l, might now be downloaded from the climateprediction.net site. The model is stable, compared to sulphur 4.23 anyway, but some users still have troubles with it. It is essentially the same model as being run at the BBC site, and you may want to peruse the BBC Linux board if you run into problems.

    A dual Xeon 2.8 GHz running FC3 was running two sulphur 4.23 models. Each of them crashed during June of 1818, shortly after the halfway point of phase 1. Nothing intelligible was found at the end of yabsd.out in either model to help determine why they crashed.

    This PC has been very stable up until this point. The Results for this hostID are 1614429 and 1614592. Below is the terminal log of the last crash. Doesn\'t look like anything helpful in this logging.

    This looks like a problem with 4.23 or the WUs.

    sulphur_ilwx_000868209 - PH 1 TS 0130189 A - 13/06/1818 06:30 - H:M:S=0139:40:28 AVG= 3.86 DLT= 2.00
    sulphur_ixod_000883453 - PH 1 TS 0008121 A - 20/05/1811 04:30 - H:M:S=0008:02:09 AVG= 3.56 DLT= 2.00
    sulphur_ixod_000883453 - PH 1 TS 0008122 A - 20/05/1811 05:00 - H:M:S=0008:02:12 AVG= 3.56 DLT= 3.00
    Preparing for restart...
    Error: Restart files for not found
    Giving up, this result exceeded crash count for available restart files.
    deflating : restart.day
    deflating : yabsd.out
    sulphur_ixod_000883453 - PH 1 TS 0008123 A - 20/05/1811 05:30 - H:M:S=0008:02:14 AVG= 3.56 DLT= 2.00
    2006-01-10 09:57:27 [---] request_reschedule_cpus: process exited
    2006-01-10 09:57:27 [climateprediction.net] Computation for result sulphur_ilwx_000868209_0 finished
    sulphur_ixod_000883453 - PH 1 TS 0008124 A - 20/05/1811 06:00 - H:M:S=0008:02:16 AVG= 3.56 DLT= 2.00
    2006-01-10 09:57:28 [climateprediction.net] Unrecoverable error for result sulphur_ilwx_000868209_0 ({file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_1.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    {file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_2.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    {file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_3.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    {file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_4.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    {file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_5.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    )
    2006-01-10 09:57:28 [climateprediction.net] Unrecoverable error for result sulphur_ilwx_000868209_0 ({file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_1.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    {file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_2.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    {file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_3.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    {file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_4.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    {file_xfer_error}
    {file_name}sulphur_ilwx_000868209_0_5.zip{/file_name}
    {error_code}-161{/error_code}
    {error_message}{/error_message}
    {/file_xfer_error}
    )

    Les Bayliss
    Forum moderator
    Send message
    Joined: Sep 5 04
    Posts: 5367
    Credit: 8,876,229
    RAC: 549
    Message 19148 - Posted 10 Jan 2006 17:56:57 UTC

      Last modified: 10 Jan 2006 17:59:20 UTC

      I notice this near the start of the log:

      quote

      Error: Restart files for not found
      unquote

      I wonder if this is just a grammatical mistake, or does it mean that some part of the error code can\'t find the model name?

      It sure is a hard slog to get Linux to complete a sulphur model. :(

      edit
      It looks like one has to include the word quote oneself.

      Profile geophi
      Forum moderator
      Send message
      Joined: Aug 7 04
      Posts: 1475
      Credit: 22,606,103
      RAC: 2,242
      Message 19154 - Posted 10 Jan 2006 19:09:53 UTC - in response to Message 19148.

        I notice this near the start of the log:

        quote
        Error: Restart files for not found
        unquote

        I wonder if this is just a grammatical mistake, or does it mean that some part of the error code can\'t find the model name?

        Typically it has something like \"dataout/restart.year\" after the \"for\". But it was obvious that it had already rewound previously because it was moving along at 3.86 s/TS, when yesterday it was doing 3.45 s/TS.

        Melvyn Bobo Slacke
        Avatar
        Send message
        Joined: Aug 16 04
        Posts: 124
        Credit: 4,307,602
        RAC: 934
        Message 19181 - Posted 11 Jan 2006 20:17:32 UTC

          Last modified: 11 Jan 2006 20:57:47 UTC

          My first two models on AMD XP also shows crashes.
          The first one gave up after TS 113528 26/6/1817 04:00. It also had crash&rewinds after trickle 8 (TS 86416). Terminal log looks similar to geophi\'s and nothing looks special at the end of yabsd.out.
          Model 1617143

          The second one has had three crash@rewinds now but is still running.
          It looked like this on the terminal:

          sulphur_irud_000875893 - PH 1 TS 0113997 A - 05/07/1817 22:30 - H:M:S=0110:41:48 AVG= 3.50 DLT= 3.00
          sulphur_irud_000875893 - PH 1 TS 0113998 A - 05/07/1817 23:00 - H:M:S=0110:41:50 AVG= 3.50 DLT= 2.00
          Preparing for restart...
          Rewinding a model-day...
          Starting model ID sulphur_irud_000875893 Phase 1
          Getting pthread attributes - retval=0
          Setting pthread size (66560000 bytes) - retval=0
          Waiting for model startup, this may take a minute...
          sulphur_irud_000875893 - PH 1 TS 0113905 A - 04/07/1817 00:30 - H:M:S=0110:41:52 AVG= 3.50 DLT= 0.00
          sulphur_irud_000875893 - PH 1 TS 0113906 A - 04/07/1817 01:00 - H:M:S=0110:42:02 AVG= 3.50 DLT= 9.94
          .
          .
          sulphur_irud_000875893 - PH 1 TS 0113996 A - 05/07/1817 22:00 - H:M:S=0110:47:18 AVG= 3.50 DLT=10.00
          sulphur_irud_000875893 - PH 1 TS 0113997 A - 05/07/1817 22:30 - H:M:S=0110:47:20 AVG= 3.50 DLT= 1.99
          Preparing for restart...
          Rewinding a model-month...
          Copying restart files for model retry...
          Starting model ID sulphur_irud_000875893 Phase 1
          Getting pthread attributes - retval=0
          Setting pthread size (66560000 bytes) - retval=0
          Waiting for model startup, this may take a minute...
          sulphur_irud_000875893 - PH 1 TS 0113761 A - 01/07/1817 00:30 - H:M:S=0110:47:23 AVG= 3.51 DLT= 0.00
          sulphur_irud_000875893 - PH 1 TS 0113762 A - 01/07/1817 01:00 - H:M:S=0110:47:34 AVG= 3.51 DLT=10.50
          .
          .
          sulphur_irud_000875893 - PH 1 TS 0113997 A - 05/07/1817 22:30 - H:M:S=0111:01:09 AVG= 3.51 DLT= 1.91
          sulphur_irud_000875893 - PH 1 TS 0113998 A - 05/07/1817 23:00 - H:M:S=0111:01:11 AVG= 3.51 DLT= 1.91
          Preparing for restart...
          Rewinding a model-year...
          Copying restart files for model retry...
          Starting model ID sulphur_irud_000875893 Phase 1
          Getting pthread attributes - retval=0
          Setting pthread size (66560000 bytes) - retval=0
          Waiting for model startup, this may take a minute...
          sulphur_irud_000875893 - PH 1 TS 0103681 A - 01/12/1816 00:30 - H:M:S=0111:01:14 AVG= 3.85 DLT= 0.00
          sulphur_irud_000875893 - PH 1 TS 0103682 A - 01/12/1816 01:00 - H:M:S=0111:01:24 AVG= 3.85 DLT=10.28

          This is HostID=3880, nice and reliable.
          I took a copy of yabsd.out but I don\'t know if there\'s something special there,
          two warning messages looks like this:
          INITTIME: Warning- New STEP doesn\'t match old value
          Internal model id 1 Old= 103536 New= 103680

          So guess this one will finish too in a couple of hours. Something special I should monitor?


          ____________

          Melvyn Bobo Slacke
          Avatar
          Send message
          Joined: Aug 16 04
          Posts: 124
          Credit: 4,307,602
          RAC: 934
          Message 19202 - Posted 12 Jan 2006 11:27:18 UTC

            The model on host 3880 gave up at TS 112422 03/06 1817 03:00, model 1622417.

            The bottom of yabsd.out looks like this:
            Model aborted with error code - 1 Routine and message:-
            ATM_DYN : NEGATIVE THETA DETECTED.
            ____________

            Melvyn Bobo Slacke
            Avatar
            Send message
            Joined: Aug 16 04
            Posts: 124
            Credit: 4,307,602
            RAC: 934
            Message 19301 - Posted 14 Jan 2006 17:31:48 UTC

              My third model also crashed.
              ____________

              Stefan Mathe
              Send message
              Joined: Sep 28 04
              Posts: 36
              Credit: 268,150
              RAC: 0
              Message 19347 - Posted 15 Jan 2006 22:29:35 UTC - in response to Message 19301.


                Got the exact same error as you guys did. My model (running on a Red Hat Linux), died around timestep 113,000. Same error log as mentioned by you.

                My last backup was before the crash. Now I have a backup with the crashed model. The crash seems to be reproducible, so running from the first backup you can reproduce the crash very easily.

                So...

                I would like to announce that I am keeping both archives (from before and after the crash), and if the developers need them for reproducing and debugging the error, I would gladly upload the data. If it could help to take a look at them, just give me an FTP connection on some server or something similar, and I shall upload the two backups. They are around 160 Mb each.

                Cheers,
                Stefan.
                ____________

                Profile geophi
                Forum moderator
                Send message
                Joined: Aug 7 04
                Posts: 1475
                Credit: 22,606,103
                RAC: 2,242
                Message 19362 - Posted 16 Jan 2006 14:25:31 UTC

                  Two more sulphur 4.23 models have failed on the dual Xeon between 10,000 and 14000 timesteps into the first phase. Not good. I\'m suspending sulphur on that PC now.

                  Profile geophi
                  Forum moderator
                  Send message
                  Joined: Aug 7 04
                  Posts: 1475
                  Credit: 22,606,103
                  RAC: 2,242
                  Message 19392 - Posted 17 Jan 2006 14:15:04 UTC

                    Another crash after about 10 trickles on one of my AMD PCs.

                    http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=1632117

                    Stefan Mathe
                    Send message
                    Joined: Sep 28 04
                    Posts: 36
                    Credit: 268,150
                    RAC: 0
                    Message 19399 - Posted 17 Jan 2006 22:35:33 UTC - in response to Message 19392.

                      Last modified: 17 Jan 2006 23:18:22 UTC

                      Judging by the crashes everyone seems to be getting, it seems that sulphur no longer works on Linux.

                      Thus I have suspended all of my Linux workstations until the developers find the problem. Switched these machines to LHC@Home in the meantime...

                      Hope the problem will be fixed soon. :((

                      Stefan.
                      ____________

                      Melvyn Bobo Slacke
                      Avatar
                      Send message
                      Joined: Aug 16 04
                      Posts: 124
                      Credit: 4,307,602
                      RAC: 934
                      Message 19463 - Posted 20 Jan 2006 13:29:59 UTC

                        Last modified: 20 Jan 2006 13:54:48 UTC

                        My fourth model, 1638004, ended on TS 114329 12/07/1817 20:30.
                        The fifth model, 1631205, gave up on TS 113686 29/06/1817 11:00. This was on a nice XP 2500+ at stock speed.

                        All of my misbehaving models were created 23 Dec.
                        Got me a newer one now from 16 Jan.
                        ____________

                        Profile geophi
                        Forum moderator
                        Send message
                        Joined: Aug 7 04
                        Posts: 1475
                        Credit: 22,606,103
                        RAC: 2,242
                        Message 19468 - Posted 20 Jan 2006 15:21:16 UTC

                          Last modified: 4 Feb 2006 14:17:13 UTC

                          I\'ve e-mailed Tolu about it and he knows there\'s a problem. When it will be fixed...given the upcoming coupled launch...??

                          Melvyn Bobo Slacke
                          Avatar
                          Send message
                          Joined: Aug 16 04
                          Posts: 124
                          Credit: 4,307,602
                          RAC: 934
                          Message 19470 - Posted 20 Jan 2006 15:55:07 UTC

                            Oh..
                            I wish they could switch to slabs then, just for Linux?
                            I only have 1.5 old slabs here now to feed three machines until the HadCM3L launch. (No, I can\'t switch to Windows)
                            ____________

                            Desti
                            Send message
                            Joined: Aug 6 04
                            Posts: 110
                            Credit: 7,123,965
                            RAC: 965
                            Message 19473 - Posted 20 Jan 2006 18:20:21 UTC

                              Last modified: 20 Jan 2006 18:22:25 UTC

                              My sulphur 4.23 crashed too :-( (A64 X2 x86_64 Gentoo)


                              sulphur_ioa2_000871274 - PH 1 TS 0129201 A - 22/05/1818 16:30 - H:M:S=0103:50:34 AVG= 2.89 DLT= 1.00
                              sulphur_ioa2_000871274 - PH 1 TS 0129202 A - 22/05/1818 17:00 - H:M:S=0103:50:36 AVG= 2.89 DLT= 2.00
                              sulphur_ioa2_000871274 - PH 1 TS 0129203 A - 22/05/1818 17:30 - H:M:S=0103:50:38 AVG= 2.89 DLT= 2.00
                              sulphur_ioa2_000871274 - PH 1 TS 0129204 A - 22/05/1818 18:00 - H:M:S=0103:50:40 AVG= 2.89 DLT= 1.98
                              sulphur_ioa2_000871274 - PH 1 TS 0129205 A - 22/05/1818 18:30 - H:M:S=0103:50:41 AVG= 2.89 DLT= 0.95
                              sulphur_ioa2_000871274 - PH 1 TS 0129206 A - 22/05/1818 19:00 - H:M:S=0103:50:49 AVG= 2.89 DLT= 7.99
                              Preparing for restart...
                              Error: Restart files for not found
                              Giving up, this result exceeded crash count for available restart files.
                              deflating : restart.day
                              deflating : yabsd.out
                              2006-01-20 15:56:06 [---] request_reschedule_cpus: process exited
                              2006-01-20 15:56:06 [climateprediction.net] Computation for result sulphur_ioa2_000871274_0 finished
                              2006-01-20 15:56:06 [LHC@home] Starting result wjan1A_v6s4hvnom_mqx_nc__17__64.269_59.279__4_6__6__85_1_sixvf_boinc81550_3 using sixtrack version 466
                              2006-01-20 15:56:07 [climateprediction.net] Unrecoverable error for result sulphur_ioa2_000871274_0 (<file_xfer_error>
                              <file_name>sulphur_ioa2_000871274_0_1.zip</file_name>
                              <error_code>-161</error_code>
                              <error_message></error_message>
                              </file_xfer_error>

                              2006-01-20 15:56:07 [climateprediction.net] Unrecoverable error for result sulphur_ioa2_000871274_0 (<file_xfer_error>





                              2006-01-20 16:42:37 [LHC@home] Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
                              2006-01-20 16:42:37 [LHC@home] Reason: To report results
                              2006-01-20 16:42:37 [LHC@home] Reporting 1 results
                              2006-01-20 16:42:42 [LHC@home] Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
                              2006-01-20 17:20:23 [LHC@home] Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
                              2006-01-20 17:20:23 [LHC@home] Reason: To fetch work
                              2006-01-20 17:20:23 [LHC@home] Requesting 12 seconds of new work
                              2006-01-20 17:20:28 [LHC@home] Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
                              2006-01-20 17:20:29 [LHC@home] Started download of woct1_v6s4hvnom_mqx-oct1__5__64.202_59.212__4_6__6__55_1_sixvf_boinc11300.zip
                              2006-01-20 17:20:31 [LHC@home] Finished download of woct1_v6s4hvnom_mqx-oct1__5__64.202_59.212__4_6__6__55_1_sixvf_boinc11300.zip
                              2006-01-20 17:20:31 [LHC@home] Throughput 33359 bytes/sec
                              2006-01-20 17:20:32 [---] request_reschedule_cpus: files downloaded
                              2006-01-20 17:20:32 [Einstein@Home] Pausing result z1_0361.0__48_S4R2a_2 (removed from memory)
                              2006-01-20 17:20:32 [LHC@home] Starting result woct1_v6s4hvnom_mqx-oct1__5__64.202_59.212__4_6__6__55_1_sixvf_boinc11300_1 using sixtrack version 466
                              2006-01-20 17:20:33 [---] request_reschedule_cpus: process exited
                              2006-01-20 18:20:10 [climateprediction.net] Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
                              2006-01-20 18:20:10 [climateprediction.net] Reason: To report results
                              2006-01-20 18:20:10 [climateprediction.net] Reporting 1 results
                              2006-01-20 18:20:15 [climateprediction.net] Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded


                              (doh, tags are filtered)
                              ____________
                              Linux Users Everywhere @ BOINC

                              Jan
                              Send message
                              Joined: Nov 1 05
                              Posts: 2
                              Credit: 80,395
                              RAC: 0
                              Message 19554 - Posted 22 Jan 2006 21:32:08 UTC

                                Same problem for the third time with Sulphur on my Opteron244 SuSE 10.0 64-bit.

                                The problem arose after the 10-th trickle the first 2 times and after the 9-th the third time

                                A fourth file is still running but if it crashes as well, I\'ll stop calculating on Climateprediction files foe the moment

                                Report of the last crash

                                2006-01-22 21:13:44 [climateprediction.net] Unrecoverable error for result sulphur_i80w_100850208_0 (<file_xfer_error>
                                <file_name>sulphur_i80w_100850208_0_1.zip</file_name>
                                <error_code>-161</error_code>
                                <error_message></error_message>
                                </file_xfer_error>
                                <file_xfer_error>
                                <file_name>sulphur_i80w_100850208_0_2.zip</file_name>
                                <error_code>-161</error_code>
                                <error_message></error_message>
                                </file_xfer_error>
                                <file_xfer_error>
                                <file_name>sulphur_i80w_100850208_0_3.zip</file_name>
                                <error_code>-161</error_code>
                                <error_message></error_message>
                                </file_xfer_error>
                                <file_xfer_error>
                                <file_name>sulphur_i80w_100850208_0_4.zip</file_name>
                                <error_code>-161</error_code>
                                <error_message></error_message>
                                </file_xfer_error>
                                <file_xfer_error>
                                <file_name>sulphur_i80w_100850208_0_5.zip</file_name>
                                <error_code>-161</error_code>
                                <error_message></error_message>
                                </file_xfer_error>
                                )

                                Beer for Linux Users Everywhere
                                ____________

                                Les Bayliss
                                Forum moderator
                                Send message
                                Joined: Sep 5 04
                                Posts: 5367
                                Credit: 8,876,229
                                RAC: 549
                                Message 19555 - Posted 22 Jan 2006 21:45:00 UTC

                                  Beer
                                  The 161 errors are a \'red herring\'. If there is any record of the REAL reason for the failure, it will be near the bottom of the file: yabsd.out, which is in the dataout folder of the model.

                                  Profile geophi
                                  Forum moderator
                                  Send message
                                  Joined: Aug 7 04
                                  Posts: 1475
                                  Credit: 22,606,103
                                  RAC: 2,242
                                  Message 19560 - Posted 23 Jan 2006 0:22:21 UTC - in response to Message 19554.

                                    The problem arose after the 10-th trickle the first 2 times and after the 9-th the third time

                                    Yep, that sounds like the 4.23 problem. All five of of my failures crashed between 9 and 13 trickles into the run.

                                    Franko30
                                    Send message
                                    Joined: Jan 10 06
                                    Posts: 3
                                    Credit: 259,190
                                    RAC: 0
                                    Message 19851 - Posted 1 Feb 2006 10:28:58 UTC - in response to Message 19555.

                                      Last modified: 1 Feb 2006 10:31:26 UTC

                                      Hi,

                                      same errors here on an Athlon 1500+ and 3000+ both running Ubuntu Breezy 5.10.

                                      Can\'t post the actual 161 error message, as the pasted text confuses the BBcode so only part of the error message gets displayed...

                                      The yabsd.out file contains a lot of cryptic stuff, but what I can make out are warnings about computations giving negative values...

                                      Melvyn Bobo Slacke
                                      Avatar
                                      Send message
                                      Joined: Aug 16 04
                                      Posts: 124
                                      Credit: 4,307,602
                                      RAC: 934
                                      Message 19974 - Posted 4 Feb 2006 13:41:25 UTC

                                        Last modified: 4 Feb 2006 13:50:30 UTC

                                        Three more crashed models around trickle 10.
                                        That\'s a total of eight now on four different machines.

                                        Edit: got a new one created today, but this is the last try.

                                        Profile geophi
                                        Forum moderator
                                        Send message
                                        Joined: Aug 7 04
                                        Posts: 1475
                                        Credit: 22,606,103
                                        RAC: 2,242
                                        Message 19976 - Posted 4 Feb 2006 14:15:31 UTC

                                          Although Tolu and Carl know about this problem, I have a feeling it won\'t be fixed in a new sulphur version until after the launch of the coupled model experiment. That is where their time is going now.

                                          Melvyn Bobo Slacke
                                          Avatar
                                          Send message
                                          Joined: Aug 16 04
                                          Posts: 124
                                          Credit: 4,307,602
                                          RAC: 934
                                          Message 19977 - Posted 4 Feb 2006 15:21:46 UTC

                                            Geophi, yes I know and I\'m looking forward to the coupled model, fast and stable as it seems.
                                            Anyway one box here was shut down since LHC has no units, a second one restarted a Spinup model, the third is chewing on an old slab model, a fourth went to GIMPS, and this one does sulphur and the coupled beta.

                                            Scarf
                                            Send message
                                            Joined: Aug 25 04
                                            Posts: 9
                                            Credit: 10,927,709
                                            RAC: 4,526
                                            Message 20292 - Posted 15 Feb 2006 18:26:03 UTC

                                              isn\'t it possible to run manually the old hadsm project on these pcs?

                                              Les Bayliss
                                              Forum moderator
                                              Send message
                                              Joined: Sep 5 04
                                              Posts: 5367
                                              Credit: 8,876,229
                                              RAC: 549
                                              Message 20297 - Posted 15 Feb 2006 18:46:58 UTC

                                                Scarf
                                                Which old Project?
                                                If you mean slab models, they finished last year.

                                                Jean-David Beyer
                                                Send message
                                                Joined: Aug 5 04
                                                Posts: 96
                                                Credit: 1,819,031
                                                RAC: 284
                                                Message 20348 - Posted 16 Feb 2006 14:21:28 UTC - in response to Message 19468.

                                                  I\'ve e-mailed Tolu about it and he knows there\'s a problem. When it will be fixed...given the upcoming coupled launch...??


                                                  I have a 4.22 one that is still working, but all the 4.23 ones fail after a little while (as is well known, now).

                                                  But should we have boincmgr boycott all new climateprediction work units for a while? Is there any benefit for computing part way through phase 1 and bombing? I know I get credit for doing that, but if it does no good, I would rather use the processor time for something else.
                                                  ____________

                                                  Les Bayliss
                                                  Forum moderator
                                                  Send message
                                                  Joined: Sep 5 04
                                                  Posts: 5367
                                                  Credit: 8,876,229
                                                  RAC: 549
                                                  Message 20355 - Posted 16 Feb 2006 19:33:21 UTC

                                                    If you can finish phase 1 and upload it, then it will be very useful, as it contains extra data that the researchers need for starting the TCM here.

                                                    Jean-David Beyer
                                                    Send message
                                                    Joined: Aug 5 04
                                                    Posts: 96
                                                    Credit: 1,819,031
                                                    RAC: 284
                                                    Message 20358 - Posted 16 Feb 2006 19:47:52 UTC - in response to Message 20355.

                                                      If you can finish phase 1 and upload it, then it will be very useful, as it contains extra data that the researchers need for starting the TCM here.



                                                      Nope: most recent failure got up only to phase 1 timestep 108020 in work unit sulphur_hbbt_100807833_0 a.k.a. Result # 1792194
                                                      ____________

                                                      Profile Thyme Lawn
                                                      Forum moderator
                                                      Send message
                                                      Joined: Aug 5 04
                                                      Posts: 1232
                                                      Credit: 10,354,096
                                                      RAC: 1,273
                                                      Message 20360 - Posted 16 Feb 2006 20:37:11 UTC

                                                        You could always set CPDN to get no new work and connect to the BBC coupled model project (http://bbc.cpdn.org) instead.
                                                        ____________
                                                        "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer

                                                        Jean-David Beyer
                                                        Send message
                                                        Joined: Aug 5 04
                                                        Posts: 96
                                                        Credit: 1,819,031
                                                        RAC: 284
                                                        Message 20366 - Posted 16 Feb 2006 22:18:42 UTC - in response to Message 20360.

                                                          You could always set CPDN to get no new work and connect to the BBC coupled model project (http://bbc.cpdn.org) instead.


                                                          OK. I hope someone posts here when climateprediction starts sending out work units that I can process on Linux when the time comes.
                                                          ____________

                                                          cjrt
                                                          Send message
                                                          Joined: Sep 19 05
                                                          Posts: 4
                                                          Credit: 1,215,336
                                                          RAC: 0
                                                          Message 20489 - Posted 19 Feb 2006 14:05:06 UTC - in response to Message 20366.

                                                            You could always set CPDN to get no new work and connect to the BBC coupled model project (http://bbc.cpdn.org) instead.


                                                            Incidentally, as far as I can see the BBC one, which starts at 1920, is in fact the beginning of the new whizzy exciting next-gen experiment 2 which various people have been working their systems so hard doing the spin-up for- it\'s not a cut down super-stable cpdn as I have seen suggested in 1 or 2 threads (not this one), it\'s the future, man! (Please, if I\'m wrong about this someone let me know).

                                                            As I have 1 windows box on normal cpdn and one linux one now on bbc.cpdn, they\'re technically on 2 different projects which offends my tidy mind, but at least they\'re both working now!

                                                            Chris
                                                            ____________

                                                            Les Bayliss
                                                            Forum moderator
                                                            Send message
                                                            Joined: Sep 5 04
                                                            Posts: 5367
                                                            Credit: 8,876,229
                                                            RAC: 549
                                                            Message 20509 - Posted 19 Feb 2006 19:42:17 UTC

                                                              Last modified: 19 Feb 2006 19:43:31 UTC

                                                              cjrt
                                                              You got it. Welcome to the future.
                                                              Coming soon to this site as well. (As soon as the dust settles.)

                                                              If you\'re a multi-project person, the way to run the BBC project is to use a normal 5.2.13 version of BOINC and just attach to the BBC project, the same as any other.

                                                              This new model type starts at the end of 1920, just before New Years.

                                                              Scarf
                                                              Send message
                                                              Joined: Aug 25 04
                                                              Posts: 9
                                                              Credit: 10,927,709
                                                              RAC: 4,526
                                                              Message 20620 - Posted 22 Feb 2006 16:02:32 UTC - in response to Message 20297.

                                                                Scarf
                                                                Which old Project?
                                                                If you mean slab models, they finished last year.


                                                                yes I mean the slab models, hadsm was the client name for that as far as I know.
                                                                I guess i have to wait for a new sulphur client or the start of the new model for regular cpdn.

                                                                cjrt
                                                                Send message
                                                                Joined: Sep 19 05
                                                                Posts: 4
                                                                Credit: 1,215,336
                                                                RAC: 0
                                                                Message 20657 - Posted 23 Feb 2006 10:28:59 UTC - in response to Message 20509.

                                                                  cjrt
                                                                  If you\'re a multi-project person, the way to run the BBC project is to use a normal 5.2.13 version of BOINC and just attach to the BBC project, the same as any other.

                                                                  Yes, that\'s what I did, but I\'ve still ended up with one (windows) system running cpdn and one (linux) system attached to the bbc.cpdn.org website- they appear to me to be running as totally separate projects, though both under boinc. Did I do it wrong?

                                                                  Chris
                                                                  ____________

                                                                  Les Bayliss
                                                                  Forum moderator
                                                                  Send message
                                                                  Joined: Sep 5 04
                                                                  Posts: 5367
                                                                  Credit: 8,876,229
                                                                  RAC: 549
                                                                  Message 20665 - Posted 23 Feb 2006 14:19:35 UTC

                                                                    They are indeed, totally separate projects, with totally separate listings for credits on stats sites.

                                                                    LS Diseño
                                                                    Send message
                                                                    Joined: Nov 4 04
                                                                    Posts: 16
                                                                    Credit: 11,577,003
                                                                    RAC: 0
                                                                    Message 20673 - Posted 23 Feb 2006 16:24:02 UTC

                                                                      I have decided to process the sulphur workunits using 4.21 until the coupled model becomes available or a new/old version of the linux application is released as 4.24+. I do not know if I will be able to upload the result, but I can always revert to 4.23 for doing that.

                                                                      Lets see what happens...


                                                                      ____________

                                                                      Profile geophi
                                                                      Forum moderator
                                                                      Send message
                                                                      Joined: Aug 7 04
                                                                      Posts: 1475
                                                                      Credit: 22,606,103
                                                                      RAC: 2,242
                                                                      Message 20697 - Posted 24 Feb 2006 2:36:36 UTC - in response to Message 20673.

                                                                        I have decided to process the sulphur workunits using 4.21 until the coupled model becomes available or a new/old version of the linux application is released as 4.24+. I do not know if I will be able to upload the result, but I can always revert to 4.23 for doing that.

                                                                        I wish there was an easy way to revert to 4.21.

                                                                        LS Diseño
                                                                        Send message
                                                                        Joined: Nov 4 04
                                                                        Posts: 16
                                                                        Credit: 11,577,003
                                                                        RAC: 0
                                                                        Message 20713 - Posted 24 Feb 2006 9:01:47 UTC

                                                                          Last modified: 24 Feb 2006 9:22:07 UTC

                                                                          I just dropped the 4.21 executables in the climateprediction.net directory, and edited the client_state.xml file to add the (file_info) and (app_version) bits. Then I changed the version of the application in each of the (workunits). When I restarted the client, it complained about a couple of shared memory symbols (I believe is because of the graphics, which I do not use) but it continue running using 4.21 all right.

                                                                          So far, the clients I modified have trickled, but I have not reach the point yet where I have to upload a file.

                                                                          Edit: Changed parenthesis () for angle brackets as these do not show.
                                                                          ____________

                                                                          LS Diseño
                                                                          Send message
                                                                          Joined: Nov 4 04
                                                                          Posts: 16
                                                                          Credit: 11,577,003
                                                                          RAC: 0
                                                                          Message 20714 - Posted 24 Feb 2006 9:12:57 UTC

                                                                            To follow up with this issue, I remember the team re-released hadsm3 (slab) 4.04 as 4.13 when 4.10, 4.11 and 4.12 were found unstable, much to my disappointment because I never had problems with them and 4.1x were much faster on Athlon 64 processors.

                                                                            Do you know if re-relasing sulphur 4.21 as 4.24 involves a lot of work? Does it require to regenerate the workunits which are queued to be processed? The answer is probably yes :-(

                                                                            ____________

                                                                            cjrt
                                                                            Send message
                                                                            Joined: Sep 19 05
                                                                            Posts: 4
                                                                            Credit: 1,215,336
                                                                            RAC: 0
                                                                            Message 20745 - Posted 25 Feb 2006 14:25:04 UTC - in response to Message 20665.

                                                                              They are indeed, totally separate projects, with totally separate listings for credits on stats sites.

                                                                              Thanks for the info Les, good thing I\'m not it it for huge credit scores then:-) (Never going to happen with an Athlon XP 3200 & an XP 2600 anyway!)

                                                                              Chris
                                                                              ____________

                                                                              LS Diseño
                                                                              Send message
                                                                              Joined: Nov 4 04
                                                                              Posts: 16
                                                                              Credit: 11,577,003
                                                                              RAC: 0
                                                                              Message 20850 - Posted 28 Feb 2006 13:39:28 UTC - in response to Message 20673.

                                                                                I have decided to process the sulphur workunits using 4.21 until the coupled model becomes available or a new/old version of the linux application is released as 4.24+. I do not know if I will be able to upload the result, but I can always revert to 4.23 for doing that.

                                                                                Lets see what happens...



                                                                                It seems to work. I just uploaded the intermediate xx...xxxx_1.zip file at the end of phase 1.

                                                                                ____________

                                                                                Profile geophi
                                                                                Forum moderator
                                                                                Send message
                                                                                Joined: Aug 7 04
                                                                                Posts: 1475
                                                                                Credit: 22,606,103
                                                                                RAC: 2,242
                                                                                Message 20851 - Posted 28 Feb 2006 16:54:15 UTC - in response to Message 20850.

                                                                                  Last modified: 28 Feb 2006 17:16:46 UTC

                                                                                  I have decided to process the sulphur workunits using 4.21 until the coupled model becomes available or a new/old version of the linux application is released as 4.24+. I do not know if I will be able to upload the result, but I can always revert to 4.23 for doing that.

                                                                                  Lets see what happens...



                                                                                  It seems to work. I just uploaded the intermediate xx...xxxx_1.zip file at the end of phase 1.

                                                                                  In order for others who are very persistent to be able to do this, they would need access to the 4.21 executables, zip files, and the associated changes in the snippets of code in the xml files. I know Honza wrote something on upgrading here so I imagine one can attempt to follow that (with appropriate changes) to downgrade, but the files are needed (downloadable) and the file signatures.

                                                                                  LS Diseño
                                                                                  Send message
                                                                                  Joined: Nov 4 04
                                                                                  Posts: 16
                                                                                  Credit: 11,577,003
                                                                                  RAC: 0
                                                                                  Message 20853 - Posted 28 Feb 2006 18:22:33 UTC - in response to Message 20851.


                                                                                    In order for others who are very persistent to be able to do this, they would need access to the 4.21 executables, zip files, and the associated changes in the snippets of code in the xml files. I know Honza wrote something on upgrading here so I imagine one can attempt to follow that (with appropriate changes) to downgrade, but the files are needed (downloadable) and the file signatures.


                                                                                    Certainly. Do you know where I can upload the files?

                                                                                    ____________

                                                                                    Profile geophi
                                                                                    Forum moderator
                                                                                    Send message
                                                                                    Joined: Aug 7 04
                                                                                    Posts: 1475
                                                                                    Credit: 22,606,103
                                                                                    RAC: 2,242
                                                                                    Message 20857 - Posted 28 Feb 2006 19:01:56 UTC - in response to Message 20853.

                                                                                      Last modified: 1 Mar 2006 15:18:29 UTC

                                                                                      Certainly. Do you know where I can upload the files?

                                                                                      I found these files at the url below

                                                                                      http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_4.21_i686-pc-linux-gnu
                                                                                      http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_4.21_i686-pc-linux-gnu.so
                                                                                      http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_data_4.21_i686-pc-linux-gnu.zip
                                                                                      http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_um_4.21_i686-pc-linux-gnu.zip
                                                                                      http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_se_4.21_i686-pc-linux-gnu.zip

                                                                                      but we\'ll need the file_signature information for those mentioned in the xml file.

                                                                                      LS Diseño
                                                                                      Send message
                                                                                      Joined: Nov 4 04
                                                                                      Posts: 16
                                                                                      Credit: 11,577,003
                                                                                      RAC: 0
                                                                                      Message 20859 - Posted 28 Feb 2006 19:19:48 UTC - in response to Message 20857.

                                                                                        Last modified: 28 Feb 2006 19:45:36 UTC


                                                                                        I found these files at the url below

                                                                                        http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_4.21_i686-pc-linux-gnu
                                                                                        http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_4.21_i686-pc-linux-gnu.so
                                                                                        http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_data_4.21_i686-pc-linux-gnu.zip
                                                                                        http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_um_4.21_i686-pc-linux-gnu.zip

                                                                                        but we\'ll need the file_signature information for those mentioned in the xml file.


                                                                                        You also need sulphur_se_4.21_i686-pc-linux-gnu.zip

                                                                                        The signature is:

                                                                                        (file_info)
                                                                                        (name)sulphur_4.21_i686-pc-linux-gnu(/name)
                                                                                        (nbytes)4259364.000000(/nbytes)
                                                                                        (max_nbytes)0.000000(/max_nbytes)
                                                                                        (status)1(/status)
                                                                                        (executable/)
                                                                                        (signature_required/)
                                                                                        (file_signature)
                                                                                        765a986e0c67b95cab0da0605be738abdeaa9791abeb36b20e596ff344044513
                                                                                        ed1a2cfa86bec894646149fda4b5da473b56d8d6b60acbb435fa46aa43787e2b
                                                                                        cf7e7d3e0cf0e8e00338257752e9f1e67363d3597d6b26414122d0cb89a7a6ce
                                                                                        0d3c6fb8b82ce3144a5f3762872eace49007cecbd04f7439d4891a7dca32a07b
                                                                                        .
                                                                                        (/file_signature)
                                                                                        (url)http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_4.21_i686-pc-linux-gnu(/url)
                                                                                        (/file_info)
                                                                                        (file_info)
                                                                                        (name)sulphur_4.21_i686-pc-linux-gnu.so(/name)
                                                                                        (nbytes)8281400.000000(/nbytes)
                                                                                        (max_nbytes)0.000000(/max_nbytes)
                                                                                        (status)1(/status)
                                                                                        (executable/)
                                                                                        (signature_required/)
                                                                                        (file_signature)
                                                                                        91f02c74278d1d3da254aaf0d33d3141ace1ea503ace9f52e3e8ea9436985048
                                                                                        c8d05408802b1c232303929358442549cb817f3b49cb59b8fc24ae7479d1ae31
                                                                                        d50102c97c4048faeb0424fad737790d95268885a627204c18a6537be8f6de83
                                                                                        eb4f2daa311af2512a1d94a2a332d9932a4af6a319d26358109ccec9f996f805
                                                                                        .
                                                                                        (/file_signature)
                                                                                        (url)http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_4.21_i686-pc-linux-gnu.so(/url)
                                                                                        (/file_info)
                                                                                        (file_info)
                                                                                        (name)sulphur_data_4.21_i686-pc-linux-gnu.zip(/name)
                                                                                        (nbytes)24047629.000000(/nbytes)
                                                                                        (max_nbytes)0.000000(/max_nbytes)
                                                                                        (status)1(/status)
                                                                                        (signature_required/)
                                                                                        (file_signature)
                                                                                        91e4e438a17842b73758007a92c4ffea63b9ade84211fd34038689b7b672eee8
                                                                                        d80e2a6c146465ce74c6aad4edab79c539a9dfddecbbc9ed93385bcfe8c533cd
                                                                                        3d2ff4f996e980262cfb5cd523b7acdec0352d19b46545407a3a67de6edf2c6e
                                                                                        8b63a505d5e2a9b89a2e6f6167a49cac0d0b04ce93772e25802807b0dc997dc1
                                                                                        .
                                                                                        (/file_signature)
                                                                                        (url)http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_data_4.21_i686-pc-linux-gnu.zip(/url)
                                                                                        (/file_info)
                                                                                        (file_info)
                                                                                        (name)sulphur_um_4.21_i686-pc-linux-gnu.zip(/name)
                                                                                        (nbytes)4563697.000000(/nbytes)
                                                                                        (max_nbytes)0.000000(/max_nbytes)
                                                                                        (status)1(/status)
                                                                                        (signature_required/)
                                                                                        (file_signature)
                                                                                        44040e3d2012e67ece56d07b9ec38910d5e903eeb0093bbda9a7cc7047c00a8d
                                                                                        7f8ec82724b0fc1f2cc3f66f274c5bfe90f10397b18972e71af3111ff025037f
                                                                                        50b5dbad2ea09bd16505c580aace29a846bcb110b547bbd173cb6c8aefad5a4a
                                                                                        e6dd8c33064330452ecaaa42aad23c7daceb2234b5221637454391829091338f
                                                                                        .
                                                                                        (/file_signature)
                                                                                        (url)http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_um_4.21_i686-pc-linux-gnu.zip(/url)
                                                                                        (/file_info)
                                                                                        (file_info)
                                                                                        (name)sulphur_se_4.21_i686-pc-linux-gnu.zip(/name)
                                                                                        (nbytes)5639582.000000(/nbytes)
                                                                                        (max_nbytes)0.000000(/max_nbytes)
                                                                                        (status)1(/status)
                                                                                        (signature_required/)
                                                                                        (file_signature)
                                                                                        09a841d58f5030fbd709c9027d05789a4ea1b2d9e2b3f5d9f5bb51c99128a918
                                                                                        a1c4e005635588ad44301705c947678ee3022053eb9661bd07d759ea9a9fe979
                                                                                        ef154e7b1041ec7bb51f96ed1482a35f8c5591eb9b78da92401b7c14600d40a4
                                                                                        e615c73c55843b82d3c17a3a3a4e2f5b09e61333aeaff1e79a3a4c3bf75d43e0
                                                                                        .
                                                                                        (/file_signature)
                                                                                        (url)http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/sulphur_se_4.21_i686-pc-linux-gnu.zip(/url)
                                                                                        (/file_info)

                                                                                        And you also have to add the (app_version):

                                                                                        (app_version)
                                                                                        (app_name)sulphur_cycle(/app_name)
                                                                                        (version_num)421(/version_num)
                                                                                        (file_ref)
                                                                                        (file_name)sulphur_4.21_i686-pc-linux-gnu(/file_name)
                                                                                        (main_program/)
                                                                                        (/file_ref)
                                                                                        (file_ref)
                                                                                        (file_name)sulphur_4.21_i686-pc-linux-gnu.so(/file_name)
                                                                                        (open_name)sulphur_4.21_i686-pc-linux-gnu.so(/open_name)
                                                                                        (/file_ref)
                                                                                        (file_ref)
                                                                                        (file_name)sulphur_data_4.21_i686-pc-linux-gnu.zip(/file_name)
                                                                                        (open_name)sulphur_data_4.21_i686-pc-linux-gnu.zip(/open_name)
                                                                                        (/file_ref)
                                                                                        (file_ref)
                                                                                        (file_name)sulphur_um_4.21_i686-pc-linux-gnu.zip(/file_name)
                                                                                        (open_name)sulphur_um_4.21_i686-pc-linux-gnu.zip(/open_name)
                                                                                        (/file_ref)
                                                                                        (file_ref)
                                                                                        (file_name)sulphur_se_4.21_i686-pc-linux-gnu.zip(/file_name)
                                                                                        (open_name)sulphur_se_4.21_i686-pc-linux-gnu.zip(/open_name)
                                                                                        (/file_ref)
                                                                                        (/app_version)


                                                                                        Written in anger: BBCode tags suck!

                                                                                        Anyway, I have an HTML page with the downgrade instructions and the XML changes to the client_state.xml file listed above.

                                                                                        ____________

                                                                                        LS Diseño
                                                                                        Send message
                                                                                        Joined: Nov 4 04
                                                                                        Posts: 16
                                                                                        Credit: 11,577,003
                                                                                        RAC: 0
                                                                                        Message 20862 - Posted 28 Feb 2006 20:14:20 UTC - in response to Message 20859.

                                                                                          OK. Try:

                                                                                          Instructions to downgrade from sulphur 4.23 to 4.21 on Linux

                                                                                          ____________

                                                                                          Profile geophi
                                                                                          Forum moderator
                                                                                          Send message
                                                                                          Joined: Aug 7 04
                                                                                          Posts: 1475
                                                                                          Credit: 22,606,103
                                                                                          RAC: 2,242
                                                                                          Message 20889 - Posted 1 Mar 2006 5:30:27 UTC

                                                                                            I\'ll give it a try later today. Thanks for the effort and I\'ll let you know how it goes.

                                                                                            Profile geophi
                                                                                            Forum moderator
                                                                                            Send message
                                                                                            Joined: Aug 7 04
                                                                                            Posts: 1475
                                                                                            Credit: 22,606,103
                                                                                            RAC: 2,242
                                                                                            Message 20939 - Posted 1 Mar 2006 21:37:29 UTC

                                                                                              LS Diseño

                                                                                              Thanks. Worked great. There were some complications because I had both 4.22 and 4.23 entries in my client_state file, but I figured it out. Running much faster than the 4.22 executables that were crunching before.

                                                                                              LS Diseño
                                                                                              Send message
                                                                                              Joined: Nov 4 04
                                                                                              Posts: 16
                                                                                              Credit: 11,577,003
                                                                                              RAC: 0
                                                                                              Message 20943 - Posted 1 Mar 2006 22:17:10 UTC - in response to Message 20939.

                                                                                                LS Diseño

                                                                                                Thanks. Worked great. There were some complications because I had both 4.22 and 4.23 entries in my client_state file, but I figured it out. Running much faster than the 4.22 executables that were crunching before.



                                                                                                Glad it worked for you! I also have a host with 4.22, and I left it at that as it was crunching through phase 3 already.
                                                                                                ____________

                                                                                                Steve Ross
                                                                                                Send message
                                                                                                Joined: Sep 12 04
                                                                                                Posts: 7
                                                                                                Credit: 515,736
                                                                                                RAC: 0
                                                                                                Message 20949 - Posted 2 Mar 2006 0:54:25 UTC - in response to Message 20943.

                                                                                                  I have tried the method you descibed, but each time CPDN downloads new workunit it switches back to 4.23 ?? What am I missing ??

                                                                                                  Please help.


                                                                                                  LS Diseño

                                                                                                  Thanks. Worked great. There were some complications because I had both 4.22 and 4.23 entries in my client_state file, but I figured it out. Running much faster than the 4.22 executables that were crunching before.



                                                                                                  Glad it worked for you! I also have a host with 4.22, and I left it at that as it was crunching through phase 3 already.




                                                                                                  ____________

                                                                                                  Profile geophi
                                                                                                  Forum moderator
                                                                                                  Send message
                                                                                                  Joined: Aug 7 04
                                                                                                  Posts: 1475
                                                                                                  Credit: 22,606,103
                                                                                                  RAC: 2,242
                                                                                                  Message 20953 - Posted 2 Mar 2006 3:46:41 UTC - in response to Message 20949.

                                                                                                    Last modified: 2 Mar 2006 3:47:01 UTC

                                                                                                    I have tried the method you descibed, but each time CPDN downloads new workunit it switches back to 4.23 ?? What am I missing ??

                                                                                                    What error messages do you get in the Messages tab when this occurs?

                                                                                                    Steve Ross
                                                                                                    Send message
                                                                                                    Joined: Sep 12 04
                                                                                                    Posts: 7
                                                                                                    Credit: 515,736
                                                                                                    RAC: 0
                                                                                                    Message 20956 - Posted 2 Mar 2006 5:29:57 UTC - in response to Message 20953.

                                                                                                      I was overwriting the 4.23 sections rather than adding in new 4.21 sections in client_state.xml. I have got it running, thanks for reply.

                                                                                                      But now the gui is not updating cpu time and progress. The client app is running and showing no errors, but I have noticed that when boinc cycles to another app and suspends cpdn, it rewinds to beginning of last saved (archived boinc copy)timestep when staring cpdn again, as if not doing data dumps and checkpoints??

                                                                                                      hlp appreciated


                                                                                                      I have tried the method you descibed, but each time CPDN downloads new workunit it switches back to 4.23 ?? What am I missing ??

                                                                                                      What error messages do you get in the Messages tab when this occurs?


                                                                                                      ____________

                                                                                                      Steve Ross
                                                                                                      Send message
                                                                                                      Joined: Sep 12 04
                                                                                                      Posts: 7
                                                                                                      Credit: 515,736
                                                                                                      RAC: 0
                                                                                                      Message 20957 - Posted 2 Mar 2006 5:37:08 UTC - in response to Message 20956.

                                                                                                        sorry Geophi, my last post was incorrect, the checkpoints and timestaps are working correctly, just gui not updating. I use a windows 2K box for monitoring other boxes running boinc, and the CPU time, progress and to completion tabs are not updating ??

                                                                                                        regrds

                                                                                                        I was overwriting the 4.23 sections rather than adding in new 4.21 sections in client_state.xml. I have got it running, thanks for reply.

                                                                                                        But now the gui is not updating cpu time and progress. The client app is running and showing no errors, but I have noticed that when boinc cycles to another app and suspends cpdn, it rewinds to beginning of last saved (archived boinc copy)timestep when staring cpdn again, as if not doing data dumps and checkpoints??

                                                                                                        hlp appreciated


                                                                                                        I have tried the method you descibed, but each time CPDN downloads new workunit it switches back to 4.23 ?? What am I missing ??

                                                                                                        What error messages do you get in the Messages tab when this occurs?



                                                                                                        ____________

                                                                                                        Steve Ross
                                                                                                        Send message
                                                                                                        Joined: Sep 12 04
                                                                                                        Posts: 7
                                                                                                        Credit: 515,736
                                                                                                        RAC: 0
                                                                                                        Message 20970 - Posted 2 Mar 2006 12:48:18 UTC - in response to Message 20957.

                                                                                                          to add to this...

                                                                                                          it seems that 4.21 has problem running on redhat 8.0 gcc v3.22, as I get sulphur-4.21__um zombie on start of process and wu gets stuck starting up model. I have tried using sulphur-4.23_um and the wu runs (so far 15 hours). But this stops the gui updating (I think this is client_state.xml not being udpated periodically ?)

                                                                                                          Is there a problem with the libraries with 4.21 and RH8.0 ?

                                                                                                          I just wish I could get some wu done on linux with sulphur !!!! 8((

                                                                                                          regards

                                                                                                          Steve R

                                                                                                          [quote]sorry Geophi, my last post was incorrect, the checkpoints and timestaps are working correctly, just gui not updating. I use a windows 2K box for monitoring other boxes running boinc, and the CPU time, progress and to completion tabs are not updating ??

                                                                                                          regrds

                                                                                                          [quote]
                                                                                                          ____________

                                                                                                          LS Diseño
                                                                                                          Send message
                                                                                                          Joined: Nov 4 04
                                                                                                          Posts: 16
                                                                                                          Credit: 11,577,003
                                                                                                          RAC: 0
                                                                                                          Message 21053 - Posted 4 Mar 2006 17:58:11 UTC - in response to Message 20970.

                                                                                                            (I was away on travel and I did not read the posts above until now)

                                                                                                            I believe the problem with the GUI not updating might be related to the warning messages 4.21 produces upon startup when trying to set pointers to the shared memory segment it allocates. At first, I thought the only side effect would be lack of graphics, but I think the GUI also uses this memory segment to interchange information with the boinc client. As I do not use the GUI much because I find is faster editing the client_state.xml file, I had not realized of this problem.

                                                                                                            On the other hand, my experience is that the client seems to be crunching workunits fine on Red Hat Linux (so Fedora should be all right too), and MEPIS. I have no direct experience with other distributions of Linux besides Knoppix, which seems to be all right too. However, what I have noticed odd behaviours of other CPDN applications in the past. For example, the hadcm3 spinup application produces a lot of zombi processes when running on Fedora, something that does not happen on other Linux distributions.

                                                                                                            Sorry I cannot be of more help.
                                                                                                            ____________

                                                                                                            Melvyn Bobo Slacke
                                                                                                            Avatar
                                                                                                            Send message
                                                                                                            Joined: Aug 16 04
                                                                                                            Posts: 124
                                                                                                            Credit: 4,307,602
                                                                                                            RAC: 934
                                                                                                            Message 21234 - Posted 13 Mar 2006 16:28:17 UTC

                                                                                                              Sorrow days are over, 2006-03-13 16:17:48 [climateprediction.net] Started download of hadcm3trans_5.08_i686-pc-linux-gnu
                                                                                                              :)

                                                                                                              arcturus
                                                                                                              Send message
                                                                                                              Joined: Aug 31 04
                                                                                                              Posts: 6
                                                                                                              Credit: 229,431
                                                                                                              RAC: 0
                                                                                                              Message 21346 - Posted 16 Mar 2006 18:42:53 UTC - in response to Message 21234.

                                                                                                                Last modified: 16 Mar 2006 18:43:51 UTC

                                                                                                                Sorrow days are over, 2006-03-13 16:17:48 [climateprediction.net] Started download of hadcm3trans_5.08_i686-pc-linux-gnu
                                                                                                                :)
                                                                                                                What exactly does this mean? Is it ok to run climate on linux once again? Is there official word? Was there any official notice that the linux client was hosed other than having to rummage through the forums?

                                                                                                                ____________

                                                                                                                Jean-David Beyer
                                                                                                                Send message
                                                                                                                Joined: Aug 5 04
                                                                                                                Posts: 96
                                                                                                                Credit: 1,819,031
                                                                                                                RAC: 284
                                                                                                                Message 21350 - Posted 16 Mar 2006 20:43:14 UTC - in response to Message 21346.

                                                                                                                  Sorrow days are over, 2006-03-13 16:17:48 [climateprediction.net] Started download of hadcm3trans_5.08_i686-pc-linux-gnu
                                                                                                                  :)
                                                                                                                  What exactly does this mean? Is it ok to run climate on linux once again? Is there official word? Was there any official notice that the linux client was hosed other than having to rummage through the forums?


                                                                                                                  I am not sure. I re-enabled climateprediction downloads, and it downloaded me three homework assignments. Two were hadsm3lb 5.08 (these may be good ones) and one sulfur_cycle 4.23 (that I think is a bad one).
                                                                                                                  ____________

                                                                                                                  Melvyn Bobo Slacke
                                                                                                                  Avatar
                                                                                                                  Send message
                                                                                                                  Joined: Aug 16 04
                                                                                                                  Posts: 124
                                                                                                                  Credit: 4,307,602
                                                                                                                  RAC: 934
                                                                                                                  Message 21361 - Posted 16 Mar 2006 22:08:08 UTC

                                                                                                                    Last modified: 16 Mar 2006 22:13:04 UTC

                                                                                                                    It\'s the new HadCM3L Coupled Model Experiment and it popped up in the Applications tab on Monday.
                                                                                                                    Been running fine on my Fedora boxen for a couple of weeks as a beta and in the BBC project.

                                                                                                                    copycat
                                                                                                                    Send message
                                                                                                                    Joined: Feb 24 05
                                                                                                                    Posts: 28
                                                                                                                    Credit: 87,911
                                                                                                                    RAC: 32
                                                                                                                    Message 22360 - Posted 24 Apr 2006 22:05:27 UTC

                                                                                                                      I\'m just wondering, is this

                                                                                                                      sulphur_ilto_000868092 - PH 1 TS 0103459 A - 26/11/1816 09:30 - H:M:S=0078:58:21 AVG= 2.75 DLT= 1.88
                                                                                                                      2006-04-22 23:23:00 [climateprediction.net] Unrecoverable error for result sulphur_ilto_000868092_0 (process got signal 11)

                                                                                                                      peculiar behavior also caused by the unstable version 4.23? I\'ve never had a full run due to a) memory problmems and b) trying to debug in wine whilst CPDN was running, but with this one I\'ve gotten farther than before. This one never rewound though, just out of the blue a signal 11, bam!
                                                                                                                      ____________

                                                                                                                      Post to thread

                                                                                                                      Questions and Answers : Unix/Linux : Linux Sulphur 4.23 Unstable




                                                                                                                      Copyright © 2002-2014 climateprediction.net