climateprediction.net home page

HadAM3P-PNW disappeared?


Advanced search

Message boards : Number crunching : HadAM3P-PNW disappeared?

AuthorMessage
Profile Greg van Paassen
Send message
Joined: Nov 17 07
Posts: 142
Credit: 4,271,370
RAC: 79
Message 42008 - Posted 22 Apr 2011 20:50:21 UTC

    As of yesterday the "Server Status" page has been showing 0 HadAM3P-PNW tasks. The day before, there were 50,000-odd.

    Should there be an announcement?

    BTW I have just received 6 HadAM3P-PNWs, at least two of which are 'new' - first task for the work unit was issued after the Server Status changed to 0.

    Les Bayliss
    Forum moderator
    Send message
    Joined: Sep 5 04
    Posts: 5338
    Credit: 8,876,229
    RAC: 549
    Message 42009 - Posted 22 Apr 2011 21:39:34 UTC - in response to Message 42008.

      No news at the moment.
      It's the Easter long weekend, so most of Oxford would be closed, and I've been sleeping for the last few hours.
      I've asked about it, but it may take a while for a reply.

      The EU pool is down to 9 at the moment as well.



      ____________
      Backups: Here

      Profile Thyme Lawn
      Forum moderator
      Send message
      Joined: Aug 5 04
      Posts: 1231
      Credit: 10,354,096
      RAC: 1,273
      Message 42010 - Posted 22 Apr 2011 22:15:50 UTC

        Last modified: 22 Apr 2011 22:39:15 UTC

        It looks like there were download problems earlier today. hadam3p_pnw_yyam_2005_1_006899510_0 (from the same WU as one of Greg's tasks) reported the following error at 22 Apr 2011 20:14:20 UTC:

        app_version download error: couldn't get input files:
        <file_xfer_error>
        <file_name>hadam3p_pnw_graphics_6.09_i686-pc-linux-gnu</file_name>
        <error_code>-224</error_code>
        <error_message>file not found</error_message>
        </file_xfer_error>

        I've run a browser check on http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/mirror.php?file=/hadam3p_pnw_graphics_6.09_i686-pc-linux-gnu and it now seems to be available on the 3 mirror servers I'm aware of (http://uploader1.atm.ox.ac.uk, http://climateprediction.net and http://climateapps2.oucs.ox.ac.uk).
        ____________
        "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer

        Profile Greg van Paassen
        Send message
        Joined: Nov 17 07
        Posts: 142
        Credit: 4,271,370
        RAC: 79
        Message 42012 - Posted 23 Apr 2011 4:16:16 UTC

          Last modified: 23 Apr 2011 4:17:06 UTC

          Oh, OK. The drop from 50,000-plus down to 0 was so sudden, I thought that someone had "pulled the plug" on the PNW project. But I'd expect that they would tell the moderators if so. :-) If you guys don't know anything about the project being cancelled, it must be just a glitch in the server status page.

          Nigel Garvey
          Send message
          Joined: May 5 10
          Posts: 33
          Credit: 578,760
          RAC: 234
          Message 42013 - Posted 23 Apr 2011 7:04:48 UTC

            The PNW app has also disappeared from the Applications page.

            http://climateapps2.oucs.ox.ac.uk/cpdnboinc/apps.php

            Profile Greg van Paassen
            Send message
            Joined: Nov 17 07
            Posts: 142
            Credit: 4,271,370
            RAC: 79
            Message 42028 - Posted 26 Apr 2011 19:44:55 UTC

              Last modified: 26 Apr 2011 19:48:06 UTC

              Now all my HadAM3P-PNWs have been marked "Didn't need". What's going on?

              Edt: correction - two of them are still "in progress", but the other four are "didn't need". Do I cancel the "didn't need"s?

              I see HadCM3N is back, on the "server status" page. Did I miss the memo?

              Les Bayliss
              Forum moderator
              Send message
              Joined: Sep 5 04
              Posts: 5338
              Credit: 8,876,229
              RAC: 549
              Message 42029 - Posted 26 Apr 2011 21:32:01 UTC - in response to Message 42028.

                Presumably, when the data sets were removed from the download data pool, the BOINC server software took that to mean that none of the unreturned results were needed, and wrote the Not needed message into everyone's model pages.

                Note however, that Not needed isn't the same as not wanted by the researchers, who would still like to get their hands on as much data as possible, please.

                So if the models are still running, and haven't been killed off by some downloaded signal from Oxford, you should continue to crunch them.

                ******************

                There's been no memo, possibly because it was still the 'weekend' in the UK.
                What's going on is anyone's guess.

                There is, however, THIS memo about upgraded security measures on the alternative PHP board.


                ____________
                Backups: Here

                Profile astroWX
                Forum moderator
                Send message
                Joined: Aug 5 04
                Posts: 1294
                Credit: 37,520,270
                RAC: 17,812
                Message 42033 - Posted 27 Apr 2011 1:57:54 UTC

                  I received this message after #13 upload to Oxford. None of the earlier uploads (to U.Oregon) triggered a red message.

                  ____________
                  "We have met the enemy and he is us." -- Pogo
                  Greetings from coastal Washington state, the scenic US Pacific Northwest.

                  Profile Jonathan Miller
                  Forum moderator
                  Project administrator
                  Project developer
                  Volunteer developer
                  Send message
                  Joined: Mar 28 11
                  Posts: 35
                  Credit: 82,588
                  RAC: 0
                  Message 42044 - Posted 27 Apr 2011 12:40:43 UTC

                    Last modified: 27 Apr 2011 13:15:00 UTC

                    DIDN'T NEED flag

                    Hi Everyone,

                    Some of the work units that get processed contain particular parameters that are of interest to the CPDN project. The BOINC system has a method for allowing us to gather more info on certain parameter sets by resubmitting a work unit to the pool of available work units.

                    The DIDN'T NEED flag means that the CPDN project did/do not need to resubmit the work unit for additional processing.

                    The flag can mean a number of things, and is combined with other flags in the database to determine exactly why we don't need to reprocess it. One of the common reasons is that the current run gives us exactly the info we need.

                    It is unfortunate that the flag gives the impression that we are not interested in the work unit - we certainly are interested.
                    We are looking into how we can make this more clear on the work unit info pages.

                    Please accept our apologies for any confusion or consternation this may have caused.

                    EDIT: We have now altered this flag to read "No Resubmission" which is a more accurate reflection of the status of the work unit.

                    Jonathan
                    CPDN SysAdmin

                    Profile Greg van Paassen
                    Send message
                    Joined: Nov 17 07
                    Posts: 142
                    Credit: 4,271,370
                    RAC: 79
                    Message 42056 - Posted 28 Apr 2011 23:44:20 UTC

                      Re: Didn't need / No resubmission:

                      Fri 29 Apr 2011 10:02:11 NZST climateprediction.net Started upload of hadam3p_pnw_zjca_1969_1_006969986_2_13.zip
                      Fri 29 Apr 2011 10:02:13 NZST climateprediction.net Computation for task hadam3p_pnw_zjca_1969_1_006969986_2 finished
                      Fri 29 Apr 2011 10:10:59 NZST climateprediction.net Finished upload of hadam3p_pnw_zjca_1969_1_006969986_2_13.zip
                      Fri 29 Apr 2011 11:19:59 NZST climateprediction.net Sending scheduler request: To send trickle-up message.
                      Fri 29 Apr 2011 11:19:59 NZST climateprediction.net Reporting 1 completed tasks, not requesting new tasks
                      Fri 29 Apr 2011 11:20:04 NZST climateprediction.net Scheduler request completed
                      Fri 29 Apr 2011 11:20:04 NZST climateprediction.net Message from server: Completed result hadam3p_pnw_zjca_1969_1_006969986_2 refused: this result wasn't sent (not needed)

                      That suggests that completed work is not getting through, whether or not the scientists want it. I'm still confused. I think I'd rather crunch something else.

                      Les Bayliss
                      Forum moderator
                      Send message
                      Joined: Sep 5 04
                      Posts: 5338
                      Credit: 8,876,229
                      RAC: 549
                      Message 42059 - Posted 29 Apr 2011 4:22:11 UTC - in response to Message 42056.

                        I've just received the same message.
                        It looks like the attempt to introduce a new label into the BOINC system hasn't worked. :(

                        I'll inform the project people.


                        ____________
                        Backups: Here

                        DaveG27
                        Send message
                        Joined: Nov 8 06
                        Posts: 18
                        Credit: 2,425,895
                        RAC: 0
                        Message 42062 - Posted 29 Apr 2011 17:05:23 UTC

                          I have had the same.
                          when I look tasks instead of saying "completed" get "No Resubmission" I get the feeling I am completely wasting my computer time this can be seen on other work units.

                          Les Bayliss
                          Forum moderator
                          Send message
                          Joined: Sep 5 04
                          Posts: 5338
                          Credit: 8,876,229
                          RAC: 549
                          Message 42063 - Posted 29 Apr 2011 17:30:23 UTC

                            Those people who feel that this new message means that they're wasting their time should stop crunching climate models, leave the project, and not come back!

                            For the rest of us, the data is stored on the servers, but it's a 4 day long weekend in the UK, so we have to wait until Tuesday morning UK time for the project people to return and kick the BOINC code until it behaves. :)


                            ____________
                            Backups: Here

                            DaveG27
                            Send message
                            Joined: Nov 8 06
                            Posts: 18
                            Credit: 2,425,895
                            RAC: 0
                            Message 42064 - Posted 29 Apr 2011 17:53:59 UTC

                              Those people who feel that this new message means that they're wasting their time should stop crunching climate models, leave the project, and not come back!

                              I have crunched this project since the BBC days but if this the new attitude I will take your advice as the messages are no longer in plain English.
                              I have had very few failures and most have been successful and put up with some of the projects problems.

                              Profile Greg van Paassen
                              Send message
                              Joined: Nov 17 07
                              Posts: 142
                              Credit: 4,271,370
                              RAC: 79
                              Message 42066 - Posted 29 Apr 2011 21:58:48 UTC - in response to Message 42064.

                                Dave, it's likely that this is just a 'learning the ropes' problem for the new project staff. And possibly Les was short of painkillers when he wrote that.

                                The HadCM3N models are working well. Just a couple of niggles: the initial duration estimate is about double the true figure (530 - 600 hours for your C2Qs), and they've a short deadline, which is really only indicative - the researchers will still use models that finish after the deadline. I'm crunching them, now.

                                DaveG27
                                Send message
                                Joined: Nov 8 06
                                Posts: 18
                                Credit: 2,425,895
                                RAC: 0
                                Message 42067 - Posted 29 Apr 2011 22:19:12 UTC

                                  Greg I've calmed down now but I was annoyed.
                                  I had noticed that I had 3 completed models with downloads stuck in the transfer tab I do not check this often.
                                  I quickly realised it due to a pnw running under linux with the handler problem. Edited the client_state.xml file which solved the problem but took hours off download. Any when reported completed I got in the messages so I went to my account to see if they were completed and left me confused.

                                  Profile astroWX
                                  Forum moderator
                                  Send message
                                  Joined: Aug 5 04
                                  Posts: 1294
                                  Credit: 37,520,270
                                  RAC: 17,812
                                  Message 42068 - Posted 30 Apr 2011 2:17:00 UTC

                                    Last modified: 30 Apr 2011 2:27:17 UTC

                                    PNW's first twelve uploads go to the science database at U.Oregon. No red messages from them, eh?

                                    #13 upload, after task completion and full credits are awarded, is a restart dump sent to Oxford so the next segment of the sequence can start, supposedly where the last one ended. (Work was chopped into segments because people accustomed to tasks taking from minutes to a few hours elsewhere whined at length across the boards. Hence some of the current difficulties. [I don't envy the scientists working to understand segment differences run on different CPUs and OSs ...])

                                    As I understand it, the new red message is a consequence of recent security changes to the boards to inhibit spam registrations. Not a secret, explanations were posted.

                                    I've been with CPDN since Original Beta, July 2003, and though, early-on (pre-boinc), we had to do some manual uploads, I don't recall any work being lost. (Early 14-day boinc timeout is another thing.)

                                    Despite a long history of under-staffing and a plethora of problems, many boinc-related, the Project has a good record of saving all our work.

                                    Hang in with Oxford's new IT team through its learning curve or bail out of a wounded but still-flying bird. Your choice.
                                    ____________
                                    "We have met the enemy and he is us." -- Pogo
                                    Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                    BigMike
                                    Send message
                                    Joined: Apr 6 05
                                    Posts: 17
                                    Credit: 744,057
                                    RAC: 0
                                    Message 42116 - Posted 5 May 2011 4:36:33 UTC

                                      Messages and server problems aside, I'm still wondering why there is no work for PNW, while there is for the other regionals.

                                      Any explanation from the science group?

                                      =Mike
                                      ____________

                                      Profile astroWX
                                      Forum moderator
                                      Send message
                                      Joined: Aug 5 04
                                      Posts: 1294
                                      Credit: 37,520,270
                                      RAC: 17,812
                                      Message 42117 - Posted 5 May 2011 5:35:15 UTC

                                        I have no new skinny but there was an issue with Linux tasks. Perhaps the new support team felt it safer to throttle PNW flavor (the one for my area of the planet) rather than tweak possibilities. I have all confidence that Andy and Jonathan will sort it all out in due time. Please hang in there!


                                        ____________
                                        "We have met the enemy and he is us." -- Pogo
                                        Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                        MacRonin
                                        Send message
                                        Joined: Apr 24 08
                                        Posts: 6
                                        Credit: 176,830
                                        RAC: 0
                                        Message 42125 - Posted 5 May 2011 20:54:56 UTC - in response to Message 42117.

                                          Just verifying, but it sounds like the msg I got is not quite accurate and is being updated for items being downloaded in the future. But in the mean time, scary msg or not the work is useful to you and should still be allowed to run if its already going

                                          After 24-48 hours of being unable to upload this task(other items uploaded ok, trickles I think) I finally got

                                          [code]Thu May 5 16:21:52 2011 climateprediction.net Message from server: Completed result hadam3p_eu_wczh_1988_1_006821781_0 refused: this result wasn't sent (not needed)
                                          [
                                          /code]

                                          And I should let my 3 other tasks go thru to completion and not get nervous if I get the same(or similar error msg)? The website doesn't show anything as labeled "In progress" but does have some(4) labeled as "no re-submission"

                                          BTW, referring back to a comment earlier in the thread. I personally have absolutely no problem with really long tasks, as long as they are labeled as such and have appropriate deadlines.

                                          Now another project I worked with for a bit would give estimates of 5-7 hours and deadlines of a week and then give you tasks that literally ran for weeks with systems dedicated 100% to them and didn't take any checkpoints for most of that time. But you guys don't do that :-)

                                          Les Bayliss
                                          Forum moderator
                                          Send message
                                          Joined: Sep 5 04
                                          Posts: 5338
                                          Credit: 8,876,229
                                          RAC: 549
                                          Message 42127 - Posted 5 May 2011 22:01:14 UTC - in response to Message 42125.

                                            I posted a message about the 'scary message' in the News thread, which is at the top of this section of the board.
                                            This News thread can be subscribed to, and you'll then get an email whenever there's a new post there.

                                            The project people are still looking into the matter. Along with all of their other work.


                                            ____________
                                            Backups: Here

                                            Post to thread

                                            Message boards : Number crunching : HadAM3P-PNW disappeared?




                                            Copyright © 2002-2014 climateprediction.net