climateprediction.net home page

The world's largest climate forecasting experiment for the 21st century.

Upload problem


Advanced search

Message boards : Number crunching : Upload problem

AuthorMessage
Profile salogel
Send message
Joined: Aug 31 04
Posts: 27
Credit: 1,506,224
RAC: 717
Message 40250 - Posted 27 Jul 2010 18:47:39 UTC

    since several days I have problems uploading the data.
    I get these messages when uploading three different results:

    27.07.2010 19:51:06 climateprediction.net Started upload of hadsm3dhet2_jyjn_006607765_4_1.zip
    27.07.2010 19:51:06 climateprediction.net Started upload of hadam3p_nb1w_1983_2_006162814_2_2.zip
    27.07.2010 19:52:08 Project communication failed: attempting access to reference site
    27.07.2010 19:52:08 climateprediction.net Temporarily failed upload of hadsm3dhet2_jyjn_006607765_4_1.zip: HTTP error
    27.07.2010 19:52:08 climateprediction.net Backing off 3 hr 36 min 8 sec on upload of hadsm3dhet2_jyjn_006607765_4_1.zip
    27.07.2010 19:52:08 climateprediction.net Started upload of famous_u4hm_1799_200_006638661_0_1.zip
    27.07.2010 19:52:09 Internet access OK - project servers may be temporarily down.
    27.07.2010 19:52:09 climateprediction.net Temporarily failed upload of hadam3p_nb1w_1983_2_006162814_2_2.zip: HTTP error
    27.07.2010 19:52:09 climateprediction.net Backing off 1 hr 49 min 22 sec on upload of hadam3p_nb1w_1983_2_006162814_2_2.zip
    27.07.2010 19:52:35 climateprediction.net Temporarily failed upload of famous_u4hm_1799_200_006638661_0_1.zip: HTTP error
    27.07.2010 19:52:35 climateprediction.net Backing off 3 hr 21 min 15 sec on upload of famous_u4hm_1799_200_006638661_0_1.zip

    Any ideas?
    These messages are the same since a week or more. On other servers there are no problems.
    Also with other projects there are no problems in uploading.
    I'm not using a proxy.

    ____________

    Profile mo.v
    Forum moderator
    Avatar
    Send message
    Joined: Sep 29 04
    Posts: 2354
    Credit: 6,493,435
    RAC: 2,039
    Message 40262 - Posted 28 Jul 2010 22:31:38 UTC

      Last modified: 28 Jul 2010 22:32:08 UTC

      Hi Legolas

      I had a lot of files to upload yesterday. Most uploaded immediately but some produced messages like:

      28/07/2010 00:05:32|climateprediction.net|Temporarily failed upload of famous_u0d0_599_200_006633319_3_9.zip: HTTP error

      But the files uploaded successfully 5 or 10 minutes later.

      Have you got files stuck in the Transfers tab that cannot upload?
      ____________
      Cpdn news
      5 CPDN READMEs

      Profile salogel
      Send message
      Joined: Aug 31 04
      Posts: 27
      Credit: 1,506,224
      RAC: 717
      Message 40264 - Posted 29 Jul 2010 5:36:00 UTC - in response to Message 40262.

        At the moment there are 5 files in the transfer tab, which are to be transferred.
        The upload goes to a maximum of 1 MB when it stucks.
        There are 3 hadam and 2 famous WUs to upload.

        I had the same from time to time but in the past the upload was successful after several tries.

        ____________

        Profile salogel
        Send message
        Joined: Aug 31 04
        Posts: 27
        Credit: 1,506,224
        RAC: 717
        Message 40265 - Posted 29 Jul 2010 8:38:18 UTC - in response to Message 40264.

          One point:
          I have two other machines where the upload is ok with client version 6.10.56
          The machine that fails has 6.10.58
          Can that be the reason?
          ____________

          Profile mo.v
          Forum moderator
          Avatar
          Send message
          Joined: Sep 29 04
          Posts: 2354
          Credit: 6,493,435
          RAC: 2,039
          Message 40266 - Posted 29 Jul 2010 8:59:31 UTC

            Last modified: 29 Jul 2010 8:59:53 UTC

            I don't think the computer's Boinc version can be the problem. I have computers with 6.2.19, 6.10.56 and 6.10.58; they all upload files. But on all three computers there are occasional problems with CPDN HTTP and other server errors. They could be various messages for the same server problem.

            CPDN server status is all green. Check that the computer has Network activity always available in the Activity menu. Instead of waiting for Boinc to try again press the Try again button in the Transfers section and see what happens. I would use this button at intervals but not dozens of times repeatedly. Once each time.
            ____________
            Cpdn news
            5 CPDN READMEs

            Profile salogel
            Send message
            Joined: Aug 31 04
            Posts: 27
            Credit: 1,506,224
            RAC: 717
            Message 40267 - Posted 29 Jul 2010 9:22:07 UTC - in response to Message 40266.

              That machine is at home - at the moment I'm not there.
              But I did that yesterday evening and each time it started uploading with the expected speed of my internet connection and suddenly it stucks after about 1MB uploaded.
              Is there a possibility to restart the upload servers although the status is green?
              I read in another thread that this solved the problem a few months ago.
              ____________

              Profile salogel
              Send message
              Joined: Aug 31 04
              Posts: 27
              Credit: 1,506,224
              RAC: 717
              Message 40287 - Posted 3 Aug 2010 6:23:38 UTC - in response to Message 40267.

                The number of results to be uploaded is still growing: now I have 8 files waiting for uploading. They all stuck after 1 MB.
                what can be done ? :-(
                ____________

                Profile JIM
                Send message
                Joined: Dec 31 07
                Posts: 609
                Credit: 3,346,966
                RAC: 4,732
                Message 40288 - Posted 3 Aug 2010 6:31:00 UTC

                  Does anyone know when the uploader.oerc will be back up? I already have a 2 zip files stuck in the transfer tab, and will have 2 more by morning. In the meantime I have suspended network communication. I guess that all we can do is wait for the data transfer to be complete.


                  ____________

                  Profile apohawk
                  Avatar
                  Send message
                  Joined: Sep 14 08
                  Posts: 1
                  Credit: 80,761
                  RAC: 0
                  Message 40290 - Posted 3 Aug 2010 7:29:16 UTC - in response to Message 40288.

                    I've got two files stuck:

                    2010-08-03 09:20:33 climateprediction.net Started upload of famous_ulnt_1999_200_006660916_2_18.zip
                    2010-08-03 09:20:40 Project communication failed: attempting access to reference site
                    2010-08-03 09:20:40 climateprediction.net Temporarily failed upload of famous_ulnt_1999_200_006660916_2_18.zip: connect() failed
                    2010-08-03 09:20:40 climateprediction.net Backing off 1 hr 22 min 28 sec on upload of famous_ulnt_1999_200_006660916_2_18.zip
                    2010-08-03 09:20:41 Internet access OK - project servers may be temporarily down.

                    and

                    2010-08-03 09:25:38 climateprediction.net Started upload of famous_ulns_1799_200_006660915_1_15.zip
                    2010-08-03 09:25:40 Project communication failed: attempting access to reference site
                    2010-08-03 09:25:40 climateprediction.net Temporarily failed upload of famous_ulns_1799_200_006660915_1_15.zip: connect() failed
                    2010-08-03 09:25:40 climateprediction.net Backing off 3 hr 58 min 29 sec on upload of famous_ulns_1799_200_006660915_1_15.zip
                    2010-08-03 09:25:41 Internet access OK - project servers may be temporarily down.

                    Now those've been lingering for few days now. Is it because uploader.oesc being down or are those task just messed up?
                    Other uploads went ok.
                    ____________

                    Profile mo.v
                    Forum moderator
                    Avatar
                    Send message
                    Joined: Sep 29 04
                    Posts: 2354
                    Credit: 6,493,435
                    RAC: 2,039
                    Message 40291 - Posted 3 Aug 2010 7:36:45 UTC

                      Please read the post about uploader.oerc in the forum News thread at the top of this Number Crunching section. As soon as we know more that is where it will be announced.
                      ____________
                      Cpdn news
                      5 CPDN READMEs

                      Profile salogel
                      Send message
                      Joined: Aug 31 04
                      Posts: 27
                      Credit: 1,506,224
                      RAC: 717
                      Message 40304 - Posted 5 Aug 2010 20:17:22 UTC - in response to Message 40291.

                        The uploading is not really better although all upload servers seem to work.
                        In the meantime two files of FAMOUS 6.11 were uploaded but still 5 files stuck after about 1 MB.
                        And I have still 2 HADAM and 1 HADSM file to be uploaded - also stuck after about 1 MB.

                        Here the messages with xfer-debug:

                        05.08.2010 20:34:07 climateprediction.net [fxd] starting upload, upload_offset -1
                        05.08.2010 20:34:07 climateprediction.net Started upload of hadam3p_nb1w_1983_2_006162814_2_1.zip
                        05.08.2010 20:34:07 climateprediction.net [file_xfer_debug] URL: http://uploader.oerc.ox.ac.uk/cpdn_cgi/file_upload_handler
                        05.08.2010 20:34:08 climateprediction.net [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
                        05.08.2010 20:34:08 climateprediction.net [file_xfer_debug] parsing upload response: <data_server_reply> <status>0</status> <file_size>1130358</file_size></data_server_reply>
                        05.08.2010 20:34:08 climateprediction.net [file_xfer_debug] parsing status: 0
                        05.08.2010 20:34:08 climateprediction.net [fxd] starting upload, upload_offset 1130358
                        05.08.2010 20:34:38 Project communication failed: attempting access to reference site
                        05.08.2010 20:34:38 climateprediction.net [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval -184
                        05.08.2010 20:34:38 climateprediction.net [file_xfer_debug] file transfer status -184
                        05.08.2010 20:34:38 climateprediction.net Temporarily failed upload of hadam3p_nb1w_1983_2_006162814_2_1.zip: HTTP error
                        05.08.2010 20:34:38 climateprediction.net Backing off 2 hr 18 min 40 sec on upload of hadam3p_nb1w_1983_2_006162814_2_1.zip


                        ____________

                        Les Bayliss
                        Forum moderator
                        Send message
                        Joined: Sep 5 04
                        Posts: 5131
                        Credit: 8,469,789
                        RAC: 6,762
                        Message 40305 - Posted 5 Aug 2010 21:21:30 UTC - in response to Message 40304.

                          Last modified: 5 Aug 2010 21:22:50 UTC

                          It may be something external to your computer:
                          ISP limit
                          Business/Educational institute file size limit, or data amount limit.
                          Proxy server problem.

                          Or if you have your own router/network, something with that.
                          ____________
                          Backups: Here

                          Profile mo.v
                          Forum moderator
                          Avatar
                          Send message
                          Joined: Sep 29 04
                          Posts: 2354
                          Credit: 6,493,435
                          RAC: 2,039
                          Message 40306 - Posted 5 Aug 2010 21:58:14 UTC

                            Last modified: 5 Aug 2010 22:22:05 UTC

                            I don't know much about network problems but wonder whether trying this could help:

                            http://setiathome.berkeley.edu/forum_thread.php?id=47636&sort=6#768589

                            [Edit: Don't do that yet; just read it. I've sent a message to Gundolf on Seti asking whether he thinks trying that would be a good idea.]
                            ____________
                            Cpdn news
                            5 CPDN READMEs

                            Les Bayliss
                            Forum moderator
                            Send message
                            Joined: Sep 5 04
                            Posts: 5131
                            Credit: 8,469,789
                            RAC: 6,762
                            Message 40307 - Posted 5 Aug 2010 23:10:11 UTC - in response to Message 40306.

                              That's what I had in mind, but was going to wait and see, before explaining about cc_config.

                              Profile mo.v
                              Forum moderator
                              Avatar
                              Send message
                              Joined: Sep 29 04
                              Posts: 2354
                              Credit: 6,493,435
                              RAC: 2,039
                              Message 40311 - Posted 6 Aug 2010 6:54:30 UTC

                                Last modified: 6 Aug 2010 6:54:48 UTC

                                There may be another member with the same problem plus a person who seems to have the same upload problem on the CPDN Beta project.

                                Legolas, we don't need to know the model names or task numbers or workunit numbers for the files that cannot upload. But could you please tell us for each file:

                                * the model type (eg HadAM3P, HadSM, FAMOUS)
                                * the version (eg 6.10, 6.11)
                                * the file number (that's the number like _11, _13 at the end of each file name in the Transfers tab. Not the number at the end of the name in the Tasks tab.)

                                When we know that we can look at our list of which files upload to which server to see if it's a problem with one upload server or more. Please list all the details for all the files because some models upload files to 3 different servers(!).
                                ____________
                                Cpdn news
                                5 CPDN READMEs

                                Profile mo.v
                                Forum moderator
                                Avatar
                                Send message
                                Joined: Sep 29 04
                                Posts: 2354
                                Credit: 6,493,435
                                RAC: 2,039
                                Message 40313 - Posted 6 Aug 2010 8:32:43 UTC

                                  Last modified: 6 Aug 2010 8:34:19 UTC

                                  I've had a reply on Seti from Gundolf who says:

                                  The symptoms Legolas describes are different from what I have experienced, but he could try the other protocol nonetheless. I can't imagine any problems being introduced by doing so.

                                  I am running HTTP 1.0 for several years now without problems (SETI and Einstein). My problems were probably caused by my ISP.

                                  Legolas should also try a reboot, to flush all (DNS) buffers of unwanted information.


                                  * If you haven't rebooted since this upload problem began try that first.

                                  * If the files still don't upload use Gundolf's suggestion in the Seti forum thread. Before you edit the cc_config file you should completely exit from Boinc by right-clicking on the Boinc icon and selecting Exit.

                                  * If that still doesn't work it would be useful if you could give us the information I listed in my last post.
                                  ____________
                                  Cpdn news
                                  5 CPDN READMEs

                                  Profile salogel
                                  Send message
                                  Joined: Aug 31 04
                                  Posts: 27
                                  Credit: 1,506,224
                                  RAC: 717
                                  Message 40314 - Posted 6 Aug 2010 8:51:50 UTC

                                    Thanks for your answers.
                                    I will try your suggestions this evening.

                                    I did already a reboot since the problem occured first and I also rebooted my rooter. That didn't help so far.
                                    ____________

                                    Profile salogel
                                    Send message
                                    Joined: Aug 31 04
                                    Posts: 27
                                    Credit: 1,506,224
                                    RAC: 717
                                    Message 40317 - Posted 6 Aug 2010 16:54:46 UTC - in response to Message 40311.

                                      I tested the HTTP option and it had no effect :-(
                                      and I also booted my machine.

                                      Here are the models, versions and filenumbers ( 3 different WUs ):

                                      1. HadSM3 6.07 File _1
                                      2. HADAM3P 6.14 Files _1 and _2
                                      3. FAMOUS 6.11 Files _2, _3, _5, _6, _8

                                      The HadSM3 and HADAM3P WU has already been finished.
                                      The FAMOUS files not in the list had the same problem but at least they were uploaded.
                                      Can you see an error on the server when files are uploaded?
                                      ____________

                                      Profile mo.v
                                      Forum moderator
                                      Avatar
                                      Send message
                                      Joined: Sep 29 04
                                      Posts: 2354
                                      Credit: 6,493,435
                                      RAC: 2,039
                                      Message 40321 - Posted 6 Aug 2010 18:24:00 UTC

                                        Last modified: 6 Aug 2010 18:25:08 UTC

                                        Well, you are trying everything.

                                        I can't see the server, nor can any of the other moderators. None of us are in Oxford where CPDN is. Milo, the programmer, can see the servers. But now it's his weekend.

                                        I was hoping that all the files would be allocated to one upload server and Milo would find a problem on it.

                                        But if our upload server lists are correct, HadSM, HadAM3P file _1 and FAMOUS files _3 and _6 go to uploader.oerc, while HadAM3P _2 and FAMOUS _2, _5 & _8 go to upload.1.comlab. If the FAMOUS 6.11 is from a new enough batch all its files will go to kraken. So two or three upload servers need to accept your files.

                                        I think this must be some connection problem at your end. I would be surprised if it's a bug on two or three upload servers. While we think again about this, don't constantly retry the transfers in the Transfers tab. I think the number of upload attempts allowed for each file is 100.

                                        But other files have uploaded OK?

                                        Have you got another computer there with internet access? If so, does it crunch CPDN and do its files all upload?
                                        ____________
                                        Cpdn news
                                        5 CPDN READMEs

                                        Profile salogel
                                        Send message
                                        Joined: Aug 31 04
                                        Posts: 27
                                        Credit: 1,506,224
                                        RAC: 717
                                        Message 40323 - Posted 6 Aug 2010 19:30:28 UTC - in response to Message 40321.

                                          I have 17 other projects running here all of them without upload problems.
                                          And I have no other computers here, only at work which upload without problems too.
                                          I upgraded to Boinc client 6.10.58 a few day ago.
                                          I will go back to 6.10.56 but I don't think that will help.
                                          It's a try.
                                          Most of the time the upload goes up to 1.09 MB when it stucks, so smaller zip files will upload without problems.
                                          I crunched a lot of CPDN WUs without any uploading problems.

                                          It doesn't matter if the problem is not solved over the weekend if we can get a solution one time.

                                          Have a nice weekend.
                                          ____________

                                          Profile Gundolf Jahn
                                          Send message
                                          Joined: Aug 31 04
                                          Posts: 1
                                          Credit: 36,628
                                          RAC: 0
                                          Message 40324 - Posted 6 Aug 2010 20:04:13 UTC - in response to Message 40323.

                                            Last modified: 6 Aug 2010 20:37:31 UTC

                                            Perhaps it could help the debugging process if you (temporarily) turn on some other logging flags. I'm afraid that I'll not be able to help analyzing the output, but perhaps someone else is. I did use the following flags:

                                            <log_flags>
                                            <file_xfer_debug>1</file_xfer_debug>
                                            <http_debug>1</http_debug>
                                            <http_xfer_debug>1</http_xfer_debug>
                                            </log_flags>

                                            Gruß,
                                            Gundolf
                                            PS: you can use the "quote" link to get the formatting. [pre] and [code] don't work correctly here at CPDN ;-)[edit]Turned the flags on[/edit]
                                            ____________
                                            Computer sind nicht alles im Leben. (Kleiner Scherz)

                                            Profile genes
                                            Avatar
                                            Send message
                                            Joined: Aug 9 04
                                            Posts: 25
                                            Credit: 4,756,980
                                            RAC: 0
                                            Message 40325 - Posted 6 Aug 2010 20:20:18 UTC

                                              Most of the time the upload goes up to 1.09 MB when it stucks, so smaller zip files will upload without problems.


                                              I don't know if this will help, but I had a similar problem a year or two ago. It turned out to be with a device called a Riverbed, which is a network caching server, which needed to be reset. This was installed to cut down on interoffice network traffic (a lot of it repetitive), but it was also set inadvertently to cache internet traffic, and started causing trouble with CPDN uploads. If you have one of these at your location, maybe it could be the problem.

                                              Profile salogel
                                              Send message
                                              Joined: Aug 31 04
                                              Posts: 27
                                              Credit: 1,506,224
                                              RAC: 717
                                              Message 40328 - Posted 7 Aug 2010 20:07:04 UTC - in response to Message 40324.

                                                Here is the debug-info for one upload try with all debugging switches on:
                                                Perhaps someone can read this?
                                                The error occurs at 21:57:41 after 30 seconds of transferring 1.09 MB data.
                                                The whole transfer volume should be 5.503.377 bytes

                                                07.08.2010 21:57:09 climateprediction.net [fxd] starting upload, upload_offset -1
                                                07.08.2010 21:57:09 [http_debug] HTTP_OP::libcurl_exec(): ca-bundle 'C:\Program Files\BOINC\ca-bundle.crt'
                                                07.08.2010 21:57:09 [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
                                                07.08.2010 21:57:09 climateprediction.net Started upload of famous_u4hm_1799_200_006638661_0_3.zip
                                                07.08.2010 21:57:09 climateprediction.net [file_xfer_debug] URL: http://uploader.oerc.ox.ac.uk/cpdn_cgi/file_upload_handler
                                                07.08.2010 21:57:10 [http_debug] [ID#30] Info: timeout on name lookup is not supported
                                                07.08.2010 21:57:10 [http_debug] [ID#30] Info: About to connect() to uploader.oerc.ox.ac.uk port 80 (#0)
                                                07.08.2010 21:57:10 [http_debug] [ID#30] Info: Trying 163.1.124.170...
                                                07.08.2010 21:57:10 [http_debug] [ID#30] Info: Connected to uploader.oerc.ox.ac.uk (163.1.124.170) port 80 (#0)
                                                07.08.2010 21:57:10 [http_debug] [ID#30] Sent header to server: POST /cpdn_cgi/file_upload_handler HTTP/1.0

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Sent header to server: User-Agent: BOINC client (windows_intelx86 6.10.58)

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Sent header to server: Host: uploader.oerc.ox.ac.uk

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Sent header to server: Accept: */*

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Sent header to server: Accept-Encoding: deflate, gzip

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Sent header to server: Content-Type: application/x-www-form-urlencoded

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Sent header to server: Content-Length: 292

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Sent header to server:

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Received header from server: HTTP/1.1 200 OK

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Received header from server: Date: Sat, 07 Aug 2010 19:57:15 GMT

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Received header from server: Server: Apache/2.2.13 (Linux/SUSE)

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Received header from server: Connection: close

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Received header from server: Content-Type: text/plain

                                                07.08.2010 21:57:10 [http_debug] [ID#30] Received header from server:

                                                07.08.2010 21:57:10 [http_xfer_debug] [ID#30] HTTP: wrote 93 bytes
                                                07.08.2010 21:57:10 [http_debug] [ID#30] Info: Expire cleared
                                                07.08.2010 21:57:10 [http_debug] [ID#30] Info: Closing connection #0
                                                07.08.2010 21:57:11 climateprediction.net [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval 0
                                                07.08.2010 21:57:11 climateprediction.net [file_xfer_debug] parsing upload response: <data_server_reply> <status>0</status> <file_size>0</file_size></data_server_reply>
                                                07.08.2010 21:57:11 climateprediction.net [file_xfer_debug] parsing status: 0
                                                07.08.2010 21:57:11 climateprediction.net [fxd] starting upload, upload_offset 0
                                                07.08.2010 21:57:11 [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
                                                07.08.2010 21:57:11 [http_debug] [ID#30] Info: timeout on name lookup is not supported
                                                07.08.2010 21:57:11 [http_debug] [ID#30] Info: About to connect() to uploader.oerc.ox.ac.uk port 80 (#0)
                                                07.08.2010 21:57:11 [http_debug] [ID#30] Info: Trying 163.1.124.170...
                                                07.08.2010 21:57:11 [http_debug] [ID#30] Info: Connected to uploader.oerc.ox.ac.uk (163.1.124.170) port 80 (#0)
                                                07.08.2010 21:57:11 [http_debug] [ID#30] Sent header to server: POST /cpdn_cgi/file_upload_handler HTTP/1.0

                                                07.08.2010 21:57:11 [http_debug] [ID#30] Sent header to server: User-Agent: BOINC client (windows_intelx86 6.10.58)

                                                07.08.2010 21:57:11 [http_debug] [ID#30] Sent header to server: Host: uploader.oerc.ox.ac.uk

                                                07.08.2010 21:57:11 [http_debug] [ID#30] Sent header to server: Accept: */*

                                                07.08.2010 21:57:11 [http_debug] [ID#30] Sent header to server: Accept-Encoding: deflate, gzip

                                                07.08.2010 21:57:11 [http_debug] [ID#30] Sent header to server: Content-Type: application/x-www-form-urlencoded

                                                07.08.2010 21:57:11 [http_debug] [ID#30] Sent header to server: Content-Length: 5503377

                                                07.08.2010 21:57:11 [http_debug] [ID#30] Sent header to server:

                                                07.08.2010 21:57:41 [http_debug] [ID#30] Info: Expire cleared
                                                07.08.2010 21:57:41 [http_debug] [ID#30] Info: Closing connection #0
                                                07.08.2010 21:57:41 [http_debug] HTTP error: Failure when receiving data from the peer
                                                07.08.2010 21:57:42 Project communication failed: attempting access to reference site
                                                07.08.2010 21:57:42 [http_debug] HTTP_OP::init_get(): http://www.google.com/
                                                07.08.2010 21:57:42 [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
                                                07.08.2010 21:57:42 climateprediction.net [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval -184
                                                07.08.2010 21:57:42 climateprediction.net [file_xfer_debug] file transfer status -184
                                                07.08.2010 21:57:42 climateprediction.net Temporarily failed upload of famous_u4hm_1799_200_006638661_0_3.zip: HTTP error
                                                07.08.2010 21:57:42 climateprediction.net [file_xfer_debug] project-wide xfer delay for 4627.819721 sec
                                                07.08.2010 21:57:42 climateprediction.net Backing off 2 hr 47 min 46 sec on upload of famous_u4hm_1799_200_006638661_0_3.zip

                                                ____________

                                                Profile salogel
                                                Send message
                                                Joined: Aug 31 04
                                                Posts: 27
                                                Credit: 1,506,224
                                                RAC: 717
                                                Message 40344 - Posted 10 Aug 2010 5:38:34 UTC

                                                  Is there anybody who can have a look at the debug of the uploading error, please?
                                                  Now there are 10 files which stuck. :-(
                                                  ____________

                                                  transient
                                                  Send message
                                                  Joined: Oct 3 06
                                                  Posts: 42
                                                  Credit: 2,120,513
                                                  RAC: 1,973
                                                  Message 40346 - Posted 10 Aug 2010 15:43:05 UTC

                                                    This thread suggests it could be a server-problem.

                                                    http://128.32.18.189/dev/forum_thread.php?id=5541

                                                    Les Bayliss
                                                    Forum moderator
                                                    Send message
                                                    Joined: Sep 5 04
                                                    Posts: 5131
                                                    Credit: 8,469,789
                                                    RAC: 6,762
                                                    Message 40349 - Posted 10 Aug 2010 21:21:54 UTC

                                                      The debug messages don't mean anything to me, but I'm not too familiar with network problems.
                                                      I'm not having any problems uploading zips to cpdn.

                                                      Profile salogel
                                                      Send message
                                                      Joined: Aug 31 04
                                                      Posts: 27
                                                      Credit: 1,506,224
                                                      RAC: 717
                                                      Message 40359 - Posted 12 Aug 2010 17:45:24 UTC - in response to Message 40349.

                                                        Now I tried the same with my VPN connection and a proxy server defined:
                                                        Same result :-(

                                                        12.08.2010 19:13:15 climateprediction.net [fxd] starting upload, upload_offset 2932378
                                                        12.08.2010 19:13:15 [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
                                                        12.08.2010 19:13:15 [http_debug] [ID#35] Info: timeout on name lookup is not supported
                                                        12.08.2010 19:13:15 [http_debug] [ID#35] Info: About to connect() to proxy 195.127.234.115 port 8080 (#0)
                                                        12.08.2010 19:13:15 [http_debug] [ID#35] Info: Trying 195.127.234.115...
                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Info: Connected to 195.127.234.115 (195.127.234.115) port 8080 (#0)
                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Sent header to server: POST http://cpdn-upload1.comlab.ox.ac.uk/cgi-bin/file_upload_handler HTTP/1.0

                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Sent header to server: User-Agent: BOINC client (windows_intelx86 6.10.58)

                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Sent header to server: Host: cpdn-upload1.comlab.ox.ac.uk

                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Sent header to server: Accept: */*

                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Sent header to server: Accept-Encoding: deflate, gzip

                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Sent header to server: Proxy-Connection: Keep-Alive

                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Sent header to server: Content-Type: application/x-www-form-urlencoded

                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Sent header to server: Content-Length: 2568847

                                                        12.08.2010 19:13:16 [http_debug] [ID#35] Sent header to server:

                                                        12.08.2010 19:13:43 [http_debug] [ID#35] Info: Expire cleared
                                                        12.08.2010 19:13:43 [http_debug] [ID#35] Info: Closing connection #0
                                                        12.08.2010 19:13:43 [http_debug] HTTP error: Failure when receiving data from the peer
                                                        12.08.2010 19:13:44 Project communication failed: attempting access to reference site
                                                        12.08.2010 19:13:44 [http_debug] HTTP_OP::init_get(): http://www.google.com/
                                                        12.08.2010 19:13:44 [http_debug] HTTP_OP::libcurl_exec(): ca-bundle set
                                                        12.08.2010 19:13:44 climateprediction.net [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval -184
                                                        12.08.2010 19:13:44 climateprediction.net [file_xfer_debug] file transfer status -184
                                                        12.08.2010 19:13:44 climateprediction.net Temporarily failed upload of famous_u4hm_1799_200_006638661_0_11.zip: HTTP error

                                                        so I think it has nothing to do with MY internet connection but with the upload servers.
                                                        Other uploads and downloads work fine with this proxy server and VPN, i.e. to Chess960@home.
                                                        ____________

                                                        Les Bayliss
                                                        Forum moderator
                                                        Send message
                                                        Joined: Sep 5 04
                                                        Posts: 5131
                                                        Credit: 8,469,789
                                                        RAC: 6,762
                                                        Message 40361 - Posted 12 Aug 2010 22:11:12 UTC - in response to Message 40359.

                                                          Only one other person is reporting upload problems, here.


                                                          ____________
                                                          Backups: Here

                                                          Profile salogel
                                                          Send message
                                                          Joined: Aug 31 04
                                                          Posts: 27
                                                          Credit: 1,506,224
                                                          RAC: 717
                                                          Message 40364 - Posted 13 Aug 2010 5:41:32 UTC - in response to Message 40361.

                                                            Sometimes it works, so this night one upload file (2.5 MB, HADAM) was sucessfully uploaded. :-)
                                                            But still 10 stuck.
                                                            ____________

                                                            Les Bayliss
                                                            Forum moderator
                                                            Send message
                                                            Joined: Sep 5 04
                                                            Posts: 5131
                                                            Credit: 8,469,789
                                                            RAC: 6,762
                                                            Message 40368 - Posted 14 Aug 2010 1:26:42 UTC

                                                              We are still thinking and talking about this, but it may take a while.


                                                              ____________
                                                              Backups: Here

                                                              Profile salogel
                                                              Send message
                                                              Joined: Aug 31 04
                                                              Posts: 27
                                                              Credit: 1,506,224
                                                              RAC: 717
                                                              Message 40395 - Posted 20 Aug 2010 9:42:08 UTC

                                                                On boincsimap I got the same problem that the upload (~ 25 MB) stucks after a while.
                                                                But on this site the next try started where the previous stuck, so the file has been uploaded completely after several tries.

                                                                On CPDN the next try starts most the time at the beginning so that I'm not able to upload the files completely.
                                                                Perhaps this could be the solution to do it the same way as boincsimap does or is there an error in this mechanism?


                                                                ____________

                                                                Profile salogel
                                                                Send message
                                                                Joined: Aug 31 04
                                                                Posts: 27
                                                                Credit: 1,506,224
                                                                RAC: 717
                                                                Message 40445 - Posted 28 Aug 2010 15:00:19 UTC

                                                                  I made a strange observation today: when I throttle down the upload speed to 15 Kbps, I was able to upload all of the 7 files after a while and several tries.
                                                                  Normally the upload speed with my home internet connection is about 95 Kbps.
                                                                  Perhaps this gives you an idea what is going wrong with the upload.

                                                                  So at the moment all files have been uploaded and I think that is the solution for the moment.


                                                                  ____________

                                                                  ian.sm
                                                                  Send message
                                                                  Joined: Oct 4 09
                                                                  Posts: 73
                                                                  Credit: 7,242,427
                                                                  RAC: 0
                                                                  Message 40638 - Posted 9 Sep 2010 7:08:17 UTC

                                                                    A server is out of disk space - when trying to upload final zip file #13 for task "v5q2". Error first appeared 7 hours ago.

                                                                    But note that an intermediate zip file (#3) uploaded ok for task "v5me" presumably to a different server.

                                                                    09/09/2010 07:29:04 climateprediction.net Sending scheduler request: To send trickle-up message.
                                                                    09/09/2010 07:29:04 climateprediction.net Not reporting or requesting tasks
                                                                    09/09/2010 07:29:05 climateprediction.net Started upload of hadam3p_pnw_v5me_1998_1_006723343_0_3.zip
                                                                    09/09/2010 07:29:05 climateprediction.net Scheduler request completed
                                                                    09/09/2010 07:30:04 climateprediction.net Finished upload of hadam3p_pnw_v5me_1998_1_006723343_0_3.zip
                                                                    09/09/2010 07:31:42 climateprediction.net Started upload of hadam3p_pnw_v5q2_1985_1_006681048_0_13.zip
                                                                    09/09/2010 07:31:43 climateprediction.net [error] Error reported by file upload server: Server is out of disk space
                                                                    09/09/2010 07:31:43 climateprediction.net Temporarily failed upload of hadam3p_pnw_v5q2_1985_1_006681048_0_13.zip: transient upload error
                                                                    09/09/2010 07:31:43 climateprediction.net Backing off 2 hr 25 min 55 sec on upload of hadam3p_pnw_v5q2_1985_1_006681048_0_13.zip

                                                                    Is the target upload server name recorded somewhere in the BOINC files?

                                                                    Pity there is no way to suspend transfers only without having to disable the network. Even better, suspending individual files so active servers can still receive other files (as in above example).
                                                                    Server status all green (otherwise would have known which server is down!).


                                                                    Profile Iain Inglis
                                                                    Forum moderator
                                                                    Send message
                                                                    Joined: Jan 16 10
                                                                    Posts: 410
                                                                    Credit: 9,532
                                                                    RAC: 0
                                                                    Message 40639 - Posted 9 Sep 2010 7:55:59 UTC - in response to Message 40638.

                                                                      A server is out of disk space - when trying to upload final zip file #13 for task "v5q2". Error first appeared 7 hours ago.

                                                                      Thanks, Ian. Message passed on.

                                                                      Les Bayliss
                                                                      Forum moderator
                                                                      Send message
                                                                      Joined: Sep 5 04
                                                                      Posts: 5131
                                                                      Credit: 8,469,789
                                                                      RAC: 6,762
                                                                      Message 40640 - Posted 9 Sep 2010 8:56:59 UTC - in response to Message 40638.

                                                                        "iansm" wrote:
                                                                        Is the target upload server name recorded somewhere in the BOINC files?
                                                                        Yes. In client_state.xml
                                                                        1st, search for the model name.
                                                                        Then keep doing this, but look at the lines above until you see upload towards the right hand end.
                                                                        Keep going until you are just past the relevant zip file number, then look at the 1st part of the upload line. This will have the server name.
                                                                        e.g. oerc

                                                                        ian.sm
                                                                        Send message
                                                                        Joined: Oct 4 09
                                                                        Posts: 73
                                                                        Credit: 7,242,427
                                                                        RAC: 0
                                                                        Message 40641 - Posted 9 Sep 2010 11:00:10 UTC - in response to Message 40640.

                                                                          Thanks, Les.

                                                                          Now located the upload server URL lines in every zip file (in the file_info blocks). Should have known this.

                                                                          The URL in the "file upload_handler" line (for the stuck task) does not appear to be included in the server status page.

                                                                          viz. <url> http://boinc1.coas.oregonstate.edu/cpdn_cgi_main/file_upload_handler </url>

                                                                          Did further searches using the "file upload_handler" string. Examples found:

                                                                          <url>http://cpdn-upload1.comlab.ox.ac.uk/cgi-bin/file_upload_handler</url>
                                                                          <url>http://uploader.oerc.ox.ac.uk/cpdn_cgi/file_upload_handler</url>
                                                                          <url>http://uploader1.atm.ox.ac.uk/cpdn_cgi/file_upload_handler</url>
                                                                          <url>http://climateapps1.oucs.ox.ac.uk/cgi-bin/file_upload_handler</url>

                                                                          That's 4 from the 7 upload servers listed. All Oxford servers.

                                                                          So zip files may go elsewhere then. Directly to the customer.

                                                                          Next, I entered the entire URL in the browser and got this


                                                                          <data_server_reply>
                                                                          <status>1</status>
                                                                          <message>no command</message>
                                                                          </data_server_reply>

                                                                          which may suggest the server is "awake"?

                                                                          Then forced a file transfer...and it's going again...now finished. 3 cheers. Another one bites the dust.

                                                                          Good, learned something new - remote servers (outside Oxford) may cause hold-ups (as servers do) but not reported on the status page.

                                                                          Thanks again for pointing me to the server url lines.

                                                                          Profile Milo Thurston
                                                                          Forum moderator
                                                                          Volunteer developer
                                                                          Send message
                                                                          Joined: Mar 2 06
                                                                          Posts: 253
                                                                          Credit: 363,646
                                                                          RAC: 0
                                                                          Message 40642 - Posted 9 Sep 2010 12:54:42 UTC - in response to Message 40641.


                                                                            Good, learned something new - remote servers (outside Oxford) may cause hold-ups (as servers do) but not reported on the status page.


                                                                            I've added that one to the status page, although I can't guarantee that the result will always be accurate.

                                                                            Greg
                                                                            Send message
                                                                            Joined: Mar 12 10
                                                                            Posts: 4
                                                                            Credit: 455,298
                                                                            RAC: 0
                                                                            Message 40652 - Posted 10 Sep 2010 12:01:09 UTC

                                                                              Hi there

                                                                              I am trying to upload result files:

                                                                              Fri 10 Sep 2010 23:49:42 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_1.zip
                                                                              Fri 10 Sep 2010 23:49:44 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:49:44 NZST climateprediction.net Backing off 2 hr 26 min 5 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_1.zip
                                                                              Fri 10 Sep 2010 23:55:46 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_1.zip
                                                                              Fri 10 Sep 2010 23:55:48 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:55:48 NZST climateprediction.net Backing off 3 min 21 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_1.zip
                                                                              Fri 10 Sep 2010 23:56:00 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_1.zip
                                                                              Fri 10 Sep 2010 23:56:01 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:56:01 NZST climateprediction.net Backing off 1 hr 11 min 24 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_1.zip
                                                                              Fri 10 Sep 2010 23:56:02 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_2.zip
                                                                              Fri 10 Sep 2010 23:56:03 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:56:03 NZST climateprediction.net Backing off 2 hr 13 min 0 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_2.zip
                                                                              Fri 10 Sep 2010 23:56:04 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_3.zip
                                                                              Fri 10 Sep 2010 23:56:05 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:56:05 NZST climateprediction.net Backing off 9 min 58 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_3.zip
                                                                              Fri 10 Sep 2010 23:56:06 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_4.zip
                                                                              Fri 10 Sep 2010 23:56:07 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:56:07 NZST climateprediction.net Backing off 3 hr 22 min 31 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_4.zip
                                                                              Fri 10 Sep 2010 23:56:08 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_5.zip
                                                                              Fri 10 Sep 2010 23:56:09 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:56:09 NZST climateprediction.net Backing off 1 hr 19 min 0 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_5.zip
                                                                              Fri 10 Sep 2010 23:56:10 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_6.zip
                                                                              Fri 10 Sep 2010 23:56:12 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:56:12 NZST climateprediction.net Backing off 30 min 29 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_6.zip
                                                                              Fri 10 Sep 2010 23:56:12 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_7.zip
                                                                              Fri 10 Sep 2010 23:56:14 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:56:14 NZST climateprediction.net Backing off 2 hr 39 min 53 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_7.zip
                                                                              Fri 10 Sep 2010 23:56:16 NZST climateprediction.net Started upload of hadam3p_pnw_v45i_1965_1_006679012_0_8.zip
                                                                              Fri 10 Sep 2010 23:56:18 NZST climateprediction.net Project file upload handler is missing
                                                                              Fri 10 Sep 2010 23:56:18 NZST climateprediction.net Backing off 3 hr 38 min 11 sec on upload of hadam3p_pnw_v45i_1965_1_006679012_0_8.zip


                                                                              Looking in the client_state.xml file the server is http://boinc1.coas.oregonstate.edu/cpdn_cgi_main/file_upload_handler

                                                                              Tried entering the URL in the browser and got:
                                                                              <data_server_reply>
                                                                              <status>1</status>
                                                                              <message>no command</message>
                                                                              </data_server_reply>

                                                                              So server is up but forcing an upload does not work.


                                                                              Greg

                                                                              ian.sm
                                                                              Send message
                                                                              Joined: Oct 4 09
                                                                              Posts: 73
                                                                              Credit: 7,242,427
                                                                              RAC: 0
                                                                              Message 40653 - Posted 10 Sep 2010 16:14:42 UTC - in response to Message 40642.


                                                                                Good, learned something new - remote servers (outside Oxford) may cause hold-ups (as servers do) but not reported on the status page.


                                                                                I've added that one to the status page, although I can't guarantee that the result will always be accurate.

                                                                                Thanks, Milo.

                                                                                BigMike
                                                                                Send message
                                                                                Joined: Apr 6 05
                                                                                Posts: 17
                                                                                Credit: 744,057
                                                                                RAC: 0
                                                                                Message 40657 - Posted 11 Sep 2010 20:23:12 UTC

                                                                                  I also need to report a problem with PNW uploads. I've had one retrying for two days:

                                                                                  Sat Sep 11 13:12:04 2010 climateprediction.net Started upload of hadam3p_pnw_v3pu_1977_1_006678448_1_13.zip
                                                                                  Sat Sep 11 13:12:11 2010 Project communication failed: attempting access to reference site
                                                                                  Sat Sep 11 13:12:11 2010 climateprediction.net Temporarily failed upload of hadam3p_pnw_v3pu_1977_1_006678448_1_13.zip: connect() failed
                                                                                  Sat Sep 11 13:12:11 2010 climateprediction.net Backing off 2 hr 36 min 44 sec on upload of hadam3p_pnw_v3pu_1977_1_006678448_1_13.zip
                                                                                  Sat Sep 11 13:12:13 2010 Internet access OK - project servers may be temporarily down.

                                                                                  What bugs me about it is that I live 15 minutes away from Oregon State. I could go over and knock on their door if I knew who they were.

                                                                                  =Mike


                                                                                  ____________

                                                                                  BigMike
                                                                                  Send message
                                                                                  Joined: Apr 6 05
                                                                                  Posts: 17
                                                                                  Credit: 744,057
                                                                                  RAC: 0
                                                                                  Message 40658 - Posted 12 Sep 2010 5:03:12 UTC - in response to Message 40657.

                                                                                    Last modified: 12 Sep 2010 5:04:35 UTC

                                                                                    | What bugs me about it is that I live 15 minutes away from Oregon State.


                                                                                    Well, now I have to apologize to the folks at OSU. It seems that it's not their upload server that's having the problem:

                                                                                    Sat Sep 11 20:50:13 2010 climateprediction.net [file_xfer_debug] URL: http://climateapps1.oucs.ox.ac.uk/cgi-bin/file_upload_handler
                                                                                    Sat Sep 11 20:50:14 2010 [http_debug] [ID#22] Info: timeout on name lookup is not supported
                                                                                    Sat Sep 11 20:50:14 2010 [http_debug] [ID#22] Info: About to connect() to climateapps1.oucs.ox.ac.uk port 80 (#0)
                                                                                    Sat Sep 11 20:50:14 2010 [http_debug] [ID#22] Info: Trying 163.1.13.16...
                                                                                    Sat Sep 11 20:50:21 2010 [http_debug] [ID#22] Info: Connection refused
                                                                                    Sat Sep 11 20:50:21 2010 [http_debug] [ID#22] Info: Failed connect to climateapps1.oucs.ox.ac.uk:80; No such file or directory
                                                                                    Sat Sep 11 20:50:21 2010 [http_debug] [ID#22] Info: Expire cleared
                                                                                    Sat Sep 11 20:50:21 2010 [http_debug] [ID#22] Info: Closing connection #0
                                                                                    Sat Sep 11 20:50:21 2010 [http_debug] HTTP error: Couldn't connect to server

                                                                                    =Mike

                                                                                    ____________

                                                                                    Les Bayliss
                                                                                    Forum moderator
                                                                                    Send message
                                                                                    Joined: Sep 5 04
                                                                                    Posts: 5131
                                                                                    Credit: 8,469,789
                                                                                    RAC: 6,762
                                                                                    Message 40659 - Posted 12 Sep 2010 6:07:28 UTC

                                                                                      The last zip file, (13), which is created about 10 minutes after the last zip that goes to OSU, goes to Oxford, as it contains the data to join it to the next in the series for that parameter set.

                                                                                      BigMike
                                                                                      Send message
                                                                                      Joined: Apr 6 05
                                                                                      Posts: 17
                                                                                      Credit: 744,057
                                                                                      RAC: 0
                                                                                      Message 40662 - Posted 12 Sep 2010 15:30:34 UTC - in response to Message 40659.

                                                                                        Last modified: 12 Sep 2010 15:42:56 UTC

                                                                                        Thanks ... good to know. Also, I saw the post about the server.

                                                                                        BTW: Who authorized giving Milo weekends off? :D

                                                                                        =Mike


                                                                                          ____________

                                                                                          Profile Milo Thurston
                                                                                          Forum moderator
                                                                                          Volunteer developer
                                                                                          Send message
                                                                                          Joined: Mar 2 06
                                                                                          Posts: 253
                                                                                          Credit: 363,646
                                                                                          RAC: 0
                                                                                          Message 40663 - Posted 13 Sep 2010 11:08:20 UTC

                                                                                            I've now put a small NAS unit in the server room where climateapps1 is stored and I'm slowly copying data to it. This is not an ideal solution but it is small and cheap so I was actually able to get hold of it in a matter of days rather than months.

                                                                                            Hopefully the server can be re-started later today.

                                                                                            Darmok
                                                                                            Avatar
                                                                                            Send message
                                                                                            Joined: Dec 29 09
                                                                                            Posts: 27
                                                                                            Credit: 2,411,171
                                                                                            RAC: 1,643
                                                                                            Message 40670 - Posted 15 Sep 2010 9:24:19 UTC

                                                                                              Last modified: 15 Sep 2010 9:25:05 UTC

                                                                                              Uploads are working well but Boinc Manager has not updated credits. I'm not very concerned about it but I don't recall seeing this with all CPDN servers running. Is this part of the current issue?

                                                                                              michel
                                                                                              Send message
                                                                                              Joined: Nov 12 09
                                                                                              Posts: 5
                                                                                              Credit: 6,176
                                                                                              RAC: 0
                                                                                              Message 40741 - Posted 21 Sep 2010 21:01:29 UTC

                                                                                                depuis plusieurs semaines, je n'arrive pas à faire le transfert des résultats.
                                                                                                Voici les messages régulièrement reçus :

                                                                                                  21/09/2010 21:32:53 climateprediction.net Started upload of famous_u0y8_599_200_006634083_0_14.zip
                                                                                                  21/09/2010 21:32:53 climateprediction.net Started upload of famous_u0y8_599_200_006634083_0_17.zip
                                                                                                  21/09/2010 21:34:23 Project communication failed: attempting access to reference site
                                                                                                  21/09/2010 21:34:23 climateprediction.net Temporarily failed upload of famous_u0y8_599_200_006634083_0_14.zip: HTTP error
                                                                                                  21/09/2010 21:34:23 climateprediction.net Backing off 1 hr 34 min 3 sec on upload of famous_u0y8_599_200_006634083_0_14.zip
                                                                                                  21/09/2010 21:34:26 Internet access OK - project servers may be temporarily down.
                                                                                                  21/09/2010 21:34:43 Project communication failed: attempting access to reference site
                                                                                                  21/09/2010 21:34:43 climateprediction.net Temporarily failed upload of famous_u0y8_599_200_006634083_0_17.zip: HTTP error


                                                                                                  21/09/2010 21:34:43 climateprediction.net Backing off 3 hr 16 min 53 sec on upload of famous_u0y8_599_200_006634083_0_17.zip
                                                                                                  21/09/2010 21:34:45 Internet access OK - project servers may be temporarily down.


                                                                                                [list=][img][/img][/list]

                                                                                                Profile mo.v
                                                                                                Forum moderator
                                                                                                Avatar
                                                                                                Send message
                                                                                                Joined: Sep 29 04
                                                                                                Posts: 2354
                                                                                                Credit: 6,493,435
                                                                                                RAC: 2,039
                                                                                                Message 40742 - Posted 21 Sep 2010 22:53:56 UTC

                                                                                                  Credits are not related to uploads (except that if you can't upload your trickles you won't receive credit for them until you do!).

                                                                                                  There's a script that runs once a day and puts our credit into our accounts. Another script, which also runs once a day though I think at a different time, exports a record of our credit to the external stats sites like BoincStats. Occasionally these scripts have to be disabled because other jobs are being done on the server, or someone turns a script off then forgets to reenable it. The data will all be there, though, not lost.

                                                                                                  I don't think Milo has access to all this at the moment so we may just have to be patient.
                                                                                                  ____________
                                                                                                  Cpdn news
                                                                                                  5 CPDN READMEs

                                                                                                  Christian Siebner
                                                                                                  Send message
                                                                                                  Joined: Apr 23 05
                                                                                                  Posts: 1
                                                                                                  Credit: 398,113
                                                                                                  RAC: 0
                                                                                                  Message 40781 - Posted 27 Sep 2010 5:47:49 UTC

                                                                                                    Last modified: 27 Sep 2010 5:49:06 UTC

                                                                                                    Hi, I do have upload-problems here (for some days now):

                                                                                                    27.09.2010 07:11:01 climateprediction.net Started upload of hadam3p_pnw_v3bl_1993_1_006722916_0_13.zip
                                                                                                    27.09.2010 07:33:07 Project communication failed: attempting access to reference site
                                                                                                    27.09.2010 07:33:07 climateprediction.net Temporarily failed upload of hadam3p_pnw_v3bl_1993_1_006722916_0_13.zip: HTTP error
                                                                                                    27.09.2010 07:33:07 climateprediction.net Backing off 3 hr 54 min 30 sec on upload of hadam3p_pnw_v3bl_1993_1_006722916_0_13.zip
                                                                                                    27.09.2010 07:33:09 Internet access OK - project servers may be temporarily down.

                                                                                                    Trickles from other WUs running on my machine were uploaded without any problem.
                                                                                                    Does somebody have an idea how to solve it?
                                                                                                    ____________

                                                                                                    Profile Veebee
                                                                                                    Send message
                                                                                                    Joined: Nov 4 06
                                                                                                    Posts: 10
                                                                                                    Credit: 1,717,441
                                                                                                    RAC: 0
                                                                                                    Message 40824 - Posted 9 Oct 2010 1:57:17 UTC

                                                                                                      I thought it better to post here rather than start another "upload trouble" thread...

                                                                                                      I am having the same problems as legolas; I have 10 zip files sitting here that have been trying to upload for around a fortnight now.

                                                                                                      I stopped client, created cc_config file, restarted and read config file for the HTTP option but it had no effect.
                                                                                                      I have also rebooted this machine.

                                                                                                      Here are the models, versions and filenumbers ( 3 different WUs ):

                                                                                                      All models are Famous 6.11 files _5, _12 (x2), _13 (x2), _14 (x2), _15 (x2), _16, _20.

                                                                                                      2 of these Famous models are completed (1 says it is ready to report, 1 says it is uploading) and on "computation errored" out at 160 odd hours.
                                                                                                      Another model is almost complete.


                                                                                                      Any idea how I can get the work/ files up to the servers as CPDN is one of my fave projects and I dont want to stop crunching it.

                                                                                                      Thanks
                                                                                                      Veebee

                                                                                                      Profile JIM
                                                                                                      Send message
                                                                                                      Joined: Dec 31 07
                                                                                                      Posts: 609
                                                                                                      Credit: 3,346,966
                                                                                                      RAC: 4,732
                                                                                                      Message 40825 - Posted 9 Oct 2010 5:04:00 UTC - in response to Message 40824.

                                                                                                        Dear Veebee:

                                                                                                        Check the server status page. The server has been down for the past 3 days. The Scheduler, transitioner and the feeder are all not running so you cannot report completed tasks or get new ones.

                                                                                                        Read more in the "News and Announcements" tread at the top in Number Crunching.

                                                                                                        ____________

                                                                                                        Profile mo.v
                                                                                                        Forum moderator
                                                                                                        Avatar
                                                                                                        Send message
                                                                                                        Joined: Sep 29 04
                                                                                                        Posts: 2354
                                                                                                        Credit: 6,493,435
                                                                                                        RAC: 2,039
                                                                                                        Message 40826 - Posted 9 Oct 2010 5:57:32 UTC

                                                                                                          The completed model can't report until climateapps2 has all (or more) of its programs up and running, but both these FAMOUS v.11 models should be able to upload all their files in spite of the outage. All these files should upload to kraken which has had no recent outages.

                                                                                                          Veebee, I think this must be a problem at your end, not with the server.

                                                                                                          The timeout on file uploads was lengthened to 90 days so there's no rush from that point of view but I think each file is only allowed 100 upload attempts. Don't keep repeating manual retries (the Retry Now button) until someone can suggest more ideas.
                                                                                                          ____________
                                                                                                          Cpdn news
                                                                                                          5 CPDN READMEs

                                                                                                          Profile Veebee
                                                                                                          Send message
                                                                                                          Joined: Nov 4 06
                                                                                                          Posts: 10
                                                                                                          Credit: 1,717,441
                                                                                                          RAC: 0
                                                                                                          Message 40827 - Posted 9 Oct 2010 9:45:05 UTC - in response to Message 40826.

                                                                                                            Last modified: 9 Oct 2010 9:51:08 UTC

                                                                                                            Quote from mo.v:
                                                                                                            Veebee, I think this must be a problem at your end, not with the server.
                                                                                                            end quote.

                                                                                                            I don't know ... I have two "identical" machines (i7-920's) and they are both crunching and up/ downloading other projects without issue.
                                                                                                            THIS machine is one I downloaded a few extra Climate models on to cover work shortages on a chosen project, the other only has the one model running but hasn't (as yet) had a zip file sit there unable to upload.
                                                                                                            (mind you, that one is a HADSM3 slab model - just had a "close look" nearly 1200 hours so far !!! :O )

                                                                                                            I shall avoid manually retrying uploads on them, but I am having that sinking feeling that all that CPU time is gonna be wasted .. :`(

                                                                                                            BTW: two of the zip files Do say they are at 100% uploaded and a few of the pothers get to a certain point and stop...

                                                                                                            Profile Thyme Lawn
                                                                                                            Forum moderator
                                                                                                            Send message
                                                                                                            Joined: Aug 5 04
                                                                                                            Posts: 1212
                                                                                                            Credit: 10,214,791
                                                                                                            RAC: 721
                                                                                                            Message 40828 - Posted 9 Oct 2010 11:13:33 UTC - in response to Message 40824.

                                                                                                              Last modified: 9 Oct 2010 14:37:30 UTC

                                                                                                              Veebee wrote:

                                                                                                              Any idea how I can get the work/ files up to the servers as CPDN is one of my fave projects and I dont want to stop crunching it.

                                                                                                              If BOINC is making simultaneous attempts to upload the files it's possible that you're hitting a 5 minute inactivity timeout on the files. That's most likely if you have a relatively slow connection, are restricting BOINC's upload bandwidth or have a busy connection (e.g. more than one computer attempting a large upload at the same time or a large non-BOINC file transfer on the same computer).

                                                                                                              I've found that when BOINC is doing multiple uploads it has a tendency to favour the most recently started upload. That can result in nothing being sent for uploads which are already in progress until the more recent upload has completed. If an upload is locked out in this way for longer than 5 minutes it is timed out and has to be restarted. The restart offset is negotiated with the server but very frequently the server seems to have lost track of how much has already been received (possibly something is causing it to delete the data it has already received?) and restarts from 0.

                                                                                                              The only way I've found of getting round this is to restrict the number of simultaneous uploads (the default is 8 in total and no more than 2 per project) by including the following in cc_config.xml:

                                                                                                              <cc_config>
                                                                                                              <options>
                                                                                                              <max_file_xfers>2</max_file_xfers>
                                                                                                              <max_file_xfers_per_project>1</max_file_xfers_per_project>
                                                                                                              </options>
                                                                                                              </cc_config>

                                                                                                              Depending on your mix of projects you might need to increase <max_file_xfers> (setting it to 1 is possible, but that would prevent other projects from uploading results until all of your CPDN files have been sent).
                                                                                                              ____________
                                                                                                              "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer

                                                                                                              Ingleside
                                                                                                              Send message
                                                                                                              Joined: Aug 5 04
                                                                                                              Posts: 85
                                                                                                              Credit: 6,853,559
                                                                                                              RAC: 11,933
                                                                                                              Message 40831 - Posted 10 Oct 2010 11:16:45 UTC - in response to Message 40826.

                                                                                                                Last modified: 10 Oct 2010 11:22:28 UTC

                                                                                                                The timeout on file uploads was lengthened to 90 days so there's no rush from that point of view but I think each file is only allowed 100 upload attempts. Don't keep repeating manual retries (the Retry Now button) until someone can suggest more ideas.

                                                                                                                I'm not aware of any limits on #retries, and a quick test reveals that manually increasing to 11000 connection-attempts had no effect, the upload just kept retrying as before.

                                                                                                                Worth remembering, since many CPDN-users still seems to use old BOINC-clients, is that the increase to 90-day is only for the v6.10.xx and later clients.


                                                                                                                As for checking for connection-problems, the 1st. is always to re-boot any modems, routers and so on, and to re-boot the affected computer.

                                                                                                                If this doesn't work, try creating/edit a cc_config.xml (placed in BOINC data-directory) containing minimum the following lines:
                                                                                                                <cc_config>
                                                                                                                <log_flags>
                                                                                                                <file_xfer_debug>1</file_xfer_debug>
                                                                                                                <http_xfer_debug>1</http_xfer_debug>
                                                                                                                </log_flags>
                                                                                                                </cc_config>

                                                                                                                And just select to "Read config file" in BOINC Manager.

                                                                                                                Keeping <file_xfer_debug> always enabled is an advantage, since you'll always get info about which upload-server is tried connected, making it easy to check with the server status-page if this server is down, and you don't need to manually search-through client_state.xml to get this info. The transfer-speed will also be logged if the transfer was successful.

                                                                                                                The 2nd. option on the other hand will create much extra info, so disabling it again after fixing the problem is recommended. To disable, just change the 1 to a zero, and re-read config-file.

                                                                                                                A couple other <log_flags> that possibly also can be useful is:
                                                                                                                <http_debug>1</http_debug>
                                                                                                                <proxy_debug>1</proxy_debug>


                                                                                                                edit - I see Gundolf Jahn also did mention some of the log-flags earlier in the thread.

                                                                                                                Profile Iain Inglis
                                                                                                                Forum moderator
                                                                                                                Send message
                                                                                                                Joined: Jan 16 10
                                                                                                                Posts: 410
                                                                                                                Credit: 9,532
                                                                                                                RAC: 0
                                                                                                                Message 40835 - Posted 10 Oct 2010 20:18:02 UTC - in response to Message 40827.

                                                                                                                  [Veebee wrote:] ...(mind you, that one is a HADSM3 slab model - just had a "close look" nearly 1200 hours so far !!! :O )
                                                                                                                  That model, hadsm3dhet2_k8ob_006620893_7, has become a slow-processing 'iceworld'. Painful though it might be at this stage, the best thing to do with that model is to abort it: it will finish eventually, but the data from the freeze point onwards is invalid. Some efforts have been made to find the cause, which is so far proving elusive.

                                                                                                                  Darmok
                                                                                                                  Avatar
                                                                                                                  Send message
                                                                                                                  Joined: Dec 29 09
                                                                                                                  Posts: 27
                                                                                                                  Credit: 2,411,171
                                                                                                                  RAC: 1,643
                                                                                                                  Message 40839 - Posted 11 Oct 2010 14:50:30 UTC

                                                                                                                    Last modified: 11 Oct 2010 14:50:59 UTC

                                                                                                                    Read Milo's announcement but I still encounter failed downloads for the past several days. Everything else is Ok.

                                                                                                                    Started download of atmos_v3xe_1199_200_006736082_0.gz
                                                                                                                    Project communication failed: attempting access to reference site
                                                                                                                    Temporarily failed download of atmos_v3xe_1199_200_006736082_0.gz: HTTP error

                                                                                                                    Thanks

                                                                                                                    Profile astroWX
                                                                                                                    Forum moderator
                                                                                                                    Send message
                                                                                                                    Joined: Aug 5 04
                                                                                                                    Posts: 1250
                                                                                                                    Credit: 35,021,473
                                                                                                                    RAC: 23,333
                                                                                                                    Message 40840 - Posted 11 Oct 2010 23:23:38 UTC - in response to Message 40839.

                                                                                                                      Don't know what's what's wrong but I had two files remaining, partially downloaded. After a day or two, I realized that each boinc attempt downloaded a small bite of bytes. Because the remaining files were relatively small and already partially downloaded and didn't restart each time, I decided to click 'Retry Now'... and click... and click.... Eventually, the downloads finished. (Pathetic way to get the job done, actually.)

                                                                                                                      Why the server permitted that bit of foolishness, when it refused to complete the transaction on its own, one can only guess.
                                                                                                                      ____________
                                                                                                                      "We have met the enemy and he is us." -- Pogo
                                                                                                                      Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                                                                                                      Darmok
                                                                                                                      Avatar
                                                                                                                      Send message
                                                                                                                      Joined: Dec 29 09
                                                                                                                      Posts: 27
                                                                                                                      Credit: 2,411,171
                                                                                                                      RAC: 1,643
                                                                                                                      Message 40841 - Posted 12 Oct 2010 10:13:57 UTC - in response to Message 40840.

                                                                                                                        [quote]I decided to click 'Retry Now'... and click... and click.... Eventually, the downloads finished. (Pathetic way to get the job done, actually.)

                                                                                                                        Thanks AstroWX. This would confirm it is a problem with the CPDN download server.

                                                                                                                        I clicked several times also but my confidence in the downloaded files was low for not being corrupted and possibly wasting computing time so I regrettably aborted them and held off on downloads until a resolution to this issue or a confirmation the models will behave properly to the end.

                                                                                                                        Profile Milo Thurston
                                                                                                                        Forum moderator
                                                                                                                        Volunteer developer
                                                                                                                        Send message
                                                                                                                        Joined: Mar 2 06
                                                                                                                        Posts: 253
                                                                                                                        Credit: 363,646
                                                                                                                        RAC: 0
                                                                                                                        Message 40842 - Posted 12 Oct 2010 10:25:21 UTC

                                                                                                                          I know that there's a problem with downloads from climateapps2 and I am working as best I can to fix it.
                                                                                                                          I cannot say how many days it will take to fix as it depends upon many factors.

                                                                                                                          Profile Milo Thurston
                                                                                                                          Forum moderator
                                                                                                                          Volunteer developer
                                                                                                                          Send message
                                                                                                                          Joined: Mar 2 06
                                                                                                                          Posts: 253
                                                                                                                          Credit: 363,646
                                                                                                                          RAC: 0
                                                                                                                          Message 40858 - Posted 14 Oct 2010 9:04:06 UTC

                                                                                                                            Hiro was able to extract a disk from his cluster, which I have used to add more space to climateapps2. So, there should now be no problem with downloads.

                                                                                                                            Lockleys
                                                                                                                            Send message
                                                                                                                            Joined: Jan 13 07
                                                                                                                            Posts: 118
                                                                                                                            Credit: 3,775,535
                                                                                                                            RAC: 2,202
                                                                                                                            Message 40859 - Posted 14 Oct 2010 12:47:08 UTC

                                                                                                                              I too have experienced the problem described by astroWX and by Darmok. I.E. the download delivers about 1K before stopping. This happens on a couple of the FAMOUS files, but not all. One of them is the ocean download. I have this problem on 2 different PCs in 2 different locations with 2 different Windows (one is XP, the other is 7) and 2 different routers. Several minutes doing astrWX's click & retry worked, but gee it's tiresome.

                                                                                                                              Darmok
                                                                                                                              Avatar
                                                                                                                              Send message
                                                                                                                              Joined: Dec 29 09
                                                                                                                              Posts: 27
                                                                                                                              Credit: 2,411,171
                                                                                                                              RAC: 1,643
                                                                                                                              Message 40863 - Posted 14 Oct 2010 23:40:07 UTC

                                                                                                                                Perfecto. Thanks Milo, Hiro and Astro.

                                                                                                                                BarryAZ
                                                                                                                                Send message
                                                                                                                                Joined: Jul 13 05
                                                                                                                                Posts: 112
                                                                                                                                Credit: 11,248,017
                                                                                                                                RAC: 1,580
                                                                                                                                Message 40892 - Posted 21 Oct 2010 16:34:57 UTC

                                                                                                                                  I see that Kraken remains offline and that it went offline due to a lack of disk storage several days ago. I read the update that indicated that Milo does not have regular access to Kraken to be able to resolve this as quickly as we all would like to see.

                                                                                                                                  That being said, is the resolution of this currently a matter of hours, days, weeks? I have a handful of the Famous work units either at 100% (and not uploading) or nearing 100% (and thus also about to not upload).

                                                                                                                                  I realize the trickle credits are still applied, but as these workunits fail to upload it means I likely won't get new work on some workstations. If the outage for Kraken is going to be extended (weeks rather than days), would it be best for those workstations which reach 100% to simply abort the uploads and reset which would result in non-Famous work being downloaded?

                                                                                                                                  The question I guess, is how useful to the project are the completed work units? I can simply leave the completed work on these workstations waiting on the return of Kraken to the living -- it would mean that other projects get an increased processing share by default -- not that big a deal.
                                                                                                                                  ____________

                                                                                                                                  Profile astroWX
                                                                                                                                  Forum moderator
                                                                                                                                  Send message
                                                                                                                                  Joined: Aug 5 04
                                                                                                                                  Posts: 1250
                                                                                                                                  Credit: 35,021,473
                                                                                                                                  RAC: 23,333
                                                                                                                                  Message 40893 - Posted 21 Oct 2010 18:33:39 UTC - in response to Message 40892.

                                                                                                                                    Barry,

                                                                                                                                    FAMOUS results are important for Hiro's support of the Millennium Project. (I'm holding mine behind suspended network activity.)

                                                                                                                                    There is a RAID array in the pipeline, discussed elsewhere, that should eliminate the upload problem -- for awhile, at least. Early next week, we hope ...
                                                                                                                                    ____________
                                                                                                                                    "We have met the enemy and he is us." -- Pogo
                                                                                                                                    Greetings from coastal Washington state, the scenic US Pacific Northwest.

                                                                                                                                    Les Bayliss
                                                                                                                                    Forum moderator
                                                                                                                                    Send message
                                                                                                                                    Joined: Sep 5 04
                                                                                                                                    Posts: 5131
                                                                                                                                    Credit: 8,469,789
                                                                                                                                    RAC: 6,762
                                                                                                                                    Message 40894 - Posted 21 Oct 2010 18:58:59 UTC - in response to Message 40892.

                                                                                                                                      As said many times, ALL results are important to whoever's research project it is.
                                                                                                                                      That's why these data transfers take so long; there's Terabytes of data stored here and there now. Unlike other projects, where a "no good" result means that the data can be dumped, here it ALL gets kept.

                                                                                                                                      The many servers here are scattered all through Oxford, in whatever department has space in a server cabinet. When a server needs a hard reset, Milo has to walk to whichever building houses it, and go through a "I need to reset a server. Can I get into your server room please?" routine.

                                                                                                                                      In the case of Kraken, Milo's only allowed to use the network in that area for data transfer during working hours. So, about a third of a day. And, as the full transfer takes a day or two, this is going to take a few more days.

                                                                                                                                      Apologies to all for the problems.

                                                                                                                                      BarryAZ
                                                                                                                                      Send message
                                                                                                                                      Joined: Jul 13 05
                                                                                                                                      Posts: 112
                                                                                                                                      Credit: 11,248,017
                                                                                                                                      RAC: 1,580
                                                                                                                                      Message 40895 - Posted 21 Oct 2010 21:17:53 UTC - in response to Message 40894.

                                                                                                                                        OK -- thanks for the answers -- I can certainly hang on to the completed work. As I run multiple projects and suspending network activity is not a project specific option, the alternative is that my other projects will for now pick up the slack if (because of a 'in process' Climate work unit upload) no new Climate work can be downloaded.

                                                                                                                                        And I do appreciate the timeline as to when to expect (or hope for) Kraken to be back online.


                                                                                                                                        As said many times, ALL results are important to whoever's research project it is.
                                                                                                                                        That's why these data transfers take so long; there's Terabytes of data stored here and there now. Unlike other projects, where a "no good" result means that the data can be dumped, here it ALL gets kept.

                                                                                                                                        The many servers here are scattered all through Oxford, in whatever department has space in a server cabinet. When a server needs a hard reset, Milo has to walk to whichever building houses it, and go through a "I need to reset a server. Can I get into your server room please?" routine.

                                                                                                                                        In the case of Kraken, Milo's only allowed to use the network in that area for data transfer during working hours. So, about a third of a day. And, as the full transfer takes a day or two, this is going to take a few more days.

                                                                                                                                        Apologies to all for the problems.



                                                                                                                                        ____________

                                                                                                                                        Post to thread

                                                                                                                                        Message boards : Number crunching : Upload problem




                                                                                                                                        Copyright © 2002-2014 climateprediction.net