climateprediction.net home page
Unrecoverable error for result sulphur_iy78_000884132_0

Unrecoverable error for result sulphur_iy78_000884132_0

Questions and Answers : Windows : Unrecoverable error for result sulphur_iy78_000884132_0
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile old_user129496

Send message
Joined: 5 Dec 05
Posts: 1
Credit: 806,307
RAC: 0
Message 19146 - Posted: 10 Jan 2006, 17:32:52 UTC

This problem started a couple of days before the servers went down.

Here are the steps I take that get me to the problem:

1. I have the machine update the project.
2. The system requests 8640 seconds of new work.
3. The system downloads sulphur_iy78_000884132.zip
4. The system then starts processing the result.

At this point I get the error message -

Unrecoverable error for result sulphur_iy78_000884132_0
(-exit code-1073741819 (0x0000005))

I then cannot recieve any new work for the day because I have reached the daily quota of 1 result.

System information -

Computer name - computer1
OS - XP Home SP2
BOINC - 5.2.13
Processor - P4 2.53GHZ 500 FSB
Memory - 785200KB (roughly 784MB)
ID: 19146 · Report as offensive     Reply Quote
old_user109091

Send message
Joined: 17 Nov 05
Posts: 4
Credit: 476,967
RAC: 0
Message 19149 - Posted: 10 Jan 2006, 18:23:41 UTC

I also have been having similar problems. I recently came back to my office computer after being gone for a couple of weeks over the break. I was startled to see that my office computer had tried to download and process no less than 26 new WU\'s during that time, and almost always crashed. Furthermore, my home computer has tried 3 different WU\'s, and the last 2 have quit with unrecoverable errors after Phase 1. I have no idea what is going on. Below is a snippet of the error log from my office computer:

2005-12-23 19:32:12 [climateprediction.net] Message from server: No work sent
2005-12-23 19:32:12 [climateprediction.net] Message from server: (reached daily quota of 1 results)
2005-12-23 19:32:12 [climateprediction.net] No work from project
2005-12-24 19:09:29 [climateprediction.net] Unrecoverable error for result sulphur_hm69_000821889_0 (<file_xfer_error>
<file_name>sulphur_hm69_000821889_0_1.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_hm69_000821889_0_2.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_hm69_000821889_0_3.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_hm69_000821889_0_4.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_hm69_000821889_0_5.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
)

This same basic block of errors is repeated several times in the log, each with a different WU attempt. It\'s really annoying to see my results page cluttered with all these failed attempts! Also, why would my home computer have always quit after phase 1 each time?

Dan
ID: 19149 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 19151 - Posted: 10 Jan 2006, 19:00:04 UTC
Last modified: 10 Jan 2006, 19:23:51 UTC

Michael
The \"1073741819\" error is well known, but unfortunately there is no known cure.
It appears to be something to do with Microsoft programs.
Lately, it has been suggested that it may be related to Direct-X. Also that programs using \"3D\" graphics may be involved.
I think I read this in a post on another project, but I can\'t remember which.

ID: 19151 · Report as offensive     Reply Quote
old_user138545

Send message
Joined: 15 Dec 05
Posts: 5
Credit: 134,136
RAC: 0
Message 19152 - Posted: 10 Jan 2006, 19:02:25 UTC

I have the same kind of problem. When phase 1 is compleet (ca 20%) it takes a long time to finish de WU for transport. Then the ZIP file error occurs. I do run Climateprediction on two machines. One with Windows Pro and one with Windows Home. Both have the same problem. The one with Windows Pro dit give an error (and dit leave de GZ files on the drive), but my resultview on the Net dit say that de WU was done. The same fault on my Windows Home machine resulted in an error in my resultview to.

That means that, although the error occurs, it is possible to send a correct result (Result ID: 1472216, Work Unit ID: 975038). So, the ZIP-files must be created correctly. When transferred to the server there goes something wrong. I have just two results so far, and ik takes al long time to calculate another. I can´t test it soon. Are there more people with the same problem, or does someone from ClimatePrediction know the answer.

Eddy.

PS.
Below my error (the one that is also a problem in my Resultview on the Net)
(Outcome: Client Error, Client state: Computing (but not running on my computer), Result ID: 1474449, Work Unit ID: 977253)

10-1-2006 4:37:40|climateprediction.net|Unrecoverable error for result sulphur_gh6i_000768762_0 (<file_xfer_error> <file_name>sulphur_gh6i_000768762_0_2.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_gh6i_000768762_0_3.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_gh6i_000768762_0_4.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_gh6i_000768762_0_5.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error>)
ID: 19152 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 19153 - Posted: 10 Jan 2006, 19:04:15 UTC

Dan
Two things:

1) To deal with the display problem when posting quotes containing brackets, you need to use e.g. WinWord to do a replace on the less than/greater than symbols. Use square brackets instead.

2) The 161 error is a red herring. It just means something like \"the error files BOINC is trying to upload don\'t exist, or are empty\".
The REAL error message is missing. This seems to be a recent \"problem\", and I don\'t know why it is happening, just that it is. A LOT!

If you look in the file yabsd.out, which is in the dataout folder of your model, there should be an error message, or description, at the bottom of the file. THIS will tell you, (or us), what REALLY happened.

The usual causes of failures are overheating, (caused by lack of air flow in the case, and/or dust on the heatsink), overclocking, (the processor just can\'t handle the intense, continuous, calcs at that speed), an agressive AV program which locks files trying to do a write, just so it can check them, (Avast, Antivir), and, I feel, a bare minimum power supply, which is letting the voltages sag under load.

ID: 19153 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,478,808
RAC: 4,045
Message 19155 - Posted: 10 Jan 2006, 19:20:19 UTC

To those of you having errors after phase 1, it has to do with the work units that were created between December 9th and 22nd. These work units will error just after the end of the first phase because there was a mistake in the work unit xml file. Even though they errored out, the first phase upload of all these sulphur models are/was valuable.

See this post on the phpBB forum that describes the problem (there are also a few posts in that thread that describe other problems).
ID: 19155 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 19156 - Posted: 10 Jan 2006, 19:23:14 UTC
Last modified: 10 Jan 2006, 19:26:07 UTC

Edman
The models on BOTH of your computers have crashed.
The one that says \"Success\" just has a problem with the labelling on the server.
If you look at the results for each, they got to 24 trickles, which is the end of phase 1. So they both have 4 more phases to go to complete, which is a total of 120 trickles.

Apparently the data sent to the server at the end of phase 1 is quite usefull, although not nearly as good as if the models had completd.
But you got a lot further than some people.

As for what happened, it\'s hard to say. Perhaps if you check the yabsd.out file, and paste the last few lines here it may help to work it out.
There is a LOT of disk activity at the end of each phase, and perhaps some other program got in the way at a critical moment.

edit
OK, Geophi\'s post explains a few things.

ID: 19156 · Report as offensive     Reply Quote
old_user138545

Send message
Joined: 15 Dec 05
Posts: 5
Credit: 134,136
RAC: 0
Message 19157 - Posted: 10 Jan 2006, 19:28:20 UTC - in response to Message 19156.  

Les, Thank you for de fast respons. I\'ve looked it up. For both WU\'s ther was the same message at the end of yabsd.out:

Mismatch in no of prognostic fields.
No of prog fields in Atmos dump 388
No of prog fields expected 389

Run RECONFIGURATION to get correct no of prognostic fields in atmos dump
or
Check/Reset experiment in User Interface

*********************************************************************************
Model aborted with error code - 102 Routine and message:-
INITDUMP: Wrong no of atmos prognostic fields
*********************************************************************************

Do i have to do something te remove the files that are left on my harddrive (approx 215 Mb each)

Edman

Edman
The models on BOTH of your computers have crashed.
The one that says \"Success\" just has a problem with the labelling on the server.
If you look at the results for each, they got to 24 trickles, which is the end of phase 1. So they both have 4 more phases to go to complete, which is a total of 120 trickles.

Apparently the data sent to the server at the end of phase 1 is quite usefull, although not nearly as good as if the models had completd.
But you got a lot further than some people.

As for what happened, it\'s hard to say. Perhaps if you check the yabsd.out file, and paste the last few lines here it may help to work it out.
There is a LOT of disk activity at the end of each phase, and perhaps some other program got in the way at a critical moment.



ID: 19157 · Report as offensive     Reply Quote
ninesouls

Send message
Joined: 18 Sep 04
Posts: 2
Credit: 2,451,088
RAC: 0
Message 19405 - Posted: 18 Jan 2006, 7:39:15 UTC - in response to Message 19157.  

Well, I\'m seeing this problem. Actually, between that and the 404 issue, I haven\'t been able to do any work whatsoever since the servers shutdown (and I never had this problem before). It downloads two workunits (I have hyperthreading enabled), fails when it tries to run them (1073741819 error) and then can\'t download anything else. This also happens immediately after a reboot, when the system should theoretically be clean of any software that might interfere with Boinc (well, there are the Windows services, of course, anti-virus, Skype etc., but no open application). I tried to reset the project - nothing. Tried looking at the yabsd.out file - couldn\'t find any (the dataout directory is empty). Any suggestions?
ID: 19405 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 19406 - Posted: 18 Jan 2006, 8:13:53 UTC

Edman
Sorry I didn\'t get back earlier, I seem to have missed your post.

It\'s a bit old, but <a href=\"http://www.climateprediction.net/board/viewtopic.php?t=2951\"> this</a> may help with the deleting.

**********

Edman and ninesouls

There are a few mentions of \"Mismatch in no of prognostic fields\" in
<a href=\"http://www.climateprediction.net/board/viewtopic.php?t=3412&postdays=0&postorder=asc&start=45\"> this</a> thread.
Note the last 3 posts at the bottom of page 4, and the first few at the top of page 5. It\'s possible that the two of you had one of these faulty models.

ID: 19406 · Report as offensive     Reply Quote
old_user138545

Send message
Joined: 15 Dec 05
Posts: 5
Credit: 134,136
RAC: 0
Message 19419 - Posted: 18 Jan 2006, 19:52:21 UTC - in response to Message 19406.  

Les,

Thanks for the Reply. I\'am always amazed that you people have the time to reply at all. As far as i can see i\'m not the only one with a problem. I have saved my crashed results to my Backupdrive (that problem is solved now). Remains only de reason. I have to wait a little (only at 13.6 % now) before a new phase is done. We wil see.

Eddy.
ID: 19419 · Report as offensive     Reply Quote
Profile old_user5994

Send message
Joined: 31 Aug 04
Posts: 239
Credit: 2,933,299
RAC: 0
Message 19811 - Posted: 31 Jan 2006, 6:11:14 UTC - in response to Message 19151.  

Michael
The \"1073741819\" error is well known, but unfortunately there is no known cure.
It appears to be something to do with Microsoft programs.
Lately, it has been suggested that it may be related to Direct-X. Also that programs using \"3D\" graphics may be involved.
I think I read this in a post on another project, but I can\'t remember which.

SETI@Home, Einstein@Home, Rosetta@Home ...

The real kicker is that on some systems the problem is easily found in that it happens on screen saver use. Cure, no screen saver.

However, in MOST cases, the program causing the error seems to be external to BOICN. Which programs, or if it is simply bad video drivers, is still unknown. Again, some have solved the issue with updating the video drivers. Others have not.

If you can locate the sensitive area it would help ... like does it happen when you launch Quake with an ATI card? Or Doom IV with nVidia?
ID: 19811 · Report as offensive     Reply Quote
old_user33284

Send message
Joined: 17 Dec 04
Posts: 1
Credit: 767,764
RAC: 0
Message 20614 - Posted: 22 Feb 2006, 12:21:02 UTC - in response to Message 19149.  

I\'ve had the same problem on machines where boinc is installed for multiple users.
I\'ve recently installed boinc as single user and climateprediction has been ok.
ID: 20614 · Report as offensive     Reply Quote

Questions and Answers : Windows : Unrecoverable error for result sulphur_iy78_000884132_0

©2024 climateprediction.net