climateprediction.net home page
model crash

model crash

Questions and Answers : Windows : model crash
Message board moderation

To post messages, you must log in.

AuthorMessage
Steinar1965

Send message
Joined: 4 Sep 06
Posts: 79
Credit: 5,583,517
RAC: 0
Message 30429 - Posted: 8 Sep 2007, 8:03:05 UTC

One of my models crashed when I started the pc today. I took a backup before I turned off the pc yesterday. The model finished and a new one was downloaded. There are no special cirkumstances as I can see.
Should I restore from backup or is it something wrong with the model?
If I restore, what about the fifth model? (the one that downloaded this morning)

Another thing. I attached from www.climateprediction.net. In the messagebox I get the message that I should detach \"when convenient\" and attach from the right url. Can I do that before the models are finished?
ID: 30429 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 30430 - Posted: 8 Sep 2007, 8:36:51 UTC

The relevant section from your model\'s error text is here:

I\'d guess a temporary glitch on your PC, so restoring from backup will be a good idea (congratulations for having such an up-to-date one :-) )

The new model will disappear from your PC, and will eventually time out on the server.

If you detach / reattach you\'ll loose all your models. I\'d suggest setting \'no more work\' against the project, and when they all finish, changing it then.

Alternatively, somewhere around on this forum is a little programme which tries to fix the problem. If you give it a go, remember to take another backup first, just in case.

CPDN Monitor - Quit request from BOINC...
cpdnmonitor: cannot open input file dataout/atmos_restart.day
cpdnmonitor: cannot open input file dataout/ocean_restart.day

Model crashed: umshell1.f: READ_FLH: I/O error
@
cpdnmonitor: cannot open input file dataout/atmos_restart.day
cpdnmonitor: cannot open input file dataout/ocean_restart.day

Model crashed: umshell1.f: READ_FLH: I/O error
@
cpdnmonitor: cannot open input file dataout/atmos_restart.day
cpdnmonitor: cannot open input file dataout/ocean_restart.day

Model crashed: umshell1.f: READ_FLH: I/O error
@
cpdnmonitor: cannot open input file dataout/atmos_restart.day
cpdnmonitor: cannot open input file dataout/ocean_restart.day

Model crashed: umshell1.f: READ_FLH: I/O error
@
cpdnmonitor: cannot open input file dataout/atmos_restart.day
cpdnmonitor: cannot open input file dataout/ocean_restart.day

Model crashed: umshell1.f: READ_FLH: I/O error
@
cpdnmonitor: cannot open input file dataout/atmos_restart.day
cpdnmonitor: cannot open input file dataout/ocean_restart.day

Model crashed: umshell1.f: READ_FLH: I/O error
@
Sorry, too many model crashes! :-(

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 30430 · Report as offensive     Reply Quote
Steinar1965

Send message
Joined: 4 Sep 06
Posts: 79
Credit: 5,583,517
RAC: 0
Message 30436 - Posted: 8 Sep 2007, 16:00:57 UTC - in response to Message 30430.  

I have had a serious problem with a lot of programmes, I dont know what happened since I have not changed anything. More models crashed and uploaded, and then another model were downloaded. I reinstalled the whole pc to avoid more crashed models. Sorry for the problems with the crashed models..
ID: 30436 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 30447 - Posted: 8 Sep 2007, 21:34:36 UTC

The \'fix\' links Mike mentioned:

Thyme Lawn\'s fix for incorrect Project names: not attached to \'bbc.cpdn.org\' or \'climateprediction.net\':
http://www.climateprediction.net/board/viewtopic.php?t=4916
http://www.climateprediction.net/board/viewtopic.php?p=44068
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 30447 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 30454 - Posted: 9 Sep 2007, 9:03:17 UTC


I\'ve added that link to the READMEs since it was missing...

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 30454 · Report as offensive     Reply Quote
Steinar1965

Send message
Joined: 4 Sep 06
Posts: 79
Credit: 5,583,517
RAC: 0
Message 30455 - Posted: 9 Sep 2007, 9:39:07 UTC

Since there was a lot of problems, I reinstalled the whole machine. I reinstalled boinc and started all over. Again one model crashed. I got the msg:

09.09.2007 08:34:29|climateprediction.net|Reason: Unrecoverable error for result hadcm3iozn_cpyi_2000_80_45899411_1 (The device does not recognize the command. (0x16) - exit code 22 (0x16))

The pc was reinstalled yesterday and I left the house after that was done and no one has ever tuched the pc. Yet it crashed one model. The other three are still running but I\'m afraid I\'m going to be a constant model-demolisher.

Is it something with the PC? I attached to the right url this time.
When I reinstalled I deleted the partition and reformatted the disk so everything should be in order..
I have Boin ver 5. 10. 20
ID: 30455 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 30456 - Posted: 9 Sep 2007, 9:47:19 UTC
Last modified: 9 Sep 2007, 9:54:42 UTC


This error is associated with floating point errors. It\'d be well worth running Prime95\'s torture test (the \'gold standard\' is to run it for 24 hours, once for each core). That\'ll tell you if the PC\'s hardware is behaving.

If Prime95 falls over before the 24 hours is complete, there are several things which can cause it to fail: overheating, a bad memory stick, a power supply which is starting to fail, or overclocking too high.

There is also something called Orthos which is supposed to do the same tests, but automatically runs on each core (rather than having to fiddle with the Prime95 settings). I\'ve never run it, but if anyone has experience with it we\'d be interested to know how it compares in stress testing to P95.

When I was O/C-ing my Q6600, I had to drop the clock a huge amount to make it Prime95 stable (unlike AMD\'s X2 where I only had to drop it a little once it was booting).

Assuming I\'m looking at the right machine (model started on the 8th, crashed on the 9th, by UK time)
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6804372

Model crashed: umshell1.f: TRANSO2A: Missing data in ocean UV fields@

Model crashed: umshell1.f: TRANSO2A: Missing data in ocean UV fields@

Model crashed: umshell1.f: TRANSO2A: Missing data in ocean UV fields@

Model crashed: umshell1.f: TRANSO2A: Missing data in ocean UV fields@

Model crashed: umshell1.f: TRANSO2A: Missing data in ocean UV fields@

Model crashed: umshell1.f: TRANSO2A: Missing data in ocean UV fields@
Sorry, too many model crashes! :-(


I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 30456 · Report as offensive     Reply Quote

Questions and Answers : Windows : model crash

©2024 climateprediction.net