climateprediction.net home page
Model crashed: REPLANCA: Current time precedes start time of data

Model crashed: REPLANCA: Current time precedes start time of data

Message boards : Number crunching : Model crashed: REPLANCA: Current time precedes start time of data
Message board moderation

To post messages, you must log in.

AuthorMessage
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1055
Credit: 16,519,286
RAC: 1,107
Message 48417 - Posted: 17 Mar 2014, 13:02:56 UTC

Is this a new problem? My computer's time is correct, if that matters.

UK Met Office HADAM3P European Region v6.09
Stderr show hide

<core_client_version>6.10.45</core_client_version>
<![CDATA[
<stderr_txt>

Model crashed: REPLANCA: Current time precedes start time of data tmp/xaakm.pipe_dummy 2048
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>hadam3p_eu_c1fl_1997_1_008565930_0_1.zip</file_name>
<error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>hadam3p_eu_c1fl_1997_1_008565930_0_2.zip</file_name>
<error_code>-161</error_code>
</file_xfer_error>


[etc.]
ID: 48417 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,497,933
RAC: 6,477
Message 48418 - Posted: 17 Mar 2014, 13:28:42 UTC - in response to Message 48417.  
Last modified: 17 Mar 2014, 13:29:44 UTC

Replanca error has certainly been discussed in the past. It is almost certainly a model problem rather than anything to do with your computer. I can't remember if it was OS dependent or not. I see that I have just downloaded some of these tasks and given that I am also a windows free zone I do not feel optimistic.

I can't remember whether it was here or on the other now defunct board that I saw it discussed.
ID: 48418 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1081
Credit: 6,980,320
RAC: 3,893
Message 48419 - Posted: 17 Mar 2014, 15:49:29 UTC

Thanks, both. Passed onto project staff. Since these models are marked '1997' rather than '2013' I don't know whether they're part of the flood analysis or something else entirely.

Will report back when more information becomes available.
ID: 48419 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,497,933
RAC: 6,477
Message 48420 - Posted: 17 Mar 2014, 15:54:59 UTC

Thanks Ian, I see that some other tasks in some of the work units have failed with the replanca error since I last looked including some running on windows boxes so at least on this occasion it is not OS dependant.
ID: 48420 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 48430 - Posted: 18 Mar 2014, 4:06:52 UTC

I had four 1997 EUR models downloaded on 17 March. All crashed at exactly the same moment (1m21s) with REPLANCA.

As soon as they'd started I tried to open the graphics globe of one of them to see whether there was anything of interest. The attempt to open the graphics window failed (the window was just an outline filled with black) but it had a terrible effect on BOINC Manager. All visible progress by other normal tasks in the Tasks pane froze and then the whole Tasks pane (or maybe the whole of BM, I can't remember) greyed out. After the four tasks had crashed, BM returned to normal and the graphics window showed a blue globe with zero crunching recorded.

REPLANCA gives BOINC a bit of a fright.
Cpdn news
ID: 48430 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,497,933
RAC: 6,477
Message 48432 - Posted: 18 Mar 2014, 8:08:35 UTC
Last modified: 18 Mar 2014, 8:30:52 UTC

Which leads to the question, should I just delete the two 1997 models I have downloaded now before they start? This would give my computer a chance of picking up anything else going. Alternatively I could briefly suspend the 2014 models I have running to confirm they crash out.

I did the latter and they duly crashed.
ID: 48432 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 48433 - Posted: 18 Mar 2014, 15:21:04 UTC

For anyone reading this with 1997 EUR models created on 17 March 2014 it would be worth doing the same as Dave: force them to run immediately by suspending other tasks. As they will almost certainly crash after a few seconds you will then be more likely to receive work when the next good batch of models appears.
Cpdn news
ID: 48433 · Report as offensive     Reply Quote
MartinNZ

Send message
Joined: 22 Mar 06
Posts: 144
Credit: 24,695,428
RAC: 0
Message 48437 - Posted: 18 Mar 2014, 19:55:58 UTC - in response to Message 48433.  

I've just had 8 of these crash on 1290283. See they have been passed onto other PCs - presumably to crash again.
Win764bit on my machine so looks like model error. Shame as things were going quite well.
ID: 48437 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 48439 - Posted: 18 Mar 2014, 20:27:20 UTC

At least these models aren't consuming much time or electricity.

When we see crash reasons for climate models spelled out in capital letters - REPLANCA, not Replanca or replanca - I now assume this follows some Met Office convention (they wrote the code) and means the cause is probably intrinsic to the model.

Not that all model defects produce an explanation in upper case. They don't.

One of my models waiting to be crunched from a completely different batch crashed on another computer with the stderr line:

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO

I've no idea what this means but I'm assuming that the capital letters indicate a model with a different intrinsic defect.
Cpdn news
ID: 48439 · Report as offensive     Reply Quote
Profile [AF>Le_Pommier] Jerome_C2005

Send message
Joined: 21 Oct 10
Posts: 53
Credit: 2,101,753
RAC: 3,985
Message 48441 - Posted: 18 Mar 2014, 21:19:36 UTC

I got the same issue on a win7 machine with 2 WU (this and that), too bad since the computer where it happened only runs 2 CPDN WU and has no access to Internet, forced to play with a USB key and an old portable version of boinc to have it running, and considering how difficult it is to get some WUs lately, pfff...
ID: 48441 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1081
Credit: 6,980,320
RAC: 3,893
Message 48444 - Posted: 18 Mar 2014, 21:52:28 UTC

REPLANCA is the name of a Fortran function (or should I say FORTRAN) that deals with times. The kind of thing that sends it into a spin is running a 360 days per year model as if it's 365 days per year. I don't suppose that's the case here but there has been some model configuration error somewhere. Unless the project team find a way of neatly pulling that batch then we just have to do it the messy way.
ID: 48444 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 24,513,467
RAC: 1,604
Message 58914 - Posted: 25 Oct 2018, 18:56:12 UTC

Hi folks,

I'm using this old REPLANCA thread to report a pnw25 model batch 757 that gave the following error on 2 machines and the 3rd attempt will be on one of mines WIN7.

Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH tmp/xadae.pipe_dummy
Leaving CPDN_ain::Monitor...
04:33:59 (37580): called boinc_finish(0)

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>wah2_pnw25_puy7_206809_28_757_011648678_1_r1093481325_28.zip</file_name>
<error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>wah2_pnw25_puy7_206809_28_757_011648678_1_r1093481325_restart.zip</file_name>
<error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
</message>
]]>

Is this a problem with the batch or normal model crash?
ID: 58914 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58915 - Posted: 25 Oct 2018, 22:44:34 UTC

That's a known problem with that batch.

Some list ran out before the others.
Ooops.

But not for all of the batch.
I emailed them on the 14th when I had it.
ID: 58915 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 24,513,467
RAC: 1,604
Message 58917 - Posted: 26 Oct 2018, 14:40:19 UTC - in response to Message 58915.  

Should I abort then or let it try?
ID: 58917 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 58918 - Posted: 26 Oct 2018, 20:30:05 UTC

You may as well Abort it.
ID: 58918 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,497,933
RAC: 6,477
Message 58919 - Posted: 27 Oct 2018, 6:32:09 UTC - in response to Message 58918.  

I have one from that batch that is on its first try so will let it run and see what happens especially as there doesn't seem to be any new work at the moment.
ID: 58919 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1081
Credit: 6,980,320
RAC: 3,893
Message 58920 - Posted: 27 Oct 2018, 14:09:32 UTC

For batches 754 and 757 I've had more successes than failures (all failures being REPLANCA errors): batch 754 = 7/9 and batch 757 = 3/5. So I am continuing to run them.
ID: 58920 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,497,933
RAC: 6,477
Message 58921 - Posted: 28 Oct 2018, 8:09:53 UTC - in response to Message 58920.  
Last modified: 28 Oct 2018, 8:23:37 UTC

One of my 757s has failed once as has my 754 so I will check to see if they were with replanca error.

Edit:All the second/third run ones are seg faults so I will let them run.
ID: 58921 · Report as offensive     Reply Quote

Message boards : Number crunching : Model crashed: REPLANCA: Current time precedes start time of data

©2024 climateprediction.net