climateprediction.net home page
Task 15335471

Task 15335471

Name hadam3p_pnw_364s_1967_1_008216740_0
Workunit 8371864
Created 5 Oct 2012, 13:36:22 UTC
Sent 5 Oct 2012, 13:42:16 UTC
Report deadline 17 Sep 2013, 19:02:16 UTC
Received 8 Nov 2012, 19:51:55 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1168410
Run time 6 days 1 hours 3 min 28 sec
CPU time 4 days 7 hours 59 min 10 sec
Validate state Invalid
Credit 2,505.24
Device peak FLOPS 2.28 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7596, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5844, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4572, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6996, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7740, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6280, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 0
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8072, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7992, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7880, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6676, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5964, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6884, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3480, selfPIDGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8132, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1016, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6768, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5184, selfPID=3888, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5868, selfPID=8088, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7268, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7228, selfPID=6464, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3232, selfPID=8016, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8064, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7032, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6252, selfPID=6228, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7136, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 5
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7016, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1304, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 5
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5336, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 5
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6612, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2796, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
18:40:16 (5656): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1872, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4104, selfPID=7564, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7772, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4436, iMonCtr=2
Model crash detected, will try to restart...
15:36:59 (7064): No heartbeat from core client for 30 sec - exiting
15:37:01 (7064): No heartbeat from core client for 30 sec - exiting
15:37:02 (7064): No heartbeat from core client for 30 sec - exiting
15:37:03 (7064): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7040, selfPID=7040, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3504, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4176, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6820, selfPID=6820, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7060, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 7
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4368, selfPID=5168, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 7
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7416, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3260, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3488, selfPID=7460, iMonCtr=1
Model crash detected, will try to restart...
16:35:17 (7912): No heartbeat from core client for 30 sec - exiting
16:35:18 (7912): No heartbeat from core client for 30 sec - exiting
16:35:19 (7912): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7340, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6896, selfPID=8052, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7168, selfPID=8028, iMonCtr=1
Model crash detected, will try to restart...
17:54:58 (7724): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7832, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2072, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7512, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 10
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_pnw_364s_1967_1_008216740\tmp\xaakm.namelists

Image              PC        Routine            Line        Source             
hadam3p_pnw_um_6.  006BA39A  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00662CD0  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00661E9A  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00642819  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00542287  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  005DE7B2  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  005DF2DA  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00359BD2  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  0069E638  Unknown               Unknown  Unknown
kernel32.dll       769ED309  Unknown               Unknown  Unknown
ntdll.dll          773C1603  Unknown               Unknown  Unknown
ntdll.dll          773C15D6  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7640, selfPID=7364, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 10
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_pnw_364s_1967_1_008216740_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_364s_1967_1_008216740_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
04 Nov 2012 18:43:38 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 115,296 352,331 3.0559
02 Nov 2012 19:46:48 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 103,776 315,166 3.0370
01 Nov 2012 10:41:50 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 92,256 278,702 3.0210
27 Oct 2012 16:07:07 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 80,736 241,675 2.9934
22 Oct 2012 17:55:55 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 69,216 202,448 2.9249
19 Oct 2012 20:49:18 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 57,696 163,660 2.8366
16 Oct 2012 14:00:35 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 46,176 126,662 2.7430
13 Oct 2012 17:53:02 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 34,656 88,361 2.5497
10 Oct 2012 14:52:19 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 23,136 50,001 2.1612
08 Oct 2012 11:56:39 1168410 15335471 hadam3p_pnw_364s_1967_1_008216740_0 11,616 12,259 1.0554


©2024 cpdn.org