climateprediction.net (CPDN) home page
Task 14734863

Task 14734863

Name hadam3p_eu_cnh1_2004_1_007988566_1
Workunit 8143680
Created 23 May 2012, 11:41:51 UTC
Sent 23 May 2012, 11:44:02 UTC
Report deadline 5 May 2013, 17:04:02 UTC
Received 8 Jul 2012, 11:41:43 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID 1202128
Run time 6 days 15 hours 25 min 54 sec
CPU time 5 days 3 hours 43 min 36 sec
Validate state Invalid
Credit 1,988.94
Device peak FLOPS 2.56 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.25</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10192, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5364, selfPID=4288, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4752, selfPID=3672, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4032, selfPID=3592, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6484, selfPID=8064, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7680, selfPID=6528, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6936, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7012, selfPID=6408, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6860, selfPID=1404, iMonCtr=1
Model crash detected, will try to restart...
14:35:34 (4260): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:35:36 (4260): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4956, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7444, selfPID=1624, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6996, selfPID=4700, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3312, selfPID=7252, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6004, selfPID=5640, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7696, selfPID=5336, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7780, selfPID=5348, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
19:09:27 (5740): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7916, selfPID=6684, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8112, selfPID=6304, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7820, selfPID=6400, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7752, selfPID=5644, iMonCtr=1
Model crash detected, will try to restart...
14:23:10 (5996): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4936, selfPID=1880, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=724, selfPID=4624, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
06:01:40 (7088): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6688, selfPID=6688, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6736, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4244, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
04:53:25 (5516): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7768, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7992, selfPID=7020, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6220, selfPID=1400, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4540, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4436, selfPID=5744, iMonCtr=1
Model crash detected, will try to restart...
18:58:02 (6200): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8140, iMonCtr=2
Model crash detected, will try to restart...
05:53:23 (6640): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1420, selfPID=6376, iMonCtr=1
Model crash detected, will try to restart...
18:08:34 (4724): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1580, selfPID=1580, iMonCtr=2
06:18:44 (6104): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6160, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7656, iMonCtr=2
Model crash detected, will try to restart...

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
22 Jun 2012 04:20:23 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 115,296 427,702 3.7096
22 Jun 2012 04:20:22 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 103,776 383,791 3.6983
22 Jun 2012 04:20:22 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 92,256 334,515 3.6259
22 Jun 2012 04:20:22 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 80,736 289,884 3.5905
12 Jun 2012 03:05:50 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 69,216 244,758 3.5361
12 Jun 2012 03:05:50 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 57,696 201,335 3.4896
07 Jun 2012 04:09:09 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 46,176 157,444 3.4097
07 Jun 2012 04:09:09 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 34,656 117,967 3.4039
04 Jun 2012 04:18:51 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 23,136 73,601 3.1812
04 Jun 2012 04:18:51 1202128 14734863 hadam3p_eu_cnh1_2004_1_007988566_1 11,616 36,223 3.1184


©2024 cpdn.org