climateprediction.net (CPDN) home page
Task 14770577

Task 14770577

Name hadam3p_eu_d08c_2003_1_007966510_1
Workunit 8121624
Created 4 Jun 2012, 5:58:33 UTC
Sent 4 Jun 2012, 6:07:19 UTC
Report deadline 17 May 2013, 11:27:19 UTC
Received 8 Jul 2012, 11:41:43 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID 1202128
Run time 5 days 1 hours 33 min 31 sec
CPU time 4 days 5 hours 40 min 36 sec
Validate state Invalid
Credit 1,591.48
Device peak FLOPS 2.69 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.25</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
14:35:34 (1004): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:35:35 (1004): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7600, selfPID=4824, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6820, selfPID=4684, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7380, selfPID=5412, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7820, selfPID=6520, iMonCtr=1
Model crash detected, will try to restart...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3752, selfPID=7300, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2152, selfPID=5648, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7784, selfPID=5108, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6156, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
19:09:27 (5748): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7588, selfPID=7172, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7924, selfPID=6696, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5048, selfPID=6368, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7896, selfPID=6408, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7656, selfPID=5944, iMonCtr=1
Model crash detected, will try to restart...
14:23:10 (6004): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6860, selfPID=6860, iMonCtr=2
14:23:11 (6004): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2860, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1532, selfPID=5532, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
06:01:39 (7096): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7016, selfPID=4628, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6660, selfPID=4496, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
04:53:25 (5700): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4788, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3752, selfPID=7708, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7948, selfPID=7028, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6288, selfPID=5704, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5808, selfPID=5756, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4816, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6172, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7148, selfPID=6240, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
05:55:18 (5708): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6892, iMonCtr=2
05:53:22 (6656): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7500, selfPID=5864, iMonCtr=1
Model crash detected, will try to restart...
18:08:35 (4744): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6860, iMonCtr=2
06:18:50 (5868): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
22 Jun 2012 04:20:23 1202128 14770577 hadam3p_eu_d08c_2003_1_007966510_1 92,256 356,403 3.8632
22 Jun 2012 04:20:23 1202128 14770577 hadam3p_eu_d08c_2003_1_007966510_1 80,736 314,054 3.8899
22 Jun 2012 04:20:23 1202128 14770577 hadam3p_eu_d08c_2003_1_007966510_1 69,216 262,875 3.7979
22 Jun 2012 04:20:23 1202128 14770577 hadam3p_eu_d08c_2003_1_007966510_1 57,696 217,908 3.7768
12 Jun 2012 03:05:50 1202128 14770577 hadam3p_eu_d08c_2003_1_007966510_1 46,176 173,248 3.7519
12 Jun 2012 03:05:50 1202128 14770577 hadam3p_eu_d08c_2003_1_007966510_1 34,656 129,570 3.7387
07 Jun 2012 04:09:09 1202128 14770577 hadam3p_eu_d08c_2003_1_007966510_1 23,136 86,267 3.7287
07 Jun 2012 04:09:09 1202128 14770577 hadam3p_eu_d08c_2003_1_007966510_1 11,616 45,269 3.8971


©2024 cpdn.org