climateprediction.net home page
Task 16467235

Task 16467235

Name hadam3p_anz_p31g_2012_1_008634611_0
Workunit 8781123
Created 3 Apr 2014, 9:58:59 UTC
Sent 10 Apr 2014, 9:53:28 UTC
Report deadline 23 Mar 2015, 15:13:28 UTC
Received 6 Aug 2014, 10:37:15 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID 1114818
Run time 6 days 9 hours 31 min 31 sec
CPU time 5 days 20 hours 33 min 33 sec
Validate state Invalid
Credit 2,993.82
Device peak FLOPS 2.78 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
11:00:46 (4856): No heartbeat from core client for 30 sec - exiting
11:00:48 (4856): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5808, selfPID=5044, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5668, selfPID=5572, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6056, selfPID=5776, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5636, selfPID=4988, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5364, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5408, selfPID=5324, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6116, selfPID=5216, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5468, selfPID=4372, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4812, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4960, selfPID=5788, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5880, selfPID=5968, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5772, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3756, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4480, selfPID=5220, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4236, selfPID=5688, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6096, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4620, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5452, selfPID=4336, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4748, selfPID=5596, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5888, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5548, selfPID=5976, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5236, selfPID=5748, iMonCtr=1
Model crash detected, will try to restart...
GSuspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5584, iMonCtr=2
10:33:55 (3876): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:33:56 (3876): No heartbeat from core client for 30 sec - exiting
10:33:57 (3876): No heartbeat from core client for 30 sec - exiting
10:33:58 (3876): No heartbeat from core client for 30 sec - exiting
10:33:59 (3876): No heartbeat from core client for 30 sec - exiting
10:34:00 (3876): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3948, selfPID=4204, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5844, selfPID=3360, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5836, selfPID=1128, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5992, selfPID=1864, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5364, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3004, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
12:33:18 (5888): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:33:20 (5888): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1600, selfPID=5620, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6068, selfPID=6028, iMonCtr=1
Model crash detected, will try to restart...
CSuspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1136, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5964, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1724, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1340, selfPID=5816, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2440, selfPID=4980, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6532, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5208, selfPID=5748, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3568, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5976, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4808, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5820, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3232, selfPID=5096, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4312, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4792, selfPID=6004, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4996, selfPID=6008, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3592, selfPID=800, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5672, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2620, iMonCtr=2
Model crash detected, will try to restart...

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
31 Jul 2014 11:12:44 1114818 16467235 hadam3p_anz_p31g_2012_1_008634611_0 69,419 479,020 6.9004
05 Jul 2014 12:27:01 1114818 16467235 hadam3p_anz_p31g_2012_1_008634611_0 57,899 399,772 6.9046
17 Jun 2014 13:06:33 1114818 16467235 hadam3p_anz_p31g_2012_1_008634611_0 46,379 320,543 6.9114
03 Jun 2014 12:47:27 1114818 16467235 hadam3p_anz_p31g_2012_1_008634611_0 34,859 240,220 6.8912
19 May 2014 12:14:19 1114818 16467235 hadam3p_anz_p31g_2012_1_008634611_0 23,339 162,369 6.9570
28 Apr 2014 11:41:42 1114818 16467235 hadam3p_anz_p31g_2012_1_008634611_0 11,819 82,253 6.9594


©2024 climateprediction.net