climateprediction.net home page
Task 16314961

Task 16314961

Name hadam3p_eu_j146_2013_1_008532955_0
Workunit 8680467
Created 3 Mar 2014, 15:31:01 UTC
Sent 4 Mar 2014, 11:10:06 UTC
Report deadline 14 Feb 2015, 16:30:06 UTC
Received 13 May 2014, 11:31:54 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID 1316560
Run time 8 days 0 hours 49 min 5 sec
CPU time 6 days 17 hours 48 min 36 sec
Validate state Invalid
Credit 2,186.01
Device peak FLOPS 1.58 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4412, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4728, selfPID=4536, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4424, selfPID=2916, iMonCtr=1
Model crash detected, will try to restart...
14:28:19 (4200): No heartbeat from core client for 30 sec - exiting
14:28:20 (4200): No heartbeat from core client for 30 sec - exiting
14:28:21 (4200): No heartbeat from core client for 30 sec - exiting
14:28:22 (4200): No heartbeat from core client for 30 sec - exiting
14:28:23 (4200): No heartbeat from core client for 30 sec - exiting
14:28:24 (4200): No heartbeat from core client for 30 sec - exiting
14:28:26 (4200): No heartbeat from core client for 30 sec - exiting
14:28:27 (4200): No heartbeat from core client for 30 sec - exiting
14:28:28 (4200): No heartbeat from core client for 30 sec - exiting
14:28:29 (4200): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3152, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1228, selfPID=1228, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2380, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2276, selfPID=2276, iMonCtr=2
CCPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4744, selfPID=3524, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=804, iMonCtr=2
Model crash detected, will try to restart...
Atmos Restart file copy failed on atmos_restart.day
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4876, selfPID=3996, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3564, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4800, selfPID=4800, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
11:28:34 (3088): No heartbeat from core client for 30 sec - exiting
11:28:35 (3088): No heartbeat from core client for 30 sec - exiting
11:28:36 (3088): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:27:58 (4312): No heartbeat from core client for 30 sec - exiting
12:27:59 (4312): No heartbeat from core client for 30 sec - exiting
12:28:00 (4312): No heartbeat from core client for 30 sec - exiting
12:28:01 (4312): No heartbeat from core client for 30 sec - exiting
12:28:02 (4312): No heartbeat from core client for 30 sec - exiting
12:28:03 (4312): No heartbeat from core client for 30 sec - exiting
12:28:04 (4312): No heartbeat from core client for 30 sec - exiting
12:28:05 (4312): No heartbeat from core client for 30 sec - exiting
12:28:06 (4312): No heartbeat from core client for 30 sec - exiting
12:28:07 (4312): No heartbeat from core client for 30 sec - exiting
12:28:08 (4312): No heartbeat from core client for 30 sec - exiting
12:28:09 (4312): No heartbeat from core client for 30 sec - exiting
12:28:10 (4312): No heartbeat from core client for 30 sec - exiting
12:28:11 (4312): No heartbeat from core client for 30 sec - exiting
12:28:12 (4312): No heartbeat from core client for 30 sec - exiting
12:28:13 (4312): No heartbeat from core client for 30 sec - exiting
12:28:14 (4312): No heartbeat from core client for 30 sec - exiting
12:28:15 (4312): No heartbeat from core client for 30 sec - exiting
12:28:16 (4312): No heartbeat from core client for 30 sec - exiting
12:28:17 (4312): No heartbeat from core client for 30 sec - exiting
12:28:18 (4312): No heartbeat from core client for 30 sec - exiting
12:28:20 (4312): No heartbeat from core client for 30 sec - exiting
12:28:21 (4312): No heartbeat from core client for 30 sec - exiting
12:28:22 (4312): No heartbeat from core client for 30 sec - exiting
12:28:23 (4312): No heartbeat from core client for 30 sec - exiting
12:28:24 (4312): No heartbeat from core client for 30 sec - exiting
12:28:25 (4312): No heartbeat from core client for 30 sec - exiting
12:28:26 (4312): No heartbeat from core client for 30 sec - exiting
12:28:27 (4312): No heartbeat from core client for 30 sec - exiting
12:28:28 (4312): No heartbeat from core client for 30 sec - exiting
12:28:29 (4312): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7824, selfPID=7900, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5752, selfPID=964, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4412, selfPID=3660, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2120, selfPID=4176, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3884, selfPID=4200, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5108, selfPID=5108, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=576, selfPID=3744, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4080, selfPID=3344, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2952, selfPID=3832, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3176, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2524, selfPID=3404, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5008, selfPID=4004, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4960, selfPID=3096, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4180, selfPID=4000, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3044, selfPID=4084, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4796, selfPID=3860, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3744, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5080, selfPID=3612, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5032, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4988, selfPID=3780, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4612, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2936, selfPID=3420, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2028, selfPID=3020, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5068, selfPID=2776, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5932, selfPID=4004, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4028, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4916, selfPID=3904, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2336, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4196, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5880, selfPID=3216, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3040, selfPID=3788, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4340, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1744, iMonCtr=2
Model crash detected, will try to restart...
GCPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4596, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4408, selfPID=4356, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5096, selfPID=4504, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
14:20:33 (4132): No heartbeat from core client for 30 sec - exiting
14:20:34 (4132): No heartbeat from core client for 30 sec - exiting
14:20:35 (4132): No heartbeat from core client for 30 sec - exiting
14:20:36 (4132): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
11:58:21 (3932): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=172, selfPID=560, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5556, selfPID=5556, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4900, selfPID=4176, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7412, selfPID=5144, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4996, selfPID=4076, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4716, selfPID=4180, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2096, selfPID=3452, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4932, selfPID=3756, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3984, selfPID=4120, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3324, selfPID=4424, iMonCtr=1
Model crash detected, will try to restart...
15:20:22 (4384): No heartbeat from core client for 30 sec - exiting
15:20:23 (4384): No heartbeat from core client for 30 sec - exiting
15:20:24 (4384): No heartbeat from core client for 30 sec - exiting
15:20:25 (4384): No heartbeat from core client for 30 sec - exiting
15:20:26 (4384): No heartbeat from core client for 30 sec - exiting
15:20:27 (4384): No heartbeat from core client for 30 sec - exiting
15:20:28 (4384): No heartbeat from core client for 30 sec - exiting
15:20:30 (4384): No heartbeat from core client for 30 sec - exiting
15:20:31 (4384): No heartbeat from core client for 30 sec - exiting
15:20:32 (4384): No heartbeat from core client for 30 sec - exiting
15:20:33 (4384): No heartbeat from core client for 30 sec - exiting
15:20:34 (4384): No heartbeat from core client for 30 sec - exiting
15:20:35 (4384): No heartbeat from core client for 30 sec - exiting
15:20:36 (4384): No heartbeat from core client for 30 sec - exiting
15:20:37 (4384): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
08 May 2014 22:28:27 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 126,720 572,377 4.5169
04 May 2014 13:17:15 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 115,200 514,577 4.4668
29 Apr 2014 10:41:34 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 103,680 457,024 4.4080
23 Apr 2014 17:13:49 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 92,160 403,549 4.3788
20 Apr 2014 10:33:25 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 80,640 349,430 4.3332
16 Apr 2014 19:46:44 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 69,120 290,900 4.2086
01 Apr 2014 09:46:09 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 57,600 237,182 4.1177
26 Mar 2014 17:56:36 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 46,080 190,634 4.1370
19 Mar 2014 13:07:18 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 34,560 143,576 4.1544
13 Mar 2014 11:08:29 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 23,142 97,923 4.2314
12 Mar 2014 16:16:30 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 23,136 97,498 4.2141
07 Mar 2014 16:20:52 1316560 16314961 hadam3p_eu_j146_2013_1_008532955_0 11,616 47,053 4.0507


©2024 cpdn.org