climateprediction.net home page
Task 14399605

Task 14399605

Name hadam3p_eu_97bc_1970_1_007870309_0
Workunit 8025421
Created 13 Apr 2012, 15:49:50 UTC
Sent 13 Apr 2012, 15:49:57 UTC
Report deadline 26 Mar 2013, 21:09:57 UTC
Received 24 Jun 2012, 12:29:29 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1177115
Run time 19 days 6 hours 6 min 50 sec
CPU time 16 hours 39 min 13 sec
Validate state Invalid
Credit 1,392.75
Device peak FLOPS 2.65 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.25</core_client_version>
<![CDATA[
<stderr_txt>
18:33:04 (3524): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:30:30 (3380): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4644, selfPID=4644, iMonCtr=2
16:19:52 (1740): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:15:32 (2272): No heartbeat from core client for 30 sec - exiting
06:15:33 (2272): No heartbeat from core client for 30 sec - exiting
06:15:34 (2272): No heartbeat from core client for 30 sec - exiting
06:15:35 (2272): No heartbeat from core client for 30 sec - exiting
06:15:36 (2272): No heartbeat from core client for 30 sec - exiting
06:15:37 (2272): No heartbeat from core client for 30 sec - exiting
06:15:38 (2272): No heartbeat from core client for 30 sec - exiting
06:15:39 (2272): No heartbeat from core client for 30 sec - exiting
06:15:40 (2272): No heartbeat from core client for 30 sec - exiting
06:15:41 (2272): No heartbeat from core client for 30 sec - exiting
06:15:42 (2272): No heartbeat from core client for 30 sec - exiting
06:15:43 (2272): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:20:15 (4848): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:34:34 (3524): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:15:56 (424): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4200, selfPID=4200, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4200, selfPID=5016, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3644, selfPID=3252, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4744, selfPID=4744, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4744, selfPID=3244, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1152, selfPID=1152, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1152, selfPID=3384, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3644, selfPID=3644, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3644, selfPID=3368, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5628, selfPID=5628, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5628, selfPID=4160, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4920, selfPID=5596, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4548, selfPID=4548, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4548, selfPID=3232, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2424, selfPID=2424, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2424, selfPID=2976, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3292, selfPID=3292, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3292, selfPID=4896, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3132, selfPID=4508, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5968, selfPID=1660, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4360, selfPID=4212, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3900, selfPID=3768, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4844, selfPID=3124, iMonCtr=1
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4844, selfPID=4844, iMonCtr=2
Leaving CPDN_Main::Monitor...
Called boinc_finish
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3608, selfPID=3228, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=676, selfPID=676, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=676, selfPID=4656, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2064, selfPID=3864, iMonCtr=1
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2064, selfPID=2064, iMonCtr=2
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1064, selfPID=4448, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4416, selfPID=4416, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4416, selfPID=3304, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4848, selfPID=3584, iMonCtr=1
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4848, selfPID=4848, iMonCtr=2
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=320, selfPID=3940, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5016, selfPID=5016, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5016, selfPID=3300, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3048, iMonCtr=2
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2988, selfPID=2988, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2988, selfPID=2816, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3308, selfPID=3308, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3308, selfPID=4780, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3188, selfPID=2452, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Reontrolleg::ional Woockess is not runnin is not running, exiting, bRetVal = 510, selfPID=3064, iMonCtr=1
Model crash detected, will try to restart...
PID=4500, iMonCtr=2
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3116, selfPID=3116, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3116, selfPID=2860, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Suspended CPDN Monitor - Suspend request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2512, selfPID=2512, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2512, selfPID=2160, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4756, selfPID=4756, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4756, selfPID=2056, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3476, selfPID=3032, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2164, selfPID=2840, iMonCtr=1
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2864, selfPID=2864, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2864, selfPID=4296, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Suspended CPDN Monitor - Suspend request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1348, selfPID=1348, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1348, selfPID=3296, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4452, selfPID=4452, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4452, selfPID=4264, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4276, selfPID=4276, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4276, selfPID=4864, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4984, selfPID=3556, iMonCtr=1
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4984, selfPID=4984, iMonCtr=2
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2940, selfPID=2996, iMonCtr=1
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2940, selfPID=2940, iMonCtr=2
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=700, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3984, selfPID=3224, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2620, selfPID=2324, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1628, selfPID=1628, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1628, selfPID=2316, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=924, selfPID=4988, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3612, iMonCtr=2
Model crash detected, will try to restart...
lobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4568, iMonCtr=2
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2908, selfPID=2908, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2908, selfPID=2460, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=676, selfPID=5056, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Suspended CPDN Monitor - Suspend request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1632, selfPID=1632, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1632, selfPID=3172, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4072, selfPID=3108, iMonCtr=1
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4072, selfPID=4072, iMonCtr=2
Leaving CPDN_Main::Monitor...
Called boinc_finish
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3040, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3752, selfPID=1548, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_97bc_1970_1_007870309_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_97bc_1970_1_007870309_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_97bc_1970_1_007870309_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_97bc_1970_1_007870309_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_97bc_1970_1_007870309_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
18 Apr 2012 20:28:10 1177115 14399605 hadam3p_eu_97bc_1970_1_007870309_0 80,736 163,294 2.0226
18 Apr 2012 14:30:26 1177115 14399605 hadam3p_eu_97bc_1970_1_007870309_0 69,216 140,193 2.0254
16 Apr 2012 18:39:34 1177115 14399605 hadam3p_eu_97bc_1970_1_007870309_0 57,696 116,787 2.0242
16 Apr 2012 12:28:28 1177115 14399605 hadam3p_eu_97bc_1970_1_007870309_0 46,176 93,251 2.0195
09 Jun 2012 11:40:24 1177115 14399605 hadam3p_eu_97bc_1970_1_007870309_0 34,657 70,192 2.0253
15 Apr 2012 15:00:53 1177115 14399605 hadam3p_eu_97bc_1970_1_007870309_0 34,656 70,101 2.0228
06 May 2012 19:46:06 1177115 14399605 hadam3p_eu_97bc_1970_1_007870309_0 23,137 47,043 2.0332
14 Apr 2012 17:09:21 1177115 14399605 hadam3p_eu_97bc_1970_1_007870309_0 23,136 46,819 2.0236
14 Apr 2012 10:02:41 1177115 14399605 hadam3p_eu_97bc_1970_1_007870309_0 11,616 23,362 2.0112


©2024 climateprediction.net