climateprediction.net home page
Task 15316804

Task 15316804

Name hadam3p_eu_65vw_2006_1_007507320_2
Workunit 7704795
Created 29 Sep 2012, 8:18:59 UTC
Sent 29 Sep 2012, 8:19:26 UTC
Report deadline 11 Sep 2013, 13:39:26 UTC
Received 13 Oct 2012, 9:05:55 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1230963
Run time 4 days 0 hours 19 min 58 sec
CPU time 3 days 4 hours 9 min 2 sec
Validate state Invalid
Credit 1,194.02
Device peak FLOPS 2.16 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5772, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4532, selfPID=4612, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5472, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2472, selfPID=652, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6900, selfPID=5100, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2508, selfPID=2508, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7108, selfPID=4992, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6568, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5732, iMonCtr=2
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2136, selfPID=4316, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
15:39:44 (5656): No heartbeat from core client for 30 sec - exiting
15:39:45 (5656): No heartbeat from core client for 30 sec - exiting
15:39:46 (5656): No heartbeat from core client for 30 sec - exiting
15:39:47 (5656): No heartbeat from core client for 30 sec - exiting
15:39:49 (5656): No heartbeat from core client for 30 sec - exiting
15:39:50 (5656): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8120, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8132, selfPID=2060, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7028, selfPID=4984, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8888, selfPID=6808, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3284, selfPID=6008, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7068, selfPID=1672, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7084, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6436, selfPID=2812, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6884, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6600, selfPID=6168, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4436, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4764, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7992, selfPID=7584, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7824, selfPID=6432, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3044, selfPID=6932, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6912, selfPID=6912, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3928, selfPID=4040, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5928, selfPID=7012, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3748, selfPID=5484, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt><message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_65vw_2006_1_007507320_2_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_65vw_2006_1_007507320_2_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_65vw_2006_1_007507320_2_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_65vw_2006_1_007507320_2_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_65vw_2006_1_007507320_2_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_65vw_2006_1_007507320_2_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
10 Oct 2012 17:59:42 1230963 15316804 hadam3p_eu_65vw_2006_1_007507320_2 69,216 236,530 3.4173
07 Oct 2012 20:17:17 1230963 15316804 hadam3p_eu_65vw_2006_1_007507320_2 57,696 197,329 3.4202
06 Oct 2012 16:40:00 1230963 15316804 hadam3p_eu_65vw_2006_1_007507320_2 46,176 157,916 3.4199
04 Oct 2012 15:39:40 1230963 15316804 hadam3p_eu_65vw_2006_1_007507320_2 34,656 118,924 3.4316
02 Oct 2012 14:18:13 1230963 15316804 hadam3p_eu_65vw_2006_1_007507320_2 23,136 80,225 3.4675
30 Sep 2012 20:17:07 1230963 15316804 hadam3p_eu_65vw_2006_1_007507320_2 11,619 41,173 3.5436
30 Sep 2012 19:15:57 1230963 15316804 hadam3p_eu_65vw_2006_1_007507320_2 11,616 40,681 3.5022


©2024 cpdn.org