climateprediction.net home page
Task 14877440

Task 14877440

Name hadam3p_pnw_bhlt_1977_1_008032318_0
Workunit 8187432
Created 8 Jul 2012, 15:55:58 UTC
Sent 8 Jul 2012, 15:56:11 UTC
Report deadline 20 Jun 2013, 21:16:11 UTC
Received 21 Jul 2012, 11:30:16 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1183651
Run time 5 days 8 hours 0 min 52 sec
CPU time 13 hours 21 min 15 sec
Validate state Invalid
Credit 2,505.24
Device peak FLOPS 3.32 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
18:13:31 (3452): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5180, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5136, selfPID=6792, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10768, selfPID=8988, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6752, selfPID=376, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6716, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6724, selfPID=4992, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 10
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2736, selfPID=2660, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5004, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4060, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1100, selfPID=2304, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 1
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6260, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=204, selfPID=416, iMonCtr=1
Model crash detected, will try to restart...
23:25:01 (8128): No heartbeat from core client for 30 sec - exiting
23:25:02 (8128): No heartbeat from core client for 30 sec - exiting
23:25:03 (8128): No heartbeat from core client for 30 sec - exiting
23:25:04 (8128): No heartbeat from core client for 30 sec - exiting
23:25:05 (8128): No heartbeat from core client for 30 sec - exiting
23:25:06 (8128): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_pnw_bhlt_1977_1_008032318_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_bhlt_1977_1_008032318_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
15 Jul 2012 20:02:46 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 115,296 251,834 2.1842
15 Jul 2012 11:22:47 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 103,776 226,343 2.1811
14 Jul 2012 18:08:21 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 92,256 200,880 2.1774
14 Jul 2012 09:50:13 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 80,736 175,494 2.1737
13 Jul 2012 15:50:52 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 69,216 150,241 2.1706
12 Jul 2012 22:02:00 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 57,696 124,867 2.1642
10 Jul 2012 15:36:52 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 46,176 98,831 2.1403
09 Jul 2012 22:53:20 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 34,656 73,627 2.1245
09 Jul 2012 15:31:06 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 23,136 49,096 2.1221
09 Jul 2012 07:39:36 1183651 14877440 hadam3p_pnw_bhlt_1977_1_008032318_0 11,616 24,486 2.1080


©2024 cpdn.org