climateprediction.net home page
Task 18618160

Task 18618160

Name hadam3p_anz_f4le_2013_1_009943463_0
Workunit 9970825
Created 23 Jun 2015, 19:58:25 UTC
Sent 23 Jun 2015, 20:00:55 UTC
Report deadline 5 Jun 2016, 1:20:55 UTC
Received 9 Oct 2015, 13:31:52 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1075479
Run time 6 days 19 hours 31 min 1 sec
CPU time 4 days 13 hours 21 min 3 sec
Validate state Invalid
Credit 3,987.46
Device peak FLOPS 3.59 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10
windows_intelx86
Stderr
<core_client_version>7.2.47</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6172, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4932, selfPID=8088, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1744, selfPID=308, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7528, selfPID=8052, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7172, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
08:50:43 (980): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8040, selfPID=7232, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN proceController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7556, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7940, selfPID=7668, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8016, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8116, selfPID=5012, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8160, selfPID=7044, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7928, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6824, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7788, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8044, selfPID=7052, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7776, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7232, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9824, iMonCtr=2
Model crash detected, will try to restart...
12:58:39 (7652): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:58:40 (7652): No heartbeat from core client for 30 sec - exiting
12:58:41 (7652): No heartbeat from core client for 30 sec - exiting
12:58:42 (7652): No heartbeat from core client for 30 sec - exiting
12:58:43 (7652): No heartbeat from core client for 30 sec - exiting
13:14:06 (3444): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:14:14 (6252): Can't acquire lockfile (32) - waiting 35s
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7620, selfPID=6928, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6588, selfPID=7516, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7076, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8028, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7408, iMonCtr=2
Model crash detected, will try to restart...
CSuspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9756, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7740, selfPID=6828, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7424, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8128, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6712, selfPID=4264, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6860, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10068, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7056, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7740, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7144, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7892, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7992, selfPID=7400, iMonCtr=1
Model crash detected, will try to restart...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8168, selfPID=7268, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CGntroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7664, iMonCtr=2
Model crash detected, will try to restart...
lobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6644, iMonCtr=2
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7472, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7528, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8064, selfPID=7552, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7788, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6908, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7812, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6680, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7252, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7356, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8128, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7472, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7676, selfPID=6840, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8000, selfPID=7236, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8056, selfPID=7044, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7540, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7640, selfPID=6916, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CSuspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7944, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8120, selfPID=7240, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3716, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7356, selfPID=7932, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7288, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=3936, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5868, selfPID=5868, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5868, selfPID=7084, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_anz_f4le_2013_1_009943463_0_9.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_f4le_2013_1_009943463_0_10.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_f4le_2013_1_009943463_0_11.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_f4le_2013_1_009943463_0_12.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
08 Oct 2015 16:13:49 1075479 18618160 hadam3p_anz_f4le_2013_1_009943463_0 92,459 393,268 4.2534
25 Sep 2015 15:12:57 1075479 18618160 hadam3p_anz_f4le_2013_1_009943463_0 80,939 342,612 4.2330
17 Sep 2015 20:48:36 1075479 18618160 hadam3p_anz_f4le_2013_1_009943463_0 69,419 292,273 4.2103
10 Sep 2015 19:15:31 1075479 18618160 hadam3p_anz_f4le_2013_1_009943463_0 57,899 245,110 4.2334
01 Sep 2015 19:15:05 1075479 18618160 hadam3p_anz_f4le_2013_1_009943463_0 46,379 194,461 4.1929
11 Aug 2015 14:12:44 1075479 18618160 hadam3p_anz_f4le_2013_1_009943463_0 34,859 145,314 4.1686
27 Jul 2015 19:05:20 1075479 18618160 hadam3p_anz_f4le_2013_1_009943463_0 23,339 96,737 4.1449
09 Jul 2015 13:35:37 1075479 18618160 hadam3p_anz_f4le_2013_1_009943463_0 11,819 48,693 4.1199


©2024 climateprediction.net