climateprediction.net home page
Task 16412147

Task 16412147

Name hadam3p_anz_n5mu_2012_1_008594918_0
Workunit 8741430
Created 26 Mar 2014, 18:26:34 UTC
Sent 29 Mar 2014, 1:37:50 UTC
Report deadline 11 Mar 2015, 6:57:50 UTC
Received 20 May 2014, 5:55:44 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1322450
Run time 4 days 16 hours 42 min 30 sec
CPU time 4 days 7 hours 31 min 51 sec
Validate state Invalid
Credit 1,503.36
Device peak FLOPS 1.65 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8588, selfPID=8240, iMonCtr=1
Model crash detected, will try to restart...
19:25:09 (4128): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5348, selfPID=5488, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1084, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5420, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5964, selfPID=4392, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6976, selfPID=6516, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3896, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7064, iMonCtr=2
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5676, selfPID=5676, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3756, selfPID=3532, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5368, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2520, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
17:19:17 (4808): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:19:19 (4808): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
14:36:23 (2612): No heartbeat from core client for 30 sec - exiting
14:36:24 (2612): No heartbeat from core client for 30 sec - exiting
14:36:25 (2612): No heartbeat from core client for 30 sec - exiting
14:36:26 (2612): No heartbeat from core client for 30 sec - exiting
14:36:27 (2612): No heartbeat from core client for 30 sec - exiting
14:36:28 (2612): No heartbeat from core client for 30 sec - exiting
14:36:29 (2612): No heartbeat from core client for 30 sec - exiting
14:36:30 (2612): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:39:00 (8460): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:43:46 (1032): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:44:37 (8252): No heartbeat from core client for 30 sec - exiting
14:44:38 (8252): No heartbeat from core client for 30 sec - exiting
14:44:39 (8252): No heartbeat from core client for 30 sec - exiting
14:44:41 (8252): No heartbeat from core client for 30 sec - exiting
14:44:42 (8252): No heartbeat from core client for 30 sec - exiting
14:44:43 (8252): No heartbeat from core client for 30 sec - exiting
14:44:44 (8252): No heartbeat from core client for 30 sec - exiting
14:44:45 (8252): No heartbeat from core client for 30 sec - exiting
14:44:46 (8252): No heartbeat from core client for 30 sec - exiting
14:44:47 (8252): No heartbeat from core client for 30 sec - exiting
14:44:49 (8252): No heartbeat from core client for 30 sec - exiting
14:44:50 (8252): No heartbeat from core client for 30 sec - exiting
14:44:51 (8252): No heartbeat from core client for 30 sec - exiting
14:44:52 (8252): No heartbeat from core client for 30 sec - exiting
14:44:53 (8252): No heartbeat from core client for 30 sec - exiting
14:44:54 (8252): No heartbeat from core client for 30 sec - exiting
14:44:55 (8252): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8716, selfPID=8716, iMonCtr=2
14:47:46 (5592): No heartbeat from core client for 30 sec - exiting
14:47:47 (5592): No heartbeat from core client for 30 sec - exiting
14:47:48 (5592): No heartbeat from core client for 30 sec - exiting
14:47:49 (5592): No heartbeat from core client for 30 sec - exiting
14:47:50 (5592): No heartbeat from core client for 30 sec - exiting
14:47:51 (5592): No heartbeat from core client for 30 sec - exiting
14:47:52 (5592): No heartbeat from core client for 30 sec - exiting
14:47:53 (5592): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:47:55 (5592): No heartbeat from core client for 30 sec - exiting
14:51:34 (9120): No heartbeat from core client for 30 sec - exiting
14:51:35 (9120): No heartbeat from core client for 30 sec - exiting
14:51:36 (9120): No heartbeat from core client for 30 sec - exiting
14:51:37 (9120): No heartbeat from core client for 30 sec - exiting
14:51:38 (9120): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:53:44 (4428): No heartbeat from core client for 30 sec - exiting
14:53:45 (4428): No heartbeat from core client for 30 sec - exiting
14:53:46 (4428): No heartbeat from core client for 30 sec - exiting
14:53:47 (4428): No heartbeat from core client for 30 sec - exiting
14:53:48 (4428): No heartbeat from core client for 30 sec - exiting
14:53:49 (4428): No heartbeat from core client for 30 sec - exiting
14:53:50 (4428): No heartbeat from core client for 30 sec - exiting
14:53:51 (4428): No heartbeat from core client for 30 sec - exiting
14:53:52 (4428): No heartbeat from core client for 30 sec - exiting
14:53:53 (4428): No heartbeat from core client for 30 sec - exiting
14:53:54 (4428): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:59:47 (2320): No heartbeat from core client for 30 sec - exiting
14:59:48 (2320): No heartbeat from core client for 30 sec - exiting
14:59:49 (2320): No heartbeat from core client for 30 sec - exiting
14:59:50 (2320): No heartbeat from core client for 30 sec - exiting
14:59:51 (2320): No heartbeat from core client for 30 sec - exiting
14:59:52 (2320): No heartbeat from core client for 30 sec - exiting
14:59:53 (2320): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:59:54 (2320): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3580, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
16:31:33 (3700): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4424, selfPID=5700, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3592, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
14:33:35 (3936): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6876, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6276, selfPID=8792, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6084, selfPID=4792, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3268, selfPID=3268, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Colobatroller ::rkerD: prCPDss is not running, ening, exiting, bRetVal1, c1, checkPID=0, selfPID=383iMonCtr=2
=2
del crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=14020, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13224, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
20:58:22 (5520): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6736, selfPID=5876, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_anz_n5mu_2012_1_008594918_0_4.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n5mu_2012_1_008594918_0_5.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n5mu_2012_1_008594918_0_6.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n5mu_2012_1_008594918_0_7.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n5mu_2012_1_008594918_0_8.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n5mu_2012_1_008594918_0_9.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n5mu_2012_1_008594918_0_10.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n5mu_2012_1_008594918_0_11.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n5mu_2012_1_008594918_0_12.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
18 May 2014 12:27:32 1322450 16412147 hadam3p_anz_n5mu_2012_1_008594918_0 34,859 301,704 8.6550
08 Apr 2014 11:41:40 1322450 16412147 hadam3p_anz_n5mu_2012_1_008594918_0 23,339 206,200 8.8350
07 Apr 2014 05:50:13 1322450 16412147 hadam3p_anz_n5mu_2012_1_008594918_0 11,819 106,887 9.0437


©2024 cpdn.org