Name | hadam3p_anz_f45b_2012_1_009777428_0 |
Workunit | 9833392 |
Created | 24 Apr 2015, 14:41:21 UTC |
Sent | 27 Apr 2015, 17:16:13 UTC |
Report deadline | 8 Apr 2016, 22:36:13 UTC |
Received | 20 May 2015, 19:24:26 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1241225 |
Run time | 16 days 13 hours 49 min 52 sec |
CPU time | 15 days 19 hours 30 min 55 sec |
Validate state | Invalid |
Credit | 2,993.82 |
Device peak FLOPS | 1.37 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10 windows_intelx86 |
Stderr | <core_client_version>7.2.28</core_client_version> <![CDATA[ <stderr_txt> 08:57:58 (6868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:01:01 (9016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:04:05 (7144): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:07:08 (1760): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3584, selfPID=3584, iMonCtr=2 09:13:13 (6792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:16:16 (7400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:19:19 (7192): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:19:20 (7192): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 08:46:27 (3148): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:52:39 (9060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:55:47 (9212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:58:54 (8412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:05:05 (6464): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:08:12 (8272): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:14:23 (6984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:20:34 (8752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:02:53 (9160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:02:55 (9176): Can't acquire lockfile (32) - waiting 35s 10:06:01 (9176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8160, selfPID=8160, iMonCtr=2 10:09:08 (4584): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7148, selfPID=7148, iMonCtr=2 10:12:17 (8604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:18:29 (9020): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:24:41 (7784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8460, selfPID=8460, iMonCtr=2 10:30:51 (5452): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8892, selfPID=8892, iMonCtr=2 10:33:58 (5504): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:37:07 (9004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7184, selfPID=7184, iMonCtr=2 10:40:15 (7704): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:01:53 (8492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4604, selfPID=4604, iMonCtr=2 10:01:59 (5856): Can't acquire lockfile (32) - waiting 35s 10:14:16 (5856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:20:18 (6224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:23:27 (2256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:26:35 (8544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:26:36 (8544): No heartbeat from core client for 30 sec - exiting 10:32:47 (4320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:35:56 (7712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:39:03 (7440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:14:47 (8740): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:24:32 (6864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:38:12 (904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:38:19 (904): No heartbeat from core client for 30 sec - exiting 13:38:20 (904): No heartbeat from core client for 30 sec - exiting 13:38:21 (904): No heartbeat from core client for 30 sec - exiting 13:38:22 (904): No heartbeat from core client for 30 sec - exiting 13:38:23 (904): No heartbeat from core client for 30 sec - exiting 13:44:57 (3612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:55:22 (5608): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8712, selfPID=8712, iMonCtr=2 13:55:23 (5608): No heartbeat from core client for 30 sec - exiting 13:58:29 (5760): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4312, selfPID=4312, iMonCtr=2 14:09:05 (7396): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:09:06 (7396): No heartbeat from core client for 30 sec - exiting 14:12:43 (9108): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:15:50 (3228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1488, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=428, selfPID=8692, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_anz_f45b_2012_1_009777428_0_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f45b_2012_1_009777428_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f45b_2012_1_009777428_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f45b_2012_1_009777428_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f45b_2012_1_009777428_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f45b_2012_1_009777428_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
12 May 2015 10:45:01 | 1241225 | 18342857 | hadam3p_anz_f45b_2012_1_009777428_0 | 69,419 | 1,197,419 | 17.2492 |
10 May 2015 00:50:00 | 1241225 | 18342857 | hadam3p_anz_f45b_2012_1_009777428_0 | 57,899 | 998,697 | 17.2490 |
08 May 2015 19:26:07 | 1241225 | 18342857 | hadam3p_anz_f45b_2012_1_009777428_0 | 46,379 | 802,047 | 17.2933 |
08 May 2015 19:14:01 | 1241225 | 18342857 | hadam3p_anz_f45b_2012_1_009777428_0 | 34,859 | 604,225 | 17.3334 |
08 May 2015 19:07:18 | 1241225 | 18342857 | hadam3p_anz_f45b_2012_1_009777428_0 | 23,339 | 403,958 | 17.3083 |
08 May 2015 18:59:41 | 1241225 | 18342857 | hadam3p_anz_f45b_2012_1_009777428_0 | 11,819 | 207,082 | 17.5211 |
©2024 cpdn.org