Name | hadam3p_anz_n86s_2007_1_009867200_1 |
Workunit | 9905697 |
Created | 30 May 2015, 0:44:56 UTC |
Sent | 4 Jun 2015, 9:15:27 UTC |
Report deadline | 16 May 2016, 14:35:27 UTC |
Received | 22 Jun 2015, 6:56:34 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1241225 |
Run time | 14 days 12 hours 41 min 7 sec |
CPU time | 11 days 11 hours 38 min 4 sec |
Validate state | Invalid |
Credit | 2,993.82 |
Device peak FLOPS | 1.33 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10 windows_intelx86 |
Stderr | <core_client_version>7.2.28</core_client_version> <![CDATA[ <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7040, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8992, iMonCtr=2 00:06:19 (6712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5972, selfPID=5972, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3252, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 09:29:38 (2572): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:29:39 (2572): No heartbeat from core client for 30 sec - exiting 09:29:41 (2572): No heartbeat from core client for 30 sec - exiting 09:29:42 (2572): No heartbeat from core client for 30 sec - exiting 15:27:47 (12080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:27:48 (12080): No heartbeat from core client for 30 sec - exiting 16:42:17 (13308): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:19:50 (19016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:43:16 (22480): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:43:17 (22480): No heartbeat from core client for 30 sec - exiting 19:09:11 (9632): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:09:12 (9632): No heartbeat from core client for 30 sec - exiting 19:09:13 (9632): No heartbeat from core client for 30 sec - exiting 19:09:14 (9632): No heartbeat from core client for 30 sec - exiting 19:09:15 (9632): No heartbeat from core client for 30 sec - exiting 19:09:16 (9632): No heartbeat from core client for 30 sec - exiting 19:09:17 (9632): No heartbeat from core client for 30 sec - exiting 20:57:38 (9388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:57:39 (9388): No heartbeat from core client for 30 sec - exiting 20:57:46 (9388): No heartbeat from core client for 30 sec - exiting 00:26:25 (34844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:26:26 (34844): No heartbeat from core client for 30 sec - exiting 00:26:27 (34844): No heartbeat from core client for 30 sec - exiting 00:33:41 (9388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:33:42 (9388): No heartbeat from core client for 30 sec - exiting 00:33:43 (9388): No heartbeat from core client for 30 sec - exiting 00:33:44 (9388): No heartbeat from core client for 30 sec - exiting 00:33:45 (9388): No heartbeat from core client for 30 sec - exiting 00:33:46 (9388): No heartbeat from core client for 30 sec - exiting Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=39904, selfPID=39904, iMonCtr=2 00:47:48 (38636): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=38540, selfPID=38540, iMonCtr=2 00:47:49 (38636): No heartbeat from core client for 30 sec - exiting 00:47:50 (38636): No heartbeat from core client for 30 sec - exiting 00:47:51 (38636): No heartbeat from core client for 30 sec - exiting 00:47:52 (38636): No heartbeat from core client for 30 sec - exiting 00:47:53 (38636): No heartbeat from core client for 30 sec - exiting 00:47:54 (38636): No heartbeat from core client for 30 sec - exiting 23:49:24 (39948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:49:25 (39948): No heartbeat from core client for 30 sec - exiting 23:49:26 (39948): No heartbeat from core client for 30 sec - exiting 23:49:27 (39948): No heartbeat from core client for 30 sec - exiting 23:49:28 (39948): No heartbeat from core client for 30 sec - exiting 23:49:29 (39948): No heartbeat from core client for 30 sec - exiting 23:49:30 (39948): No heartbeat from core client for 30 sec - exiting 23:49:32 (39948): No heartbeat from core client for 30 sec - exiting 23:49:33 (39948): No heartbeat from core client for 30 sec - exiting 23:49:34 (39948): No heartbeat from core client for 30 sec - exiting 23:49:35 (39948): No heartbeat from core client for 30 sec - exiting 01:02:28 (46496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:02:29 (46496): No heartbeat from core client for 30 sec - exiting 01:02:31 (46496): No heartbeat from core client for 30 sec - exiting 01:02:32 (46496): No heartbeat from core client for 30 sec - exiting 01:38:59 (39840): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:39:01 (39840): No heartbeat from core client for 30 sec - exiting 01:39:02 (39840): No heartbeat from core client for 30 sec - exiting 01:39:03 (39840): No heartbeat from core client for 30 sec - exiting 01:39:04 (39840): No heartbeat from core client for 30 sec - exiting 01:39:05 (39840): No heartbeat from core client for 30 sec - exiting 01:39:06 (39840): No heartbeat from core client for 30 sec - exiting 03:20:27 (46440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:20:28 (46440): No heartbeat from core client for 30 sec - exiting 03:20:29 (46440): No heartbeat from core client for 30 sec - exiting 03:20:30 (46440): No heartbeat from core client for 30 sec - exiting 03:20:31 (46440): No heartbeat from core client for 30 sec - exiting 03:20:32 (46440): No heartbeat from core client for 30 sec - exiting 16:16:41 (10672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:16:42 (10672): No heartbeat from core client for 30 sec - exiting 17:14:58 (40564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:14:59 (40564): No heartbeat from core client for 30 sec - exiting 17:15:00 (40564): No heartbeat from core client for 30 sec - exiting 17:15:01 (40564): No heartbeat from core client for 30 sec - exiting 17:15:02 (40564): No heartbeat from core client for 30 sec - exiting 17:15:03 (40564): No heartbeat from core client for 30 sec - exiting 17:15:05 (40564): No heartbeat from core client for 30 sec - exiting 17:52:05 (43996): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:06:09 (43700): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:06:10 (43700): No heartbeat from core client for 30 sec - exiting 19:06:11 (43700): No heartbeat from core client for 30 sec - exiting 19:06:12 (43700): No heartbeat from core client for 30 sec - exiting 19:06:14 (43700): No heartbeat from core client for 30 sec - exiting 20:09:56 (45960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:09:57 (45960): No heartbeat from core client for 30 sec - exiting 20:09:58 (45960): No heartbeat from core client for 30 sec - exiting 20:09:59 (45960): No heartbeat from core client for 30 sec - exiting 20:10:00 (45960): No heartbeat from core client for 30 sec - exiting 20:10:01 (45960): No heartbeat from core client for 30 sec - exiting 20:10:02 (45960): No heartbeat from core client for 30 sec - exiting forrtl: FormatMessage failed for sysmem message number 145004:32:03 (45192): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:32:04 (45192): No heartbeat from core client for 30 sec - exiting 04:32:05 (45192): No heartbeat from core client for 30 sec - exiting 04:32:06 (45192): No heartbeat from core client for 30 sec - exiting 05:23:28 (42524): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:23:30 (42524): No heartbeat from core client for 30 sec - exiting 05:23:31 (42524): No heartbeat from core client for 30 sec - exiting 05:55:12 (20852): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:35:52 (15112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:59:05 (44984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:22:46 (4956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:47:47 (28428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=32352, selfPID=32352, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=32352, selfPID=21512, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_anz_n86s_2007_1_009867200_1_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_n86s_2007_1_009867200_1_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_n86s_2007_1_009867200_1_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_n86s_2007_1_009867200_1_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_n86s_2007_1_009867200_1_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_n86s_2007_1_009867200_1_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
18 Jun 2015 18:57:46 | 1241225 | 18513876 | hadam3p_anz_n86s_2007_1_009867200_1 | 69,419 | 936,874 | 13.4959 |
15 Jun 2015 17:03:28 | 1241225 | 18513876 | hadam3p_anz_n86s_2007_1_009867200_1 | 57,899 | 773,530 | 13.3600 |
13 Jun 2015 03:27:29 | 1241225 | 18513876 | hadam3p_anz_n86s_2007_1_009867200_1 | 46,379 | 613,952 | 13.2377 |
10 Jun 2015 08:42:42 | 1241225 | 18513876 | hadam3p_anz_n86s_2007_1_009867200_1 | 34,859 | 454,536 | 13.0393 |
08 Jun 2015 09:27:26 | 1241225 | 18513876 | hadam3p_anz_n86s_2007_1_009867200_1 | 23,339 | 302,283 | 12.9518 |
06 Jun 2015 13:56:33 | 1241225 | 18513876 | hadam3p_anz_n86s_2007_1_009867200_1 | 11,819 | 151,031 | 12.7787 |
©2024 cpdn.org