Name | hadam3p_saf_189z_1982_1_006919359_0 |
Workunit | 7122675 |
Created | 22 Nov 2010, 9:45:45 UTC |
Sent | 18 Mar 2011, 23:18:42 UTC |
Report deadline | 29 Feb 2012, 4:38:42 UTC |
Received | 12 Apr 2011, 22:24:33 UTC |
Server state | Over |
Outcome | No reply |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1139664 |
Run time | 4 days 3 hours 15 min 13 sec |
CPU time | 2 days 20 hours 8 min 18 sec |
Validate state | Invalid |
Credit | 1,309.70 |
Device peak FLOPS | 2.10 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Southern Africa v6.08 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:33:46 (8780): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:33:47 (8780): No heartbeat from core client for 30 sec - exiting 16:33:48 (8780): No heartbeat from core client for 30 sec - exiting 16:33:49 (8780): No heartbeat from core client for 30 sec - exiting 16:33:50 (8780): No heartbeat from core client for 30 sec - exiting 16:33:51 (8780): No heartbeat from core client for 30 sec - exiting 16:33:52 (8780): No heartbeat from core client for 30 sec - exiting 16:33:53 (8780): No heartbeat from core client for 30 sec - exiting 16:33:54 (8780): No heartbeat from core client for 30 sec - exiting 16:33:55 (8780): No heartbeat from core client for 30 sec - exiting 16:33:56 (8780): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 02:33:53 (11784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=23636, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10876, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3028, selfPID=4416, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10328, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4792, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 00:09:04 (9712): No heartbeat from core client for 30 sec - exiting 00:09:06 (9712): No heartbeat from core client for 30 sec - exiting 00:09:07 (9712): No heartbeat from core client for 30 sec - exiting 00:09:08 (9712): No heartbeat from core client for 30 sec - exiting 00:09:09 (9712): No heartbeat from core client for 30 sec - exiting 00:09:10 (9712): No heartbeat from core client for 30 sec - exiting 00:09:11 (9712): No heartbeat from core client for 30 sec - exiting 00:09:12 (9712): No heartbeat from core client for 30 sec - exiting 00:09:13 (9712): No heartbeat from core client for 30 sec - exiting 00:09:14 (9712): No heartbeat from core client for 30 sec - exiting 00:09:15 (9712): No heartbeat from core client for 30 sec - exiting 00:09:16 (9712): No heartbeat from core client for 30 sec - exiting 00:09:17 (9712): No heartbeat from core client for 30 sec - exiting 00:09:18 (9712): No heartbeat from core client for 30 sec - exiting 00:09:19 (9712): No heartbeat from core client for 30 sec - exiting 00:09:20 (9712): No heartbeat from core client for 30 sec - exiting 00:09:21 (9712): No heartbeat from core client for 30 sec - exiting 00:09:22 (9712): No heartbeat from core client for 30 sec - exiting 00:09:23 (9712): No heartbeat from core client for 30 sec - exiting 00:09:24 (9712): No heartbeat from core client for 30 sec - exiting 00:09:25 (9712): No heartbeat from core client for 30 sec - exiting 00:09:26 (9712): No heartbeat from core client for 30 sec - exiting 00:09:27 (9712): No heartbeat from core client for 30 sec - exiting 00:09:28 (9712): No heartbeat from core client for 30 sec - exiting 00:09:29 (9712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5908, selfPID=5296, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:22:08 (6464): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7260, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6216, selfPID=2972, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 12:35:10 (8168): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 12:35:12 (8168): No heartbeat from core client for 30 sec - exiting 12:35:13 (8168): No heartbeat from core client for 30 sec - exiting 12:35:14 (8168): No heartbeat from core client for 30 sec - exiting 12:35:15 (8168): No heartbeat from core client for 30 sec - exiting 12:35:16 (8168): No heartbeat from core client for 30 sec - exiting 12:35:17 (8168): No heartbeat from core client for 30 sec - exiting 12:35:18 (8168): No heartbeat from core client for 30 sec - exiting 12:35:19 (8168): No heartbeat from core client for 30 sec - exiting 12:35:20 (8168): No heartbeat from core client for 30 sec - exiting 12:35:21 (8168): No heartbeat from core client for 30 sec - exiting 12:35:22 (8168): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 22:55:32 (864): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 22:55:45 (864): No heartbeat from core client for 30 sec - exiting 22:55:48 (864): No heartbeat from core client for 30 sec - exiting 22:55:49 (864): No heartbeat from core client for 30 sec - exiting 22:55:50 (864): No heartbeat from core client for 30 sec - exiting 22:55:51 (864): No heartbeat from core client for 30 sec - exiting 22:55:52 (864): No heartbeat from core client for 30 sec - exiting 22:55:53 (864): No heartbeat from core client for 30 sec - exiting 22:55:54 (864): No heartbeat from core client for 30 sec - exiting 22:55:55 (864): No heartbeat from core client for 30 sec - exiting 22:55:57 (864): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=16064, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 19:22:59 (16064): called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_saf_189z_1982_1_006919359_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_saf_189z_1982_1_006919359_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_saf_189z_1982_1_006919359_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_saf_189z_1982_1_006919359_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_saf_189z_1982_1_006919359_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
04 Apr 2011 06:38:06 | 1139664 | 12199186 | hadam3p_saf_189z_1982_1_006919359_0 | 80,736 | 218,796 | 2.7100 |
03 Apr 2011 12:35:31 | 1139664 | 12199186 | hadam3p_saf_189z_1982_1_006919359_0 | 69,216 | 187,501 | 2.7089 |
25 Mar 2011 23:58:16 | 1139664 | 12199186 | hadam3p_saf_189z_1982_1_006919359_0 | 57,696 | 157,578 | 2.7312 |
23 Mar 2011 12:27:09 | 1139664 | 12199186 | hadam3p_saf_189z_1982_1_006919359_0 | 46,176 | 124,435 | 2.6948 |
22 Mar 2011 19:47:58 | 1139664 | 12199186 | hadam3p_saf_189z_1982_1_006919359_0 | 34,656 | 93,344 | 2.6934 |
21 Mar 2011 21:55:22 | 1139664 | 12199186 | hadam3p_saf_189z_1982_1_006919359_0 | 23,136 | 62,321 | 2.6937 |
20 Mar 2011 20:50:30 | 1139664 | 12199186 | hadam3p_saf_189z_1982_1_006919359_0 | 11,616 | 31,424 | 2.7052 |
©2025 cpdn.org