Name | hadam3p_saf_1fjx_2004_1_006947989_0 |
Workunit | 7151305 |
Created | 22 Nov 2010, 16:33:50 UTC |
Sent | 8 Mar 2011, 23:52:46 UTC |
Report deadline | 19 Feb 2012, 5:12:46 UTC |
Received | 19 Mar 2011, 14:35:56 UTC |
Server state | Over |
Outcome | No reply |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 986646 |
Run time | |
CPU time | 4 days 6 hours 37 min 14 sec |
Validate state | Initial |
Credit | 1,309.70 |
Device peak FLOPS | 2.02 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Southern Africa v6.08 windows_intelx86 |
Stderr | <core_client_version>5.10.45</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8176, iMonCtr=2 16:36:41 (4300): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:36:57 (4300): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2172, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=172, iMonCtr=2 Model crash detected, will try to restart... 11:44:33 (6620): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:44:35 (6620): No heartbeat from core client for 30 sec - exiting 13:41:23 (1976): No heartbeat from core client for 30 sec - exiting 13:41:24 (1976): No heartbeat from core client for 30 sec - exiting 13:41:25 (1976): No heartbeat from core client for 30 sec - exiting 13:41:26 (1976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4652, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 12:41:34 (2188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:42:19 (2772): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:42:25 (2772): No heartbeat from core client for 30 sec - exiting 12:46:18 (4556): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:46:20 (4556): No heartbeat from core client for 30 sec - exiting 12:46:21 (4556): No heartbeat from core client for 30 sec - exiting 12:57:47 (7380): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:57:48 (7380): No heartbeat from core client for 30 sec - exiting 13:02:50 (1204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:02:55 (1204): No heartbeat from core client for 30 sec - exiting 13:02:56 (1204): No heartbeat from core client for 30 sec - exiting 13:02:57 (1204): No heartbeat from core client for 30 sec - exiting 13:02:58 (1204): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7068, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6612, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 09:09:29 (2960): No heartbeat from core client for 30 sec - exiting 09:09:33 (2960): No heartbeat from core client for 30 sec - exiting 09:09:34 (2960): No heartbeat from core client for 30 sec - exiting 09:09:35 (2960): No heartbeat from core client for 30 sec - exiting 09:09:36 (2960): No heartbeat from core client for 30 sec - exiting 09:09:37 (2960): No heartbeat from core client for 30 sec - exiting 09:09:38 (2960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1768, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1576, iMonCtr=2 Leaving CPDN_Main::Monitor... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1216, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5912, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3016, iMonCtr=2 Model crash detected, will try to restart... 10:23:57 (6084): No heartbeat from core client for 30 sec - exiting 10:23:58 (6084): No heartbeat from core client for 30 sec - exiting 10:23:59 (6084): No heartbeat from core client for 30 sec - exiting 10:24:00 (6084): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:48:00 (2900): No heartbeat from core client for 30 sec - exiting 10:48:02 (2900): No heartbeat from core client for 30 sec - exiting 10:48:04 (2900): No heartbeat from core client for 30 sec - exiting 10:48:05 (2900): No heartbeat from core client for 30 sec - exiting 10:48:06 (2900): No heartbeat from core client for 30 sec - exiting 10:48:07 (2900): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:48:08 (2900): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4068, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:19:41 (3484): No heartbeat from core client for 30 sec - exiting 15:19:58 (3484): No heartbeat from core client for 30 sec - exiting 15:19:59 (3484): No heartbeat from core client for 30 sec - exiting 15:20:00 (3484): No heartbeat from core client for 30 sec - exiting 15:20:01 (3484): No heartbeat from core client for 30 sec - exiting 15:20:03 (3484): No heartbeat from core client for 30 sec - exiting 15:20:04 (3484): No heartbeat from core client for 30 sec - exiting 15:20:05 (3484): No heartbeat from core client for 30 sec - exiting 15:20:06 (3484): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Model crashed: Leaving CPDN_Main::Monitor... 15:35:41 (2128): called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_saf_1fjx_2004_1_006947989_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_saf_1fjx_2004_1_006947989_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_saf_1fjx_2004_1_006947989_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_saf_1fjx_2004_1_006947989_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_saf_1fjx_2004_1_006947989_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
17 Mar 2011 22:19:58 | 986646 | 12228552 | hadam3p_saf_1fjx_2004_1_006947989_0 | 80,736 | 364,362 | 4.5130 |
15 Mar 2011 21:15:47 | 986646 | 12228552 | hadam3p_saf_1fjx_2004_1_006947989_0 | 69,216 | 313,849 | 4.5343 |
15 Mar 2011 02:04:18 | 986646 | 12228552 | hadam3p_saf_1fjx_2004_1_006947989_0 | 57,696 | 263,256 | 4.5628 |
14 Mar 2011 08:07:40 | 986646 | 12228552 | hadam3p_saf_1fjx_2004_1_006947989_0 | 46,176 | 213,151 | 4.6161 |
13 Mar 2011 15:22:50 | 986646 | 12228552 | hadam3p_saf_1fjx_2004_1_006947989_0 | 34,656 | 161,870 | 4.6708 |
12 Mar 2011 10:46:28 | 986646 | 12228552 | hadam3p_saf_1fjx_2004_1_006947989_0 | 23,137 | 110,831 | 4.7902 |
11 Mar 2011 23:30:00 | 986646 | 12228552 | hadam3p_saf_1fjx_2004_1_006947989_0 | 23,136 | 110,044 | 4.7564 |
10 Mar 2011 15:42:26 | 986646 | 12228552 | hadam3p_saf_1fjx_2004_1_006947989_0 | 11,616 | 54,951 | 4.7306 |
©2024 cpdn.org