Name | hadam3p_eu_qfvv_2001_1_008346376_0 |
Workunit | 8497237 |
Created | 5 Apr 2013, 14:16:51 UTC |
Sent | 5 Apr 2013, 19:02:31 UTC |
Report deadline | 19 Mar 2014, 0:22:31 UTC |
Received | 20 Aug 2013, 13:38:05 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1253472 |
Run time | 4 days 3 hours 46 min |
CPU time | 2 days 15 hours 20 min 21 sec |
Validate state | Invalid |
Credit | 995.30 |
Device peak FLOPS | 1.85 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5052, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6904, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5188, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4132, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2444, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1772, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3100, selfPID=4808, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6356, selfPID=1028, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4264, selfPID=2620, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4020, selfPID=2928, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5776, selfPID=5776, iMonCtr=2 CPDN Monitor - Quit request from BOINC... 09:03:42 (2392): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3316, selfPID=5200, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5888, selfPID=1932, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2828, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5540, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6700, selfPID=6712, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5100, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3716, selfPID=5344, iMonCtr=1 Model crash detected, will try to restart... 13:58:18 (1932): No heartbeat from core client for 30 sec - exiting 13:58:19 (1932): No heartbeat from core client for 30 sec - exiting 13:58:20 (1932): No heartbeat from core client for 30 sec - exiting 13:58:21 (1932): No heartbeat from core client for 30 sec - exiting 13:58:22 (1932): No heartbeat from core client for 30 sec - exiting 13:58:23 (1932): No heartbeat from core client for 30 sec - exiting 13:58:24 (1932): No heartbeat from core client for 30 sec - exiting 13:58:25 (1932): No heartbeat from core client for 30 sec - exiting 13:58:26 (1932): No heartbeat from core client for 30 sec - exiting 13:58:27 (1932): No heartbeat from core client for 30 sec - exiting 13:58:28 (1932): No heartbeat from core client for 30 sec - exiting 13:58:29 (1932): No heartbeat from core client for 30 sec - exiting 13:58:30 (1932): No heartbeat from core client for 30 sec - exiting 13:58:32 (1932): No heartbeat from core client for 30 sec - exiting 13:58:33 (1932): No heartbeat from core client for 30 sec - exiting 13:58:34 (1932): No heartbeat from core client for 30 sec - exiting 13:58:35 (1932): No heartbeat from core client for 30 sec - exiting 13:58:36 (1932): No heartbeat from core client for 30 sec - exiting 13:58:37 (1932): No heartbeat from core client for 30 sec - exiting 13:58:38 (1932): No heartbeat from core client for 30 sec - exiting 13:58:39 (1932): No heartbeat from core client for 30 sec - exiting 13:58:40 (1932): No heartbeat from core client for 30 sec - exiting 13:58:41 (1932): No heartbeat from core client for 30 sec - exiting 13:58:42 (1932): No heartbeat from core client for 30 sec - exiting 13:58:43 (1932): No heartbeat from core client for 30 sec - exiting 13:58:44 (1932): No heartbeat from core client for 30 sec - exiting 13:58:45 (1932): No heartbeat from core client for 30 sec - exiting 13:58:46 (1932): No heartbeat from core client for 30 sec - exiting 13:58:47 (1932): No heartbeat from core client for 30 sec - exiting 13:58:48 (1932): No heartbeat from core client for 30 sec - exiting 13:58:49 (1932): No heartbeat from core client for 30 sec - exiting 13:58:50 (1932): No heartbeat from core client for 30 sec - exiting 13:58:51 (1932): No heartbeat from core client for 30 sec - exiting 13:58:52 (1932): No heartbeat from core client for 30 sec - exiting 13:58:53 (1932): No heartbeat from core client for 30 sec - exiting 13:58:54 (1932): No heartbeat from core client for 30 sec - exiting 13:58:55 (1932): No heartbeat from core client for 30 sec - exiting 13:58:56 (1932): No heartbeat from core client for 30 sec - exiting 13:58:57 (1932): No heartbeat from core client for 30 sec - exiting 13:58:58 (1932): No heartbeat from core client for 30 sec - exiting 13:58:59 (1932): No heartbeat from core client for 30 sec - exiting 13:59:00 (1932): No heartbeat from core client for 30 sec - exiting 13:59:01 (1932): No heartbeat from core client for 30 sec - exiting 13:59:02 (1932): No heartbeat from core client for 30 sec - exiting 13:59:03 (1932): No heartbeat from core client for 30 sec - exiting 13:59:04 (1932): No heartbeat from core client for 30 sec - exiting 13:59:05 (1932): No heartbeat from core client for 30 sec - exiting 13:59:06 (1932): No heartbeat from core client for 30 sec - exiting 13:59:07 (1932): No heartbeat from core client for 30 sec - exiting 13:59:08 (1932): No heartbeat from core client for 30 sec - exiting 13:59:09 (1932): No heartbeat from core client for 30 sec - exiting 13:59:10 (1932): No heartbeat from core client for 30 sec - exiting 13:59:11 (1932): No heartbeat from core client for 30 sec - exiting 13:59:12 (1932): No heartbeat from core client for 30 sec - exiting 13:59:13 (1932): No heartbeat from core client for 30 sec - exiting 13:59:14 (1932): No heartbeat from core client for 30 sec - exiting 13:59:15 (1932): No heartbeat from core client for 30 sec - exiting 13:59:16 (1932): No heartbeat from core client for 30 sec - exiting 13:59:17 (1932): No heartbeat from core client for 30 sec - exiting 13:59:18 (1932): No heartbeat from core client for 30 sec - exiting 13:59:19 (1932): No heartbeat from core client for 30 sec - exiting 13:59:20 (1932): No heartbeat from core client for 30 sec - exiting 13:59:21 (1932): No heartbeat from core client for 30 sec - exiting 13:59:22 (1932): No heartbeat from core client for 30 sec - exiting 13:59:23 (1932): No heartbeat from core client for 30 sec - exiting 13:59:24 (1932): No heartbeat from core client for 30 sec - exiting 13:59:25 (1932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:03:28 (2404): No heartbeat from core client for 30 sec - exiting 16:03:29 (2404): No heartbeat from core client for 30 sec - exiting 16:03:30 (2404): No heartbeat from core client for 30 sec - exiting 16:03:32 (2404): No heartbeat from core client for 30 sec - exiting 16:03:33 (2404): No heartbeat from core client for 30 sec - exiting 16:03:34 (2404): No heartbeat from core client for 30 sec - exiting 16:03:35 (2404): No heartbeat from core client for 30 sec - exiting 16:03:36 (2404): No heartbeat from core client for 30 sec - exiting 16:03:37 (2404): No heartbeat from core client for 30 sec - exiting 16:03:38 (2404): No heartbeat from core client for 30 sec - exiting 16:03:40 (2404): No heartbeat from core client for 30 sec - exiting 16:03:41 (2404): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6308, selfPID=728, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 10:37:54 (4280): No heartbeat from core client for 30 sec - exiting 10:37:55 (4280): No heartbeat from core client for 30 sec - exiting 10:37:57 (4280): No heartbeat from core client for 30 sec - exiting 10:37:58 (4280): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1020, iMonCtr= 2 del crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4880, selfPID=1752, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt><message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
17 Aug 2013 11:25:06 | 1253472 | 15707821 | hadam3p_eu_qfvv_2001_1_008346376_0 | 57,696 | 207,036 | 3.5884 |
26 Jul 2013 15:56:56 | 1253472 | 15707821 | hadam3p_eu_qfvv_2001_1_008346376_0 | 46,176 | 165,580 | 3.5858 |
23 Jul 2013 19:47:20 | 1253472 | 15707821 | hadam3p_eu_qfvv_2001_1_008346376_0 | 34,656 | 124,292 | 3.5864 |
02 Jul 2013 11:48:52 | 1253472 | 15707821 | hadam3p_eu_qfvv_2001_1_008346376_0 | 23,136 | 83,825 | 3.6231 |
10 Apr 2013 15:15:34 | 1253472 | 15707821 | hadam3p_eu_qfvv_2001_1_008346376_0 | 11,616 | 42,514 | 3.6600 |
©2024 cpdn.org