Name | hadam3p_anz_f900_2013_1_009733528_0 |
Workunit | 9805373 |
Created | 8 Apr 2015, 21:12:34 UTC |
Sent | 13 Apr 2015, 5:36:15 UTC |
Report deadline | 25 Mar 2016, 10:56:15 UTC |
Received | 21 May 2015, 8:46:22 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1355028 |
Run time | 3 days 22 hours 42 min 21 sec |
CPU time | 3 days 16 hours 6 min 2 sec |
Validate state | Invalid |
Credit | 2,497.00 |
Device peak FLOPS | 3.32 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10 windows_intelx86 |
Stderr | <core_client_version>6.8.44</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:49:40 (5408): No heartbeat from core client for 30 sec - exiting 14:49:41 (5408): No heartbeat from core client for 30 sec - exiting 14:49:42 (5408): No heartbeat from core client for 30 sec - exiting 14:49:43 (5408): No heartbeat from core client for 30 sec - exiting 14:49:44 (5408): No heartbeat from core client for 30 sec - exiting 14:49:45 (5408): No heartbeat from core client for 30 sec - exiting 14:49:46 (5408): No heartbeat from core client for 30 sec - exiting 14:49:47 (5408): No heartbeat from core client for 30 sec - exiting 14:49:48 (5408): No heartbeat from core client for 30 sec - exiting 14:49:49 (5408): No heartbeat from core client for 30 sec - exiting 14:49:50 (5408): No heartbeat from core client for 30 sec - exiting 14:49:51 (5408): No heartbeat from core client for 30 sec - exiting 14:49:52 (5408): No heartbeat from core client for 30 sec - exiting 14:49:53 (5408): No heartbeat from core client for 30 sec - exiting 14:49:54 (5408): No heartbeat from core client for 30 sec - exiting 14:49:55 (5408): No heartbeat from core client for 30 sec - exiting 14:49:56 (5408): No heartbeat from core client for 30 sec - exiting 14:49:57 (5408): No heartbeat from core client for 30 sec - exiting 14:49:58 (5408): No heartbeat from core client for 30 sec - exiting 14:49:59 (5408): No heartbeat from core client for 30 sec - exiting 14:50:00 (5408): No heartbeat from core client for 30 sec - exiting 14:50:01 (5408): No heartbeat from core client for 30 sec - exiting 14:50:02 (5408): No heartbeat from core client for 30 sec - exiting 14:50:03 (5408): No heartbeat from core client for 30 sec - exiting 14:50:04 (5408): No heartbeat from core client for 30 sec - exiting 14:50:05 (5408): No heartbeat from core client for 30 sec - exiting 14:50:06 (5408): No heartbeat from core client for 30 sec - exiting 14:50:07 (5408): No heartbeat from core client for 30 sec - exiting 14:50:08 (5408): No heartbeat from core client for 30 sec - exiting 14:50:09 (5408): No heartbeat from core client for 30 sec - exiting 14:50:10 (5408): No heartbeat from core client for 30 sec - exiting 14:50:11 (5408): No heartbeat from core client for 30 sec - exiting 14:50:12 (5408): No heartbeat from core client for 30 sec - exiting 14:50:13 (5408): No heartbeat from core client for 30 sec - exiting 14:50:14 (5408): No heartbeat from core client for 30 sec - exiting 14:50:15 (5408): No heartbeat from core client for 30 sec - exiting 14:50:16 (5408): No heartbeat from core client for 30 sec - exiting 14:50:17 (5408): No heartbeat from core client for 30 sec - exiting 14:50:18 (5408): No heartbeat from core client for 30 sec - exiting 14:50:19 (5408): No heartbeat from core client for 30 sec - exiting 14:50:20 (5408): No heartbeat from core client for 30 sec - exiting 14:50:21 (5408): No heartbeat from core client for 30 sec - exiting 14:50:22 (5408): No heartbeat from core client for 30 sec - exiting 14:50:23 (5408): No heartbeat from core client for 30 sec - exiting 14:50:24 (5408): No heartbeat from core client for 30 sec - exiting 14:50:25 (5408): No heartbeat from core client for 30 sec - exiting 14:50:26 (5408): No heartbeat from core client for 30 sec - exiting 14:50:27 (5408): No heartbeat from core client for 30 sec - exiting 14:50:28 (5408): No heartbeat from core client for 30 sec - exiting 14:50:29 (5408): No heartbeat from core client for 30 sec - exiting 14:50:30 (5408): No heartbeat from core client for 30 sec - exiting 14:50:31 (5408): No heartbeat from core client for 30 sec - exiting 14:50:32 (5408): No heartbeat from core client for 30 sec - exiting 14:50:33 (5408): No heartbeat from core client for 30 sec - exiting 14:50:34 (5408): No heartbeat from core client for 30 sec - exiting 14:50:35 (5408): No heartbeat from core client for 30 sec - exiting 14:50:36 (5408): No heartbeat from core client for 30 sec - exiting 14:50:37 (5408): No heartbeat from core client for 30 sec - exiting 14:50:38 (5408): No heartbeat from core client for 30 sec - exiting 14:50:39 (5408): No heartbeat from core client for 30 sec - exiting 14:50:40 (5408): No heartbeat from core client for 30 sec - exiting 14:50:41 (5408): No heartbeat from core client for 30 sec - exiting 14:50:42 (5408): No heartbeat from core client for 30 sec - exiting 14:50:43 (5408): No heartbeat from core client for 30 sec - exiting 14:50:44 (5408): No heartbeat from core client for 30 sec - exiting 14:50:45 (5408): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2136, selfPID=2136, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6492, selfPID=6492, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7784, selfPID=7784, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9944, selfPID=9944, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=252, selfPID=252, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4564, selfPID=4564, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1492, selfPID=1492, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3220, selfPID=3220, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5576, selfPID=5576, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5336, selfPID=5336, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2040, selfPID=2040, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6776, selfPID=6776, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 06:55:14 (5196): No heartbeat from core client for 30 sec - exiting 06:55:15 (5196): No heartbeat from core client for 30 sec - exiting 06:55:16 (5196): No heartbeat from core client for 30 sec - exiting 06:55:17 (5196): No heartbeat from core client for 30 sec - exiting 06:55:18 (5196): No heartbeat from core client for 30 sec - exiting 06:55:19 (5196): No heartbeat from core client for 30 sec - exiting 06:55:20 (5196): No heartbeat from core client for 30 sec - exiting 06:55:21 (5196): No heartbeat from core client for 30 sec - exiting 06:55:22 (5196): No heartbeat from core client for 30 sec - exiting 06:55:23 (5196): No heartbeat from core client for 30 sec - exiting 06:55:24 (5196): No heartbeat from core client for 30 sec - exiting 06:55:25 (5196): No heartbeat from core client for 30 sec - exiting 06:55:26 (5196): No heartbeat from core client for 30 sec - exiting 06:55:27 (5196): No heartbeat from core client for 30 sec - exiting 06:55:28 (5196): No heartbeat from core client for 30 sec - exiting 06:55:29 (5196): No heartbeat from core client for 30 sec - exiting 06:55:30 (5196): No heartbeat from core client for 30 sec - exiting 06:55:31 (5196): No heartbeat from core client for 30 sec - exiting 06:55:32 (5196): No heartbeat from core client for 30 sec - exiting 06:55:33 (5196): No heartbeat from core client for 30 sec - exiting 06:55:34 (5196): No heartbeat from core client for 30 sec - exiting 06:55:35 (5196): No heartbeat from core client for 30 sec - exiting 06:55:36 (5196): No heartbeat from core client for 30 sec - exiting 06:55:37 (5196): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=1688, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=0, iMonCtr=2 CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=1900, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=0, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3772, selfPID=2772, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_anz_f900_2013_1_009733528_0_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f900_2013_1_009733528_0_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f900_2013_1_009733528_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f900_2013_1_009733528_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f900_2013_1_009733528_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f900_2013_1_009733528_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_f900_2013_1_009733528_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
12 May 2015 07:52:19 | 1355028 | 18287198 | hadam3p_anz_f900_2013_1_009733528_0 | 57,899 | 291,773 | 5.0393 |
08 May 2015 19:02:19 | 1355028 | 18287198 | hadam3p_anz_f900_2013_1_009733528_0 | 46,379 | 233,503 | 5.0347 |
29 Apr 2015 04:25:09 | 1355028 | 18287198 | hadam3p_anz_f900_2013_1_009733528_0 | 34,859 | 174,437 | 5.0041 |
21 Apr 2015 11:26:43 | 1355028 | 18287198 | hadam3p_anz_f900_2013_1_009733528_0 | 23,339 | 118,742 | 5.0877 |
14 Apr 2015 20:38:27 | 1355028 | 18287198 | hadam3p_anz_f900_2013_1_009733528_0 | 11,819 | 60,910 | 5.1536 |
©2024 cpdn.org