Name | hadam3p_eu_a3v0_2013_1_008521345_0 |
Workunit | 8668857 |
Created | 3 Mar 2014, 12:15:08 UTC |
Sent | 3 Mar 2014, 12:16:17 UTC |
Report deadline | 13 Feb 2015, 17:36:17 UTC |
Received | 27 Mar 2014, 3:55:44 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1308857 |
Run time | 2 days 6 hours 19 min 18 sec |
CPU time | 2 days 4 hours 55 min 42 sec |
Validate state | Invalid |
Credit | 1,591.48 |
Device peak FLOPS | 2.02 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>7.2.33</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1176, selfPID=1176, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8736, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6648, selfPID=5964, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 07:18:49 (1132): No heartbeat from core client for 30 sec - exiting 07:18:50 (1132): No heartbeat from core client for 30 sec - exiting 07:18:51 (1132): No heartbeat from core client for 30 sec - exiting 07:18:52 (1132): No heartbeat from core client for 30 sec - exiting 07:18:53 (1132): No heartbeat from core client for 30 sec - exiting 07:18:55 (1132): No heartbeat from core client for 30 sec - exiting 07:18:56 (1132): No heartbeat from core client for 30 sec - exiting 07:18:57 (1132): No heartbeat from core client for 30 sec - exiting 07:18:58 (1132): No heartbeat from core client for 30 sec - exiting 07:18:59 (1132): No heartbeat from core client for 30 sec - exiting 07:19:00 (1132): No heartbeat from core client for 30 sec - exiting 07:19:01 (1132): No heartbeat from core client for 30 sec - exiting 07:19:02 (1132): No heartbeat from core client for 30 sec - exiting 07:19:03 (1132): No heartbeat from core client for 30 sec - exiting 07:19:04 (1132): No heartbeat from core client for 30 sec - exiting 07:19:05 (1132): No heartbeat from core client for 30 sec - exiting 07:19:07 (1132): No heartbeat from core client for 30 sec - exiting 07:19:08 (1132): No heartbeat from core client for 30 sec - exiting 07:19:09 (1132): No heartbeat from core client for 30 sec - exiting 07:19:10 (1132): No heartbeat from core client for 30 sec - exiting 07:19:11 (1132): No heartbeat from core client for 30 sec - exiting 07:19:12 (1132): No heartbeat from core client for 30 sec - exiting 07:19:13 (1132): No heartbeat from core client for 30 sec - exiting 07:19:14 (1132): No heartbeat from core client for 30 sec - exiting 07:19:15 (1132): No heartbeat from core client for 30 sec - exiting 07:19:16 (1132): No heartbeat from core client for 30 sec - exiting 07:19:18 (1132): No heartbeat from core client for 30 sec - exiting 07:19:19 (1132): No heartbeat from core client for 30 sec - exiting 07:19:20 (1132): No heartbeat from core client for 30 sec - exiting 07:19:21 (1132): No heartbeat from core client for 30 sec - exiting 07:19:22 (1132): No heartbeat from core client for 30 sec - exiting 07:19:23 (1132): No heartbeat from core client for 30 sec - exiting 07:19:24 (1132): No heartbeat from core client for 30 sec - exiting 07:19:25 (1132): No heartbeat from core client for 30 sec - exiting 07:19:26 (1132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 07:24:03 (5544): No heartbeat from core client for 30 sec - exiting 07:24:04 (5544): No heartbeat from core client for 30 sec - exiting 07:24:05 (5544): No heartbeat from core client for 30 sec - exiting 07:24:06 (5544): No heartbeat from core client for 30 sec - exiting 07:24:07 (5544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4288, selfPID=4288, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3136, selfPID=3136, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3996, selfPID=3996, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9816, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4860, selfPID=6088, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6508, selfPID=792, iMonCtr=1 Model crash detected, will try to restart... 15:58:45 (4568): No heartbeat from core client for 30 sec - exiting 15:58:47 (4568): No heartbeat from core client for 30 sec - exiting 15:58:48 (4568): No heartbeat from core client for 30 sec - exiting 15:58:49 (4568): No heartbeat from core client for 30 sec - exiting 15:58:50 (4568): No heartbeat from core client for 30 sec - exiting 15:58:51 (4568): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5924, selfPID=5592, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO tmp/xaakg.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_a3v0_2013_1_008521345_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_a3v0_2013_1_008521345_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_a3v0_2013_1_008521345_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_a3v0_2013_1_008521345_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
25 Mar 2014 07:16:00 | 1308857 | 16302968 | hadam3p_eu_a3v0_2013_1_008521345_0 | 92,256 | 190,061 | 2.0601 |
22 Mar 2014 19:16:46 | 1308857 | 16302968 | hadam3p_eu_a3v0_2013_1_008521345_0 | 80,736 | 166,579 | 2.0633 |
20 Mar 2014 09:36:02 | 1308857 | 16302968 | hadam3p_eu_a3v0_2013_1_008521345_0 | 69,216 | 143,963 | 2.0799 |
18 Mar 2014 07:38:10 | 1308857 | 16302968 | hadam3p_eu_a3v0_2013_1_008521345_0 | 57,696 | 119,633 | 2.0735 |
13 Mar 2014 07:32:47 | 1308857 | 16302968 | hadam3p_eu_a3v0_2013_1_008521345_0 | 46,176 | 94,897 | 2.0551 |
07 Mar 2014 09:43:24 | 1308857 | 16302968 | hadam3p_eu_a3v0_2013_1_008521345_0 | 34,656 | 71,627 | 2.0668 |
06 Mar 2014 09:12:35 | 1308857 | 16302968 | hadam3p_eu_a3v0_2013_1_008521345_0 | 23,136 | 48,276 | 2.0866 |
05 Mar 2014 16:55:12 | 1308857 | 16302968 | hadam3p_eu_a3v0_2013_1_008521345_0 | 11,616 | 24,722 | 2.1283 |
©2024 cpdn.org