Name | hadam3p_anz_a2zm_201412_12_313_010279075_1 |
Workunit | 10279075 |
Created | 15 Feb 2016, 18:20:09 UTC |
Sent | 15 Feb 2016, 18:22:55 UTC |
Report deadline | 27 Jan 2017, 23:42:55 UTC |
Received | 25 Feb 2016, 21:19:32 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1254613 |
Run time | 1 days 11 hours 28 min 32 sec |
CPU time | 1 days 9 hours 53 min 52 sec |
Validate state | Invalid |
Credit | 1,006.54 |
Device peak FLOPS | 3.03 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10 windows_intelx86 |
Stderr | <core_client_version>7.6.9</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2716, selfPID=4128, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 00:08:54 (3864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:09:35 (5032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:10:34 (4108): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:11:21 (4004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:12:26 (5672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:13:12 (3372): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:14:12 (5980): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5732, selfPID=5732, iMonCtr=2 00:15:06 (5644): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2372, selfPID=2372, iMonCtr=2 00:15:59 (5228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:16:51 (5796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:18:36 (1484): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:19:28 (6080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:20:21 (4088): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2108, selfPID=2108, iMonCtr=2 00:21:26 (2420): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:22:12 (4880): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2084, selfPID=2084, iMonCtr=2 00:23:00 (3940): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3092, selfPID=3092, iMonCtr=2 00:23:53 (6104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5256, selfPID=5256, iMonCtr=2 00:25:38 (2264): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5524, selfPID=5524, iMonCtr=2 00:26:43 (5316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:27:24 (5180): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4116, selfPID=4116, iMonCtr=2 00:28:35 (4020): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:29:22 (1312): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5576, selfPID=5576, iMonCtr=2 00:30:15 (5656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:31:09 (5236): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5184, selfPID=5184, iMonCtr=2 00:32:01 (1152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:32:54 (5832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=5768, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3316, selfPID=3316, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3316, selfPID=5908, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_3.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_4.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_5.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_6.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_7.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_8.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_9.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_10.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_11.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_a2zm_201412_12_313_010279075_1_12.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
17 Feb 2016 00:10:37 | 1254613 | 19288299 | hadam3p_anz_a2zm_201412_12_313_010279075_1 | 23,339 | 95,052 | 4.0727 |
16 Feb 2016 21:24:56 | 1254613 | 19288299 | hadam3p_anz_a2zm_201412_12_313_010279075_1 | 11,819 | 47,997 | 4.0610 |
©2024 cpdn.org