Name | hadam3p_anz_m6ii_2012_1_009307472_0 |
Workunit | 9391660 |
Created | 17 Dec 2014, 19:55:23 UTC |
Sent | 21 Dec 2014, 16:12:24 UTC |
Report deadline | 3 Dec 2015, 21:32:24 UTC |
Received | 19 Jul 2015, 8:35:00 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1290561 |
Run time | 5 days 19 hours 5 min 31 sec |
CPU time | 4 days 19 hours 8 min 10 sec |
Validate state | Invalid |
Credit | 1,503.36 |
Device peak FLOPS | 2.25 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10 windows_intelx86 |
Stderr | <core_client_version>7.4.36</core_client_version> <![CDATA[ <stderr_txt> 18:30:56 (5008): No heartbeat from core client for 30 sec - exiting 18:30:57 (5008): No heartbeat from core client for 30 sec - exiting 18:30:58 (5008): No heartbeat from core client for 30 sec - exiting 18:30:59 (5008): No heartbeat from core client for 30 sec - exiting 18:31:00 (5008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... RGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1844, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3140, selfPID=3168, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4212, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5380, selfPID=4064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3984, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5012, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3412, selfPID=4044, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4240, selfPID=4024, iMonCtr=1 Model crash detected, will try to restart... 11:49:06 (3556): No heartbeat from core client for 30 sec - exiting 11:49:08 (3556): No heartbeat from core client for 30 sec - exiting 11:49:09 (3556): No heartbeat from core client for 30 sec - exiting 11:49:10 (3556): No heartbeat from core client for 30 sec - exiting 11:49:11 (3556): No heartbeat from core client for 30 sec - exiting 11:49:12 (3556): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5100, selfPID=4836, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4920, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3460, iMonCtr=2 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3436, selfPID=1476, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5000, selfPID=4764, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4192, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=212, selfPID=4792, iMonCtr=1 Model crash detected, will try to restart... 17:22:37 (3696): No heartbeat from core client for 30 sec - exiting 17:22:38 (3696): No heartbeat from core client for 30 sec - exiting 17:22:39 (3696): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4940, selfPID=4940, iMonCtr=2 13:11:36 (4176): No heartbeat from core client for 30 sec - exiting 13:11:37 (4176): No heartbeat from core client for 30 sec - exiting 13:11:38 (4176): No heartbeat from core client for 30 sec - exiting 13:11:39 (4176): No heartbeat from core client for 30 sec - exiting 13:11:41 (4176): No heartbeat from core client for 30 sec - exiting 13:11:42 (4176): No heartbeat from core client for 30 sec - exiting 13:11:43 (4176): No heartbeat from core client for 30 sec - exiting 13:11:44 (4176): No heartbeat from core client for 30 sec - exiting 13:11:45 (4176): No heartbeat from core client for 30 sec - exiting 13:11:46 (4176): No heartbeat from core client for 30 sec - exiting 13:11:47 (4176): No heartbeat from core client for 30 sec - exiting 13:11:48 (4176): No heartbeat from core client for 30 sec - exiting 13:11:49 (4176): No heartbeat from core client for 30 sec - exiting 13:11:50 (4176): No heartbeat from core client for 30 sec - exiting 13:11:51 (4176): No heartbeat from core client for 30 sec - exiting 13:11:52 (4176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... GGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4588, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1380, selfPID=4668, iMonCtr=1 Model crash detected, will try to restart... 20:01:20 (4612): No heartbeat from core client for 30 sec - exiting 20:01:21 (4612): No heartbeat from core client for 30 sec - exiting 20:01:22 (4612): No heartbeat from core client for 30 sec - exiting 20:01:23 (4612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:16:28 (4280): No heartbeat from core client for 30 sec - exiting 18:16:29 (4280): No heartbeat from core client for 30 sec - exiting 18:16:30 (4280): No heartbeat from core client for 30 sec - exiting 18:16:31 (4280): No heartbeat from core client for 30 sec - exiting 18:16:32 (4280): No heartbeat from core client for 30 sec - exiting 18:16:33 (4280): No heartbeat from core client for 30 sec - exiting 18:16:34 (4280): No heartbeat from core client for 30 sec - exiting 18:16:35 (4280): No heartbeat from core client for 30 sec - exiting 18:16:36 (4280): No heartbeat from core client for 30 sec - exiting 18:16:37 (4280): No heartbeat from core client for 30 sec - exiting 18:16:38 (4280): No heartbeat from core client for 30 sec - exiting 18:16:39 (4280): No heartbeat from core client for 30 sec - exiting 18:16:40 (4280): No heartbeat from core client for 30 sec - exiting 18:16:41 (4280): No heartbeat from core client for 30 sec - exiting 18:16:42 (4280): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4352, selfPID=4352, iMonCtr=2 12:06:24 (3708): No heartbeat from core client for 30 sec - exiting 12:06:25 (3708): No heartbeat from core client for 30 sec - exiting 12:06:26 (3708): No heartbeat from core client for 30 sec - exiting 12:06:27 (3708): No heartbeat from core client for 30 sec - exiting 12:06:28 (3708): No heartbeat from core client for 30 sec - exiting 12:06:29 (3708): No heartbeat from core client for 30 sec - exiting 12:06:30 (3708): No heartbeat from core client for 30 sec - exiting 12:06:31 (3708): No heartbeat from core client for 30 sec - exiting 12:06:32 (3708): No heartbeat from core client for 30 sec - exiting 12:06:33 (3708): No heartbeat from core client for 30 sec - exiting 12:06:34 (3708): No heartbeat from core client for 30 sec - exiting 12:06:35 (3708): No heartbeat from core client for 30 sec - exiting 12:06:36 (3708): No heartbeat from core client for 30 sec - exiting 12:06:37 (3708): No heartbeat from core client for 30 sec - exiting 12:06:38 (3708): No heartbeat from core client for 30 sec - exiting 12:06:39 (3708): No heartbeat from core client for 30 sec - exiting 12:06:40 (3708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2812, selfPID=2812, iMonCtr=2 06:57:33 (3724): No heartbeat from core client for 30 sec - exiting 06:57:34 (3724): No heartbeat from core client for 30 sec - exiting 06:57:35 (3724): No heartbeat from core client for 30 sec - exiting 06:57:36 (3724): No heartbeat from core client for 30 sec - exiting 06:57:37 (3724): No heartbeat from core client for 30 sec - exiting 06:57:38 (3724): No heartbeat from core client for 30 sec - exiting 06:58:12 (3724): No heartbeat from core client for 30 sec - exiting 06:58:13 (3724): No heartbeat from core client for 30 sec - exiting 06:58:14 (3724): No heartbeat from core client for 30 sec - exiting 06:58:15 (3724): No heartbeat from core client for 30 sec - exiting 06:58:16 (3724): No heartbeat from core client for 30 sec - exiting 06:58:17 (3724): No heartbeat from core client for 30 sec - exiting 06:58:18 (3724): No heartbeat from core client for 30 sec - exiting 06:58:19 (3724): No heartbeat from core client for 30 sec - exiting 06:58:20 (3724): No heartbeat from core client for 30 sec - exiting 06:58:21 (3724): No heartbeat from core client for 30 sec - exiting 06:58:22 (3724): No heartbeat from core client for 30 sec - exiting 06:58:23 (3724): No heartbeat from core client for 30 sec - exiting 06:58:24 (3724): No heartbeat from core client for 30 sec - exiting 06:58:25 (3724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:06:17 (4440): No heartbeat from core client for 30 sec - exiting 22:06:18 (4440): No heartbeat from core client for 30 sec - exiting 22:06:19 (4440): No heartbeat from core client for 30 sec - exiting 22:06:20 (4440): No heartbeat from core client for 30 sec - exiting 22:06:21 (4440): No heartbeat from core client for 30 sec - exiting 22:06:22 (4440): No heartbeat from core client for 30 sec - exiting 22:06:23 (4440): No heartbeat from core client for 30 sec - exiting 22:06:24 (4440): No heartbeat from core client for 30 sec - exiting 22:06:56 (4440): No heartbeat from core client for 30 sec - exiting 22:06:57 (4440): No heartbeat from core client for 30 sec - exiting 22:06:58 (4440): No heartbeat from core client for 30 sec - exiting 22:06:59 (4440): No heartbeat from core client for 30 sec - exiting 22:07:00 (4440): No heartbeat from core client for 30 sec - exiting 22:07:01 (4440): No heartbeat from core client for 30 sec - exiting 22:07:02 (4440): No heartbeat from core client for 30 sec - exiting 22:07:03 (4440): No heartbeat from core client for 30 sec - exiting 22:07:04 (4440): No heartbeat from core client for 30 sec - exiting 22:07:06 (4440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2840, selfPID=2840, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3656, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5184, selfPID=608, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 23:47:50 (3840): No heartbeat from core client for 30 sec - exiting 23:47:51 (3840): No heartbeat from core client for 30 sec - exiting 23:47:52 (3840): No heartbeat from core client for 30 sec - exiting 23:47:53 (3840): No heartbeat from core client for 30 sec - exiting 23:47:54 (3840): No heartbeat from core client for 30 sec - exiting 23:47:55 (3840): No heartbeat from core client for 30 sec - exiting 23:47:56 (3840): No heartbeat from core client for 30 sec - exiting 23:47:57 (3840): No heartbeat from core client for 30 sec - exiting 23:47:58 (3840): No heartbeat from core client for 30 sec - exiting 23:47:59 (3840): No heartbeat from core client for 30 sec - exiting 23:48:01 (3840): No heartbeat from core client for 30 sec - exiting 23:48:02 (3840): No heartbeat from core client for 30 sec - exiting 23:48:03 (3840): No heartbeat from core client for 30 sec - exiting 23:48:04 (3840): No heartbeat from core client for 30 sec - exiting 23:48:05 (3840): No heartbeat from core client for 30 sec - exiting 23:48:06 (3840): No heartbeat from core client for 30 sec - exiting 23:48:07 (3840): No heartbeat from core client for 30 sec - exiting 23:48:08 (3840): No heartbeat from core client for 30 sec - exiting 23:48:09 (3840): No heartbeat from core client for 30 sec - exiting 23:48:10 (3840): No heartbeat from core client for 30 sec - exiting 23:48:11 (3840): No heartbeat from core client for 30 sec - exiting 23:48:12 (3840): No heartbeat from core client for 30 sec - exiting 23:48:13 (3840): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2764, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4748, selfPID=1140, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3512, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4344, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6120, selfPID=4448, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4784, selfPID=3192, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4952, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3596, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2408, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4428, selfPID=3540, iMonCtr=1 Model crash detected, will try to restart... 20:27:06 (5348): No heartbeat from core client for 30 sec - exiting 20:27:07 (5348): No heartbeat from core client for 30 sec - exiting 20:27:08 (5348): No heartbeat from core client for 30 sec - exiting 20:27:09 (5348): No heartbeat from core client for 30 sec - exiting 20:27:11 (5348): No heartbeat from core client for 30 sec - exiting 20:27:12 (5348): No heartbeat from core client for 30 sec - exiting 20:27:13 (5348): No heartbeat from core client for 30 sec - exiting 20:27:14 (5348): No heartbeat from core client for 30 sec - exiting 20:27:15 (5348): No heartbeat from core client for 30 sec - exiting 20:27:16 (5348): No heartbeat from core client for 30 sec - exiting 20:27:17 (5348): No heartbeat from core client for 30 sec - exiting 20:27:18 (5348): No heartbeat from core client for 30 sec - exiting 20:27:19 (5348): No heartbeat from core client for 30 sec - exiting 20:27:20 (5348): No heartbeat from core client for 30 sec - exiting 20:27:21 (5348): No heartbeat from core client for 30 sec - exiting 20:27:22 (5348): No heartbeat from core client for 30 sec - exiting 20:27:23 (5348): No heartbeat from core client for 30 sec - exiting 20:27:25 (5348): No heartbeat from core client for 30 sec - exiting 20:27:26 (5348): No heartbeat from core client for 30 sec - exiting 20:27:27 (5348): No heartbeat from core client for 30 sec - exiting 20:27:29 (5348): No heartbeat from core client for 30 sec - exiting 20:27:30 (5348): No heartbeat from core client for 30 sec - exiting 20:27:31 (5348): No heartbeat from core client for 30 sec - exiting 20:27:32 (5348): No heartbeat from core client for 30 sec - exiting 20:27:33 (5348): No heartbeat from core client for 30 sec - exiting 20:27:34 (5348): No heartbeat from core client for 30 sec - exiting 20:27:35 (5348): No heartbeat from core client for 30 sec - exiting 20:27:36 (5348): No heartbeat from core client for 30 sec - exiting 20:27:37 (5348): No heartbeat from core client for 30 sec - exiting 20:27:39 (5348): No heartbeat from core client for 30 sec - exiting 20:27:40 (5348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3712, selfPID=5520, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3532, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5400, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5564, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3336, selfPID=5116, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3648, selfPID=3132, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3040, selfPID=4556, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5564, selfPID=4560, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4184, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4288, selfPID=4504, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3016, selfPID=4664, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4288, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6020, selfPID=4364, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4368, selfPID=4544, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=996, selfPID=2776, iMonCtr=1 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5656, selfPID=4944, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_anz_m6ii_2012_1_009307472_0_4.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_m6ii_2012_1_009307472_0_5.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_m6ii_2012_1_009307472_0_6.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_m6ii_2012_1_009307472_0_7.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_m6ii_2012_1_009307472_0_8.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_m6ii_2012_1_009307472_0_9.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_m6ii_2012_1_009307472_0_10.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_m6ii_2012_1_009307472_0_11.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_anz_m6ii_2012_1_009307472_0_12.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
22 May 2015 21:01:55 | 1290561 | 17593730 | hadam3p_anz_m6ii_2012_1_009307472_0 | 34,859 | 334,924 | 9.6080 |
26 Mar 2015 09:26:26 | 1290561 | 17593730 | hadam3p_anz_m6ii_2012_1_009307472_0 | 23,339 | 223,115 | 9.5597 |
24 Jan 2015 15:56:08 | 1290561 | 17593730 | hadam3p_anz_m6ii_2012_1_009307472_0 | 11,819 | 112,533 | 9.5214 |
©2024 cpdn.org