Name | hadam3p_eu_8cw6_2009_1_008134729_0 |
Workunit | 8289843 |
Created | 12 Aug 2012, 9:16:15 UTC |
Sent | 12 Aug 2012, 9:20:28 UTC |
Report deadline | 25 Jul 2013, 14:40:28 UTC |
Received | 26 Sep 2012, 11:39:13 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1231507 |
Run time | 6 days 8 hours 24 min 45 sec |
CPU time | 4 days 5 hours 18 min 37 sec |
Validate state | Invalid |
Credit | 1,790.21 |
Device peak FLOPS | 2.09 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3892, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4036, selfPID=1656, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3556, selfPID=3536, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3484, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1968, selfPID=1744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3064, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1240, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3548, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1656, selfPID=3492, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2556, selfPID=3496, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1792, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1836, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=668, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3444, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3480, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3128, selfPID=2588, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1992, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3828, selfPID=3896, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2948, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2936, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2928, selfPID=3428, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=220, selfPID=220, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3700, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3576, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1140, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1404, selfPID=3464, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2272, selfPID=3524, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3036, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3124, selfPID=3460, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2428, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2368, selfPID=3924, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=772, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3468, selfPID=3700, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3520, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3124, selfPID=3644, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3696, iMonCtr=2 Model crash detected, will try to restart... 13:59:38 (3536): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1608, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1144, selfPID=4028, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3364, selfPID=3032, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2244, selfPID=1448, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1800, selfPID=3196, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3856, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1252, selfPID=3100, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2332, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1788, selfPID=1152, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3740, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3240, selfPID=3276, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3912, selfPID=2660, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4076, selfPID=2124, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2572, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2132, selfPID=3688, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3520, selfPID=3920, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3400, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1612, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=660, selfPID=3980, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2484, selfPID=3692, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1044, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3856, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2836, selfPID=3204, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3560, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3936, selfPID=3492, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2160, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3440, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3700, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3004, selfPID=2608, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4000, iMonCtr=2 Mod el crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3300, selfPID=3956, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2624, selfPID=3620, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2492, selfPID=3592, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3088, selfPID=4032, iMonCtr=1 Model crash detected, will try to restart... CGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1448, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3264, selfPID=3688, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2820, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3484, selfPID=3908, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2276, selfPID=2276, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3816, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1356, selfPID=3820, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt><message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_8cw6_2009_1_008134729_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_8cw6_2009_1_008134729_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_8cw6_2009_1_008134729_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
23 Sep 2012 07:50:58 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 103,776 | 354,781 | 3.4187 |
16 Sep 2012 06:33:58 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 92,257 | 315,721 | 3.4222 |
15 Sep 2012 15:44:05 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 92,256 | 315,150 | 3.4160 |
10 Sep 2012 12:27:47 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 80,737 | 276,801 | 3.4284 |
09 Sep 2012 16:06:49 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 80,736 | 276,257 | 3.4217 |
08 Sep 2012 07:06:03 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 69,217 | 236,990 | 3.4239 |
07 Sep 2012 20:07:21 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 69,216 | 236,391 | 3.4153 |
04 Sep 2012 16:25:45 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 57,696 | 196,671 | 3.4087 |
31 Aug 2012 09:25:36 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 46,177 | 157,707 | 3.4153 |
31 Aug 2012 08:25:24 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 46,176 | 157,129 | 3.4028 |
28 Aug 2012 08:10:53 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 34,656 | 117,714 | 3.3966 |
23 Aug 2012 19:10:58 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 23,136 | 78,651 | 3.3995 |
20 Aug 2012 07:39:16 | 1231507 | 15101868 | hadam3p_eu_8cw6_2009_1_008134729_0 | 11,616 | 39,380 | 3.3902 |
©2024 climateprediction.net