Task 14987971

Name	hadam3p_eu_aa7u_1993_1_008060380_2
Workunit	8215494
Created	24 Jul 2012, 11:15:46 UTC
Sent	24 Jul 2012, 11:38:14 UTC
Report deadline	6 Jul 2013, 16:58:14 UTC
Received	26 Aug 2012, 17:23:21 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	0 (0x00000000)
Computer ID	1185548
Run time	2 days 20 hours 30 min 34 sec
CPU time	8 min 42 sec
Validate state	Invalid
Credit	796.57
Device peak FLOPS	1.93 GFLOPS
Application version	UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86
Stderr	<core_client_version>7.0.25</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4948, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2620, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4140, selfPID=332, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5856, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4228, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6020, selfPID=6008, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 08:26:41 (5416): No heartbeat from core client for 30 sec - exiting 08:26:42 (5416): No heartbeat from core client for 30 sec - exiting 08:26:43 (5416): No heartbeat from core client for 30 sec - exiting 08:26:45 (5416): No heartbeat from core client for 30 sec - exiting 08:26:46 (5416): No heartbeat from core client for 30 sec - exiting 08:26:47 (5416): No heartbeat from core client for 30 sec - exiting 08:26:48 (5416): No heartbeat from core client for 30 sec - exiting 08:26:49 (5416): No heartbeat from core client for 30 sec - exiting 08:26:50 (5416): No heartbeat from core client for 30 sec - exiting 08:26:51 (5416): No heartbeat from core client for 30 sec - exiting 08:26:52 (5416): No heartbeat from core client for 30 sec - exiting 08:26:54 (5416): No heartbeat from core client for 30 sec - exiting 08:26:55 (5416): No heartbeat from core client for 30 sec - exiting 08:26:56 (5416): No heartbeat from core client for 30 sec - exiting 08:26:57 (5416): No heartbeat from core client for 30 sec - exiting 08:26:58 (5416): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4876, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3332, selfPID=6000, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5340, selfPID=5232, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 11:22:24 (5324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... GCM: BUFFIN : Read Failed: No error GCM : BUFFIN: C I/O Error feof - Unit 21 - Return code = 16 GCM : BUFFIN: C I/O Error feof - Unit 21 - Return code = 16 Model crashed: Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_aa7u_1993_1_008060380_2_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_aa7u_1993_1_008060380_2_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_aa7u_1993_1_008060380_2_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_aa7u_1993_1_008060380_2_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_aa7u_1993_1_008060380_2_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_aa7u_1993_1_008060380_2_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_aa7u_1993_1_008060380_2_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_aa7u_1993_1_008060380_2_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
23 Aug 2012 20:53:08	1185548	14987971	hadam3p_eu_aa7u_1993_1_008060380_2	46,176	171,443	3.7128
04 Aug 2012 20:13:57	1185548	14987971	hadam3p_eu_aa7u_1993_1_008060380_2	34,656	126,185	3.6411
31 Jul 2012 23:18:29	1185548	14987971	hadam3p_eu_aa7u_1993_1_008060380_2	23,136	83,792	3.6217
29 Jul 2012 02:43:20	1185548	14987971	hadam3p_eu_aa7u_1993_1_008060380_2	11,616	39,807	3.4269