Task 13672580

Name	hadcm3n_ycx2_1940_40_007547387_1
Workunit	7744619
Created	29 Nov 2011, 14:13:57 UTC
Sent	29 Nov 2011, 14:15:28 UTC
Report deadline	28 Feb 2012, 21:42:39 UTC
Received	3 Mar 2012, 21:40:57 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	870485
Run time	10 days 10 hours 19 min 12 sec
CPU time	10 days 10 hours 19 min 12 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.32 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>5.10.45</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5972, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4584, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4584, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4620, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3744, iMonCtr=1 Model crash detected, will try to restart... 13:19:11 (4296): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:19:12 (4296): No heartbeat from core client for 30 sec - exiting 13:19:13 (4296): No heartbeat from core client for 30 sec - exiting 13:19:14 (4296): No heartbeat from core client for 30 sec - exiting 13:19:15 (4296): No heartbeat from core client for 30 sec - exiting 13:19:16 (4296): No heartbeat from core client for 30 sec - exiting 13:19:17 (4296): No heartbeat from core client for 30 sec - exiting 13:19:18 (4296): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5676, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 15:34:24 (6032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4328, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5244, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5164, iMonCtr=1 Model crash detected, will try to restart... 20:30:50 (5028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=888, iMonCtr=1 Model crash detected, will try to restart... 17:51:02 (1004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:51:05 (1004): No heartbeat from core client for 30 sec - exiting 17:51:06 (1004): No heartbeat from core client for 30 sec - exiting 17:51:07 (1004): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5232, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5792, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4940, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 21:38:41 (5168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3564, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4824, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... ConCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 22:37:52 (7016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CSignal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
03 Mar 2012 20:42:43	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	518,400	901,140	1.7383
26 Feb 2012 12:37:35	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	492,480	855,794	1.7377
20 Feb 2012 19:34:57	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	466,560	810,664	1.7375
15 Feb 2012 20:47:30	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	440,640	765,403	1.7370
12 Feb 2012 15:46:03	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	414,720	718,752	1.7331
05 Feb 2012 16:48:03	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	388,800	673,218	1.7315
29 Jan 2012 14:18:36	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	362,880	628,477	1.7319
25 Jan 2012 22:21:26	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	336,960	583,753	1.7324
21 Jan 2012 22:43:47	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	311,040	538,689	1.7319
17 Jan 2012 21:56:33	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	285,120	494,817	1.7355
14 Jan 2012 11:02:39	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	259,200	450,666	1.7387
07 Jan 2012 20:26:52	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	233,280	405,557	1.7385
03 Jan 2012 21:02:37	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	207,360	360,520	1.7386
01 Jan 2012 15:33:53	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	181,440	315,826	1.7407
28 Dec 2011 21:40:26	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	155,520	270,641	1.7402
26 Dec 2011 18:53:48	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	129,600	225,328	1.7386
21 Dec 2011 20:34:12	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	103,680	179,870	1.7349
19 Dec 2011 16:59:37	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	77,760	135,187	1.7385
16 Dec 2011 21:35:40	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	51,840	90,109	1.7382
04 Dec 2011 12:45:52	870485	13672580	hadcm3n_ycx2_1940_40_007547387_1	25,920	44,399	1.7129