Task 15501535

Name	hadcm3n_3ihq_1940_40_008262778_2
Workunit	8417902
Created	23 Dec 2012, 19:58:17 UTC
Sent	23 Dec 2012, 20:41:19 UTC
Report deadline	25 Mar 2013, 4:08:30 UTC
Received	25 Feb 2013, 8:01:21 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID	1166383
Run time	60 days 2 hours 6 min 13 sec
CPU time	57 days 5 hours 18 min 1 sec
Validate state	Invalid
Credit	11,197.44
Device peak FLOPS	1.76 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4920, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5048, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1272, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6024, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5388, iMonCtr=1 Model crash detected, will try to restart... 09:17:01 (5852): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:54:52 (4204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4652, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 04:29:08 (4772): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:27:59 (5288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
25 Feb 2013 03:37:56	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	933,120	4,934,341	5.2880
21 Feb 2013 18:11:36	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	907,200	4,774,179	5.2625
20 Feb 2013 16:05:42	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	881,280	4,683,134	5.3140
19 Feb 2013 14:55:45	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	855,360	4,593,017	5.3697
18 Feb 2013 14:25:24	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	829,440	4,502,378	5.4282
17 Feb 2013 03:20:55	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	803,520	4,383,055	5.4548
14 Feb 2013 19:37:01	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	777,600	4,196,762	5.3971
12 Feb 2013 06:19:05	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	751,680	4,012,140	5.3376
09 Feb 2013 23:19:37	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	725,760	3,827,066	5.2732
07 Feb 2013 18:03:04	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	699,840	3,646,436	5.2104
05 Feb 2013 12:55:13	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	673,920	3,468,182	5.1463
03 Feb 2013 07:02:15	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	648,000	3,292,910	5.0817
01 Feb 2013 04:05:52	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	622,080	3,122,469	5.0194
31 Jan 2013 01:29:24	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	596,160	3,025,461	5.0749
30 Jan 2013 02:32:20	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	570,240	2,945,876	5.1660
29 Jan 2013 05:08:58	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	544,320	2,868,927	5.2707
28 Jan 2013 07:09:14	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	518,400	2,789,728	5.3814
27 Jan 2013 08:10:54	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	492,480	2,706,785	5.4962
25 Jan 2013 12:05:54	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	466,560	2,553,688	5.4734
23 Jan 2013 11:53:45	1166383	15501535	hadcm3n_3ihq_1940_40_008262778_2	440,640	2,389,320	5.4224