Task 15399985

Name	hadcm3n_o3hj_1980_40_008239407_1
Workunit	8394531
Created	26 Oct 2012, 17:16:46 UTC
Sent	26 Oct 2012, 17:16:53 UTC
Report deadline	26 Jan 2013, 0:44:04 UTC
Received	13 Dec 2012, 23:15:59 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1184169
Run time	6 days 12 hours 53 min 35 sec
CPU time	6 days 8 hours 11 min 37 sec
Validate state	Invalid
Credit	6,531.84
Device peak FLOPS	3.70 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:06:45 (7936): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:04:30 (1644): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 11:25:22 (1600): No heartbeat from core client for 30 sec - exiting 11:25:23 (1600): No heartbeat from core client for 30 sec - exiting 11:25:24 (1600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 09:34:33 (6948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:15:24 (8896): No heartbeat from core client for 30 sec - exiting 15:15:25 (8896): No heartbeat from core client for 30 sec - exiting 15:15:26 (8896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 23:01:18 (10008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:01:54 (5028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:10:05 (2512): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 21:08:07 (7584): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 12:35:29 (2492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 21:53:48 (9508): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:05:16 (8456): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4368, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4368, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4368, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4368, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4368, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4368, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
06 Dec 2012 21:34:48	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	544,320	540,344	0.9927
02 Dec 2012 18:18:23	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	518,400	514,735	0.9929
02 Dec 2012 00:16:54	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	492,480	489,114	0.9932
01 Dec 2012 16:48:01	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	466,560	463,226	0.9929
29 Nov 2012 22:34:27	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	440,640	437,324	0.9925
26 Nov 2012 22:31:26	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	414,720	411,680	0.9927
25 Nov 2012 16:42:02	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	388,800	385,932	0.9926
23 Nov 2012 23:21:15	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	362,880	360,385	0.9931
21 Nov 2012 22:05:15	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	336,960	334,649	0.9931
19 Nov 2012 19:02:48	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	311,040	308,968	0.9933
18 Nov 2012 15:50:46	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	285,120	283,177	0.9932
17 Nov 2012 23:48:30	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	259,200	257,822	0.9947
17 Nov 2012 16:46:27	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	233,280	232,815	0.9980
16 Nov 2012 18:17:15	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	207,360	207,045	0.9985
03 Nov 2012 20:22:45	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	181,440	181,258	0.9990
03 Nov 2012 12:40:19	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	155,520	155,459	0.9996
01 Nov 2012 20:41:39	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	129,600	129,579	0.9998
01 Nov 2012 12:02:03	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	103,680	103,787	1.0010
28 Oct 2012 22:00:26	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	77,760	78,125	1.0047
28 Oct 2012 14:24:32	1184169	15399985	hadcm3n_o3hj_1980_40_008239407_1	51,840	52,176	1.0065