Task 14912024

Name	hadcm3n_o5ic_2060_40_008049284_2
Workunit	8204398
Created	13 Jul 2012, 12:55:55 UTC
Sent	13 Jul 2012, 12:56:18 UTC
Report deadline	12 Oct 2012, 20:23:29 UTC
Received	18 Oct 2012, 16:31:09 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1227424
Run time	9 days 7 hours 20 min 21 sec
CPU time	8 days 19 hours 49 min 41 sec
Validate state	Invalid
Credit	5,909.76
Device peak FLOPS	2.84 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:16:49 (4688): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4120, iMonCtr=1 Model crash detected, will try to restart... 21:36:47 (1436): No heartbeat from core client for 30 sec - exiting 21:36:48 (1436): No heartbeat from core client for 30 sec - exiting 21:36:50 (1436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:25:38 (5176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:28:59 (4168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 07:07:45 (8632): No heartbeat from core client for 30 sec - exiting 07:07:46 (8632): No heartbeat from core client for 30 sec - exiting 07:07:47 (8632): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:28:52 (5096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 16:09:53 (9580): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 17:32:59 (6224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:30:21 (5504): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 16:08:10 (6164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 06:10:56 (5448): No heartbeat from core client for 30 sec - exiting 06:10:57 (5448): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:31:21 (948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:29:48 (7560): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2600, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2600, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2600, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2600, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2600, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2600, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
16 Oct 2012 16:51:31	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	492,480	735,297	1.4930
15 Oct 2012 11:31:03	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	466,560	696,869	1.4936
11 Oct 2012 19:05:36	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	440,640	658,469	1.4943
10 Oct 2012 14:27:15	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	414,720	618,189	1.4906
08 Oct 2012 14:17:11	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	388,800	578,321	1.4875
03 Oct 2012 00:44:52	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	362,880	543,019	1.4964
01 Oct 2012 11:30:57	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	336,960	503,472	1.4942
26 Sep 2012 20:09:34	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	311,040	462,129	1.4858
25 Sep 2012 12:58:54	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	285,120	421,338	1.4778
14 Sep 2012 18:39:20	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	259,200	383,017	1.4777
13 Sep 2012 17:19:08	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	233,280	344,995	1.4789
12 Sep 2012 18:30:15	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	207,360	307,874	1.4847
11 Sep 2012 13:31:44	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	181,440	271,852	1.4983
03 Sep 2012 11:01:41	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	155,520	233,925	1.5041
02 Sep 2012 01:39:09	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	129,600	196,860	1.5190
01 Sep 2012 13:36:22	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	103,680	156,901	1.5133
01 Sep 2012 02:04:01	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	77,760	117,900	1.5162
31 Aug 2012 06:19:48	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	51,840	78,058	1.5057
04 Aug 2012 13:11:06	1227424	14912024	hadcm3n_o5ic_2060_40_008049284_2	25,920	38,487	1.4848