Task 15934459

Name	hadcm3n_n0l6_1880_40_008410114_0
Workunit	8560970
Created	22 Aug 2013, 4:11:33 UTC
Sent	22 Aug 2013, 15:31:38 UTC
Report deadline	21 Nov 2013, 22:58:49 UTC
Received	9 Dec 2013, 10:24:33 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1275048
Run time	7 days 1 hours 29 min 11 sec
CPU time	6 days 13 hours 46 min 52 sec
Validate state	Invalid
Credit	5,287.68
Device peak FLOPS	3.10 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.2.33</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 20:46:47 (7856): No heartbeat from core client for 30 sec - exiting 20:46:48 (7856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:46:49 (7856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:53:50 (4156): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:53:51 (4156): No heartbeat from core client for 30 sec - exiting 14:53:52 (4156): No heartbeat from core client for 30 sec - exiting 14:53:53 (4156): No heartbeat from core client for 30 sec - exiting 14:53:54 (4156): No heartbeat from core client for 30 sec - exiting 14:53:55 (4156): No heartbeat from core client for 30 sec - exiting 14:53:56 (4156): No heartbeat from core client for 30 sec - exiting 14:53:57 (4156): No heartbeat from core client for 30 sec - exiting 14:53:58 (4156): No heartbeat from core client for 30 sec - exiting 14:53:59 (4156): No heartbeat from core client for 30 sec - exiting 14:54:00 (4156): No heartbeat from core client for 30 sec - exiting Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1304, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1428, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1428, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1428, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1428, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1428, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
16 Nov 2013 22:01:34	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	440,640	567,985	1.2890
04 Nov 2013 22:03:52	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	414,720	533,029	1.2853
31 Oct 2013 23:33:58	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	388,800	499,780	1.2854
31 Oct 2013 14:21:20	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	362,880	467,637	1.2887
23 Oct 2013 22:43:34	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	336,960	433,900	1.2877
16 Oct 2013 11:10:34	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	311,040	399,966	1.2859
15 Oct 2013 06:29:21	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	285,120	365,271	1.2811
13 Oct 2013 14:01:31	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	259,200	331,323	1.2783
09 Oct 2013 02:24:59	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	233,280	298,621	1.2801
08 Oct 2013 16:44:12	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	207,360	265,886	1.2822
11 Sep 2013 17:23:23	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	181,440	232,130	1.2794
26 Aug 2013 20:31:54	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	155,520	198,440	1.2760
26 Aug 2013 02:12:18	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	129,600	164,315	1.2679
25 Aug 2013 10:28:24	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	103,680	131,752	1.2708
24 Aug 2013 13:34:19	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	77,760	99,711	1.2823
23 Aug 2013 21:19:35	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	51,840	66,861	1.2898
23 Aug 2013 10:38:20	1275048	15934459	hadcm3n_n0l6_1880_40_008410114_0	25,920	33,637	1.2977