Task 17385948

Name	hadcm3n_sayj_1940_40_009109766_1
Workunit	9240102
Created	10 Nov 2014, 23:42:20 UTC
Sent	10 Nov 2014, 23:43:31 UTC
Report deadline	10 Feb 2015, 7:10:42 UTC
Received	23 Nov 2014, 9:03:44 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1187505
Run time	11 days 0 hours 26 min 38 sec
CPU time	10 days 21 hours 1 min 2 sec
Validate state	Invalid
Credit	9,642.24
Device peak FLOPS	2.62 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.4.27</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 09:32:50 (9164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:32:51 (9164): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 16:42:14 (2880): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3740, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3740, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3740, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3740, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3740, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3740, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
23 Nov 2014 00:09:45	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	803,520	913,138	1.1364
22 Nov 2014 15:48:36	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	777,600	883,382	1.1360
21 Nov 2014 21:18:55	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	751,680	849,466	1.1301
21 Nov 2014 13:43:12	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	725,760	820,036	1.1299
20 Nov 2014 19:47:51	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	699,840	789,535	1.1282
20 Nov 2014 11:31:06	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	673,920	760,239	1.1281
20 Nov 2014 03:23:20	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	648,000	731,204	1.1284
19 Nov 2014 19:43:54	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	622,080	704,028	1.1317
19 Nov 2014 13:09:43	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	596,160	677,726	1.1368
19 Nov 2014 04:31:54	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	570,240	649,838	1.1396
18 Nov 2014 19:38:16	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	544,320	618,450	1.1362
18 Nov 2014 10:50:05	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	518,400	587,359	1.1330
18 Nov 2014 02:38:02	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	492,480	558,047	1.1331
17 Nov 2014 18:15:32	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	466,560	528,329	1.1324
17 Nov 2014 09:58:26	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	440,640	498,637	1.1316
17 Nov 2014 01:50:46	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	414,720	469,527	1.1322
16 Nov 2014 18:19:52	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	388,800	440,441	1.1328
16 Nov 2014 09:51:32	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	362,880	412,421	1.1365
15 Nov 2014 12:00:34	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	336,960	380,188	1.1283
15 Nov 2014 04:14:29	1187505	17385948	hadcm3n_sayj_1940_40_009109766_1	311,040	352,466	1.1332