Task 15064769

Name	hadcm3n_1066_1940_40_007955406_2
Workunit	8110518
Created	2 Aug 2012, 1:21:18 UTC
Sent	2 Aug 2012, 1:21:23 UTC
Report deadline	1 Nov 2012, 8:48:34 UTC
Received	23 Aug 2012, 13:07:29 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	869571
Run time	16 days 21 hours 7 min 17 sec
CPU time	16 days 21 hours 7 min 17 sec
Validate state	Invalid
Credit	9,020.16
Device peak FLOPS	2.53 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>5.10.45</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7524, iMonCtr=1 Model crash detected, will try to restart... 07:12:05 (4476): No heartbeat from core client for 30 sec - exiting 07:12:06 (4476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4052, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3112, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3112, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3112, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3112, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4244, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4244, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
18 Aug 2012 12:29:45	869571	15064769	hadcm3n_1066_1940_40_007955406_2	751,680	1,437,124	1.9119
17 Aug 2012 23:22:19	869571	15064769	hadcm3n_1066_1940_40_007955406_2	725,760	1,387,949	1.9124
17 Aug 2012 10:10:09	869571	15064769	hadcm3n_1066_1940_40_007955406_2	699,840	1,338,838	1.9131
16 Aug 2012 22:02:27	869571	15064769	hadcm3n_1066_1940_40_007955406_2	673,920	1,291,661	1.9166
16 Aug 2012 08:45:58	869571	15064769	hadcm3n_1066_1940_40_007955406_2	648,000	1,245,177	1.9216
15 Aug 2012 19:16:14	869571	15064769	hadcm3n_1066_1940_40_007955406_2	622,080	1,197,736	1.9254
15 Aug 2012 02:38:48	869571	15064769	hadcm3n_1066_1940_40_007955406_2	596,160	1,150,851	1.9304
14 Aug 2012 12:57:22	869571	15064769	hadcm3n_1066_1940_40_007955406_2	570,240	1,101,141	1.9310
13 Aug 2012 23:29:24	869571	15064769	hadcm3n_1066_1940_40_007955406_2	544,320	1,051,048	1.9309
13 Aug 2012 10:04:38	869571	15064769	hadcm3n_1066_1940_40_007955406_2	518,400	1,001,385	1.9317
12 Aug 2012 20:36:08	869571	15064769	hadcm3n_1066_1940_40_007955406_2	492,480	951,239	1.9315
12 Aug 2012 07:10:09	869571	15064769	hadcm3n_1066_1940_40_007955406_2	466,560	901,315	1.9318
11 Aug 2012 17:52:06	869571	15064769	hadcm3n_1066_1940_40_007955406_2	440,640	851,430	1.9323
11 Aug 2012 04:26:09	869571	15064769	hadcm3n_1066_1940_40_007955406_2	414,720	801,244	1.9320
10 Aug 2012 15:02:52	869571	15064769	hadcm3n_1066_1940_40_007955406_2	388,800	751,766	1.9336
10 Aug 2012 01:45:35	869571	15064769	hadcm3n_1066_1940_40_007955406_2	362,880	702,316	1.9354
09 Aug 2012 12:27:03	869571	15064769	hadcm3n_1066_1940_40_007955406_2	336,960	652,267	1.9357
08 Aug 2012 23:14:56	869571	15064769	hadcm3n_1066_1940_40_007955406_2	311,040	604,008	1.9419
08 Aug 2012 12:16:57	869571	15064769	hadcm3n_1066_1940_40_007955406_2	285,120	556,856	1.9531
07 Aug 2012 21:09:37	869571	15064769	hadcm3n_1066_1940_40_007955406_2	259,200	506,686	1.9548