Task 15488858

Name	hadcm3n_3n3u_1940_40_008260802_0
Workunit	8415926
Created	20 Dec 2012, 19:38:32 UTC
Sent	20 Dec 2012, 19:40:41 UTC
Report deadline	22 Mar 2013, 3:07:52 UTC
Received	9 Jan 2013, 17:06:03 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1183081
Run time	9 days 0 hours 48 min 59 sec
CPU time	8 days 2 hours 45 min 26 sec
Validate state	Invalid
Credit	5,909.76
Device peak FLOPS	3.05 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... 12:17:34 (440): No heartbeat from core client for 30 sec - exiting 12:17:35 (440): No heartbeat from core client for 30 sec - exiting 12:17:36 (440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:12:24 (3128): No heartbeat from core client for 30 sec - exiting 09:12:25 (3128): No heartbeat from core client for 30 sec - exiting 09:12:26 (3128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3996, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3016, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2484, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2856, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=932, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=932, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=932, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1468, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1468, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1468, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1468, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
07 Jan 2013 21:59:20	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	492,480	676,664	1.3740
06 Jan 2013 19:07:56	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	466,560	642,012	1.3761
05 Jan 2013 21:02:07	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	440,640	608,082	1.3800
04 Jan 2013 22:15:59	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	414,720	573,207	1.3822
03 Jan 2013 15:30:54	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	388,800	535,278	1.3767
02 Jan 2013 18:42:51	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	362,880	498,745	1.3744
01 Jan 2013 15:30:56	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	336,960	464,375	1.3781
31 Dec 2012 17:46:41	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	311,040	429,559	1.3810
30 Dec 2012 21:25:26	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	285,120	395,085	1.3857
30 Dec 2012 10:19:51	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	259,200	360,290	1.3900
29 Dec 2012 14:52:02	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	233,280	325,588	1.3957
28 Dec 2012 16:56:20	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	207,360	287,155	1.3848
27 Dec 2012 19:38:43	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	181,440	249,046	1.3726
26 Dec 2012 22:47:11	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	155,520	214,380	1.3785
26 Dec 2012 12:44:46	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	129,600	180,047	1.3893
25 Dec 2012 16:31:10	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	103,680	145,494	1.4033
24 Dec 2012 17:01:20	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	77,760	111,162	1.4296
22 Dec 2012 21:40:15	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	51,840	74,984	1.4465
21 Dec 2012 23:22:10	1183081	15488858	hadcm3n_3n3u_1940_40_008260802_0	25,920	34,520	1.3318