Task 13338223

Name	hadcm3n_o703_1900_40_007440690_1
Workunit	7638193
Created	5 Sep 2011, 18:21:06 UTC
Sent	5 Sep 2011, 18:45:15 UTC
Report deadline	6 Dec 2011, 2:12:26 UTC
Received	20 Nov 2011, 18:49:05 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1158390
Run time	11 days 17 hours 9 min 48 sec
CPU time	11 days 5 hours 25 min 16 sec
Validate state	Invalid
Credit	7,776.00
Device peak FLOPS	2.48 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.12.34</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 08:40:47 (3604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:12:34 (5016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 22:16:59 (4764): No heartbeat from core client for 30 sec - exiting 22:17:00 (4764): No heartbeat from core client for 30 sec - exiting 22:17:01 (4764): No heartbeat from core client for 30 sec - exiting 22:17:02 (4764): No heartbeat from core client for 30 sec - exiting 22:17:03 (4764): No heartbeat from core client for 30 sec - exiting 22:17:04 (4764): No heartbeat from core client for 30 sec - exiting 22:17:05 (4764): No heartbeat from core client for 30 sec - exiting 22:17:06 (4764): No heartbeat from core client for 30 sec - exiting 22:17:07 (4764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:22:13 (3536): No heartbeat from core client for 30 sec - exiting 07:22:14 (3536): No heartbeat from core client for 30 sec - exiting 07:22:15 (3536): No heartbeat from core client for 30 sec - exiting 07:22:16 (3536): No heartbeat from core client for 30 sec - exiting 07:22:17 (3536): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4636, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5372, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5372, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1816, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1816, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5628, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
18 Nov 2011 05:03:11	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	648,000	953,652	1.4717
17 Nov 2011 05:12:24	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	622,080	915,751	1.4721
17 Nov 2011 05:12:24	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	596,160	877,715	1.4723
09 Nov 2011 23:27:43	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	570,240	840,057	1.4732
09 Nov 2011 06:41:39	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	544,320	802,181	1.4737
05 Nov 2011 20:07:20	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	518,400	764,251	1.4742
03 Nov 2011 02:58:37	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	492,480	726,213	1.4746
03 Nov 2011 02:58:37	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	466,560	688,640	1.4760
16 Oct 2011 01:26:00	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	440,640	650,793	1.4769
14 Oct 2011 18:01:05	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	414,720	611,961	1.4756
13 Oct 2011 20:53:20	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	388,800	573,167	1.4742
12 Oct 2011 02:19:21	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	362,880	534,457	1.4728
11 Oct 2011 12:39:02	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	336,960	495,933	1.4718
09 Oct 2011 12:19:47	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	311,040	456,952	1.4691
09 Oct 2011 02:51:09	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	285,120	418,649	1.4683
06 Oct 2011 15:46:51	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	259,200	380,169	1.4667
06 Oct 2011 04:49:14	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	233,280	341,737	1.4649
03 Oct 2011 05:34:10	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	207,360	304,119	1.4666
02 Oct 2011 19:14:00	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	181,440	266,482	1.4687
21 Sep 2011 14:10:48	1158390	13338223	hadcm3n_o703_1900_40_007440690_1	155,520	228,417	1.4687