Task 10970269

Name	hadsm3dhet2_jlhw_006590854_2
Workunit	6794227
Created	15 Mar 2010, 11:55:17 UTC
Sent	19 Oct 2010, 21:05:04 UTC
Report deadline	2 Oct 2011, 2:25:04 UTC
Received	19 Nov 2010, 18:15:43 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1020107
Run time
CPU time	3 days 9 hours 16 min 22 sec
Validate state	Invalid
Credit	2,183.35
Device peak FLOPS	2.43 GFLOPS
Application version	UK Met Office HadSM3 Slab Model v6.07 windows_intelx86
Stderr	<core_client_version>6.2.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=10200, selfPID=10200, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=1664, selfPID=1664, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=776, selfPID=776, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=8024, selfPID=8024, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=10048, selfPID=10048, iMonCtr=1 CPDN process is not running, exiting, bRetVal = 1, checkPID=564, selfPID=564, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=1860, selfPID=1860, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=9632, selfPID=9632, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=648, selfPID=648, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CCPDN process is not running, exiting, bRetVal = 1, checkPID=8112, selfPID=8112, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( cpdnmonitor: cannot open input file C:\Documents and Settings\All Users\Application Data\BOINC/projects/climateprediction.net/hadsm3dhet2_jlhw_006590854/dataout/restart.day Model crashed: (null) cpdnmonitor: cannot open input file C:\Documents and Settings\All Users\Application Data\BOINC/projects/climateprediction.net/hadsm3dhet2_jlhw_006590854/dataout/restart.day Model crashed: (null) cpdnmonitor: cannot open input file C:\Documents and Settings\All Users\Application Data\BOINC/projects/climateprediction.net/hadsm3dhet2_jlhw_006590854/dataout/restart.day Model crashed: (null) cpdnmonitor: cannot open input file C:\Documents and Settings\All Users\Application Data\BOINC/projects/climateprediction.net/hadsm3dhet2_jlhw_006590854/dataout/restart.day Model crashed: (null) cpdnmonitor: cannot open input file C:\Documents and Settings\All Users\Application Data\BOINC/projects/climateprediction.net/hadsm3dhet2_jlhw_006590854/dataout/restart.day Model crashed: (null) cpdnmonitor: cannot open input file C:\Documents and Settings\All Users\Application Data\BOINC/projects/climateprediction.net/hadsm3dhet2_jlhw_006590854/dataout/restart.day Model crashed: (null) Sorry, too many model crashes! :-( called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
15 Nov 2010 15:02:37	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	237,644	287,693	1.2106
10 Nov 2010 20:46:11	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	226,842	274,384	1.2096
09 Nov 2010 23:47:12	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	216,040	261,036	1.2083
08 Nov 2010 11:20:35	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	205,238	247,928	1.2080
07 Nov 2010 16:17:25	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	194,436	234,637	1.2068
07 Nov 2010 04:22:27	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	183,634	221,615	1.2068
06 Nov 2010 14:40:36	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	172,832	208,460	1.2061
03 Nov 2010 14:59:14	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	162,030	195,707	1.2078
03 Nov 2010 05:18:40	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	151,228	183,104	1.2108
02 Nov 2010 09:31:15	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	140,426	169,861	1.2096
01 Nov 2010 23:44:55	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	129,624	156,828	1.2099
01 Nov 2010 15:30:00	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	118,822	143,722	1.2096
31 Oct 2010 19:30:30	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	108,020	130,960	1.2124
28 Oct 2010 08:38:57	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	97,218	117,811	1.2118
27 Oct 2010 12:11:56	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	86,416	104,879	1.2137
27 Oct 2010 03:37:20	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	75,614	91,844	1.2146
26 Oct 2010 17:18:38	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	64,812	78,705	1.2144
26 Oct 2010 04:27:03	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	54,010	65,257	1.2082
23 Oct 2010 04:12:53	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	43,208	52,257	1.2094
21 Oct 2010 21:32:40	1020107	10970269	hadsm3dhet2_jlhw_006590854_2	32,406	38,970	1.2026