Task 13567178

Name	hadcm3n_yjnk_1900_40_007525519_4
Workunit	7722994
Created	30 Oct 2011, 6:19:26 UTC
Sent	30 Oct 2011, 6:32:32 UTC
Report deadline	29 Jan 2012, 13:59:43 UTC
Received	22 Nov 2011, 20:19:42 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1051474
Run time	16 days 8 hours 26 min 21 sec
CPU time	14 days 14 hours 51 min 5 sec
Validate state	Invalid
Credit	8,398.08
Device peak FLOPS	2.68 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.12.34</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12464, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:37:45 (5236): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
22 Nov 2011 00:56:55	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	699,840	1,240,628	1.7727
21 Nov 2011 11:14:09	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	673,920	1,197,111	1.7763
20 Nov 2011 20:40:07	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	648,000	1,152,102	1.7779
20 Nov 2011 04:57:01	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	622,080	1,107,585	1.7805
19 Nov 2011 14:50:22	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	596,160	1,064,299	1.7853
19 Nov 2011 00:01:18	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	570,240	1,019,721	1.7882
18 Nov 2011 09:54:28	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	544,320	973,942	1.7893
17 Nov 2011 19:39:27	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	518,400	927,019	1.7882
17 Nov 2011 05:42:32	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	492,480	880,591	1.7881
16 Nov 2011 15:27:11	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	466,560	834,462	1.7885
16 Nov 2011 01:21:37	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	440,640	788,348	1.7891
15 Nov 2011 17:58:38	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	414,720	743,330	1.7924
15 Nov 2011 17:58:38	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	388,800	695,199	1.7881
15 Nov 2011 17:58:38	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	362,880	643,417	1.7731
15 Nov 2011 17:58:38	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	336,960	598,859	1.7772
15 Nov 2011 17:58:38	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	311,040	555,121	1.7847
15 Nov 2011 17:58:38	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	285,120	511,092	1.7926
15 Nov 2011 17:58:38	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	259,200	466,703	1.8006
15 Nov 2011 17:58:38	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	233,280	422,870	1.8127
09 Nov 2011 21:51:24	1051474	13567178	hadcm3n_yjnk_1900_40_007525519_4	207,360	378,992	1.8277