Task 15285864

Name	hadcm3n_zesw_1880_40_008200487_3
Workunit	8355611
Created	14 Sep 2012, 21:42:51 UTC
Sent	14 Sep 2012, 21:43:15 UTC
Report deadline	15 Dec 2012, 5:10:26 UTC
Received	14 Oct 2012, 22:26:19 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1213506
Run time	15 days 14 hours 57 min 6 sec
CPU time	11 days 10 hours 28 min 8 sec
Validate state	Invalid
Credit	8,398.08
Device peak FLOPS	2.81 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3008, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 07:25:33 (3728): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5056, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=820, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=820, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=820, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=820, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=820, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=820, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
14 Oct 2012 13:42:04	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	699,840	980,414	1.4009
13 Oct 2012 18:36:11	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	673,920	944,521	1.4015
12 Oct 2012 23:45:57	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	648,000	907,949	1.4012
11 Oct 2012 17:43:30	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	622,080	871,286	1.4006
10 Oct 2012 23:53:56	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	596,160	835,170	1.4009
10 Oct 2012 07:20:59	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	570,240	798,487	1.4003
09 Oct 2012 17:18:14	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	544,320	761,588	1.3992
09 Oct 2012 00:23:45	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	518,400	726,204	1.4009
06 Oct 2012 13:53:43	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	492,480	689,942	1.4010
05 Oct 2012 18:41:06	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	466,560	653,591	1.4009
04 Oct 2012 15:54:43	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	440,640	616,978	1.4002
03 Oct 2012 18:42:05	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	414,720	583,591	1.4072
02 Oct 2012 21:03:58	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	388,800	545,458	1.4029
02 Oct 2012 06:31:32	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	362,880	508,904	1.4024
01 Oct 2012 07:55:13	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	336,960	472,889	1.4034
29 Sep 2012 03:29:44	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	311,040	436,682	1.4039
28 Sep 2012 15:02:37	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	285,120	400,589	1.4050
27 Sep 2012 21:18:24	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	259,200	364,548	1.4064
26 Sep 2012 17:44:38	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	233,280	328,132	1.4066
25 Sep 2012 02:31:38	1213506	15285864	hadcm3n_zesw_1880_40_008200487_3	207,360	291,399	1.4053