Task 15728354

Name	hadcm3n_3dib_1980_40_008322748_1
Workunit	8473883
Created	17 Apr 2013, 17:06:08 UTC
Sent	17 Apr 2013, 17:06:54 UTC
Report deadline	18 Jul 2013, 0:34:05 UTC
Received	27 Apr 2013, 10:29:20 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1124571
Run time	8 days 12 hours 26 min 45 sec
CPU time	8 days 7 hours 44 min 51 sec
Validate state	Invalid
Credit	2,177.28
Device peak FLOPS	1.94 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 12:51:02 (18896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:51:03 (18896): No heartbeat from core client for 30 sec - exiting 12:51:04 (18896): No heartbeat from core client for 30 sec - exiting 12:51:05 (18896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... 13:00:23 (2200): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4932, iMonCtr=1 Model crash detected, will try to restart... 15:08:54 (4932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
26 Apr 2013 01:05:28	1124571	15728354	hadcm3n_3dib_1980_40_008322748_1	181,440	674,482	3.7174
24 Apr 2013 21:17:36	1124571	15728354	hadcm3n_3dib_1980_40_008322748_1	155,520	577,516	3.7135
23 Apr 2013 17:46:45	1124571	15728354	hadcm3n_3dib_1980_40_008322748_1	129,600	481,071	3.7120
22 Apr 2013 14:15:33	1124571	15728354	hadcm3n_3dib_1980_40_008322748_1	103,680	384,339	3.7070
21 Apr 2013 10:42:39	1124571	15728354	hadcm3n_3dib_1980_40_008322748_1	77,760	287,751	3.7005
20 Apr 2013 07:43:37	1124571	15728354	hadcm3n_3dib_1980_40_008322748_1	51,840	192,315	3.7098
18 Apr 2013 20:23:37	1124571	15728354	hadcm3n_3dib_1980_40_008322748_1	25,920	96,257	3.7136