Task 13823968

Name	hadcm3n_yhzz_1940_40_007607811_4
Workunit	7785941
Created	27 Dec 2011, 14:10:04 UTC
Sent	27 Dec 2011, 14:10:29 UTC
Report deadline	27 Mar 2012, 21:37:40 UTC
Received	3 Jan 2012, 11:51:22 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1305473
Run time	6 days 2 hours 26 min 31 sec
CPU time	5 days 5 hours 27 min 7 sec
Validate state	Invalid
Credit	4,354.56
Device peak FLOPS	3.55 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5888, iMonCtr=1 Model crash detected, will try to restart... 14:52:43 (6804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:38:58 (5512): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7348, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8088, iMonCtr=1 Model crash detected, will try to restart... 15:00:11 (6604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:23:08 (8480): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:23:09 (8480): No heartbeat from core client for 30 sec - exiting Model crashed: TEMPHIST: Failed in OPEN of history file tmp/pipe_dummy 2048 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
02 Jan 2012 23:13:47	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	362,880	443,595	1.2224
02 Jan 2012 13:20:58	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	336,960	434,270	1.2888
02 Jan 2012 06:19:17	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	311,040	408,832	1.3144
01 Jan 2012 22:12:15	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	285,120	381,739	1.3389
01 Jan 2012 16:09:01	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	259,200	356,741	1.3763
01 Jan 2012 06:01:49	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	233,280	323,393	1.3863
31 Dec 2011 19:06:48	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	207,360	285,857	1.3786
31 Dec 2011 08:01:15	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	181,440	248,291	1.3684
30 Dec 2011 21:53:12	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	155,520	210,730	1.3550
30 Dec 2011 11:06:44	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	129,600	173,072	1.3354
29 Dec 2011 19:13:39	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	103,680	135,863	1.3104
29 Dec 2011 02:12:00	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	77,760	99,427	1.2786
28 Dec 2011 14:26:09	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	51,840	66,167	1.2764
28 Dec 2011 05:09:00	1076549	13823968	hadcm3n_yhzz_1940_40_007607811_4	25,920	34,192	1.3191