Task 15911844

Name	hadcm3n_3dwq_1940_40_008262388_3
Workunit	8417512
Created	14 Aug 2013, 11:30:30 UTC
Sent	15 Aug 2013, 7:10:17 UTC
Report deadline	14 Nov 2013, 14:37:28 UTC
Received	3 Oct 2013, 7:14:24 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1161168
Run time	8 days 3 hours 33 min 28 sec
CPU time	7 days 18 hours 29 min 8 sec
Validate state	Invalid
Credit	9,331.20
Device peak FLOPS	3.23 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4988, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4988, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4616, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5248, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:03:39 (5204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:25:00 (4328): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5340, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 08:43:28 (5844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:43:29 (5844): No heartbeat from core client for 30 sec - exiting 08:43:30 (5844): No heartbeat from core client for 30 sec - exiting 08:43:31 (5844): No heartbeat from core client for 30 sec - exiting 08:43:33 (5844): No heartbeat from core client for 30 sec - exiting 08:43:34 (5844): No heartbeat from core client for 30 sec - exiting 08:43:35 (5844): No heartbeat from core client for 30 sec - exiting 08:43:36 (5844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3564, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
02 Oct 2013 07:18:37	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	777,600	671,346	0.8634
01 Oct 2013 09:56:21	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	751,680	649,752	0.8644
29 Sep 2013 15:30:36	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	725,760	628,403	0.8659
28 Sep 2013 11:05:46	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	699,840	606,767	0.8670
26 Sep 2013 12:23:47	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	673,920	585,056	0.8681
25 Sep 2013 14:27:59	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	648,000	563,429	0.8695
25 Sep 2013 09:29:47	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	622,080	541,815	0.8710
24 Sep 2013 07:58:48	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	596,160	520,525	0.8731
23 Sep 2013 10:55:57	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	570,240	499,786	0.8764
19 Sep 2013 07:47:06	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	544,320	479,026	0.8800
17 Sep 2013 15:13:05	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	518,400	457,736	0.8830
16 Sep 2013 12:13:00	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	492,480	436,304	0.8859
15 Sep 2013 14:35:52	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	466,560	414,820	0.8891
12 Sep 2013 10:34:58	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	440,640	393,167	0.8923
11 Sep 2013 12:37:03	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	414,720	371,102	0.8948
09 Sep 2013 10:38:52	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	388,800	350,226	0.9008
07 Sep 2013 09:14:11	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	362,880	328,641	0.9056
06 Sep 2013 11:27:55	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	336,960	307,085	0.9113
05 Sep 2013 11:43:52	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	311,040	285,533	0.9180
04 Sep 2013 13:48:30	1161168	15911844	hadcm3n_3dwq_1940_40_008262388_3	285,120	263,916	0.9256