Task 13337425

Name	hadcm3n_o58y_1900_40_007440292_0
Workunit	7637795
Created	5 Sep 2011, 18:19:41 UTC
Sent	5 Sep 2011, 21:27:00 UTC
Report deadline	6 Dec 2011, 4:54:11 UTC
Received	23 Nov 2011, 21:51:44 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1080897
Run time	19 days 14 hours 28 min 50 sec
CPU time	19 days 14 hours 28 min 50 sec
Validate state	Invalid
Credit	9,331.20
Device peak FLOPS	2.34 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.56</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2120, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 12:07:53 (2120): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2644, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4432, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4864, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2332, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3304, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=384, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2800, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3212, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4396, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2840, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2976, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2572, iMonCtr=1 Model crash detected, will try to restart... 19:49:48 (3244): No heartbeat from core client for 30 sec - exiting 19:49:51 (3244): No heartbeat from core client for 30 sec - exiting 19:49:53 (3244): No heartbeat from core client for 30 sec - exiting 19:49:54 (3244): No heartbeat from core client for 30 sec - exiting 19:49:55 (3244): No heartbeat from core client for 30 sec - exiting 19:49:56 (3244): No heartbeat from core client for 30 sec - exiting 19:49:57 (3244): No heartbeat from core client for 30 sec - exiting 19:49:58 (3244): No heartbeat from core client for 30 sec - exiting 19:49:59 (3244): No heartbeat from core client for 30 sec - exiting 19:50:00 (3244): No heartbeat from core client for 30 sec - exiting 19:50:01 (3244): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:09:28 (3056): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:10:20 (3056): No heartbeat from core client for 30 sec - exiting 15:35:30 (3000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:20:10 (2928): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
23 Nov 2011 21:53:19	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	777,600	1,693,724	2.1781
20 Nov 2011 23:36:12	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	751,680	1,637,575	2.1786
20 Nov 2011 07:47:35	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	725,760	1,580,950	2.1783
19 Nov 2011 02:22:02	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	699,840	1,524,248	2.1780
18 Nov 2011 03:07:42	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	673,920	1,467,306	2.1773
15 Nov 2011 23:20:33	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	648,000	1,409,633	2.1754
15 Nov 2011 20:38:32	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	622,080	1,351,547	2.1726
09 Nov 2011 00:21:42	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	596,160	1,294,734	2.1718
06 Nov 2011 20:48:57	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	570,240	1,238,859	2.1725
05 Nov 2011 18:32:03	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	544,320	1,182,213	2.1719
03 Nov 2011 22:11:58	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	518,400	1,124,494	2.1692
31 Oct 2011 21:00:40	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	492,480	1,067,290	2.1672
31 Oct 2011 18:52:39	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	466,560	1,011,947	2.1690
31 Oct 2011 18:14:17	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	440,640	955,809	2.1691
31 Oct 2011 15:29:56	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	414,720	900,392	2.1711
31 Oct 2011 15:26:09	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	388,800	844,029	2.1709
31 Oct 2011 15:26:08	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	362,880	787,304	2.1696
13 Oct 2011 21:34:11	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	336,960	731,263	2.1702
10 Oct 2011 15:58:24	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	311,040	675,652	2.1722
05 Oct 2011 00:52:07	1080897	13337425	hadcm3n_o58y_1900_40_007440292_0	285,120	618,691	2.1699