Task 14840968

Name	hadcm3n_o31x_2100_40_008026029_1
Workunit	8181143
Created	25 Jun 2012, 2:53:13 UTC
Sent	25 Jun 2012, 2:54:11 UTC
Report deadline	24 Sep 2012, 10:21:22 UTC
Received	6 Aug 2012, 19:58:45 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1145580
Run time	13 days 0 hours 45 min 59 sec
CPU time	11 days 5 hours 23 min 25 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.54 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3632, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is notSuspended CPDN Monitor - Suspend request from BOINC... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2384, selfPID=2384, iMonCtr=1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1596, iMonCtr=1 Model crash detected, will try to restart... 16:40:01 (2268): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4056, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 18:21:36 (2856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3736, selfPID=3736, iMonCtr=1 Suspended CPDN Monitor - Suspend request from BOINC... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1000, selfPID=1000, iMonCtr=1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3696, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2884, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2740, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2036, iMonCtr=1 Model crash detected, will try to restart... 15:11:21 (2588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2936, iMonCtr=1 Model crash detected, will try to restart... 01:18:21 (1160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1308, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 09:22:26 (4028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CSuspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2892, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3956, iMonCtr=1 Model crash detected, will try to restart... C03:30:13 (3344): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
06 Aug 2012 19:00:33	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	518,400	969,799	1.8708
04 Aug 2012 05:30:06	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	492,480	921,543	1.8712
29 Jul 2012 02:18:16	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	466,560	873,776	1.8728
28 Jul 2012 00:12:26	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	440,640	825,300	1.8730
26 Jul 2012 20:01:57	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	414,720	777,239	1.8741
25 Jul 2012 18:40:19	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	388,800	729,738	1.8769
24 Jul 2012 13:55:18	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	362,880	680,404	1.8750
23 Jul 2012 04:36:49	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	336,960	631,085	1.8729
20 Jul 2012 00:15:22	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	311,040	582,162	1.8717
18 Jul 2012 21:33:45	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	285,120	533,312	1.8705
17 Jul 2012 00:33:37	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	259,200	484,319	1.8685
15 Jul 2012 02:16:07	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	233,280	435,315	1.8661
12 Jul 2012 01:50:19	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	207,360	386,171	1.8623
06 Jul 2012 19:57:31	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	181,440	337,593	1.8606
05 Jul 2012 01:32:08	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	155,520	288,871	1.8575
03 Jul 2012 20:08:44	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	129,600	240,658	1.8569
02 Jul 2012 17:26:59	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	103,680	191,806	1.8500
30 Jun 2012 03:32:39	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	77,760	143,847	1.8499
27 Jun 2012 18:03:10	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	51,840	95,674	1.8456
26 Jun 2012 04:40:50	1145580	14840968	hadcm3n_o31x_2100_40_008026029_1	25,920	47,805	1.8443