Task 15880711

Name	hadcm3n_3lys_1940_40_008260381_3
Workunit	8415505
Created	4 Jul 2013, 12:20:12 UTC
Sent	4 Jul 2013, 16:12:08 UTC
Report deadline	3 Oct 2013, 23:39:19 UTC
Received	8 Sep 2013, 11:31:29 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1125445
Run time	13 days 12 hours 26 min 54 sec
CPU time	12 days 0 hours 12 min 9 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.69 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 08:52:50 (4800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3688, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5340, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 08:37:20 (7376): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:37:21 (7376): No heartbeat from core client for 30 sec - exiting 08:37:22 (7376): No heartbeat from core client for 30 sec - exiting 08:37:24 (7376): No heartbeat from core client for 30 sec - exiting 08:37:25 (7376): No heartbeat from core client for 30 sec - exiting 08:37:26 (7376): No heartbeat from core client for 30 sec - exiting 08:37:27 (7376): No heartbeat from core client for 30 sec - exiting 08:37:28 (7376): No heartbeat from core client for 30 sec - exiting 08:37:29 (7376): No heartbeat from core client for 30 sec - exiting 08:37:30 (7376): No heartbeat from core client for 30 sec - exiting 08:37:31 (7376): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4800, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5080, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=436, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 17:48:37 (5168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:48:38 (5168): No heartbeat from core client for 30 sec - exiting 17:48:39 (5168): No heartbeat from core client for 30 sec - exiting 17:48:40 (5168): No heartbeat from core client for 30 sec - exiting 17:48:41 (5168): No heartbeat from core client for 30 sec - exiting 17:48:42 (5168): No heartbeat from core client for 30 sec - exiting 17:48:43 (5168): No heartbeat from core client for 30 sec - exiting 17:48:44 (5168): No heartbeat from core client for 30 sec - exiting 17:48:45 (5168): No heartbeat from core client for 30 sec - exiting 17:48:46 (5168): No heartbeat from core client for 30 sec - exiting 17:48:47 (5168): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1560, iMonCtr=1 Model crash detected, will try to restart... Ocean Restart file copy failed on 3lysko.daf6co0 Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPSuspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6092, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
08 Sep 2013 10:32:43	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	518,400	1,037,520	2.0014
01 Sep 2013 08:13:53	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	492,480	986,333	2.0028
27 Aug 2013 06:55:18	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	466,560	934,249	2.0024
25 Aug 2013 17:44:56	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	440,640	882,362	2.0025
24 Aug 2013 07:02:46	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	414,720	830,223	2.0019
18 Aug 2013 04:58:55	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	388,800	778,102	2.0013
16 Aug 2013 05:58:52	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	362,880	726,189	2.0012
15 Aug 2013 05:49:31	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	336,960	674,739	2.0024
15 Aug 2013 05:49:31	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	311,040	623,513	2.0046
15 Aug 2013 05:49:31	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	285,120	571,741	2.0053
15 Aug 2013 05:49:31	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	259,200	519,673	2.0049
15 Aug 2013 05:49:31	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	233,280	468,183	2.0070
15 Aug 2013 05:49:31	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	207,360	416,733	2.0097
30 Jul 2013 09:21:04	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	181,440	364,167	2.0071
30 Jul 2013 09:21:03	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	155,520	309,655	1.9911
23 Jul 2013 20:49:38	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	129,600	257,969	1.9905
23 Jul 2013 19:24:14	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	103,680	205,950	1.9864
23 Jul 2013 19:19:34	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	77,760	154,135	1.9822
23 Jul 2013 19:19:34	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	51,840	102,549	1.9782
07 Jul 2013 09:01:03	1125445	15880711	hadcm3n_3lys_1940_40_008260381_3	25,920	51,492	1.9866