Task 13120076

Name	hadcm3n_yjh8_1900_40_007358102_1
Workunit	7555532
Created	6 Jul 2011, 14:57:22 UTC
Sent	8 Jul 2011, 17:50:14 UTC
Report deadline	8 Oct 2011, 1:17:25 UTC
Received	2 Nov 2011, 17:39:37 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1149100
Run time	20 days 0 hours 59 min 19 sec
CPU time	20 days 0 hours 59 min 19 sec
Validate state	Invalid
Credit	9,331.20
Device peak FLOPS	2.42 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4540, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4528, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2312, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:13:16 (5260): No heartbeat from core client for 30 sec - exiting 16:13:17 (5260): No heartbeat from core client for 30 sec - exiting 16:13:18 (5260): No heartbeat from core client for 30 sec - exiting 16:13:19 (5260): No heartbeat from core client for 30 sec - exiting 16:13:20 (5260): No heartbeat from core client for 30 sec - exiting 16:13:21 (5260): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4844, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4692, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3504, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 17:42:11 (4648): No heartbeat from core client for 30 sec - exiting 17:42:12 (4648): No heartbeat from core client for 30 sec - exiting 17:42:13 (4648): No heartbeat from core client for 30 sec - exiting 17:42:14 (4648): No heartbeat from core client for 30 sec - exiting 17:42:15 (4648): No heartbeat from core client for 30 sec - exiting 17:42:16 (4648): No heartbeat from core client for 30 sec - exiting 17:42:17 (4648): No heartbeat from core client for 30 sec - exiting 17:42:18 (4648): No heartbeat from core client for 30 sec - exiting 17:42:19 (4648): No heartbeat from core client for 30 sec - exiting 17:42:20 (4648): No heartbeat from core client for 30 sec - exiting 17:42:21 (4648): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3752, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3100, iMonCtr=1 ModeCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5344, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 09:50:44 (5400): No heartbeat from core client for 30 sec - exiting 09:50:45 (5400): No heartbeat from core client for 30 sec - exiting 09:50:46 (5400): No heartbeat from core client for 30 sec - exiting 09:50:47 (5400): No heartbeat from core client for 30 sec - exiting 09:50:48 (5400): No heartbeat from core client for 30 sec - exiting 09:50:49 (5400): No heartbeat from core client for 30 sec - exiting 09:50:50 (5400): No heartbeat from core client for 30 sec - exiting 09:50:51 (5400): No heartbeat from core client for 30 sec - exiting 09:50:52 (5400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 21:50:08 (5256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:07:40 (4356): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 14:00:11 (4884): No heartbeat from core client for 30 sec - exiting 14:00:12 (4884): No heartbeat from core client for 30 sec - exiting 14:00:13 (4884): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4124, iMonCtr=1 Model crash detected, will try to restart... 22:48:03 (4956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:25:52 (3360): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4008, iMonCtr=1 Model crash detected, will try to restart... 22:46:44 (5052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:20:38 (2668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3920, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4896, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4236, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5040, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5332, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5552, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5188, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4848, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
02 Nov 2011 16:43:34	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	777,600	1,731,556	2.2268
01 Nov 2011 15:23:45	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	751,680	1,672,201	2.2246
31 Oct 2011 19:47:12	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	725,760	1,615,815	2.2264
31 Oct 2011 17:16:44	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	699,840	1,559,136	2.2278
31 Oct 2011 15:04:39	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	673,920	1,502,524	2.2295
31 Oct 2011 15:04:39	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	648,000	1,447,866	2.2344
31 Oct 2011 15:04:39	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	622,080	1,404,020	2.2570
19 Oct 2011 00:25:46	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	596,160	1,346,829	2.2592
17 Oct 2011 11:24:00	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	570,240	1,289,741	2.2618
13 Oct 2011 15:32:33	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	544,320	1,233,122	2.2654
10 Oct 2011 18:45:04	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	518,400	1,176,266	2.2690
06 Oct 2011 22:39:21	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	492,480	1,119,531	2.2733
03 Oct 2011 09:55:31	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	466,560	1,062,490	2.2773
28 Sep 2011 14:40:10	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	440,640	1,005,587	2.2821
23 Sep 2011 20:57:55	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	414,720	951,101	2.2934
17 Sep 2011 19:21:08	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	388,800	894,383	2.3004
14 Sep 2011 15:13:21	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	362,880	838,491	2.3107
11 Sep 2011 17:43:00	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	336,960	781,912	2.3205
05 Sep 2011 00:00:40	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	311,040	724,687	2.3299
30 Aug 2011 13:49:34	1149100	13120076	hadcm3n_yjh8_1900_40_007358102_1	285,120	666,998	2.3394