Task 12477749

Name	famous_voa1_799_200_006734719_1
Workunit	6938060
Created	13 Jan 2011, 1:22:45 UTC
Sent	13 Jan 2011, 12:47:05 UTC
Report deadline	14 Apr 2011, 20:14:16 UTC
Received	3 Feb 2011, 16:47:21 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID	1066945
Run time	7 days 7 hours 58 min 30 sec
CPU time	6 days 9 hours 38 min 39 sec
Validate state	Invalid
Credit	3,644.12
Device peak FLOPS	2.28 GFLOPS
Application version	UK Met Office FAMOUS v6.11 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3448, iMonCtr=1 Model crash detected, will try to restart... 09:03:54 (4672): No heartbeat from core client for 30 sec - exiting 09:03:55 (4672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:47:01 (3856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3908, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3620, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3624, iMonCtr=1 Model crash detected, will try to restart... 11:23:17 (1652): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:19:36 (4040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:20:22 (3624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:21:41 (4668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4400, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4088, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... 09:25:06 (3648): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:25:48 (4052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:26:26 (3716): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4280, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 09:05:03 (3464): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 09:06:23 (2024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... 10:08:43 (3560): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:15:21 (2304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:16:12 (2168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:28:01 (3552): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:28:52 (1840): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4644, iMonCtr=1 Model crash detected, will try to restart... 16:01:37 (3064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:02:29 (3624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1700, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3428, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
03 Feb 2011 17:09:15	1066945	12477749	famous_voa1_799_200_006734719_1	1,104,506	551,939	0.4997
01 Feb 2011 14:24:10	1066945	12477749	famous_voa1_799_200_006734719_1	1,095,146	547,809	0.5002
01 Feb 2011 13:00:56	1066945	12477749	famous_voa1_799_200_006734719_1	1,085,786	543,240	0.5003
01 Feb 2011 12:54:26	1066945	12477749	famous_voa1_799_200_006734719_1	1,076,426	538,688	0.5004
31 Jan 2011 23:02:03	1066945	12477749	famous_voa1_799_200_006734719_1	1,067,066	534,015	0.5005
31 Jan 2011 18:20:48	1066945	12477749	famous_voa1_799_200_006734719_1	1,057,706	529,396	0.5005
31 Jan 2011 16:37:21	1066945	12477749	famous_voa1_799_200_006734719_1	1,048,346	524,828	0.5006
30 Jan 2011 21:41:00	1066945	12477749	famous_voa1_799_200_006734719_1	1,038,986	520,150	0.5006
30 Jan 2011 18:27:45	1066945	12477749	famous_voa1_799_200_006734719_1	1,029,626	515,328	0.5005
30 Jan 2011 15:54:47	1066945	12477749	famous_voa1_799_200_006734719_1	1,020,266	510,573	0.5004
30 Jan 2011 14:14:47	1066945	12477749	famous_voa1_799_200_006734719_1	1,010,906	505,912	0.5005
30 Jan 2011 12:22:39	1066945	12477749	famous_voa1_799_200_006734719_1	1,001,546	501,169	0.5004
30 Jan 2011 10:48:28	1066945	12477749	famous_voa1_799_200_006734719_1	992,186	496,500	0.5004
29 Jan 2011 16:23:52	1066945	12477749	famous_voa1_799_200_006734719_1	982,826	491,717	0.5003
29 Jan 2011 14:53:07	1066945	12477749	famous_voa1_799_200_006734719_1	973,466	487,098	0.5004
29 Jan 2011 13:22:55	1066945	12477749	famous_voa1_799_200_006734719_1	964,106	482,504	0.5005
29 Jan 2011 12:26:39	1066945	12477749	famous_voa1_799_200_006734719_1	954,746	477,901	0.5006
28 Jan 2011 21:32:07	1066945	12477749	famous_voa1_799_200_006734719_1	945,386	473,259	0.5006
28 Jan 2011 19:48:54	1066945	12477749	famous_voa1_799_200_006734719_1	936,026	468,518	0.5005
28 Jan 2011 18:52:25	1066945	12477749	famous_voa1_799_200_006734719_1	926,666	463,844	0.5006