Task 11623168

Name	famous_s3ae_599_200_006669241_0
Workunit	6872495
Created	29 Jul 2010, 10:23:28 UTC
Sent	30 Jul 2010, 0:08:19 UTC
Report deadline	29 Oct 2010, 7:35:30 UTC
Received	10 Sep 2010, 23:54:39 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1042806
Run time	13 days 12 hours 11 min 51 sec
CPU time	10 days 6 hours 55 min 20 sec
Validate state	Invalid
Credit	3,026.49
Device peak FLOPS	0.78 GFLOPS
Application version	UK Met Office FAMOUS v6.11 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6140, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5780, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not runnController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2896, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5460, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5588, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4300, iMonCtr=1 Model crash detected, will try to restarController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2080, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3608, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5420, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4400, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4748, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... 12:06:28 (608): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1588, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5604, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=180, iMonCtr=1 Model crash detected, will try to restart... Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy Sorry, too many model crashes! :-( 17:40:21 (1548): called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
07 Sep 2010 01:49:59	1042806	11623168	famous_s3ae_599_200_006669241_0	917,306	880,222	0.9596
06 Sep 2010 02:21:21	1042806	11623168	famous_s3ae_599_200_006669241_0	907,946	868,642	0.9567
05 Sep 2010 22:41:42	1042806	11623168	famous_s3ae_599_200_006669241_0	898,586	857,096	0.9538
05 Sep 2010 18:37:46	1042806	11623168	famous_s3ae_599_200_006669241_0	889,226	845,732	0.9511
05 Sep 2010 14:55:32	1042806	11623168	famous_s3ae_599_200_006669241_0	879,866	834,214	0.9481
05 Sep 2010 11:21:49	1042806	11623168	famous_s3ae_599_200_006669241_0	870,506	822,671	0.9450
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	861,146	811,114	0.9419
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	851,786	804,336	0.9443
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	842,426	800,052	0.9497
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	833,066	795,762	0.9552
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	823,706	791,488	0.9609
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	814,346	787,167	0.9666
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	804,986	782,834	0.9725
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	795,626	778,558	0.9785
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	786,266	774,284	0.9848
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	776,906	770,010	0.9911
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	767,546	765,732	0.9976
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	758,186	761,466	1.0043
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	748,826	757,147	1.0111
04 Sep 2010 18:42:48	1042806	11623168	famous_s3ae_599_200_006669241_0	739,466	752,873	1.0181