Task 12498187

Name	famous_wsd1_1099_200_007123516_0
Workunit	7321876
Created	16 Jan 2011, 16:57:42 UTC
Sent	16 Jan 2011, 20:50:02 UTC
Report deadline	18 Apr 2011, 4:17:13 UTC
Received	5 Apr 2011, 13:07:38 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	986848
Run time	32 days 20 hours 2 min 46 sec
CPU time	27 days 23 hours 24 min 16 sec
Validate state	Invalid
Credit	4,786.74
Device peak FLOPS	0.89 GFLOPS
Application version	UK Met Office FAMOUS v6.11 windows_intelx86
Stderr	<core_client_version>6.6.36</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 04:28:02 (4772): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5804, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5804, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4432, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4432, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4572, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5832, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5832, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6068, iMonCtr=1 Model crash detected, will try to restart... 14:41:06 (5920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 01:33:07 (624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5380, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4284, selfPID=4284, iMonCtr=1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5348, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 17:04:30 (6052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2156, selfPID=2156, iMonCtr=1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Model crashed: READHIST: End of file in READ from history file for namelist NLCFILES tmp/pipe_dummy Model crashed: READHIST: End of file in READ from history file for namelist NLCFILES tmp/pipe_dummy Model crashed: READHIST: End of file in READ from history file for namelist NLCFILES tmp/pipe_dummy Model crashed: READHIST: End of file in READ from history file for namelist NLCFILES tmp/pipe_dummy Model crashed: READHIST: End of file in READ from history file for namelist NLCFILES tmp/pipe_dummy Model crashed: READHIST: End of file in READ from history file for namelist NLCFILES tmp/pipe_dummy Sorry, too many model crashes! :-( 04:21:11 (5328): called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
05 Apr 2011 08:58:30	986848	12498187	famous_wsd1_1099_200_007123516_0	1,450,826	2,409,502	1.6608
05 Apr 2011 01:11:58	986848	12498187	famous_wsd1_1099_200_007123516_0	1,441,466	2,394,235	1.6610
04 Apr 2011 17:05:55	986848	12498187	famous_wsd1_1099_200_007123516_0	1,432,106	2,378,568	1.6609
04 Apr 2011 11:28:27	986848	12498187	famous_wsd1_1099_200_007123516_0	1,422,746	2,362,731	1.6607
04 Apr 2011 06:11:34	986848	12498187	famous_wsd1_1099_200_007123516_0	1,413,386	2,346,979	1.6605
03 Apr 2011 23:38:52	986848	12498187	famous_wsd1_1099_200_007123516_0	1,404,026	2,331,783	1.6608
03 Apr 2011 18:54:56	986848	12498187	famous_wsd1_1099_200_007123516_0	1,394,666	2,316,613	1.6611
03 Apr 2011 14:19:40	986848	12498187	famous_wsd1_1099_200_007123516_0	1,385,306	2,301,550	1.6614
03 Apr 2011 09:44:49	986848	12498187	famous_wsd1_1099_200_007123516_0	1,375,946	2,286,497	1.6618
03 Apr 2011 05:08:27	986848	12498187	famous_wsd1_1099_200_007123516_0	1,366,586	2,271,550	1.6622
02 Apr 2011 02:02:49	986848	12498187	famous_wsd1_1099_200_007123516_0	1,357,226	2,256,133	1.6623
01 Apr 2011 20:54:40	986848	12498187	famous_wsd1_1099_200_007123516_0	1,347,866	2,240,851	1.6625
01 Apr 2011 16:14:35	986848	12498187	famous_wsd1_1099_200_007123516_0	1,338,506	2,225,511	1.6627
01 Apr 2011 02:35:58	986848	12498187	famous_wsd1_1099_200_007123516_0	1,329,146	2,209,483	1.6623
31 Mar 2011 21:55:14	986848	12498187	famous_wsd1_1099_200_007123516_0	1,319,786	2,194,001	1.6624
30 Mar 2011 00:43:28	986848	12498187	famous_wsd1_1099_200_007123516_0	1,310,426	2,178,273	1.6623
28 Mar 2011 18:31:20	986848	12498187	famous_wsd1_1099_200_007123516_0	1,301,066	2,161,773	1.6615
28 Mar 2011 11:26:26	986848	12498187	famous_wsd1_1099_200_007123516_0	1,291,706	2,147,100	1.6622
28 Mar 2011 01:37:11	986848	12498187	famous_wsd1_1099_200_007123516_0	1,282,346	2,132,086	1.6626
26 Mar 2011 21:14:43	986848	12498187	famous_wsd1_1099_200_007123516_0	1,272,986	2,116,733	1.6628