Task 12759378

Name	hadcm3n_o4ko_1900_40_007201259_2
Workunit	7399539
Created	31 Mar 2011, 18:13:26 UTC
Sent	31 Mar 2011, 18:16:48 UTC
Report deadline	1 Jul 2011, 1:43:59 UTC
Received	22 Apr 2011, 4:39:26 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	775427
Run time	12 days 17 hours 12 min 20 sec
CPU time	11 days 22 hours 0 min 12 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.33 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.60</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 08:50:48 (3588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 15:27:21 (5036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:02:19 (7564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:03:00 (9328): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:03:01 (9328): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... 17:57:06 (5644): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:57:07 (5644): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7736, iMonCtr=1 Model crash detected, will try to restart... 15:03:09 (4848): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:03:10 (4848): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8156, iMonCtr=1 Model crash detected, will try to restart... 09:00:52 (2732): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 17:31:44 (6976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:37:27 (4516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5552, iMonCtr=1 Model crash detected, will try to restart... 14:09:41 (1904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7920, iMonCtr=1 Model crash detected, will try to restart... C15:28:29 (5060): No heartbeat from core client for 30 sec - exiting 15:28:31 (5060): No heartbeat from core client for 30 sec - exiting 15:28:32 (5060): No heartbeat from core client for 30 sec - exiting 15:28:33 (5060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... zip error: Could not create output file (was replacing the original zip file) 15:02:07 (4480): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:03:06 (7432): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:03:07 (7432): No heartbeat from core client for 30 sec - exiting 16:03:08 (7432): No heartbeat from core client for 30 sec - exiting 16:03:09 (7432): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1 Model crash detected, will try to restart... 21:21:35 (4188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C08:23:53 (4764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5196, iMonCtr=1 Model crash detected, will try to restart... 07:29:37 (5060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:39:11 (5572): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7948, iMonCtr=1 Model crash detected, will try to restart... 15:06:23 (4132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:14:32 (6992): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:07:42 (6460): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:07:43 (6460): No heartbeat from core client for 30 sec - exiting 16:07:44 (6460): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... 17:09:37 (7708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7808, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2408, iMonCtr=1 Model crash detected, will try to restart... 07:55:29 (4988): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4816, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4172, iMonCtr=1 Model crash detected, will try to restart... 21:15:05 (5964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:36:31 (9500): No heartbeat from core client for 30 sec - exiting 22:36:32 (9500): No heartbeat from core client for 30 sec - exiting 22:36:33 (9500): No heartbeat from core client for 30 sec - exiting Ocean Restart file copy failed on o4koko.dac0c20 CPDN Monitor - No 'heartbeat' from BOINC... zip error: Could not create output file (was replacing the original zip file) cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\BOINC/projects/climateprediction.net/hadcm3n_o4ko_1900_40_007201259/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
22 Apr 2011 04:53:43	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	518,400	1,029,714	1.9863
21 Apr 2011 04:21:54	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	492,480	975,334	1.9805
20 Apr 2011 18:09:24	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	466,560	924,126	1.9807
20 Apr 2011 18:09:24	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	440,640	874,099	1.9837
20 Apr 2011 18:09:23	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	414,720	822,870	1.9842
20 Apr 2011 18:09:22	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	388,800	771,147	1.9834
20 Apr 2011 18:09:21	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	362,880	718,498	1.9800
20 Apr 2011 18:09:20	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	336,960	665,854	1.9761
20 Apr 2011 18:09:19	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	311,040	614,491	1.9756
20 Apr 2011 18:09:18	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	285,120	562,005	1.9711
12 Apr 2011 21:33:02	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	259,200	510,383	1.9691
11 Apr 2011 21:38:13	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	233,280	459,009	1.9676
10 Apr 2011 21:45:50	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	207,360	407,512	1.9652
09 Apr 2011 22:05:40	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	181,440	355,948	1.9618
08 Apr 2011 15:55:27	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	155,520	305,242	1.9627
07 Apr 2011 14:45:39	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	129,600	254,850	1.9664
06 Apr 2011 05:43:42	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	103,680	204,854	1.9758
05 Apr 2011 06:39:20	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	77,760	153,974	1.9801
04 Apr 2011 15:30:33	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	51,840	101,989	1.9674
03 Apr 2011 16:01:25	775427	12759378	hadcm3n_o4ko_1900_40_007201259_2	25,920	51,498	1.9868