Task 16165547

Name	hadcm3n_3aio_1980_40_008340149_3
Workunit	8491010
Created	29 Dec 2013, 5:30:08 UTC
Sent	29 Dec 2013, 5:30:23 UTC
Report deadline	30 Mar 2014, 12:57:34 UTC
Received	2 Feb 2014, 14:48:53 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1211583
Run time	9 days 20 hours 9 min 59 sec
CPU time	9 days 3 hours 24 min 55 sec
Validate state	Invalid
Credit	6,531.84
Device peak FLOPS	3.04 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.2.33</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/3aioko.pji7c10 Error converting file to netcdf: dataout/3aioko.pii7c10 Error converting file to netcdf: dataout/3aioko.pfi7c10 Error converting file to netcdf: dataout/3aioka.phi7c10 Error converting file to netcdf: dataout/3aioka.pgi7c10 Error converting file to netcdf: dataout/3aioka.pei7c10 Error converting file to netcdf: dataout/3aioka.pdi7c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9052, iMonCtr=1 Model crash detected, will try to restart... 18:37:08 (8060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:50:23 (5336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:50:24 (5336): No heartbeat from core client for 30 sec - exiting 19:50:25 (5336): No heartbeat from core client for 30 sec - exiting 19:50:26 (5336): No heartbeat from core client for 30 sec - exiting 19:50:27 (5336): No heartbeat from core client for 30 sec - exiting 19:50:28 (5336): No heartbeat from core client for 30 sec - exiting 19:50:29 (5336): No heartbeat from core client for 30 sec - exiting 19:50:30 (5336): No heartbeat from core client for 30 sec - exiting 19:50:31 (5336): No heartbeat from core client for 30 sec - exiting 19:50:32 (5336): No heartbeat from core client for 30 sec - exiting 19:50:33 (5336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 18:46:24 (6740): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:46:25 (6740): No heartbeat from core client for 30 sec - exiting 18:46:26 (6740): No heartbeat from core client for 30 sec - exiting 18:46:27 (6740): No heartbeat from core client for 30 sec - exiting 18:46:28 (6740): No heartbeat from core client for 30 sec - exiting 18:46:29 (6740): No heartbeat from core client for 30 sec - exiting 18:46:30 (6740): No heartbeat from core client for 30 sec - exiting 18:46:31 (6740): No heartbeat from core client for 30 sec - exiting 18:46:32 (6740): No heartbeat from core client for 30 sec - exiting 18:46:33 (6740): No heartbeat from core client for 30 sec - exiting 18:46:34 (6740): No heartbeat from core client for 30 sec - exiting 18:46:35 (6740): No heartbeat from core client for 30 sec - exiting 18:46:36 (6740): No heartbeat from core client for 30 sec - exiting Atmos Hold Restart file rename failed on atmos_restart.hold CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:51:30 (6912): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8912, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 19:45:23 (10096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
01 Feb 2014 19:25:59	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	544,320	791,830	1.4547
01 Feb 2014 08:26:42	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	518,400	754,105	1.4547
31 Jan 2014 13:10:22	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	492,480	715,904	1.4537
26 Jan 2014 16:28:11	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	466,560	677,770	1.4527
26 Jan 2014 02:56:04	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	440,640	640,410	1.4534
25 Jan 2014 09:22:16	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	414,720	602,887	1.4537
24 Jan 2014 14:27:11	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	388,800	565,130	1.4535
19 Jan 2014 17:51:34	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	362,880	527,535	1.4537
18 Jan 2014 17:00:59	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	336,960	489,836	1.4537
17 Jan 2014 20:54:15	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	311,040	452,789	1.4557
17 Jan 2014 09:32:22	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	285,120	414,548	1.4539
12 Jan 2014 12:44:25	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	259,200	376,884	1.4540
12 Jan 2014 01:51:49	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	233,280	339,041	1.4534
11 Jan 2014 11:48:24	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	207,360	301,463	1.4538
05 Jan 2014 16:37:48	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	181,440	263,940	1.4547
04 Jan 2014 10:45:34	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	155,520	226,234	1.4547
03 Jan 2014 14:47:30	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	129,600	188,721	1.4562
02 Jan 2014 10:21:00	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	103,680	151,293	1.4592
31 Dec 2013 22:26:48	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	77,760	114,022	1.4663
30 Dec 2013 17:25:32	1211583	16165547	hadcm3n_3aio_1980_40_008340149_3	51,840	76,235	1.4706