Task 15850447

Name	hadcm3n_yafe_2020_40_008366187_1
Workunit	8517046
Created	19 Jun 2013, 20:08:26 UTC
Sent	19 Jun 2013, 20:31:38 UTC
Report deadline	19 Sep 2013, 3:58:49 UTC
Received	14 Aug 2013, 15:57:49 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1096206
Run time	8 days 13 hours 3 min 34 sec
CPU time	8 days 8 hours 1 min 15 sec
Validate state	Invalid
Credit	5,598.72
Device peak FLOPS	2.66 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> Het apparaat herkent de opdracht niet. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10148, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10148, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/yafeko.pjm1c10 Error converting file to netcdf: dataout/yafeko.pim1c10 Error converting file to netcdf: dataout/yafeko.pfm1c10 Error converting file to netcdf: dataout/yafeka.phm1c10 Error converting file to netcdf: dataout/yafeka.pgm1c10 Error converting file to netcdf: dataout/yafeka.pem1c10 Error converting file to netcdf: dataout/yafeka.pdm1c10 14:07:31 (2292): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11816, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5580, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5580, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5580, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 21:37:08 (6576): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:25:40 (14344)CPDN Monitor - Quit request from BOINC... Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=12612, selfPID=12612, iMonCtr=1 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 16:07:05 (13632): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3868, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3868, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3868, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3868, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3868, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3868, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4224, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4224, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4224, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4224, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4224, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yafe_2020_40_008366187/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4224, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
14 Aug 2013 16:08:53	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	466,560	689,983	1.4789
14 Aug 2013 16:08:53	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	440,640	653,204	1.4824
23 Jul 2013 21:47:32	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	414,720	613,351	1.4790
23 Jul 2013 20:58:23	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	388,800	573,705	1.4756
23 Jul 2013 20:29:03	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	362,880	533,591	1.4704
23 Jul 2013 19:09:50	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	336,960	494,967	1.4689
23 Jul 2013 18:55:15	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	311,040	456,192	1.4667
23 Jul 2013 18:55:14	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	285,120	416,643	1.4613
23 Jul 2013 18:55:14	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	259,200	377,193	1.4552
09 Jul 2013 18:14:44	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	233,280	339,915	1.4571
07 Jul 2013 01:31:41	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	207,360	297,231	1.4334
06 Jul 2013 05:29:09	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	181,440	254,022	1.4000
02 Jul 2013 11:16:11	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	155,520	218,658	1.4060
02 Jul 2013 10:21:17	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	129,600	182,223	1.4060
02 Jul 2013 09:44:52	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	103,680	145,757	1.4058
26 Jun 2013 20:37:06	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	77,760	107,478	1.3822
23 Jun 2013 17:51:24	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	51,840	71,878	1.3865
22 Jun 2013 17:00:42	1096206	15850447	hadcm3n_yafe_2020_40_008366187_1	25,920	36,128	1.3938