Task 15700453

Name	hadcm3n_4c2o_1980_40_008340440_1
Workunit	8491301
Created	3 Apr 2013, 4:33:50 UTC
Sent	3 Apr 2013, 4:34:08 UTC
Report deadline	3 Jul 2013, 12:01:19 UTC
Received	13 Apr 2013, 21:18:08 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1230346
Run time	7 days 14 hours 1 min 19 sec
CPU time	6 days 14 hours 26 min 27 sec
Validate state	Invalid
Credit	4,976.64
Device peak FLOPS	3.11 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2744, iMonCtr=1 Model crash detected, will try to restart... 16:47:17 (7176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 06:55:03 (9112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:09:11 (7144): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:22:44 (5288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:50:21 (9768): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:07:28 (11160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:07:29 (11160): No heartbeat from core client for 30 sec - exiting 17:07:30 (11160): No heartbeat from core client for 30 sec - exiting 17:07:31 (11160): No heartbeat from core client for 30 sec - exiting 17:07:32 (11160): No heartbeat from core client for 30 sec - exiting 17:07:33 (11160): No heartbeat from core client for 30 sec - exiting 17:07:34 (11160): No heartbeat from core client for 30 sec - exiting 17:07:35 (11160): No heartbeat from core client for 30 sec - exiting 17:07:36 (11160): No heartbeat from core client for 30 sec - exiting 17:07:37 (11160): No heartbeat from core client for 30 sec - exiting 17:07:38 (11160): No heartbeat from core client for 30 sec - exiting Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1016, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1016, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1016, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/ocean_restart.day after 11 attempts 17:30:09 (1016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3536, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3536, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9108, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4c2o_1980_40_008340440/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9108, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
13 Apr 2013 13:35:06	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	414,720	547,200	1.3194
13 Apr 2013 13:35:06	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	388,800	513,862	1.3217
13 Apr 2013 13:35:06	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	362,880	480,864	1.3251
13 Apr 2013 13:35:06	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	336,960	447,448	1.3279
13 Apr 2013 13:35:06	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	311,040	410,339	1.3192
10 Apr 2013 06:52:32	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	285,120	373,794	1.3110
09 Apr 2013 21:28:37	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	259,200	338,100	1.3044
09 Apr 2013 21:28:37	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	233,280	303,894	1.3027
09 Apr 2013 21:28:37	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	207,360	269,305	1.2987
08 Apr 2013 06:16:10	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	181,440	234,363	1.2917
05 Apr 2013 21:17:06	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	155,520	199,544	1.2831
05 Apr 2013 21:17:05	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	129,600	166,251	1.2828
05 Apr 2013 21:17:05	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	103,680	133,034	1.2831
05 Apr 2013 21:17:05	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	77,760	99,614	1.2810
05 Apr 2013 21:17:05	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	51,840	66,651	1.2857
05 Apr 2013 21:17:05	1230346	15700453	hadcm3n_4c2o_1980_40_008340440_1	25,920	33,318	1.2854