Task 16291773

Name	hadcm3n_o85m_1900_40_008465693_3
Workunit	8616532
Created	18 Feb 2014, 12:01:24 UTC
Sent	18 Feb 2014, 12:07:19 UTC
Report deadline	20 May 2014, 19:34:30 UTC
Received	6 Apr 2014, 3:04:22 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1315088
Run time	5 days 22 hours 33 min 10 sec
CPU time	5 days 21 hours 39 min 3 sec
Validate state	Invalid
Credit	4,976.64
Device peak FLOPS	3.03 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.2.42</core_client_version> <![CDATA[ <message> Das Gerät erkennt den Befehl nicht. (0x16) - exit code 22 (0x16) </message> <stderr_txt> forrtl: Nicht genügend Systemressourcen, um den angeforderten Dienst auszuführen. 11:02:27 (5312): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3944, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3944, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3944, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3944, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3944, iMonCtr=1 Model crash detected, will try to restart... 11:31:49 (3944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 05:00:40 (1388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2220, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2220, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2220, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2220, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2220, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2220, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
06 Apr 2014 02:43:04	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	414,720	508,711	1.2266
05 Apr 2014 19:06:25	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	388,800	482,128	1.2400
04 Apr 2014 14:46:12	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	362,880	455,447	1.2551
03 Apr 2014 23:17:59	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	336,960	428,206	1.2708
03 Apr 2014 15:55:21	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	311,040	401,665	1.2914
24 Mar 2014 12:03:23	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	285,120	374,903	1.3149
24 Mar 2014 04:27:12	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	259,200	347,746	1.3416
23 Mar 2014 20:55:55	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	233,280	320,582	1.3742
23 Mar 2014 13:24:34	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	207,360	293,498	1.4154
23 Mar 2014 05:48:26	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	181,440	266,415	1.4683
22 Mar 2014 21:32:10	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	155,520	236,560	1.5211
22 Mar 2014 11:10:27	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	129,600	199,355	1.5382
21 Mar 2014 23:58:21	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	103,680	159,173	1.5352
21 Mar 2014 13:06:19	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	77,760	120,384	1.5481
15 Mar 2014 11:01:10	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	51,840	77,341	1.4919
11 Mar 2014 16:29:42	1315088	16291773	hadcm3n_o85m_1900_40_008465693_3	25,920	39,846	1.5373