Task 13022214

Name	hadcm3n_t2o3_1940_40_007314752_1
Workunit	7512182
Created	28 Jun 2011, 17:45:13 UTC
Sent	28 Jun 2011, 17:52:34 UTC
Report deadline	28 Sep 2011, 1:19:45 UTC
Received	6 Apr 2013, 23:51:05 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	275529
Run time	33 days 17 hours 36 min 7 sec
CPU time	33 days 17 hours 36 min 7 sec
Validate state	Invalid
Credit	6,531.84
Device peak FLOPS	1.24 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>5.4.11</core_client_version> <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 4 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3924, iMonCtr=1 Model crash detected, will try to restart... Signal 4 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3924, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 17:08:29 (3352): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 17:28:17 (1512): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1592, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1592, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:34:58 (4024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:33:17 (3536): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 16:32:40 (3364): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:11:56 (8484): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=224, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=224, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=224, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=224, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3920, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3920, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	544,320	2,890,483	5.3103
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	518,400	2,751,591	5.3079
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	492,480	2,612,384	5.3045
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	466,560	2,473,543	5.3017
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	440,640	2,334,690	5.2984
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	414,720	2,196,005	5.2952
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	388,800	2,056,795	5.2901
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	362,880	1,917,804	5.2850
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	336,960	1,778,319	5.2775
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	311,040	1,639,434	5.2708
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	285,120	1,500,135	5.2614
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	259,200	1,361,069	5.2510
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	233,280	1,222,016	5.2384
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	207,360	1,083,288	5.2242
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	181,440	945,992	5.2138
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	155,520	810,178	5.2095
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	129,600	673,899	5.1998
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	103,680	537,838	5.1875
06 Apr 2013 02:16:00	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	77,760	401,643	5.1652
06 Apr 2013 02:15:59	275529	13022214	hadcm3n_t2o3_1940_40_007314752_1	51,840	265,291	5.1175