Task 15653777

Name	hadcm3n_3jyj_1940_40_008268223_4
Workunit	8423347
Created	8 Mar 2013, 3:08:52 UTC
Sent	8 Mar 2013, 3:09:02 UTC
Report deadline	7 Jun 2013, 10:36:13 UTC
Received	31 Mar 2013, 10:33:24 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1255354
Run time	15 days 5 hours 54 min 17 sec
CPU time	12 days 17 hours 44 min 33 sec
Validate state	Invalid
Credit	7,153.92
Device peak FLOPS	2.51 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4680, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4232, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4232, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4200, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4200, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1788, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1788, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5024, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5024, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4812, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4812, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4476, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4136, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4136, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3324, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3324, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1 Model crash detected, will try to restart... Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. tmp/pipe_dummy 2048 Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. tmp/pipe_dummy 2048 Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. tmp/pipe_dummy 2048 Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. tmp/pipe_dummy 2048 Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. tmp/pipe_dummy 2048 Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
31 Mar 2013 07:23:46	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	596,160	1,134,040	1.9022
29 Mar 2013 20:59:09	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	570,240	1,084,193	1.9013
29 Mar 2013 06:09:45	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	544,320	1,034,714	1.9009
27 Mar 2013 19:02:27	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	518,400	985,417	1.9009
25 Mar 2013 15:31:03	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	492,480	936,937	1.9025
24 Mar 2013 07:26:44	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	466,560	890,205	1.9080
23 Mar 2013 16:15:51	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	440,640	843,045	1.9132
21 Mar 2013 20:33:42	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	414,720	794,862	1.9166
21 Mar 2013 04:18:01	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	388,800	745,442	1.9173
20 Mar 2013 13:45:09	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	362,880	696,073	1.9182
19 Mar 2013 23:14:42	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	336,960	646,735	1.9193
18 Mar 2013 14:20:16	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	311,040	596,772	1.9186
17 Mar 2013 11:49:39	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	285,120	546,880	1.9181
16 Mar 2013 21:10:26	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	259,200	497,325	1.9187
16 Mar 2013 05:07:00	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	233,280	446,894	1.9157
15 Mar 2013 12:24:28	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	207,360	396,723	1.9132
14 Mar 2013 20:23:11	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	181,440	345,875	1.9063
13 Mar 2013 09:50:08	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	155,520	296,554	1.9069
12 Mar 2013 11:07:11	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	129,600	247,592	1.9104
11 Mar 2013 09:58:34	1255354	15653777	hadcm3n_3jyj_1940_40_008268223_4	103,680	197,820	1.9080