Task 15281504

Name	hadcm3n_zhku_1880_40_008201192_0
Workunit	8356316
Created	13 Sep 2012, 11:42:32 UTC
Sent	13 Sep 2012, 17:50:46 UTC
Report deadline	14 Dec 2012, 1:17:57 UTC
Received	28 Oct 2012, 11:23:21 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1286177
Run time	18 days 6 hours 57 min 11 sec
CPU time	17 days 22 hours 55 min 53 sec
Validate state	Invalid
Credit	9,953.28
Device peak FLOPS	2.62 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4692, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 01:21:41 (3904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4240, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 06:10:36 (4736): No heartbeat from core client for 30 sec - exiting 06:10:37 (4736): No heartbeat from core client for 30 sec - exiting 06:10:38 (4736): No heartbeat from core client for 30 sec - exiting 06:10:40 (4736): No heartbeat from core client for 30 sec - exiting 06:10:41 (4736): No heartbeat from core client for 30 sec - exiting 06:10:42 (4736): No heartbeat from core client for 30 sec - exiting 06:10:43 (4736): No heartbeat from core client for 30 sec - exiting 06:10:44 (4736): No heartbeat from core client for 30 sec - exiting 06:10:45 (4736): No heartbeat from core client for 30 sec - exiting 06:10:46 (4736): No heartbeat from core client for 30 sec - exiting 06:10:47 (4736): No heartbeat from core client for 30 sec - exiting 06:10:48 (4736): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
28 Oct 2012 00:28:05	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	829,440	1,540,815	1.8577
27 Oct 2012 02:48:13	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	803,520	1,486,987	1.8506
26 Oct 2012 15:03:52	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	777,600	1,444,234	1.8573
25 Oct 2012 20:26:44	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	751,680	1,400,068	1.8626
24 Oct 2012 02:25:39	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	725,760	1,353,542	1.8650
22 Oct 2012 18:38:09	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	699,840	1,303,413	1.8624
21 Oct 2012 23:16:09	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	673,920	1,253,662	1.8603
18 Oct 2012 02:26:02	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	648,000	1,203,934	1.8579
16 Oct 2012 05:28:54	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	622,080	1,153,855	1.8548
15 Oct 2012 15:21:35	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	596,160	1,103,761	1.8515
14 Oct 2012 01:30:20	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	570,240	1,048,444	1.8386
13 Oct 2012 01:46:20	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	544,320	997,802	1.8331
11 Oct 2012 14:01:39	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	518,400	948,812	1.8303
09 Oct 2012 20:38:10	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	492,480	900,697	1.8289
09 Oct 2012 00:43:50	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	466,560	850,590	1.8231
05 Oct 2012 09:52:45	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	440,640	800,234	1.8161
04 Oct 2012 13:09:08	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	414,720	749,243	1.8066
03 Oct 2012 05:30:54	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	388,800	698,172	1.7957
01 Oct 2012 21:38:43	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	362,880	654,032	1.8023
01 Oct 2012 00:23:36	1105888	15281504	hadcm3n_zhku_1880_40_008201192_0	336,960	613,339	1.8202