Task 16011392

Name	hadcm3n_o73i_1980_40_008385336_3
Workunit	8536195
Created	11 Sep 2013, 1:41:55 UTC
Sent	11 Sep 2013, 1:52:53 UTC
Report deadline	11 Dec 2013, 9:20:04 UTC
Received	19 Nov 2013, 13:58:50 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	25 (0x00000019) Unknown error code
Computer ID	1203586
Run time	38 days 7 hours 25 min 13 sec
CPU time	34 days 8 hours 26 min 21 sec
Validate state	Invalid
Credit	12,130.56
Device peak FLOPS	2.19 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5544, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4040, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4392, iMonCtr=1 Model crash detected, will try to restart... 21:17:19 (2960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:17:20 (2960): No heartbeat from core client for 30 sec - exiting 21:17:21 (2960): No heartbeat from core client for 30 sec - exiting 21:17:22 (2960): No heartbeat from core client for 30 sec - exiting 21:17:23 (2960): No heartbeat from core client for 30 sec - exiting 21:17:24 (2960): No heartbeat from core client for 30 sec - exiting 21:17:26 (2960): No heartbeat from core client for 30 sec - exiting 21:17:27 (2960): No heartbeat from core client for 30 sec - exiting 21:17:28 (2960): No heartbeat from core client for 30 sec - exiting 21:17:29 (2960): No heartbeat from core client for 30 sec - exiting 21:17:30 (2960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1992, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6220, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4756, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4756, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4756, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5540, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3296, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3296, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3296, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3296, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3296, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5552, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 17:13:54 (5620): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
17 Nov 2013 14:14:21	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	1,010,880	2,889,491	2.8584
15 Nov 2013 18:16:22	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	984,960	2,791,846	2.8345
14 Nov 2013 17:11:40	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	959,040	2,694,487	2.8096
14 Nov 2013 17:11:40	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	933,120	2,587,887	2.7734
14 Nov 2013 17:11:40	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	907,200	2,487,700	2.7422
14 Nov 2013 17:11:40	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	881,280	2,378,137	2.6985
09 Nov 2013 02:44:53	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	855,360	2,273,198	2.6576
05 Nov 2013 16:00:41	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	829,440	2,179,062	2.6271
04 Nov 2013 14:00:47	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	803,520	2,090,255	2.6014
03 Nov 2013 10:33:22	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	777,600	1,998,875	2.5706
02 Nov 2013 03:21:24	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	751,680	1,896,928	2.5236
20 Oct 2013 18:18:05	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	725,760	1,816,699	2.5032
17 Oct 2013 19:29:45	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	699,840	1,747,175	2.4965
17 Oct 2013 04:41:29	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	673,920	1,671,807	2.4807
16 Oct 2013 00:25:37	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	648,000	1,598,255	2.4664
15 Oct 2013 03:58:17	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	622,080	1,528,608	2.4573
13 Oct 2013 05:13:03	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	596,160	1,503,312	2.5217
12 Oct 2013 10:40:45	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	570,240	1,434,079	2.5149
11 Oct 2013 16:45:38	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	544,320	1,374,301	2.5248
10 Oct 2013 10:06:56	1203586	16011392	hadcm3n_o73i_1980_40_008385336_3	518,400	1,307,870	2.5229