Task 13124341

Name	hadcm3n_yl4h_1900_40_007360235_0
Workunit	7557665
Created	6 Jul 2011, 15:11:23 UTC
Sent	7 Jul 2011, 19:34:09 UTC
Report deadline	7 Oct 2011, 3:01:20 UTC
Received	16 Aug 2011, 20:00:24 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	25 (0x00000019) Unknown error code
Computer ID	1376550
Run time	34 days 16 hours 59 min 33 sec
CPU time	33 days 20 hours 13 min 45 sec
Validate state	Invalid
Credit	11,197.44
Device peak FLOPS	1.67 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2676, iMonCtr=1 Model crash detected, will try to restart... 06:44:39 (4792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:36:49 (3672): Can't acquire lockfile (32) - waiting 35s 19:37:01 (6316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... 12:36:19 (6936): No heartbeat from core client for 30 sec - exiting 12:36:20 (6936): No heartbeat from core client for 30 sec - exiting 12:36:21 (6936): No heartbeat from core client for 30 sec - exiting 12:36:22 (6936): No heartbeat from core client for 30 sec - exiting 12:36:23 (6936): No heartbeat from core client for 30 sec - exiting 12:36:24 (6936): No heartbeat from core client for 30 sec - exiting 12:36:25 (6936): No heartbeat from core client for 30 sec - exiting 12:36:26 (6936): No heartbeat from core client for 30 sec - exiting 12:36:27 (6936): No heartbeat from core client for 30 sec - exiting 12:36:29 (6936): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4176, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6928, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 06:39:16 (6348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:39:18 (6348): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6588, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3916, iMonCtr=1 Model crash detected, will try to restart... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
16 Aug 2011 10:32:21	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	933,120	2,914,071	3.1229
15 Aug 2011 09:58:55	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	907,200	2,827,226	3.1164
10 Aug 2011 11:46:44	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	881,280	2,744,129	3.1138
09 Aug 2011 13:43:10	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	855,360	2,665,268	3.1160
08 Aug 2011 15:27:56	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	829,440	2,586,039	3.1178
07 Aug 2011 17:37:39	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	803,520	2,506,772	3.1197
06 Aug 2011 19:07:54	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	777,600	2,427,926	3.1223
05 Aug 2011 19:39:44	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	751,680	2,361,203	3.1412
04 Aug 2011 19:35:17	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	725,760	2,275,044	3.1347
03 Aug 2011 18:55:52	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	699,840	2,187,385	3.1256
02 Aug 2011 18:17:48	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	673,920	2,100,177	3.1164
01 Aug 2011 17:59:12	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	648,000	2,015,354	3.1101
31 Jul 2011 15:04:09	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	622,080	1,927,176	3.0980
30 Jul 2011 17:43:01	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	596,160	1,848,387	3.1005
29 Jul 2011 18:45:41	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	570,240	1,769,032	3.1023
28 Jul 2011 21:10:46	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	544,320	1,690,254	3.1053
27 Jul 2011 22:21:24	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	518,400	1,611,731	3.1090
27 Jul 2011 00:23:25	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	492,480	1,533,394	3.1136
26 Jul 2011 04:14:05	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	466,560	1,460,441	3.1302
25 Jul 2011 22:49:15	1119324	13124341	hadcm3n_yl4h_1900_40_007360235_0	440,640	1,383,035	3.1387