Task 16202668

Name	hadcm3n_n3an_1880_40_008403560_3
Workunit	8554416
Created	6 Jan 2014, 21:31:55 UTC
Sent	6 Jan 2014, 21:32:08 UTC
Report deadline	8 Apr 2014, 4:59:19 UTC
Received	21 Feb 2014, 23:53:35 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	25 (0x00000019) Unknown error code
Computer ID	1122348
Run time	12 days 9 hours 45 min 43 sec
CPU time	12 days 9 hours 17 min 37 sec
Validate state	Invalid
Credit	6,842.88
Device peak FLOPS	2.31 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> 10:54:44 (8128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:57:06 (7560): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 01:32:31 (5492): No heartbeat from core client for 30 sec - exiting 01:32:32 (5492): No heartbeat from core client for 30 sec - exiting 01:32:33 (5492): No heartbeat from core client for 30 sec - exiting 01:32:34 (5492): No heartbeat from core client for 30 sec - exiting 01:32:35 (5492): No heartbeat from core client for 30 sec - exiting 01:32:36 (5492): No heartbeat from core client for 30 sec - exiting 01:32:37 (5492): No heartbeat from core client for 30 sec - exiting 01:32:38 (5492): No heartbeat from core client for 30 sec - exiting 01:32:39 (5492): No heartbeat from core client for 30 sec - exiting 01:32:40 (5492): No heartbeat from core client for 30 sec - exiting 01:32:41 (5492): No heartbeat from core client for 30 sec - exiting 01:32:42 (5492): No heartbeat from core client for 30 sec - exiting 01:32:43 (5492): No heartbeat from core client for 30 sec - exiting 01:32:44 (5492): No heartbeat from core client for 30 sec - exiting 01:32:45 (5492): No heartbeat from core client for 30 sec - exiting 01:32:46 (5492): No heartbeat from core client for 30 sec - exiting 01:32:47 (5492): No heartbeat from core client for 30 sec - exiting 01:32:48 (5492): No heartbeat from core client for 30 sec - exiting 01:32:49 (5492): No heartbeat from core client for 30 sec - exiting 01:32:50 (5492): No heartbeat from core client for 30 sec - exiting 01:32:51 (5492): No heartbeat from core client for 30 sec - exiting 01:32:52 (5492): No heartbeat from core client for 30 sec - exiting 01:32:53 (5492): No heartbeat from core client for 30 sec - exiting 01:32:54 (5492): No heartbeat from core client for 30 sec - exiting 01:32:55 (5492): No heartbeat from core client for 30 sec - exiting 01:32:56 (5492): No heartbeat from core client for 30 sec - exiting 01:32:57 (5492): No heartbeat from core client for 30 sec - exiting 01:32:58 (5492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:08:02 (4528): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:42:47 (7000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:49:13 (8036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:54:02 (2680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:57:16 (8128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:09:11 (2680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:17:41 (7904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3872, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6720, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5028, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5820, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5820, iMonCtr=1 Model crash detected, will try to restart... 03:33:03 (6960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6312, iMonCtr=1 Model crash detected, will try to restart... 17:41:53 (6852): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
21 Feb 2014 19:20:36	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	570,240	1,061,525	1.8615
21 Feb 2014 08:32:16	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	544,320	1,020,959	1.8757
20 Feb 2014 19:15:27	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	518,400	976,704	1.8841
20 Feb 2014 07:16:28	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	492,480	933,652	1.8958
14 Feb 2014 16:22:46	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	466,560	890,717	1.9091
14 Feb 2014 04:20:13	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	440,640	849,446	1.9278
11 Feb 2014 01:44:40	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	414,720	802,142	1.9342
07 Feb 2014 01:24:01	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	388,800	752,779	1.9362
05 Feb 2014 09:31:04	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	362,880	701,793	1.9340
04 Feb 2014 10:09:18	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	336,960	654,247	1.9416
31 Jan 2014 10:29:28	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	311,040	602,666	1.9376
31 Jan 2014 07:28:28	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	285,120	549,039	1.9256
28 Jan 2014 08:16:18	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	259,200	494,056	1.9061
24 Jan 2014 11:26:33	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	233,280	438,232	1.8786
24 Jan 2014 11:26:33	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	207,360	382,824	1.8462
17 Jan 2014 03:50:58	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	181,440	327,990	1.8077
15 Jan 2014 06:48:24	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	155,520	272,758	1.7538
10 Jan 2014 11:38:11	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	129,600	222,249	1.7149
10 Jan 2014 02:30:35	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	103,680	182,308	1.7584
08 Jan 2014 12:42:05	1122348	16202668	hadcm3n_n3an_1880_40_008403560_3	77,760	138,904	1.7863