Name | hadcm3n_yl4h_1900_40_007360235_0 |
Workunit | 7557665 |
Created | 6 Jul 2011, 15:11:23 UTC |
Sent | 7 Jul 2011, 19:34:09 UTC |
Report deadline | 7 Oct 2011, 3:01:20 UTC |
Received | 16 Aug 2011, 20:00:24 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 25 (0x00000019) Unknown error code |
Computer ID | 1376550 |
Run time | 34 days 16 hours 59 min 33 sec |
CPU time | 33 days 20 hours 13 min 45 sec |
Validate state | Invalid |
Credit | 11,197.44 |
Device peak FLOPS | 1.67 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2676, iMonCtr=1 Model crash detected, will try to restart... 06:44:39 (4792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:36:49 (3672): Can't acquire lockfile (32) - waiting 35s 19:37:01 (6316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... 12:36:19 (6936): No heartbeat from core client for 30 sec - exiting 12:36:20 (6936): No heartbeat from core client for 30 sec - exiting 12:36:21 (6936): No heartbeat from core client for 30 sec - exiting 12:36:22 (6936): No heartbeat from core client for 30 sec - exiting 12:36:23 (6936): No heartbeat from core client for 30 sec - exiting 12:36:24 (6936): No heartbeat from core client for 30 sec - exiting 12:36:25 (6936): No heartbeat from core client for 30 sec - exiting 12:36:26 (6936): No heartbeat from core client for 30 sec - exiting 12:36:27 (6936): No heartbeat from core client for 30 sec - exiting 12:36:29 (6936): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4176, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6928, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 06:39:16 (6348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:39:18 (6348): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6588, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3916, iMonCtr=1 Model crash detected, will try to restart... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
16 Aug 2011 10:32:21 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 933,120 | 2,914,071 | 3.1229 |
15 Aug 2011 09:58:55 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 907,200 | 2,827,226 | 3.1164 |
10 Aug 2011 11:46:44 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 881,280 | 2,744,129 | 3.1138 |
09 Aug 2011 13:43:10 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 855,360 | 2,665,268 | 3.1160 |
08 Aug 2011 15:27:56 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 829,440 | 2,586,039 | 3.1178 |
07 Aug 2011 17:37:39 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 803,520 | 2,506,772 | 3.1197 |
06 Aug 2011 19:07:54 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 777,600 | 2,427,926 | 3.1223 |
05 Aug 2011 19:39:44 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 751,680 | 2,361,203 | 3.1412 |
04 Aug 2011 19:35:17 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 725,760 | 2,275,044 | 3.1347 |
03 Aug 2011 18:55:52 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 699,840 | 2,187,385 | 3.1256 |
02 Aug 2011 18:17:48 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 673,920 | 2,100,177 | 3.1164 |
01 Aug 2011 17:59:12 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 648,000 | 2,015,354 | 3.1101 |
31 Jul 2011 15:04:09 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 622,080 | 1,927,176 | 3.0980 |
30 Jul 2011 17:43:01 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 596,160 | 1,848,387 | 3.1005 |
29 Jul 2011 18:45:41 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 570,240 | 1,769,032 | 3.1023 |
28 Jul 2011 21:10:46 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 544,320 | 1,690,254 | 3.1053 |
27 Jul 2011 22:21:24 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 518,400 | 1,611,731 | 3.1090 |
27 Jul 2011 00:23:25 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 492,480 | 1,533,394 | 3.1136 |
26 Jul 2011 04:14:05 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 466,560 | 1,460,441 | 3.1302 |
25 Jul 2011 22:49:15 | 1119324 | 13124341 | hadcm3n_yl4h_1900_40_007360235_0 | 440,640 | 1,383,035 | 3.1387 |
©2024 cpdn.org