Name | hadcm3n_zl34_1920_40_008244562_1 |
Workunit | 8399686 |
Created | 14 Nov 2012, 19:22:36 UTC |
Sent | 14 Nov 2012, 19:22:52 UTC |
Report deadline | 14 Feb 2013, 2:50:03 UTC |
Received | 1 Jan 2013, 14:49:58 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 25 (0x00000019) Unknown error code |
Computer ID | 1206904 |
Run time | 17 days 5 hours 15 min 13 sec |
CPU time | 13 days 6 hours 16 min 30 sec |
Validate state | Invalid |
Credit | 4,976.64 |
Device peak FLOPS | 2.39 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.25</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4072, iMonCtr=1 Model crash detected, will try to restart... 15:08:06 (4792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:40:34 (3480): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4680, iMonCtr=1 Model crash detected, will try to restart... 11:12:21 (6768): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:13:39 (2620): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:02:59 (2136): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:03:00 (2136): No heartbeat from core client for 30 sec - exiting 15:03:02 (2136): No heartbeat from core client for 30 sec - exiting 15:03:03 (2136): No heartbeat from core client for 30 sec - exiting 15:03:04 (2136): No heartbeat from core client for 30 sec - exiting 15:04:35 (1756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is15:00:43 (6392): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:04:27 (5312): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:10:32 (4900): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4180, iMonCtr=1 Model crash detected, will try to restart... 16:08:23 (6000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... C09:19:47 (4904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5724, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1404, iMonCtr=1 Model crash detected, will try to restart... 15:35:43 (6064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5104, iMonCtr=1 Model crash detected, will try to restart... 09:02:45 (6256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1028, iMonCtr=1 Model crash detected, will try to restart... 10:30:02 (5992): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1628, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7360, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4244, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4264, iMonCtr=1 Model crash detected, will try to restart... 08:56:48 (4408): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C08:44:21 (6320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:19:55 (4336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6360, iMonCtr=1 Model crash detected, will try to restart... 09:23:42 (4708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6476, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4660, iMonCtr=1 Model crash detected, will try to restart... 15:54:51 (4816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:51:55 (6300): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:43:57 (4876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6804, iMonCtr=1 Model crash detected, will try to restart... 19:35:57 (4260): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3832, iMonCtr=1 Model crash detected, will try to restart... 14:51:01 (4184): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:15:43 (6604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4616, iMonCtr=1 Model crash detected, will try to restart... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
31 Dec 2012 10:48:52 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 414,720 | 1,102,208 | 2.6577 |
22 Dec 2012 16:10:53 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 388,800 | 1,033,310 | 2.6577 |
17 Dec 2012 18:10:08 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 362,880 | 964,687 | 2.6584 |
15 Dec 2012 12:30:13 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 336,960 | 897,416 | 2.6633 |
14 Dec 2012 16:16:06 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 311,040 | 828,732 | 2.6644 |
14 Dec 2012 16:16:06 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 285,120 | 759,711 | 2.6645 |
14 Dec 2012 16:16:06 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 259,200 | 691,649 | 2.6684 |
06 Dec 2012 11:48:55 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 233,280 | 620,835 | 2.6613 |
03 Dec 2012 13:05:22 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 207,360 | 550,655 | 2.6556 |
01 Dec 2012 14:57:39 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 181,440 | 481,722 | 2.6550 |
29 Nov 2012 16:53:48 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 155,520 | 411,867 | 2.6483 |
28 Nov 2012 08:11:42 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 129,600 | 343,112 | 2.6475 |
26 Nov 2012 14:37:05 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 103,680 | 275,293 | 2.6552 |
24 Nov 2012 12:59:21 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 77,760 | 207,934 | 2.6740 |
21 Nov 2012 15:49:29 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 51,840 | 138,871 | 2.6788 |
18 Nov 2012 16:25:49 | 1206904 | 15435195 | hadcm3n_zl34_1920_40_008244562_1 | 25,920 | 69,711 | 2.6895 |
©2024 climateprediction.net