Name | hadsm3dhet2_ju4l_006602039_1 |
Workunit | 6805412 |
Created | 15 Mar 2010, 12:09:54 UTC |
Sent | 10 Jun 2010, 22:23:34 UTC |
Report deadline | 24 May 2011, 3:43:34 UTC |
Received | 14 Jul 2010, 9:41:47 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 967683 |
Run time | 4 days 8 hours 13 min 46 sec |
CPU time | 3 days 23 hours 29 min 26 sec |
Validate state | Invalid |
Credit | 2,084.11 |
Device peak FLOPS | 0.82 GFLOPS |
Application version | UK Met Office HadSM3 Slab Model v6.07 windows_intelx86 |
Stderr | <core_client_version>6.6.20</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6032, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4304, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4304, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4304, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5824, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5824, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5824, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5824, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5824, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5824, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
12 Jul 2010 09:01:36 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 226,842 | 334,377 | 1.4741 |
08 Jul 2010 02:11:06 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 216,040 | 302,922 | 1.4022 |
28 Jun 2010 07:54:36 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 205,238 | 289,819 | 1.4121 |
27 Jun 2010 19:15:26 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 194,436 | 270,260 | 1.3900 |
27 Jun 2010 07:25:15 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 183,634 | 254,644 | 1.3867 |
26 Jun 2010 16:20:27 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 172,832 | 241,380 | 1.3966 |
26 Jun 2010 00:23:37 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 162,030 | 228,603 | 1.4109 |
24 Jun 2010 19:28:22 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 151,228 | 215,018 | 1.4218 |
23 Jun 2010 01:59:53 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 140,426 | 202,924 | 1.4451 |
22 Jun 2010 03:09:32 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 129,624 | 187,490 | 1.4464 |
21 Jun 2010 14:08:46 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 118,822 | 172,386 | 1.4508 |
20 Jun 2010 10:26:48 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 108,020 | 142,281 | 1.3172 |
19 Jun 2010 16:50:00 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 97,218 | 113,450 | 1.1670 |
19 Jun 2010 09:03:51 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 86,416 | 100,175 | 1.1592 |
19 Jun 2010 02:11:05 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 75,614 | 86,572 | 1.1449 |
18 Jun 2010 08:39:58 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 64,812 | 74,061 | 1.1427 |
16 Jun 2010 02:47:27 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 54,010 | 62,340 | 1.1542 |
15 Jun 2010 01:45:43 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 43,208 | 51,517 | 1.1923 |
13 Jun 2010 21:58:38 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 32,406 | 35,749 | 1.1032 |
12 Jun 2010 08:41:33 | 967683 | 11082120 | hadsm3dhet2_ju4l_006602039_1 | 21,604 | 23,025 | 1.0658 |
©2024 cpdn.org