Name | hadcm3n_88s8_1980_40_008720803_0 |
Workunit | 8866781 |
Created | 23 Apr 2014, 12:27:37 UTC |
Sent | 5 May 2014, 15:58:26 UTC |
Report deadline | 4 Aug 2014, 23:25:37 UTC |
Received | 29 May 2014, 21:10:24 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1325699 |
Run time | 11 days 10 hours 51 min 19 sec |
CPU time | 10 days 8 hours 41 min 33 sec |
Validate state | Invalid |
Credit | 7,153.92 |
Device peak FLOPS | 3.30 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... 12:17:34 (4052): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 00:20:08 (6572): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:52:01 (1296): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:47:30 (6368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:17:27 (6872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 12:06:16 (5212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:15:40 (1812): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:56:41 (3368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:59:42 (7068): No heartbeat from core client for 30 sec - exiting 12:59:43 (7068): No heartbeat from core client for 30 sec - exiting 12:59:44 (7068): No heartbeat from core client for 30 sec - exiting 12:59:45 (7068): No heartbeat from core client for 30 sec - exiting 12:59:46 (7068): No heartbeat from core client for 30 sec - exiting 12:59:47 (7068): No heartbeat from core client for 30 sec - exiting 12:59:48 (7068): No heartbeat from core client for 30 sec - exiting 12:59:49 (7068): No heartbeat from core client for 30 sec - exiting 12:59:50 (7068): No heartbeat from core client for 30 sec - exiting 12:59:51 (7068): No heartbeat from core client for 30 sec - exiting 12:59:52 (7068): No heartbeat from core client for 30 sec - exiting 12:59:53 (7068): No heartbeat from core client for 30 sec - exiting 12:59:54 (7068): No heartbeat from core client for 30 sec - exiting 12:59:55 (7068): No heartbeat from core client for 30 sec - exiting 12:59:56 (7068): No heartbeat from core client for 30 sec - exiting 12:59:57 (7068): No heartbeat from core client for 30 sec - exiting 12:59:58 (7068): No heartbeat from core client for 30 sec - exiting 12:59:59 (7068): No heartbeat from core client for 30 sec - exiting 13:00:00 (7068): No heartbeat from core client for 30 sec - exiting 13:00:01 (7068): No heartbeat from core client for 30 sec - exiting 13:00:02 (7068): No heartbeat from core client for 30 sec - exiting 13:00:03 (7068): No heartbeat from core client for 30 sec - exiting 13:00:04 (7068): No heartbeat from core client for 30 sec - exiting 13:00:05 (7068): No heartbeat from core client for 30 sec - exiting 13:00:06 (7068): No heartbeat from core client for 30 sec - exiting 13:00:07 (7068): No heartbeat from core client for 30 sec - exiting 13:00:08 (7068): No heartbeat from core client for 30 sec - exiting 13:00:09 (7068): No heartbeat from core client for 30 sec - exiting 13:00:10 (7068): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:02:16 (7052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 01:09:09 (5956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:32:41 (5264): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:11:13 (1284): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:15:09 (6592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... forrtl: Access is denied. 19:25:10 (5112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:16:45 (5484): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... forrtl: Access is denied. Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 18:32:43 (4500): No heartbeat from core client for 30 sec - exiting 18:32:45 (4500): No heartbeat from core client for 30 sec - exiting 18:32:46 (4500): No heartbeat from core client for 30 sec - exiting 18:32:47 (4500): No heartbeat from core client for 30 sec - exiting 18:32:48 (4500): No heartbeat from core client for 30 sec - exiting 18:32:49 (4500): No heartbeat from core client for 30 sec - exiting 18:32:50 (4500): No heartbeat from core client for 30 sec - exiting 18:32:51 (4500): No heartbeat from core client for 30 sec - exiting 18:32:52 (4500): No heartbeat from core client for 30 sec - exiting 18:32:53 (4500): No heartbeat from core client for 30 sec - exiting 18:32:54 (4500): No heartbeat from core client for 30 sec - exiting 18:32:55 (4500): No heartbeat from core client for 30 sec - exiting 18:32:57 (4500): No heartbeat from core client for 30 sec - exiting 18:32:58 (4500): No heartbeat from core client for 30 sec - exiting 18:32:59 (4500): No heartbeat from core client for 30 sec - exiting 18:33:00 (4500): No heartbeat from core client for 30 sec - exiting 18:33:01 (4500): No heartbeat from core client for 30 sec - exiting 18:33:02 (4500): No heartbeat from core client for 30 sec - exiting 18:33:03 (4500): No heartbeat from core client for 30 sec - exiting 18:33:04 (4500): No heartbeat from core client for 30 sec - exiting 18:33:05 (4500): No heartbeat from core client for 30 sec - exiting 18:33:06 (4500): No heartbeat from core client for 30 sec - exiting 18:33:07 (4500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:33:09 (4500): No heartbeat from core client for 30 sec - exiting 03:52:08 (6516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 12:14:23 (7016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 07:11:57 (6504): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:27:24 (6888): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:28:25 (4432): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:15:38 (5820): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:16:07 (4624): Can't acquire lockfile (32) - waiting 35s 17:49:23 (4624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:39:21 (3084): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:06:50 (6312): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:11:01 (2040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:29:28 (5176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 03:26:57 (4284): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:21:45 (2864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:22:11 (6556): Can't acquire lockfile (32) - waiting 35s 07:11:41 (6556): No heartbeat from core client for 30 sec - exiting 07:11:42 (6556): No heartbeat from core client for 30 sec - exiting 07:11:43 (6556): No heartbeat from core client for 30 sec - exiting 07:11:44 (6556): No heartbeat from core client for 30 sec - exiting 07:11:45 (6556): No heartbeat from core client for 30 sec - exiting 07:11:46 (6556): No heartbeat from core client for 30 sec - exiting 07:11:47 (6556): No heartbeat from core client for 30 sec - exiting 07:11:48 (6556): No heartbeat from core client for 30 sec - exiting 07:11:49 (6556): No heartbeat from core client for 30 sec - exiting 07:11:50 (6556): No heartbeat from core client for 30 sec - exiting 07:11:51 (6556): No heartbeat from core client for 30 sec - exiting 07:11:53 (6556): No heartbeat from core client for 30 sec - exiting 07:11:54 (6556): No heartbeat from core client for 30 sec - exiting 07:11:55 (6556): No heartbeat from core client for 30 sec - exiting 07:11:56 (6556): No heartbeat from core client for 30 sec - exiting 07:11:57 (6556): No heartbeat from core client for 30 sec - exiting 07:11:58 (6556): No heartbeat from core client for 30 sec - exiting 07:11:59 (6556): No heartbeat from core client for 30 sec - exiting 07:12:00 (6556): No heartbeat from core client for 30 sec - exiting 07:12:01 (6556): No heartbeat from core client for 30 sec - exiting 07:12:02 (6556): No heartbeat from core client for 30 sec - exiting 07:12:03 (6556): No heartbeat from core client for 30 sec - exiting 07:12:05 (6556): No heartbeat from core client for 30 sec - exiting 07:12:06 (6556): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6768, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
28 May 2014 10:27:22 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 596,160 | 874,911 | 1.4676 |
27 May 2014 21:19:10 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 570,240 | 834,865 | 1.4641 |
26 May 2014 04:07:11 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 544,320 | 800,281 | 1.4702 |
24 May 2014 22:44:30 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 518,400 | 764,098 | 1.4740 |
24 May 2014 10:06:47 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 492,480 | 718,830 | 1.4596 |
23 May 2014 21:03:57 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 466,560 | 672,115 | 1.4406 |
23 May 2014 01:39:07 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 440,640 | 639,905 | 1.4522 |
22 May 2014 10:25:27 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 414,720 | 592,456 | 1.4286 |
21 May 2014 20:24:49 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 388,800 | 549,815 | 1.4141 |
21 May 2014 03:44:50 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 362,880 | 515,413 | 1.4203 |
20 May 2014 14:58:29 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 336,960 | 484,248 | 1.4371 |
18 May 2014 07:56:20 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 311,040 | 451,253 | 1.4508 |
17 May 2014 18:40:41 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 285,120 | 419,508 | 1.4713 |
17 May 2014 07:11:26 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 259,200 | 384,813 | 1.4846 |
16 May 2014 21:24:46 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 233,280 | 350,127 | 1.5009 |
16 May 2014 06:04:38 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 207,360 | 314,168 | 1.5151 |
14 May 2014 07:31:54 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 181,440 | 283,183 | 1.5608 |
13 May 2014 07:38:18 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 155,520 | 234,167 | 1.5057 |
12 May 2014 17:47:01 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 129,600 | 192,423 | 1.4847 |
12 May 2014 07:37:28 | 1325699 | 16585594 | hadcm3n_88s8_1980_40_008720803_0 | 103,680 | 160,641 | 1.5494 |
©2024 cpdn.org