Name | hadcm3n_4lvf_1940_40_008305935_1 |
Workunit | 8457070 |
Created | 9 May 2013, 15:31:44 UTC |
Sent | 9 May 2013, 15:32:18 UTC |
Report deadline | 8 Aug 2013, 22:59:29 UTC |
Received | 11 Jun 2013, 16:21:06 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1204099 |
Run time | 17 days 12 hours 15 min 19 sec |
CPU time | 17 days 6 hours 21 min 57 sec |
Validate state | Invalid |
Credit | 9,642.24 |
Device peak FLOPS | 2.29 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 17:20:11 (8492): No heartbeat from core client for 30 sec - exiting 17:20:12 (8492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 17:31:55 (15664): No heartbeat from core client for 30 sec - exiting 17:31:56 (15664): No heartbeat from core client for 30 sec - exiting 17:31:57 (15664): No heartbeat from core client for 30 sec - exiting 17:31:58 (15664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 17:30:12 (16752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 17:34:27 (3852): No heartbeat from core client for 30 sec - exiting 17:34:28 (3852): No heartbeat from core client for 30 sec - exiting 17:34:29 (3852): No heartbeat from core client for 30 sec - exiting 17:34:30 (3852): No heartbeat from core client for 30 sec - exiting 17:34:31 (3852): No heartbeat from core client for 30 sec - exiting 17:34:32 (3852): No heartbeat from core client for 30 sec - exiting 17:34:33 (3852): No heartbeat from core client for 30 sec - exiting 17:34:34 (3852): No heartbeat from core client for 30 sec - exiting 17:34:35 (3852): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 18:31:25 (13796): No heartbeat from core client for 30 sec - exiting 18:31:26 (13796): No heartbeat from core client for 30 sec - exiting 18:31:27 (13796): No heartbeat from core client for 30 sec - exiting 18:31:28 (13796): No heartbeat from core client for 30 sec - exiting 18:31:29 (13796): No heartbeat from core client for 30 sec - exiting 18:31:30 (13796): No heartbeat from core client for 30 sec - exiting 18:31:31 (13796): No heartbeat from core client for 30 sec - exiting 18:31:32 (13796): No heartbeat from core client for 30 sec - exiting 18:31:33 (13796): No heartbeat from core client for 30 sec - exiting 18:31:34 (13796): No heartbeat from core client for 30 sec - exiting 18:31:35 (13796): No heartbeat from core client for 30 sec - exiting 18:31:36 (13796): No heartbeat from core client for 30 sec - exiting 18:31:37 (13796): No heartbeat from core client for 30 sec - exiting 18:31:38 (13796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 17:28:37 (18816): No heartbeat from core client for 30 sec - exiting 17:28:38 (18816): No heartbeat from core client for 30 sec - exiting 17:28:39 (18816): No heartbeat from core client for 30 sec - exiting 17:28:40 (18816): No heartbeat from core client for 30 sec - exiting 17:28:41 (18816): No heartbeat from core client for 30 sec - exiting 17:28:42 (18816): No heartbeat from core client for 30 sec - exiting 17:28:43 (18816): No heartbeat from core client for 30 sec - exiting 17:28:44 (18816): No heartbeat from core client for 30 sec - exiting 17:28:45 (18816): No heartbeat from core client for 30 sec - exiting 17:28:46 (18816): No heartbeat from core client for 30 sec - exiting 17:28:47 (18816): No heartbeat from core client for 30 sec - exiting 17:28:48 (18816): No heartbeat from core client for 30 sec - exiting 17:28:49 (18816): No heartbeat from core client for 30 sec - exiting 17:28:50 (18816): No heartbeat from core client for 30 sec - exiting 17:28:51 (18816): No heartbeat from core client for 30 sec - exiting 17:28:52 (18816): No heartbeat from core client for 30 sec - exiting 17:28:53 (18816): No heartbeat from core client for 30 sec - exiting 17:28:54 (18816): No heartbeat from core client for 30 sec - exiting 17:28:55 (18816): No heartbeat from core client for 30 sec - exiting 17:28:56 (18816): No heartbeat from core client for 30 sec - exiting 17:28:57 (18816): No heartbeat from core client for 30 sec - exiting 17:28:58 (18816): No heartbeat from core client for 30 sec - exiting 17:28:59 (18816): No heartbeat from core client for 30 sec - exiting 17:29:00 (18816): No heartbeat from core client for 30 sec - exiting 17:29:01 (18816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 17:42:15 (5452): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 17:26:29 (11204): No heartbeat from core client for 30 sec - exiting 17:26:30 (11204): No heartbeat from core client for 30 sec - exiting 17:26:31 (11204): No heartbeat from core client for 30 sec - exiting 17:26:32 (11204): No heartbeat from core client for 30 sec - exiting 17:26:33 (11204): No heartbeat from core client for 30 sec - exiting 17:26:34 (11204): No heartbeat from core client for 30 sec - exiting 17:26:35 (11204): No heartbeat from core client for 30 sec - exiting 17:26:36 (11204): No heartbeat from core client for 30 sec - exiting 17:26:38 (11204): No heartbeat from core client for 30 sec - exiting 17:26:39 (11204): No heartbeat from core client for 30 sec - exiting 17:26:40 (11204): No heartbeat from core client for 30 sec - exiting 17:26:41 (11204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5084, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9184, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8600, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8600, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6692, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3828, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
01 Jun 2013 22:09:39 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 803,520 | 1,458,445 | 1.8151 |
01 Jun 2013 07:48:16 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 777,600 | 1,409,344 | 1.8124 |
31 May 2013 17:52:17 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 751,680 | 1,360,861 | 1.8104 |
29 May 2013 19:27:25 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 725,760 | 1,312,247 | 1.8081 |
28 May 2013 20:37:03 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 699,840 | 1,263,689 | 1.8057 |
27 May 2013 21:53:57 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 673,920 | 1,215,315 | 1.8034 |
26 May 2013 21:44:36 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 648,000 | 1,167,454 | 1.8016 |
26 May 2013 08:24:57 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 622,080 | 1,120,336 | 1.8010 |
25 May 2013 19:30:09 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 596,160 | 1,074,413 | 1.8022 |
25 May 2013 06:31:21 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 570,240 | 1,027,887 | 1.8026 |
24 May 2013 17:30:08 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 544,320 | 981,313 | 1.8028 |
24 May 2013 04:46:31 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 518,400 | 936,133 | 1.8058 |
23 May 2013 16:26:39 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 492,480 | 892,228 | 1.8117 |
22 May 2013 17:55:03 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 466,560 | 844,081 | 1.8092 |
21 May 2013 20:07:03 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 440,640 | 797,924 | 1.8108 |
19 May 2013 22:26:51 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 414,720 | 752,651 | 1.8148 |
19 May 2013 09:52:31 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 388,800 | 707,785 | 1.8204 |
18 May 2013 21:22:41 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 362,880 | 663,111 | 1.8274 |
18 May 2013 00:10:12 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 336,960 | 616,862 | 1.8307 |
17 May 2013 02:37:56 | 1204099 | 15768756 | hadcm3n_4lvf_1940_40_008305935_1 | 311,040 | 572,308 | 1.8400 |
©2024 cpdn.org