Name | hadcm3n_3cik_2020_40_008365819_0 |
Workunit | 8516678 |
Created | 11 May 2013, 2:44:59 UTC |
Sent | 11 May 2013, 2:52:32 UTC |
Report deadline | 10 Aug 2013, 10:19:43 UTC |
Received | 2 Jun 2013, 7:29:56 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1158176 |
Run time | 21 days 10 hours 43 min 46 sec |
CPU time | 19 days 15 hours 55 min 43 sec |
Validate state | Invalid |
Credit | 9,642.24 |
Device peak FLOPS | 2.83 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10104, iMonCtr=1 Model crash detected, will try to restart... 06:06:09 (13972): No heartbeat from core client for 30 sec - exiting 06:06:10 (13972): No heartbeat from core client for 30 sec - exiting 06:06:11 (13972): No heartbeat from core client for 30 sec - exiting 06:06:12 (13972): No heartbeat from core client for 30 sec - exiting 06:06:13 (13972): No heartbeat from core client for 30 sec - exiting 06:06:14 (13972): No heartbeat from core client for 30 sec - exiting 06:06:15 (13972): No heartbeat from core client for 30 sec - exiting 06:06:16 (13972): No heartbeat from core client for 30 sec - exiting 06:06:17 (13972): No heartbeat from core client for 30 sec - exiting 06:06:18 (13972): No heartbeat from core client for 30 sec - exiting 06:06:19 (13972): No heartbeat from core client for 30 sec - exiting 06:06:20 (13972): No heartbeat from core client for 30 sec - exiting 06:06:21 (13972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:07:49 (1188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:00:56 (3564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:01:57 (2456): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:08:22 (5540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=468, iMonCtr=1 Model crash detected, will try to restart... 06:24:11 (5368): No heartbeat from core client for 30 sec - exiting 06:24:12 (5368): No heartbeat from core client for 30 sec - exiting 06:24:13 (5368): No heartbeat from core client for 30 sec - exiting 06:24:14 (5368): No heartbeat from core client for 30 sec - exiting 06:24:15 (5368): No heartbeat from core client for 30 sec - exiting 06:24:16 (5368): No heartbeat from core client for 30 sec - exiting 06:24:17 (5368): No heartbeat from core client for 30 sec - exiting 06:24:18 (5368): No heartbeat from core client for 30 sec - exiting 06:24:19 (5368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:25:24 (4420): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:26:36 (6580): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:28:19 (5940): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:35:30 (1816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:47:42 (11972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:09:00 (23868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:16:43 (5256): No heartbeat from core client for 30 sec - exiting 22:16:44 (5256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:23:21 (6072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:25:44 (2904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:52:55 (6708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:21:51 (7780): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:24:40 (6324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:28:35 (7212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:28:36 (7212): No heartbeat from core client for 30 sec - exiting 05:28:37 (7212): No heartbeat from core client for 30 sec - exiting 05:28:38 (7212): No heartbeat from core client for 30 sec - exiting 05:28:39 (7212): No heartbeat from core client for 30 sec - exiting 05:28:40 (7212): No heartbeat from core client for 30 sec - exiting 05:28:41 (7212): No heartbeat from core client for 30 sec - exiting 05:28:43 (7212): No heartbeat from core client for 30 sec - exiting 05:28:44 (7212): No heartbeat from core client for 30 sec - exiting 05:28:45 (7212): No heartbeat from core client for 30 sec - exiting 05:28:46 (7212): No heartbeat from core client for 30 sec - exiting 06:08:59 (6192): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:14:39 (1040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:01:42 (6616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:48:51 (6412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:38:37 (4164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:55:25 (5196): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:17:59 (16924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:20:38 (7728): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:17:26 (22408): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:18:28 (43788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:20:45 (43872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:20:46 (43872): No heartbeat from core client for 30 sec - exiting 19:20:47 (43872): No heartbeat from core client for 30 sec - exiting 19:20:48 (43872): No heartbeat from core client for 30 sec - exiting 19:20:49 (43872): No heartbeat from core client for 30 sec - exiting 19:20:50 (43872): No heartbeat from core client for 30 sec - exiting 19:20:51 (43872): No heartbeat from core client for 30 sec - exiting 19:20:52 (43872): No heartbeat from core client for 30 sec - exiting 19:20:53 (43872): No heartbeat from core client for 30 sec - exiting 19:20:54 (43872): No heartbeat from core client for 30 sec - exiting 19:20:55 (43872): No heartbeat from core client for 30 sec - exiting 23:58:41 (43064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4700, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4700, iMonCtr=1 Model crash detected, will try to restart... 20:49:47 (4700): No heartbeat from core client for 30 sec - exiting 20:49:48 (4700): No heartbeat from core client for 30 sec - exiting 20:49:49 (4700): No heartbeat from core client for 30 sec - exiting 20:49:50 (4700): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=788, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=788, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=788, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=788, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
01 Jun 2013 08:03:27 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 803,520 | 1,651,328 | 2.0551 |
31 May 2013 11:02:09 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 777,600 | 1,584,469 | 2.0376 |
30 May 2013 14:27:42 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 751,680 | 1,522,082 | 2.0249 |
29 May 2013 20:44:44 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 725,760 | 1,471,269 | 2.0272 |
29 May 2013 05:06:38 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 699,840 | 1,421,102 | 2.0306 |
28 May 2013 14:25:50 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 673,920 | 1,372,082 | 2.0360 |
27 May 2013 23:09:53 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 648,000 | 1,320,004 | 2.0370 |
27 May 2013 09:03:58 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 622,080 | 1,271,478 | 2.0439 |
26 May 2013 20:07:31 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 596,160 | 1,223,898 | 2.0530 |
26 May 2013 05:54:17 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 570,240 | 1,177,399 | 2.0647 |
25 May 2013 16:29:52 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 544,320 | 1,131,042 | 2.0779 |
25 May 2013 02:29:52 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 518,400 | 1,083,060 | 2.0892 |
24 May 2013 12:34:05 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 492,480 | 1,036,434 | 2.1045 |
23 May 2013 23:24:38 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 466,560 | 991,418 | 2.1250 |
23 May 2013 10:03:31 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 440,640 | 945,405 | 2.1455 |
22 May 2013 18:59:56 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 414,720 | 896,189 | 2.1609 |
22 May 2013 00:15:01 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 388,800 | 850,211 | 2.1868 |
21 May 2013 08:17:47 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 362,880 | 797,670 | 2.1982 |
20 May 2013 15:38:31 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 336,960 | 743,853 | 2.2075 |
20 May 2013 00:07:52 | 1158176 | 15775893 | hadcm3n_3cik_2020_40_008365819_0 | 311,040 | 691,735 | 2.2239 |
©2024 climateprediction.net