Name | hadcm3n_3loe_2020_40_008393288_4 |
Workunit | 8544147 |
Created | 27 Jun 2013, 6:55:37 UTC |
Sent | 27 Jun 2013, 8:08:47 UTC |
Report deadline | 26 Sep 2013, 15:35:58 UTC |
Received | 27 Sep 2013, 13:57:52 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 963814 |
Run time | 12 days 9 hours 36 min 43 sec |
CPU time | 11 days 22 hours 1 min 6 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 3.03 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... 10:04:42 (4648): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7828, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5872, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5872, iMonCtr=1 Model crash detected, will try to restart... 12:08:31 (5924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6520, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2324, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5708, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5708, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 11:08:20 (2820): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2432, iMonCtr=1 Model crash detected, will try to restart... 09:14:46 (5960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:26:39 (5168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:49:52 (4888): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 13:14:58 (4868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:37:08 (5652): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 09:39:44 (2844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:47:52 (3820): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:48:06 (5528): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:39:45 (4516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:30:50 (3856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:04:56 (5892): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:05:47 (4744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6424, iMonCtr=1 Model crash detected, will try to restart... 09:14:40 (3476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:13:01 (3736): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:28:23 (3236): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:32:38 (3644): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:06:44 (5104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:47:39 (3512): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4008, iMonCtr=1 Model crash detected, will try to restart... 09:10:50 (3544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:00:21 (3648): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:09:26 (2532): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6336, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 07:57:42 (3316): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:40:43 (4752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6732, iMonCtr=1 Model crash detected, will try to restart... 09:14:18 (4684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2208, iMonCtr=1 Model crash detected, will try to restart... 08:54:25 (4732): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:56:51 (5828): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:32:00 (4900): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:34:08 (4464): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:45:34 (3884): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1 Model crash detected, will try to restart... 08:51:21 (3856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:59:31 (4004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:02:27 (3792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:00:52 (5900): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:45:46 (6416): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5956, iMonCtr=1 Model crash detected, will try to restart... 08:54:37 (3540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:47:19 (7584): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:16:38 (3916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:52:43 (3664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:20:41 (4600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:13:06 (2696): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3728, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3728, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3728, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3728, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3728, iMonCtr=1 Model crash detected, will try to restart... 09:59:01 (3864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:55:18 (3544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:21:54 (4828): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:58:02 (5660): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 10:00:38 (3800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:10:27 (4328): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:46:55 (7080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:58:15 (5156): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
27 Sep 2013 13:59:41 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 777,600 | 1,029,657 | 1.3241 |
26 Sep 2013 13:09:26 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 751,680 | 998,105 | 1.3278 |
20 Sep 2013 12:30:42 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 725,760 | 965,732 | 1.3306 |
18 Sep 2013 08:07:51 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 699,840 | 932,515 | 1.3325 |
18 Sep 2013 08:07:51 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 699,840 | 932,515 | 1.3325 |
16 Sep 2013 11:07:42 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 673,920 | 899,544 | 1.3348 |
12 Sep 2013 14:51:35 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 648,000 | 866,249 | 1.3368 |
10 Sep 2013 13:40:51 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 622,080 | 833,348 | 1.3396 |
07 Sep 2013 13:05:10 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 596,160 | 800,272 | 1.3424 |
06 Sep 2013 12:58:25 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 570,240 | 767,403 | 1.3458 |
05 Sep 2013 11:48:53 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 544,320 | 734,429 | 1.3493 |
04 Sep 2013 10:22:21 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 518,400 | 700,837 | 1.3519 |
03 Sep 2013 08:45:29 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 492,480 | 667,345 | 1.3551 |
19 Aug 2013 14:30:50 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 466,560 | 634,016 | 1.3589 |
16 Aug 2013 13:00:28 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 440,640 | 600,499 | 1.3628 |
15 Aug 2013 11:21:25 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 414,720 | 567,078 | 1.3674 |
14 Aug 2013 16:08:22 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 388,800 | 533,843 | 1.3731 |
14 Aug 2013 16:08:22 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 362,880 | 500,687 | 1.3798 |
14 Aug 2013 16:08:22 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 336,960 | 467,063 | 1.3861 |
14 Aug 2013 16:08:22 | 963814 | 15869128 | hadcm3n_3loe_2020_40_008393288_4 | 311,040 | 433,631 | 1.3941 |
©2025 cpdn.org