Name | hadcm3n_zmss_1960_40_008364987_0 |
Workunit | 8515846 |
Created | 10 May 2013, 22:59:11 UTC |
Sent | 10 May 2013, 23:01:20 UTC |
Report deadline | 10 Aug 2013, 6:28:31 UTC |
Received | 1 Jun 2013, 18:40:14 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1203942 |
Run time | 4 days 16 hours 43 min 19 sec |
CPU time | 4 days 9 hours 34 min 1 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 2.62 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> 04:58:34 (4932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:58:55 (4932): No heartbeat from core client for 30 sec - exiting 04:58:57 (4932): No heartbeat from core client for 30 sec - exiting 04:58:58 (4932): No heartbeat from core client for 30 sec - exiting 04:59:00 (4932): No heartbeat from core client for 30 sec - exiting 04:59:01 (4932): No heartbeat from core client for 30 sec - exiting 05:11:44 (5732): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:25:03 (4140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:25:11 (4140): No heartbeat from core client for 30 sec - exiting 05:25:13 (4140): No heartbeat from core client for 30 sec - exiting 05:25:14 (4140): No heartbeat from core client for 30 sec - exiting 05:25:15 (4140): No heartbeat from core client for 30 sec - exiting 05:25:16 (4140): No heartbeat from core client for 30 sec - exiting 05:25:17 (4140): No heartbeat from core client for 30 sec - exiting 05:25:18 (4140): No heartbeat from core client for 30 sec - exiting 05:25:19 (4140): No heartbeat from core client for 30 sec - exiting 17:04:59 (5072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:05:06 (5072): No heartbeat from core client for 30 sec - exiting 17:05:07 (5072): No heartbeat from core client for 30 sec - exiting 17:05:08 (5072): No heartbeat from core client for 30 sec - exiting 17:05:09 (5072): No heartbeat from core client for 30 sec - exiting 17:05:10 (5072): No heartbeat from core client for 30 sec - exiting 17:05:11 (5072): No heartbeat from core client for 30 sec - exiting 17:05:12 (5072): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2056, iMonCtr=1 Model crash detected, will try to restart... 05:50:45 (5564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:50:55 (5564): No heartbeat from core client for 30 sec - exiting 05:50:56 (5564): No heartbeat from core client for 30 sec - exiting 05:50:57 (5564): No heartbeat from core client for 30 sec - exiting 05:50:58 (5564): No heartbeat from core client for 30 sec - exiting 05:50:59 (5564): No heartbeat from core client for 30 sec - exiting 05:51:01 (5564): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 06:02:55 (6216): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:12:56 (6668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:20:16 (9004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=816, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1456, iMonCtr=1 Model crash detected, will try to restart... 17:06:12 (516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:06:14 (516): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3348, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2284, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 18:32:49 (5112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:32:51 (5112): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5236, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5380, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4764, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4544, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4352, iMonCtr=1 Model crash detected, will try to restart... 15:56:05 (3424): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:56:07 (3424): No heartbeat from core client for 30 sec - exiting 15:56:08 (3424): No heartbeat from core client for 30 sec - exiting 16:02:16 (2516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4388, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3480, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10816, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 04:46:48 (2628): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4800, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1140, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2344, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 20:37:43 (444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5808, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3144, iMonCtr=1 Model crash detected, will try to restart... 13:05:24 (9916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:05:26 (9916): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2992, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CSuspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
01 Jun 2013 18:41:42 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 259,200 | 380,039 | 1.4662 |
29 May 2013 00:30:02 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 233,280 | 342,262 | 1.4672 |
27 May 2013 02:16:41 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 207,360 | 305,371 | 1.4727 |
26 May 2013 15:42:11 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 181,440 | 268,886 | 1.4820 |
25 May 2013 09:02:14 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 155,520 | 230,412 | 1.4816 |
20 May 2013 09:31:10 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 129,600 | 193,033 | 1.4895 |
19 May 2013 03:00:08 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 103,680 | 155,585 | 1.5006 |
16 May 2013 23:56:53 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 77,760 | 117,561 | 1.5118 |
16 May 2013 00:34:17 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 51,840 | 77,702 | 1.4989 |
12 May 2013 19:41:19 | 1203942 | 15774690 | hadcm3n_zmss_1960_40_008364987_0 | 25,920 | 38,378 | 1.4806 |
©2024 cpdn.org