Name | hadcm3n_z8rr_1880_40_008247540_4 |
Workunit | 8402664 |
Created | 27 Mar 2013, 4:31:20 UTC |
Sent | 27 Mar 2013, 4:31:27 UTC |
Report deadline | 26 Jun 2013, 11:58:38 UTC |
Received | 4 Apr 2013, 6:07:38 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1270025 |
Run time | 7 days 23 hours 46 min 5 sec |
CPU time | 6 days 20 hours 32 min 42 sec |
Validate state | Invalid |
Credit | 4,354.56 |
Device peak FLOPS | 3.25 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... 23:50:27 (12068): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:52:03 (11436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:58:44 (13140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 00:05:50 (12140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:06:37 (11056): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:07:09 (12540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:08:13 (14208): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:40:08 (14984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:40:09 (14984): No heartbeat from core client for 30 sec - exiting 14:40:10 (14984): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:32:10 (14904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:32:11 (14904): No heartbeat from core client for 30 sec - exiting 17:32:12 (14904): No heartbeat from core client for 30 sec - exiting 17:32:56 (10936): No heartbeat from core client for 30 sec - exiting 17:32:57 (10936): No heartbeat from core client for 30 sec - exiting 17:32:58 (10936): No heartbeat from core client for 30 sec - exiting 17:32:59 (10936): No heartbeat from core client for 30 sec - exiting 17:33:00 (10936): No heartbeat from core client for 30 sec - exiting 17:33:01 (10936): No heartbeat from core client for 30 sec - exiting 17:33:02 (10936): No heartbeat from core client for 30 sec - exiting 17:33:03 (10936): No heartbeat from core client for 30 sec - exiting 17:33:04 (10936): No heartbeat from core client for 30 sec - exiting 17:33:05 (10936): No heartbeat from core client for 30 sec - exiting 17:33:06 (10936): No heartbeat from core client for 30 sec - exiting 17:33:07 (10936): No heartbeat from core client for 30 sec - exiting 17:33:08 (10936): No heartbeat from core client for 30 sec - exiting 17:33:09 (10936): No heartbeat from core client for 30 sec - exiting 17:33:10 (10936): No heartbeat from core client for 30 sec - exiting 17:33:11 (10936): No heartbeat from core client for 30 sec - exiting 17:33:12 (10936): No heartbeat from core client for 30 sec - exiting 17:33:13 (10936): No heartbeat from core client for 30 sec - exiting 17:33:14 (10936): No heartbeat from core client for 30 sec - exiting 17:33:15 (10936): No heartbeat from core client for 30 sec - exiting 17:33:16 (10936): No heartbeat from core client for 30 sec - exiting 17:33:17 (10936): No heartbeat from core client for 30 sec - exiting 17:33:18 (10936): No heartbeat from core client for 30 sec - exiting 17:33:19 (10936): No heartbeat from core client for 30 sec - exiting 17:33:20 (10936): No heartbeat from core client for 30 sec - exiting 17:33:21 (10936): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:34:49 (23064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:03:52 (7336): No heartbeat from core client for 30 sec - exiting 23:03:54 (7336): No heartbeat from core client for 30 sec - exiting 23:03:55 (7336): No heartbeat from core client for 30 sec - exiting 23:03:56 (7336): No heartbeat from core client for 30 sec - exiting 23:03:57 (7336): No heartbeat from core client for 30 sec - exiting 23:03:58 (7336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=252, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=252, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=252, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=252, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=252, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=252, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
03 Apr 2013 22:01:12 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 362,880 | 577,708 | 1.5920 |
03 Apr 2013 08:10:15 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 336,960 | 535,358 | 1.5888 |
02 Apr 2013 17:13:21 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 311,040 | 493,087 | 1.5853 |
02 Apr 2013 04:06:51 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 285,120 | 451,493 | 1.5835 |
01 Apr 2013 14:25:36 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 259,200 | 409,658 | 1.5805 |
31 Mar 2013 23:51:08 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 233,280 | 364,738 | 1.5635 |
31 Mar 2013 10:14:35 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 207,360 | 321,959 | 1.5527 |
30 Mar 2013 21:35:36 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 181,440 | 279,335 | 1.5395 |
30 Mar 2013 08:09:10 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 155,520 | 236,892 | 1.5232 |
29 Mar 2013 17:55:22 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 129,600 | 194,950 | 1.5042 |
29 Mar 2013 05:29:26 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 103,680 | 153,910 | 1.4845 |
28 Mar 2013 17:22:18 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 77,760 | 114,688 | 1.4749 |
28 Mar 2013 05:27:31 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 51,840 | 76,457 | 1.4749 |
27 Mar 2013 17:11:18 | 1270025 | 15686372 | hadcm3n_z8rr_1880_40_008247540_4 | 25,920 | 37,989 | 1.4656 |
©2024 cpdn.org