Name | hadcm3n_zbmn_1880_40_008248148_2 |
Workunit | 8403272 |
Created | 21 Nov 2012, 13:08:46 UTC |
Sent | 21 Nov 2012, 13:08:55 UTC |
Report deadline | 20 Feb 2013, 20:36:06 UTC |
Received | 21 Dec 2012, 2:59:22 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1175567 |
Run time | 29 days 3 hours 49 min 32 sec |
CPU time | 26 days 22 hours 26 min 28 sec |
Validate state | Invalid |
Credit | 11,197.44 |
Device peak FLOPS | 1.86 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.42</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... 22:29:18 (2712): No heartbeat from core client for 30 sec - exiting 22:29:19 (2712): No heartbeat from core client for 30 sec - exiting 22:29:20 (2712): No heartbeat from core client for 30 sec - exiting 22:29:21 (2712): No heartbeat from core client for 30 sec - exiting 22:29:22 (2712): No heartbeat from core client for 30 sec - exiting 22:29:23 (2712): No heartbeat from core client for 30 sec - exiting 22:29:25 (2712): No heartbeat from core client for 30 sec - exiting 22:29:26 (2712): No heartbeat from core client for 30 sec - exiting 22:29:27 (2712): No heartbeat from core client for 30 sec - exiting 22:29:28 (2712): No heartbeat from core client for 30 sec - exiting 22:29:29 (2712): No heartbeat from core client for 30 sec - exiting 22:29:30 (2712): No heartbeat from core client for 30 sec - exiting 22:29:31 (2712): No heartbeat from core client for 30 sec - exiting 22:29:32 (2712): No heartbeat from core client for 30 sec - exiting 22:29:33 (2712): No heartbeat from core client for 30 sec - exiting 22:29:34 (2712): No heartbeat from core client for 30 sec - exiting 22:29:36 (2712): No heartbeat from core client for 30 sec - exiting 22:29:37 (2712): No heartbeat from core client for 30 sec - exiting 22:29:38 (2712): No heartbeat from core client for 30 sec - exiting 22:29:39 (2712): No heartbeat from core client for 30 sec - exiting 22:29:40 (2712): No heartbeat from core client for 30 sec - exiting 22:29:41 (2712): No heartbeat from core client for 30 sec - exiting 22:29:42 (2712): No heartbeat from core client for 30 sec - exiting 22:29:43 (2712): No heartbeat from core client for 30 sec - exiting 22:29:44 (2712): No heartbeat from core client for 30 sec - exiting 22:29:45 (2712): No heartbeat from core client for 30 sec - exiting 22:29:46 (2712): No heartbeat from core client for 30 sec - exiting 22:29:48 (2712): No heartbeat from core client for 30 sec - exiting 22:29:49 (2712): No heartbeat from core client for 30 sec - exiting 22:29:50 (2712): No heartbeat from core client for 30 sec - exiting 22:29:51 (2712): No heartbeat from core client for 30 sec - exiting 22:29:52 (2712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:10:42 (5612): No heartbeat from core client for 30 sec - exiting 01:10:43 (5612): No heartbeat from core client for 30 sec - exiting 01:10:44 (5612): No heartbeat from core client for 30 sec - exiting 01:10:45 (5612): No heartbeat from core client for 30 sec - exiting 01:10:46 (5612): No heartbeat from core client for 30 sec - exiting 01:10:48 (5612): No heartbeat from core client for 30 sec - exiting 01:10:49 (5612): No heartbeat from core client for 30 sec - exiting 01:10:50 (5612): No heartbeat from core client for 30 sec - exiting 01:10:51 (5612): No heartbeat from core client for 30 sec - exiting 01:10:52 (5612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:00:01 (6860): No heartbeat from core client for 30 sec - exiting 16:00:03 (6860): No heartbeat from core client for 30 sec - exiting 16:00:04 (6860): No heartbeat from core client for 30 sec - exiting 16:00:05 (6860): No heartbeat from core client for 30 sec - exiting 16:00:06 (6860): No heartbeat from core client for 30 sec - exiting 16:00:07 (6860): No heartbeat from core client for 30 sec - exiting 16:00:08 (6860): No heartbeat from core client for 30 sec - exiting 16:00:09 (6860): No heartbeat from core client for 30 sec - exiting 16:00:10 (6860): No heartbeat from core client for 30 sec - exiting 16:00:11 (6860): No heartbeat from core client for 30 sec - exiting 16:00:12 (6860): No heartbeat from core client for 30 sec - exiting 16:00:13 (6860): No heartbeat from core client for 30 sec - exiting 16:00:15 (6860): No heartbeat from core client for 30 sec - exiting 16:00:16 (6860): No heartbeat from core client for 30 sec - exiting 16:00:17 (6860): No heartbeat from core client for 30 sec - exiting 16:00:18 (6860): No heartbeat from core client for 30 sec - exiting 16:00:19 (6860): No heartbeat from core client for 30 sec - exiting 16:00:20 (6860): No heartbeat from core client for 30 sec - exiting 16:00:21 (6860): No heartbeat from core client for 30 sec - exiting 16:00:22 (6860): No heartbeat from core client for 30 sec - exiting 16:00:23 (6860): No heartbeat from core client for 30 sec - exiting 16:00:24 (6860): No heartbeat from core client for 30 sec - exiting 16:00:25 (6860): No heartbeat from core client for 30 sec - exiting 16:00:27 (6860): No heartbeat from core client for 30 sec - exiting 16:00:28 (6860): No heartbeat from core client for 30 sec - exiting 16:00:29 (6860): No heartbeat from core client for 30 sec - exiting 16:00:30 (6860): No heartbeat from core client for 30 sec - exiting 16:00:31 (6860): No heartbeat from core client for 30 sec - exiting 16:00:32 (6860): No heartbeat from core client for 30 sec - exiting 16:00:33 (6860): No heartbeat from core client for 30 sec - exiting 16:00:34 (6860): No heartbeat from core client for 30 sec - exiting 16:00:35 (6860): No heartbeat from core client for 30 sec - exiting 16:00:36 (6860): No heartbeat from core client for 30 sec - exiting 16:00:37 (6860): No heartbeat from core client for 30 sec - exiting 16:00:39 (6860): No heartbeat from core client for 30 sec - exiting 16:00:40 (6860): No heartbeat from core client for 30 sec - exiting 16:00:41 (6860): No heartbeat from core client for 30 sec - exiting 16:00:42 (6860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 03:25:08 (7868): No heartbeat from core client for 30 sec - exiting 03:25:09 (7868): No heartbeat from core client for 30 sec - exiting 03:25:10 (7868): No heartbeat from core client for 30 sec - exiting 03:25:11 (7868): No heartbeat from core client for 30 sec - exiting 03:25:12 (7868): No heartbeat from core client for 30 sec - exiting 03:25:13 (7868): No heartbeat from core client for 30 sec - exiting 03:25:15 (7868): No heartbeat from core client for 30 sec - exiting 03:25:16 (7868): No heartbeat from core client for 30 sec - exiting 03:25:17 (7868): No heartbeat from core client for 30 sec - exiting 03:25:18 (7868): No heartbeat from core client for 30 sec - exiting 03:25:19 (7868): No heartbeat from core client for 30 sec - exiting 03:25:20 (7868): No heartbeat from core client for 30 sec - exiting 03:25:21 (7868): No heartbeat from core client for 30 sec - exiting 03:25:22 (7868): No heartbeat from core client for 30 sec - exiting 03:25:23 (7868): No heartbeat from core client for 30 sec - exiting 03:25:24 (7868): No heartbeat from core client for 30 sec - exiting 03:25:25 (7868): No heartbeat from core client for 30 sec - exiting 03:25:27 (7868): No heartbeat from core client for 30 sec - exiting 03:25:28 (7868): No heartbeat from core client for 30 sec - exiting 03:25:29 (7868): No heartbeat from core client for 30 sec - exiting 03:25:30 (7868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:25:31 (7868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7576, iMonCtr=1 Model crash detected, will try to restart... 16:20:54 (2500): No heartbeat from core client for 30 sec - exiting 16:20:55 (2500): No heartbeat from core client for 30 sec - exiting 16:20:56 (2500): No heartbeat from core client for 30 sec - exiting 16:20:57 (2500): No heartbeat from core client for 30 sec - exiting 16:20:58 (2500): No heartbeat from core client for 30 sec - exiting 16:20:59 (2500): No heartbeat from core client for 30 sec - exiting 16:21:00 (2500): No heartbeat from core client for 30 sec - exiting 16:21:02 (2500): No heartbeat from core client for 30 sec - exiting 16:21:03 (2500): No heartbeat from core client for 30 sec - exiting 16:21:04 (2500): No heartbeat from core client for 30 sec - exiting 16:21:05 (2500): No heartbeat from core client for 30 sec - exiting 16:21:06 (2500): No heartbeat from core client for 30 sec - exiting 16:21:07 (2500): No heartbeat from core client for 30 sec - exiting 16:21:08 (2500): No heartbeat from core client for 30 sec - exiting 16:21:09 (2500): No heartbeat from core client for 30 sec - exiting 16:21:10 (2500): No heartbeat from core client for 30 sec - exiting 16:21:11 (2500): No heartbeat from core client for 30 sec - exiting 16:21:13 (2500): No heartbeat from core client for 30 sec - exiting 16:21:14 (2500): No heartbeat from core client for 30 sec - exiting 16:21:15 (2500): No heartbeat from core client for 30 sec - exiting 16:21:16 (2500): No heartbeat from core client for 30 sec - exiting 16:21:17 (2500): No heartbeat from core client for 30 sec - exiting 16:21:18 (2500): No heartbeat from core client for 30 sec - exiting 16:21:19 (2500): No heartbeat from core client for 30 sec - exiting 16:21:20 (2500): No heartbeat from core client for 30 sec - exiting 16:21:21 (2500): No heartbeat from core client for 30 sec - exiting 16:21:22 (2500): No heartbeat from core client for 30 sec - exiting 16:21:23 (2500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:21:25 (2500): No heartbeat from core client for 30 sec - exiting 16:21:26 (2500): No heartbeat from core client for 30 sec - exiting 16:21:27 (2500): No heartbeat from core client for 30 sec - exiting 16:21:28 (2500): No heartbeat from core client for 30 sec - exiting 16:21:29 (2500): No heartbeat from core client for 30 sec - exiting 16:21:30 (2500): No heartbeat from core client for 30 sec - exiting 16:21:31 (2500): No heartbeat from core client for 30 sec - exiting 16:21:32 (2500): No heartbeat from core client for 30 sec - exiting 21:31:39 (6104): No heartbeat from core client for 30 sec - exiting 21:31:41 (6104): No heartbeat from core client for 30 sec - exiting 21:31:42 (6104): No heartbeat from core client for 30 sec - exiting 21:31:43 (6104): No heartbeat from core client for 30 sec - exiting 21:31:44 (6104): No heartbeat from core client for 30 sec - exiting 21:31:45 (6104): No heartbeat from core client for 30 sec - exiting 21:31:46 (6104): No heartbeat from core client for 30 sec - exiting 21:31:47 (6104): No heartbeat from core client for 30 sec - exiting 21:31:48 (6104): No heartbeat from core client for 30 sec - exiting 21:31:49 (6104): No heartbeat from core client for 30 sec - exiting 21:31:50 (6104): No heartbeat from core client for 30 sec - exiting 21:31:52 (6104): No heartbeat from core client for 30 sec - exiting 21:31:53 (6104): No heartbeat from core client for 30 sec - exiting 21:31:54 (6104): No heartbeat from core client for 30 sec - exiting 21:31:55 (6104): No heartbeat from core client for 30 sec - exiting 21:31:56 (6104): No heartbeat from core client for 30 sec - exiting 21:31:57 (6104): No heartbeat from core client for 30 sec - exiting 21:31:58 (6104): No heartbeat from core client for 30 sec - exiting 21:31:59 (6104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:32:00 (6104): No heartbeat from core client for 30 sec - exiting 21:32:01 (6104): No heartbeat from core client for 30 sec - exiting 21:32:02 (6104): No heartbeat from core client for 30 sec - exiting 21:32:04 (6104): No heartbeat from core client for 30 sec - exiting 21:32:05 (6104): No heartbeat from core client for 30 sec - exiting 21:32:06 (6104): No heartbeat from core client for 30 sec - exiting 21:32:07 (6104): No heartbeat from core client for 30 sec - exiting 21:32:08 (6104): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 23:07:57 (4612): No heartbeat from core client for 30 sec - exiting 23:07:58 (4612): No heartbeat from core client for 30 sec - exiting 23:07:59 (4612): No heartbeat from core client for 30 sec - exiting 23:08:00 (4612): No heartbeat from core client for 30 sec - exiting 23:08:01 (4612): No heartbeat from core client for 30 sec - exiting 23:08:02 (4612): No heartbeat from core client for 30 sec - exiting 23:08:03 (4612): No heartbeat from core client for 30 sec - exiting 23:08:05 (4612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1744, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1744, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
20 Dec 2012 15:37:25 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 933,120 | 2,301,073 | 2.4660 |
19 Dec 2012 18:50:10 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 907,200 | 2,236,826 | 2.4656 |
18 Dec 2012 21:14:03 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 881,280 | 2,172,214 | 2.4648 |
18 Dec 2012 02:30:36 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 855,360 | 2,108,437 | 2.4650 |
17 Dec 2012 07:44:28 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 829,440 | 2,043,888 | 2.4642 |
16 Dec 2012 13:00:26 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 803,520 | 1,979,623 | 2.4637 |
15 Dec 2012 18:31:37 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 777,600 | 1,915,799 | 2.4637 |
14 Dec 2012 23:56:39 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 751,680 | 1,851,912 | 2.4637 |
14 Dec 2012 05:06:25 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 725,760 | 1,787,896 | 2.4635 |
14 Dec 2012 02:44:34 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 699,840 | 1,720,763 | 2.4588 |
14 Dec 2012 02:44:34 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 673,920 | 1,658,720 | 2.4613 |
14 Dec 2012 02:44:34 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 648,000 | 1,596,963 | 2.4644 |
14 Dec 2012 02:44:34 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 622,080 | 1,533,918 | 2.4658 |
14 Dec 2012 02:44:34 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 596,160 | 1,470,981 | 2.4674 |
14 Dec 2012 02:44:34 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 570,240 | 1,406,552 | 2.4666 |
14 Dec 2012 02:44:34 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 544,320 | 1,339,543 | 2.4609 |
07 Dec 2012 15:23:07 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 518,400 | 1,272,179 | 2.4540 |
06 Dec 2012 20:13:43 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 492,480 | 1,206,545 | 2.4499 |
06 Dec 2012 01:37:02 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 466,560 | 1,142,373 | 2.4485 |
05 Dec 2012 06:23:59 | 1175567 | 15446604 | hadcm3n_zbmn_1880_40_008248148_2 | 440,640 | 1,077,404 | 2.4451 |
©2024 cpdn.org