Name | hadcm3n_zgtp_1880_40_008242760_2 |
Workunit | 8397884 |
Created | 17 Nov 2012, 4:08:34 UTC |
Sent | 17 Nov 2012, 4:08:48 UTC |
Report deadline | 16 Feb 2013, 11:35:59 UTC |
Received | 18 Jan 2013, 22:29:32 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 460931 |
Run time | 35 days 22 hours 16 min 38 sec |
CPU time | 29 days 0 hours 16 min 19 sec |
Validate state | Invalid |
Credit | 8,087.04 |
Device peak FLOPS | 2.01 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 07:50:15 (5132): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 07:50:16 (5132): No heartbeat from core client for 30 sec - exiting 07:50:17 (5132): No heartbeat from core client for 30 sec - exiting 04:14:15 (4872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:14:53 (448): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:39:46 (2320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:40:37 (6108): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 00:01:59 (6024): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 00:02:01 (6024): No heartbeat from core client for 30 sec - exiting 00:02:02 (6024): No heartbeat from core client for 30 sec - exiting 00:02:03 (6024): No heartbeat from core client for 30 sec - exiting 00:02:04 (6024): No heartbeat from core client for 30 sec - exiting 01:52:00 (2688): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:53:24 (1836): No heartbeat from core client for 30 sec - exiting 01:53:25 (1836): No heartbeat from core client for 30 sec - exiting 01:53:26 (1836): No heartbeat from core client for 30 sec - exiting 01:53:27 (1836): No heartbeat from core client for 30 sec - exiting 01:53:28 (1836): No heartbeat from core client for 30 sec - exiting 01:53:29 (1836): No heartbeat from core client for 30 sec - exiting 01:53:30 (1836): No heartbeat from core client for 30 sec - exiting 01:53:31 (1836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:35:38 (628): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 23:35:40 (628): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 03:47:57 (4456): No heartbeat from core client for 30 sec - exiting 03:48:02 (4456):Suspended CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting 23:20:36 (1276): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:20:38 (1276): No heartbeat from core client for 30 sec - exiting 23:21:26 (4804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:40:07 (1204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:40:08 (1204): No heartbeat from core client for 30 sec - exiting 10:21:31 (2868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:02:35 (5964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:02:37 (5964): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 15:42:09 (5096): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 15:42:10 (5096): No heartbeat from core client for 30 sec - exiting 15:42:11 (5096): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 02:16:18 (5968): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 02:33:00 (4336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 10:48:51 (1228): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 10:48:53 (1228): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 23:48:05 (2944): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 23:48:06 (2944): No heartbeat from core client for 30 sec - exiting 23:48:07 (2944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3916, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3916, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3916, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3916, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4808, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4808, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
14 Jan 2013 09:13:51 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 673,920 | 2,452,714 | 3.6395 |
07 Jan 2013 12:45:23 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 648,000 | 2,356,074 | 3.6359 |
05 Jan 2013 18:49:56 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 622,080 | 2,263,134 | 3.6380 |
04 Jan 2013 03:29:13 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 596,160 | 2,170,776 | 3.6413 |
02 Jan 2013 19:10:42 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 570,240 | 2,078,317 | 3.6446 |
24 Dec 2012 23:55:25 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 544,320 | 1,986,604 | 3.6497 |
23 Dec 2012 06:27:33 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 518,400 | 1,896,654 | 3.6587 |
21 Dec 2012 09:36:54 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 492,480 | 1,804,939 | 3.6650 |
20 Dec 2012 03:50:09 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 466,560 | 1,715,063 | 3.6760 |
18 Dec 2012 23:15:07 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 440,640 | 1,624,655 | 3.6870 |
17 Dec 2012 13:05:16 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 414,720 | 1,533,924 | 3.6987 |
16 Dec 2012 10:14:51 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 388,800 | 1,443,953 | 3.7139 |
14 Dec 2012 12:30:26 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 362,880 | 1,351,724 | 3.7250 |
14 Dec 2012 09:15:33 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 336,960 | 1,259,950 | 3.7392 |
14 Dec 2012 09:15:33 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 311,040 | 1,166,172 | 3.7493 |
08 Dec 2012 02:59:59 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 285,120 | 1,070,786 | 3.7556 |
06 Dec 2012 06:07:52 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 259,200 | 971,638 | 3.7486 |
04 Dec 2012 13:28:12 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 233,280 | 876,173 | 3.7559 |
01 Dec 2012 23:46:48 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 207,360 | 777,501 | 3.7495 |
30 Nov 2012 07:01:27 | 460931 | 15437881 | hadcm3n_zgtp_1880_40_008242760_2 | 181,440 | 677,405 | 3.7335 |
©2024 cpdn.org