Name | hadcm3n_zmlp_1880_40_008200204_4 |
Workunit | 8355328 |
Created | 19 Dec 2012, 22:36:37 UTC |
Sent | 19 Dec 2012, 22:36:50 UTC |
Report deadline | 21 Mar 2013, 6:04:01 UTC |
Received | 5 Mar 2013, 7:03:17 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1131474 |
Run time | 32 days 8 hours 52 min 4 sec |
CPU time | 29 days 23 hours 7 min 1 sec |
Validate state | Invalid |
Credit | 8,398.08 |
Device peak FLOPS | 1.64 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> Il dispositivo non riconosce il comando. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4028, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3464, iMonCtr=1 Model crash detected, will try to restart... 01:46:04 (2096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4580, iMonCtr=1 Model crash detected, will try to restart... 19:28:18 (4568): No heartbeat from core client for 30 sec - exiting 19:28:19 (4568): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:28:20 (4568): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 20:52:40 (1940): No heartbeat from core client for 30 sec - exiting 20:52:41 (1940): No heartbeat from core client for 30 sec - exiting 20:52:42 (1940): No heartbeat from core client for 30 sec - exiting 20:52:43 (1940): No heartbeat from core client for 30 sec - exiting 20:52:44 (1940): No heartbeat from core client for 30 sec - exiting 20:52:45 (1940): No heartbeat from core client for 30 sec - exiting 20:52:46 (1940): No heartbeat from core client for 30 sec - exiting 20:52:47 (1940): No heartbeat from core client for 30 sec - exiting 20:52:48 (1940): No heartbeat from core client for 30 sec - exiting 20:52:49 (1940): No heartbeat from core client for 30 sec - exiting 20:52:50 (1940): No heartbeat from core client for 30 sec - exiting 20:52:51 (1940): No heartbeat from core client for 30 sec - exiting 20:52:52 (1940): No heartbeat from core client for 30 sec - exiting 20:52:53 (1940): No heartbeat from core client for 30 sec - exiting 20:52:54 (1940): No heartbeat from core client for 30 sec - exiting 20:52:55 (1940): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 02:01:56 (5108): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:08:09 (7720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7708, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 06:59:58 (4792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1860, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4984, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4052, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4848, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3036, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 06:10:46 (4332): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:10:47 (4332): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5336, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 09:22:39 (5864): No heartbeat from core client for 30 sec - exiting 09:22:40 (5864): No heartbeat from core client for 30 sec - exiting 09:22:41 (5864): No heartbeat from core client for 30 sec - exiting 09:22:42 (5864): No heartbeat from core client for 30 sec - exiting 09:22:43 (5864): No heartbeat from core client for 30 sec - exiting 09:22:44 (5864): No heartbeat from core client for 30 sec - exiting 09:22:45 (5864): No heartbeat from core client for 30 sec - exiting 09:22:46 (5864): No heartbeat from core client for 30 sec - exiting 09:22:47 (5864): No heartbeat from core client for 30 sec - exiting 09:22:48 (5864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:22:49 (5864): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4076, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 19:00:34 (2196): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 23:40:22 (4592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 06:07:39 (4500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:09:58 (5752): No heartbeat from core client for 30 sec - exiting 08:09:59 (5752): No heartbeat from core client for 30 sec - exiting 08:10:00 (5752): No heartbeat from core client for 30 sec - exiting 08:10:01 (5752): No heartbeat from core client for 30 sec - exiting 08:10:02 (5752): No heartbeat from core client for 30 sec - exiting 08:10:03 (5752): No heartbeat from core client for 30 sec - exiting 08:10:04 (5752): No heartbeat from core client for 30 sec - exiting 08:10:05 (5752): No heartbeat from core client for 30 sec - exiting 08:10:06 (5752): No heartbeat from core client for 30 sec - exiting 08:10:07 (5752): No heartbeat from core client for 30 sec - exiting 08:10:08 (5752): No heartbeat from core client for 30 sec - exiting 08:10:09 (5752): No heartbeat from core client for 30 sec - exiting 08:10:10 (5752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:35:57 (1884): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 00:57:37 (4100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8848, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5592, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
01 Mar 2013 21:37:59 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 699,840 | 2,505,902 | 3.5807 |
28 Feb 2013 06:32:11 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 673,920 | 2,413,231 | 3.5809 |
26 Feb 2013 17:31:31 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 648,000 | 2,322,107 | 3.5835 |
23 Feb 2013 14:25:28 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 622,080 | 2,228,332 | 3.5821 |
21 Feb 2013 07:34:23 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 596,160 | 2,140,031 | 3.5897 |
19 Feb 2013 17:06:18 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 570,240 | 2,044,689 | 3.5857 |
17 Feb 2013 16:15:45 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 544,320 | 1,950,996 | 3.5843 |
16 Feb 2013 00:42:13 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 518,400 | 1,860,952 | 3.5898 |
14 Feb 2013 07:40:29 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 492,480 | 1,765,225 | 3.5844 |
12 Feb 2013 07:39:22 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 466,560 | 1,668,136 | 3.5754 |
09 Feb 2013 19:25:54 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 440,640 | 1,570,077 | 3.5632 |
07 Feb 2013 08:02:37 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 414,720 | 1,474,416 | 3.5552 |
05 Feb 2013 02:15:32 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 388,800 | 1,379,755 | 3.5488 |
03 Feb 2013 01:25:58 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 362,880 | 1,283,576 | 3.5372 |
01 Feb 2013 07:21:32 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 336,960 | 1,186,142 | 3.5201 |
29 Jan 2013 07:04:16 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 311,040 | 1,096,373 | 3.5249 |
27 Jan 2013 08:56:02 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 285,120 | 994,817 | 3.4891 |
24 Jan 2013 19:10:08 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 259,200 | 893,979 | 3.4490 |
22 Jan 2013 01:09:04 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 233,280 | 801,207 | 3.4345 |
13 Jan 2013 14:00:14 | 1131474 | 15484425 | hadcm3n_zmlp_1880_40_008200204_4 | 207,360 | 708,949 | 3.4189 |
©2024 cpdn.org