Name | hadcm3n_yl06_1980_40_007535689_4 |
Workunit | 7732921 |
Created | 26 Nov 2011, 20:58:16 UTC |
Sent | 26 Nov 2011, 20:59:10 UTC |
Report deadline | 26 Feb 2012, 4:26:21 UTC |
Received | 3 Dec 2011, 23:13:39 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 890338 |
Run time | 2 days 2 hours 45 min 35 sec |
CPU time | 2 days 0 hours 55 min 25 sec |
Validate state | Invalid |
Credit | 1,244.16 |
Device peak FLOPS | 2.65 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.34</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7040, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:58:06 (3464): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:27:34 (5124): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3132, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:51:14 (3172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:51:15 (3172): No heartbeat from core client for 30 sec - exiting 16:51:16 (3172): No heartbeat from core client for 30 sec - exiting 16:51:17 (3172): No heartbeat from core client for 30 sec - exiting 16:51:18 (3172): No heartbeat from core client for 30 sec - exiting 16:51:19 (3172): No heartbeat from core client for 30 sec - exiting 16:51:20 (3172): No heartbeat from core client for 30 sec - exiting 16:51:21 (3172): No heartbeat from core client for 30 sec - exiting 16:51:22 (3172): No heartbeat from core client for 30 sec - exiting 16:51:23 (3172): No heartbeat from core client for 30 sec - exiting 16:51:24 (3172): No heartbeat from core client for 30 sec - exiting 16:51:25 (3172): No heartbeat from core client for 30 sec - exiting 16:51:26 (3172): No heartbeat from core client for 30 sec - exiting 16:51:27 (3172): No heartbeat from core client for 30 sec - exiting 16:51:28 (3172): No heartbeat from core client for 30 sec - exiting 16:51:29 (3172): No heartbeat from core client for 30 sec - exiting 16:51:30 (3172): No heartbeat from core client for 30 sec - exiting 16:51:31 (3172): No heartbeat from core client for 30 sec - exiting 16:51:32 (3172): No heartbeat from core client for 30 sec - exiting 16:51:33 (3172): No heartbeat from core client for 30 sec - exiting 16:51:34 (3172): No heartbeat from core client for 30 sec - exiting 16:51:35 (3172): No heartbeat from core client for 30 sec - exiting 16:51:36 (3172): No heartbeat from core client for 30 sec - exiting 16:51:37 (3172): No heartbeat from core client for 30 sec - exiting 16:51:38 (3172): No heartbeat from core client for 30 sec - exiting 16:51:39 (3172): No heartbeat from core client for 30 sec - exiting 16:51:40 (3172): No heartbeat from core client for 30 sec - exiting 16:51:41 (3172): No heartbeat from core client for 30 sec - exiting 16:51:42 (3172): No heartbeat from core client for 30 sec - exiting 16:51:43 (3172): No heartbeat from core client for 30 sec - exiting 16:51:44 (3172): No heartbeat from core client for 30 sec - exiting 16:51:45 (3172): No heartbeat from core client for 30 sec - exiting 16:51:46 (3172): No heartbeat from core client for 30 sec - exiting 16:51:47 (3172): No heartbeat from core client for 30 sec - exiting 16:51:48 (3172): No heartbeat from core client for 30 sec - exiting 16:51:49 (3172): No heartbeat from core client for 30 sec - exiting 16:51:50 (3172): No heartbeat from core client for 30 sec - exiting 16:51:51 (3172): No heartbeat from core client for 30 sec - exiting 16:51:52 (3172): No heartbeat from core client for 30 sec - exiting 16:51:53 (3172): No heartbeat from core client for 30 sec - exiting 16:51:54 (3172): No heartbeat from core client for 30 sec - exiting 16:51:55 (3172): No heartbeat from core client for 30 sec - exiting 16:51:56 (3172): No heartbeat from core client for 30 sec - exiting 16:51:57 (3172): No heartbeat from core client for 30 sec - exiting 16:51:58 (3172): No heartbeat from core client for 30 sec - exiting 16:51:59 (3172): No heartbeat from core client for 30 sec - exiting 16:52:00 (3172): No heartbeat from core client for 30 sec - exiting 16:52:01 (3172): No heartbeat from core client for 30 sec - exiting 16:52:02 (3172): No heartbeat from core client for 30 sec - exiting 16:52:03 (3172): No heartbeat from core client for 30 sec - exiting 16:52:04 (3172): No heartbeat from core client for 30 sec - exiting 16:52:05 (3172): No heartbeat from core client for 30 sec - exiting 16:52:06 (3172): No heartbeat from core client for 30 sec - exiting 16:52:07 (3172): No heartbeat from core client for 30 sec - exiting 16:52:08 (3172): No heartbeat from core client for 30 sec - exiting 16:52:09 (3172): No heartbeat from core client for 30 sec - exiting 16:52:10 (3172): No heartbeat from core client for 30 sec - exiting 16:52:11 (3172): No heartbeat from core client for 30 sec - exiting 16:52:12 (3172): No heartbeat from core client for 30 sec - exiting 16:52:13 (3172): No heartbeat from core client for 30 sec - exiting 16:52:14 (3172): No heartbeat from core client for 30 sec - exiting 16:52:15 (3172): No heartbeat from core client for 30 sec - exiting 16:52:16 (3172): No heartbeat from core client for 30 sec - exiting 16:52:17 (3172): No heartbeat from core client for 30 sec - exiting 16:52:18 (3172): No heartbeat from core client for 30 sec - exiting 16:52:19 (3172): No heartbeat from core client for 30 sec - exiting 16:52:20 (3172): No heartbeat from core client for 30 sec - exiting 16:52:21 (3172): No heartbeat from core client for 30 sec - exiting 16:52:22 (3172): No heartbeat from core client for 30 sec - exiting 16:52:23 (3172): No heartbeat from core client for 30 sec - exiting 16:52:24 (3172): No heartbeat from core client for 30 sec - exiting 16:52:25 (3172): No heartbeat from core client for 30 sec - exiting 16:52:26 (3172): No heartbeat from core client for 30 sec - exiting 16:52:27 (3172): No heartbeat from core client for 30 sec - exiting 16:52:28 (3172): No heartbeat from core client for 30 sec - exiting 16:52:29 (3172): No heartbeat from core client for 30 sec - exiting 16:52:30 (3172): No heartbeat from core client for 30 sec - exiting 16:52:31 (3172): No heartbeat from core client for 30 sec - exiting 16:52:32 (3172): No heartbeat from core client for 30 sec - exiting 16:52:33 (3172): No heartbeat from core client for 30 sec - exiting 16:52:34 (3172): No heartbeat from core client for 30 sec - exiting 16:52:35 (3172): No heartbeat from core client for 30 sec - exiting 16:52:36 (3172): No heartbeat from core client for 30 sec - exiting 16:52:37 (3172): No heartbeat from core client for 30 sec - exiting 16:52:38 (3172): No heartbeat from core client for 30 sec - exiting 16:52:39 (3172): No heartbeat from core client for 30 sec - exiting 16:52:40 (3172): No heartbeat from core client for 30 sec - exiting 16:52:41 (3172): No heartbeat from core client for 30 sec - exiting 16:52:42 (3172): No heartbeat from core client for 30 sec - exiting 16:52:43 (3172): No heartbeat from core client for 30 sec - exiting 16:52:44 (3172): No heartbeat from core client for 30 sec - exiting 16:52:45 (3172): No heartbeat from core client for 30 sec - exiting 16:52:46 (3172): No heartbeat from core client for 30 sec - exiting 16:52:47 (3172): No heartbeat from core client for 30 sec - exiting 16:52:48 (3172): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3736, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3736, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3736, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3736, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3736, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3736, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
03 Dec 2011 18:39:21 | 890338 | 13664042 | hadcm3n_yl06_1980_40_007535689_4 | 103,680 | 160,699 | 1.5500 |
02 Dec 2011 20:07:03 | 890338 | 13664042 | hadcm3n_yl06_1980_40_007535689_4 | 77,760 | 119,544 | 1.5373 |
29 Nov 2011 21:53:02 | 890338 | 13664042 | hadcm3n_yl06_1980_40_007535689_4 | 51,840 | 78,640 | 1.5170 |
27 Nov 2011 19:55:47 | 890338 | 13664042 | hadcm3n_yl06_1980_40_007535689_4 | 25,920 | 38,603 | 1.4893 |
©2024 cpdn.org