Name | hadcm3n_ylo0_1900_40_007360938_1 |
Workunit | 7558368 |
Created | 6 Jul 2011, 15:15:57 UTC |
Sent | 7 Jul 2011, 16:32:28 UTC |
Report deadline | 6 Oct 2011, 23:59:39 UTC |
Received | 19 Sep 2011, 5:49:16 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1070354 |
Run time | 18 days 6 hours 33 min 36 sec |
CPU time | 14 days 3 hours 39 min 19 sec |
Validate state | Invalid |
Credit | 6,220.80 |
Device peak FLOPS | 1.62 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.13.1</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8724, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 07:55:51 (8008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:55:52 (8008): No heartbeat from core client for 30 sec - exiting 07:55:53 (8008): No heartbeat from core client for 30 sec - exiting 07:55:54 (8008): No heartbeat from core client for 30 sec - exiting 07:55:55 (8008): No heartbeat from core client for 30 sec - exiting 07:55:56 (8008): No heartbeat from core client for 30 sec - exiting 07:55:57 (8008): No heartbeat from core client for 30 sec - exiting 08:25:11 (8140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:25:17 (8140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:51:15 (1304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:51:16 (1304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4212, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5448, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1568, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8712, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9876, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3420, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4244, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5052, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5336, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5892, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5396, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4396, iMonCtr=1 Model crash detected, will try to restart... 14:34:01 (5192): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:47:36 (3324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:47:37 (3324): No heartbeat from core client for 30 sec - exiting 14:47:38 (3324): No heartbeat from core client for 30 sec - exiting 14:47:39 (3324): No heartbeat from core client for 30 sec - exiting 14:47:40 (3324): No heartbeat from core client for 30 sec - exiting 14:47:41 (3324): No heartbeat from core client for 30 sec - exiting 14:47:42 (3324): No heartbeat from core client for 30 sec - exiting 14:47:43 (3324): No heartbeat from core client for 30 sec - exiting 14:47:44 (3324): No heartbeat from core client for 30 sec - exiting 14:47:45 (3324): No heartbeat from core client for 30 sec - exiting 14:47:46 (3324): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5180, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 06:54:19 (5420): No heartbeat from core client for 30 sec - exiting 06:54:20 (5420): No heartbeat from core client for 30 sec - exiting 06:54:21 (5420): No heartbeat from core client for 30 sec - exiting 06:54:22 (5420): No heartbeat from core client for 30 sec - exiting 06:54:23 (5420): No heartbeat from core client for 30 sec - exiting 06:54:24 (5420): No heartbeat from core client for 30 sec - exiting 06:54:25 (5420): No heartbeat from core client for 30 sec - exiting 06:54:26 (5420): No heartbeat from core client for 30 sec - exiting 06:54:27 (5420): No heartbeat from core client for 30 sec - exiting 06:54:28 (5420): No heartbeat from core client for 30 sec - exiting 06:54:29 (5420): No heartbeat from core client for 30 sec - exiting 06:54:30 (5420): No heartbeat from core client for 30 sec - exiting 06:54:31 (5420): No heartbeat from core client for 30 sec - exiting 06:54:32 (5420): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5892, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1752, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5292, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5916, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 17:50:01 (6984): No heartbeat from core client for 30 sec - exiting 17:50:02 (6984): No heartbeat from core client for 30 sec - exiting 17:50:03 (6984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8052, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2188, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6012, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6012, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4424, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4424, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4424, iMonCtr=1 Model crash detected, will try to restart... 22:14:01 (4424): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 23:17:01 (2740): No heartbeat from core client for 30 sec - exiting 23:17:02 (2740): No heartbeat from core client for 30 sec - exiting 23:17:03 (2740): No heartbeat from core client for 30 sec - exiting 23:17:04 (2740): No heartbeat from core client for 30 sec - exiting 23:17:05 (2740): No heartbeat from core client for 30 sec - exiting 23:17:06 (2740): No heartbeat from core client for 30 sec - exiting 23:17:07 (2740): No heartbeat from core client for 30 sec - exiting 23:17:08 (2740): No heartbeat from core client for 30 sec - exiting 23:17:09 (2740): No heartbeat from core client for 30 sec - exiting 23:17:10 (2740): No heartbeat from core client for 30 sec - exiting 23:17:11 (2740): No heartbeat from core client for 30 sec - exiting 23:17:12 (2740): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:06:11 (5560): No heartbeat from core client for 30 sec - exiting 00:06:12 (5560): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5976, iMonCtr=1 Model crash detected, will try to restart... 16:49:22 (6116): No heartbeat from core client for 30 sec - exiting 16:49:23 (6116): No heartbeat from core client for 30 sec - exiting 16:49:24 (6116): No heartbeat from core client for 30 sec - exiting 16:49:25 (6116): No heartbeat from core client for 30 sec - exiting 16:49:26 (6116): No heartbeat from core client for 30 sec - exiting 16:49:27 (6116): No heartbeat from core client for 30 sec - exiting 16:49:28 (6116): No heartbeat from core client for 30 sec - exiting 16:49:29 (6116): No heartbeat from core client for 30 sec - exiting 16:49:30 (6116): No heartbeat from core client for 30 sec - exiting 16:49:31 (6116): No heartbeat from core client for 30 sec - exiting 16:49:32 (6116): No heartbeat from core client for 30 sec - exiting 16:49:33 (6116): No heartbeat from core client for 30 sec - exiting 16:49:34 (6116): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:49:35 (6116): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 18:41:27 (6668): No heartbeat from core client for 30 sec - exiting 18:41:28 (6668): No heartbeat from core client for 30 sec - exiting 18:41:29 (6668): No heartbeat from core client for 30 sec - exiting 18:41:30 (6668): No heartbeat from core client for 30 sec - exiting 18:41:31 (6668): No heartbeat from core client for 30 sec - exiting 18:41:32 (6668): No heartbeat from core client for 30 sec - exiting 18:41:33 (6668): No heartbeat from core client for 30 sec - exiting 18:41:34 (6668): No heartbeat from core client for 30 sec - exiting 18:41:35 (6668): No heartbeat from core client for 30 sec - exiting 18:41:36 (6668): No heartbeat from core client for 30 sec - exiting 18:41:37 (6668): No heartbeat from core client for 30 sec - exiting 18:41:38 (6668): No heartbeat from core client for 30 sec - exiting 18:41:39 (6668): No heartbeat from core client for 30 sec - exiting 18:41:40 (6668): No heartbeat from core client for 30 sec - exiting 18:41:41 (6668): No heartbeat from core client for 30 sec - exiting 18:41:42 (6668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:41:43 (6668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... 20:39:25 (5604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:39:26 (5604): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4088, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3460, iMonCtr=1 Model crash detected, will try to restart... 17:30:27 (4564): No heartbeat from core client for 30 sec - exiting 17:30:28 (4564): No heartbeat from core client for 30 sec - exiting 17:30:29 (4564): No heartbeat from core client for 30 sec - exiting 17:30:30 (4564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1460, iMonCtr=1 Model crash detected, will try to restart... 16:06:56 (5208): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:06:57 (5208): No heartbeat from core client for 30 sec - exiting 16:10:26 (1144): Can't acquire lockfile (32) - waiting 35s 16:12:49 (1144): No heartbeat from core client for 30 sec - exiting 16:12:50 (1144): No heartbeat from core client for 30 sec - exiting 16:12:51 (1144): No heartbeat from core client for 30 sec - exiting 16:12:52 (1144): No heartbeat from core client for 30 sec - exiting 16:12:53 (1144): No heartbeat from core client for 30 sec - exiting 16:12:54 (1144): No heartbeat from core client for 30 sec - exiting 16:12:55 (1144): No heartbeat from core client for 30 sec - exiting 16:12:56 (1144): No heartbeat from core client for 30 sec - exiting 16:12:57 (1144): No heartbeat from core client for 30 sec - exiting 16:12:58 (1144): No heartbeat from core client for 30 sec - exiting 16:12:59 (1144): No heartbeat from core client for 30 sec - exiting 16:13:00 (1144): No heartbeat from core client for 30 sec - exiting 16:13:01 (1144): No heartbeat from core client for 30 sec - exiting 16:13:02 (1144): No heartbeat from core client for 30 sec - exiting 16:13:03 (1144): No heartbeat from core client for 30 sec - exiting 16:13:04 (1144): No heartbeat from core client for 30 sec - exiting 16:13:05 (1144): No heartbeat from core client for 30 sec - exiting 16:13:06 (1144): No heartbeat from core client for 30 sec - exiting 16:13:07 (1144): No heartbeat from core client for 30 sec - exiting 16:13:08 (1144): No heartbeat from core client for 30 sec - exiting 16:13:09 (1144): No heartbeat from core client for 30 sec - exiting 16:13:10 (1144): No heartbeat from core client for 30 sec - exiting 16:13:11 (1144): No heartbeat from core client for 30 sec - exiting 16:13:12 (1144): No heartbeat from core client for 30 sec - exiting 16:13:13 (1144): No heartbeat from core client for 30 sec - exiting 16:13:15 (1144): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4428, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5248, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8648, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
19 Sep 2011 05:51:04 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 518,400 | 1,222,754 | 2.3587 |
31 Aug 2011 14:43:14 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 492,480 | 1,160,901 | 2.3573 |
21 Aug 2011 09:24:09 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 466,560 | 1,097,736 | 2.3528 |
15 Aug 2011 16:27:00 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 440,640 | 1,035,228 | 2.3494 |
10 Aug 2011 15:27:09 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 414,720 | 989,197 | 2.3852 |
07 Aug 2011 15:06:34 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 388,800 | 925,721 | 2.3810 |
04 Aug 2011 18:31:43 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 362,880 | 862,346 | 2.3764 |
03 Aug 2011 08:58:08 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 336,960 | 799,366 | 2.3723 |
30 Jul 2011 18:16:58 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 311,040 | 735,994 | 2.3662 |
29 Jul 2011 08:37:46 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 285,120 | 674,592 | 2.3660 |
28 Jul 2011 04:31:29 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 259,200 | 613,341 | 2.3663 |
25 Jul 2011 20:53:49 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 233,280 | 550,967 | 2.3618 |
25 Jul 2011 19:10:35 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 207,360 | 490,769 | 2.3667 |
25 Jul 2011 18:05:06 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 181,440 | 429,901 | 2.3694 |
25 Jul 2011 16:10:56 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 155,520 | 367,273 | 2.3616 |
25 Jul 2011 13:21:08 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 129,600 | 306,432 | 2.3644 |
25 Jul 2011 13:21:08 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 103,680 | 245,948 | 2.3722 |
25 Jul 2011 13:21:08 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 77,760 | 185,184 | 2.3815 |
25 Jul 2011 13:21:08 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 51,840 | 122,321 | 2.3596 |
09 Jul 2011 17:44:08 | 1070354 | 13125748 | hadcm3n_ylo0_1900_40_007360938_1 | 25,920 | 61,532 | 2.3739 |
©2024 cpdn.org