Name | hadcm3n_p0hn_1940_40_007450449_0 |
Workunit | 7647952 |
Created | 10 Sep 2011, 6:21:21 UTC |
Sent | 13 Sep 2011, 14:34:13 UTC |
Report deadline | 13 Dec 2011, 22:01:24 UTC |
Received | 28 Nov 2011, 16:14:20 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1041747 |
Run time | 20 days 20 hours 13 min 55 sec |
CPU time | 18 days 1 hours 1 min 18 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 2.34 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:20:28 (7016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5968, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5968, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4500, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4500, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4500, iMonCtr=1 Model crash detected, will try to restart... 10:51:24 (4704): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5208, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5208, iMonCtr=1 Model crash detected, will try to restart... 17:39:23 (5228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7692, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2600, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1976, iMonCtr=1 Model crash detected, will try to restart... C15:48:46 (3188): No heartbeat from core client for 30 sec - exiting 15:48:47 (3188): No heartbeat from core client for 30 sec - exiting 15:48:48 (3188): No heartbeat from core client for 30 sec - exiting 15:48:49 (3188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:51:28 (2716): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7692, iMonCtr=1 Model crash detected, will try to restart... 06:59:42 (4944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6512, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5416, iMonCtr=1 Model crash detected, will try to restart... 11:53:35 (5740): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8148, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5040, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 15:47:42 (4756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:49:40 (7488): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5884, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5884, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5884, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5756, iMonCtr=1 Model crash detected, will try to restart... 15:20:41 (5892): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:17:25 (5336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:19:06 (5164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6548, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3040, iMonCtr=1 Model crash detected, will try to restart... 14:28:15 (5716): No heartbeat from core client for 30 sec - exiting 14:28:16 (5716): No heartbeat from core client for 30 sec - exiting 14:28:17 (5716): No heartbeat from core client for 30 sec - exiting 14:28:18 (5716): No heartbeat from core client for 30 sec - exiting 14:28:19 (5716): No heartbeat from core client for 30 sec - exiting 14:28:20 (5716): No heartbeat from core client for 30 sec - exiting 14:28:21 (5716): No heartbeat from core client for 30 sec - exiting 14:28:22 (5716): No heartbeat from core client for 30 sec - exiting 14:28:23 (5716): No heartbeat from core client for 30 sec - exiting 14:28:24 (5716): No heartbeat from core client for 30 sec - exiting 14:28:25 (5716): No heartbeat from core client for 30 sec - exiting 14:28:26 (5716): No heartbeat from core client for 30 sec - exiting 14:28:27 (5716): No heartbeat from core client for 30 sec - exiting 14:28:28 (5716): No heartbeat from core client for 30 sec - exiting 14:28:29 (5716): No heartbeat from core client for 30 sec - exiting 14:28:30 (5716): No heartbeat from core client for 30 sec - exiting 14:28:31 (5716): No heartbeat from core client for 30 sec - exiting 14:28:32 (5716): No heartbeat from core client for 30 sec - exiting 14:28:33 (5716): No heartbeat from core client for 30 sec - exiting 14:28:34 (5716): No heartbeat from core client for 30 sec - exiting 14:28:35 (5716): No heartbeat from core client for 30 sec - exiting 14:28:36 (5716): No heartbeat from core client for 30 sec - exiting 14:28:37 (5716): No heartbeat from core client for 30 sec - exiting 14:28:38 (5716): No heartbeat from core client for 30 sec - exiting 14:28:39 (5716): No heartbeat from core client for 30 sec - exiting 14:28:40 (5716): No heartbeat from core client for 30 sec - exiting 14:28:41 (5716): No heartbeat from core client for 30 sec - exiting 14:28:42 (5716): No heartbeat from core client for 30 sec - exiting 14:28:43 (5716): No heartbeat from core client for 30 sec - exiting 14:28:44 (5716): No heartbeat from core client for 30 sec - exiting 14:28:45 (5716): No heartbeat from core client for 30 sec - exiting 14:28:46 (5716): No heartbeat from core client for 30 sec - exiting 14:28:47 (5716): No heartbeat from core client for 30 sec - exiting 14:28:48 (5716): No heartbeat from core client for 30 sec - exiting 14:28:49 (5716): No heartbeat from core client for 30 sec - exiting 14:28:50 (5716): No heartbeat from core client for 30 sec - exiting 14:28:51 (5716): No heartbeat from core client for 30 sec - exiting 14:28:52 (5716): No heartbeat from core client for 30 sec - exiting 14:28:53 (5716): No heartbeat from core client for 30 sec - exiting 14:28:54 (5716): No heartbeat from core client for 30 sec - exiting 14:28:55 (5716): No heartbeat from core client for 30 sec - exiting 14:28:56 (5716): No heartbeat from core client for 30 sec - exiting 14:28:57 (5716): No heartbeat from core client for 30 sec - exiting 14:28:58 (5716): No heartbeat from core client for 30 sec - exiting 14:28:59 (5716): No heartbeat from core client for 30 sec - exiting 14:29:00 (5716): No heartbeat from core client for 30 sec - exiting 14:29:01 (5716): No heartbeat from core client for 30 sec - exiting 14:29:02 (5716): No heartbeat from core client for 30 sec - exiting 14:29:03 (5716): No heartbeat from core client for 30 sec - exiting 14:29:04 (5716): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 20:09:49 (5084): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:18:34 (4276): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:18:36 (4276): No heartbeat from core client for 30 sec - exiting C11:26:27 (5496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7592, iMonCtr=1 Model crash detected, will try to restart... 14:21:04 (6120): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:33:42 (884): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:50:38 (8388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:50:39 (8388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... 10:46:24 (2764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 15:47:03 (5444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:49:08 (6540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:12:37 (5816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:09:06 (4504): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2500, iMonCtr=1 Model crash detected, will try to restart... 11:18:48 (3720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8576, iMonCtr=1 Model crash detected, will try to restart... 21:06:30 (6100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:59:35 (5784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:50:25 (872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7156, iMonCtr=1 Model crash detected, will try to restart... 09:31:53 (6128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:32:28 (5804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:32:30 (5804): No heartbeat from core client for 30 sec - exiting 09:34:47 (1820): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 15:47:37 (4760): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6300, iMonCtr=1 Model crash detected, will try to restart... 17:50:38 (5800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:22:32 (1144): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:21:07 (5964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:17:50 (6120): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 10:11:35 (6028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:20:58 (6836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:54:34 (3844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5448, iMonCtr=1 Model crash detected, will try to restart... 13:19:04 (4496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:32:36 (5612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:19:36 (6132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77B52957 read attempt to address 0xFFFFFFF9 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p0hn_1940_40_007450449/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
28 Nov 2011 13:26:52 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 777,600 | 1,558,862 | 2.0047 |
26 Nov 2011 14:46:02 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 751,680 | 1,503,149 | 1.9997 |
20 Nov 2011 19:34:32 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 725,760 | 1,449,673 | 1.9975 |
19 Nov 2011 15:35:33 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 699,840 | 1,397,241 | 1.9965 |
17 Nov 2011 11:29:04 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 673,920 | 1,345,039 | 1.9958 |
15 Nov 2011 18:12:42 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 648,000 | 1,292,997 | 1.9954 |
07 Nov 2011 21:02:27 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 622,080 | 1,240,702 | 1.9944 |
06 Nov 2011 17:35:40 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 596,160 | 1,188,944 | 1.9943 |
03 Nov 2011 20:46:19 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 570,240 | 1,135,827 | 1.9918 |
31 Oct 2011 17:34:06 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 544,320 | 1,086,435 | 1.9959 |
31 Oct 2011 16:42:49 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 518,400 | 1,036,125 | 1.9987 |
31 Oct 2011 14:09:16 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 492,480 | 984,091 | 1.9982 |
31 Oct 2011 14:09:16 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 466,560 | 932,092 | 1.9978 |
18 Oct 2011 21:21:56 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 440,640 | 879,473 | 1.9959 |
17 Oct 2011 12:19:55 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 414,720 | 826,761 | 1.9935 |
15 Oct 2011 18:23:15 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 388,800 | 774,415 | 1.9918 |
13 Oct 2011 15:37:38 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 362,880 | 722,765 | 1.9917 |
12 Oct 2011 12:08:26 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 336,960 | 672,096 | 1.9946 |
10 Oct 2011 15:17:28 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 311,040 | 621,894 | 1.9994 |
08 Oct 2011 14:15:26 | 1041747 | 13366912 | hadcm3n_p0hn_1940_40_007450449_0 | 285,120 | 570,236 | 2.0000 |
©2024 cpdn.org