Name | hadcm3n_p1lm_1940_40_007422379_0 |
Workunit | 7620014 |
Created | 25 Aug 2011, 5:39:38 UTC |
Sent | 26 Aug 2011, 0:30:34 UTC |
Report deadline | 25 Nov 2011, 7:57:45 UTC |
Received | 26 Sep 2011, 14:17:40 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 775427 |
Run time | 19 days 2 hours 23 min 33 sec |
CPU time | 17 days 15 hours 0 min 15 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 2.31 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.34</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... 21:53:51 (6808): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5028, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 21:27:30 (6492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:21:22 (2864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:37:26 (2868): No heartbeat from core client for 30 sec - exiting 09:37:27 (2868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6116, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6160, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1784, iMonCtr=1 Model crash detected, will try to restart... 10:52:07 (4044): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:52:08 (4044): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5676, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3908, iMonCtr=1 Model crash detected, will try to restart... 21:56:44 (4664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7620, iMonCtr=1 Model crash detected, will try to restart... 22:23:39 (4964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C08:46:12 (5164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5744, iMonCtr=1 Model crash detected, will try to restart... 17:02:43 (6000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:29:03 (7060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:29:04 (7060): No heartbeat from core client for 30 sec - exiting BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/p1lmko.pjf3c10 Error converting file to netcdf: dataout/p1lmko.pif3c10 Error converting file to netcdf: dataout/p1lmko.pff3c10 Error converting file to netcdf: dataout/p1lmka.phf3c10 Error converting file to netcdf: dataout/p1lmka.pgf3c10 Error converting file to netcdf: dataout/p1lmka.pef3c10 Error converting file to netcdf: dataout/p1lmka.pdf3c10 08:03:26 (6048): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:04:06 (5944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5876, iMonCtr=1 Model crash detected, will try to restart... 09:05:23 (5172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 10:02:00 (5408): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:36:20 (3628): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:37:53 (2092): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:40:32 (3560): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:40:34 (3560): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 12:04:41 (3068): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:05:33 (5280): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:06:11 (5564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:45:03 (5704): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:46:42 (3620): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:50:12 (5152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:00:02 (5868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:04:50 (4956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:20:41 (6032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2052, iMonCtr=1 Model crash detected, will try to restart... 10:23:09 (3472): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:05:19 (3784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:13:18 (4068): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:13:20 (4068): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 23:15:41 (6920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6880, iMonCtr=1 Model crash detected, will try to restart... 09:41:38 (5248): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:58:04 (920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:58:06 (920): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 10:43:29 (6916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:38:32 (3224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:35:31 (4148): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3476, iMonCtr=1 Model crash detected, will try to restart... 13:21:04 (4736): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:21:46 (7304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:25:26 (708): No heartbeat from core client for 30 sec - exiting 16:25:27 (708): No heartbeat from core client for 30 sec - exiting 16:25:28 (708): No heartbeat from core client for 30 sec - exiting 16:25:29 (708): No heartbeat from core client for 30 sec - exiting 16:25:30 (708): No heartbeat from core client for 30 sec - exiting 16:25:31 (708): No heartbeat from core client for 30 sec - exiting 16:25:32 (708): No heartbeat from core client for 30 sec - exiting 16:25:33 (708): No heartbeat from core client for 30 sec - exiting 16:25:34 (708): No heartbeat from core client for 30 sec - exiting 16:25:35 (708): No heartbeat from core client for 30 sec - exiting 16:25:36 (708): No heartbeat from core client for 30 sec - exiting 16:25:37 (708): No heartbeat from core client for 30 sec - exiting 16:25:38 (708): No heartbeat from core client for 30 sec - exiting 16:25:39 (708): No heartbeat from core client for 30 sec - exiting 16:25:40 (708): No heartbeat from core client for 30 sec - exiting 16:25:41 (708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/p1lmko.pjf6c10 Error converting file to netcdf: dataout/p1lmko.pif6c10 Error converting file to netcdf: dataout/p1lmko.pff6c10 Error converting file to netcdf: dataout/p1lmka.phf6c10 Error converting file to netcdf: dataout/p1lmka.pgf6c10 Error converting file to netcdf: dataout/p1lmka.pef6c10 Error converting file to netcdf: dataout/p1lmka.pdf6c10 22:07:15 (1984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:07:16 (1984): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6040, iMonCtr=1 Model crash detected, will try to restart... 23:08:43 (5380): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CSuspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4680, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5680, iMonCtr=1 Model crash detected, will try to restart... 11:49:14 (6672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:54:48 (7904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:07:36 (8132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:34:15 (2496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:28:02 (5184): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:28:29 (5184): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:57:04 (6852): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:57:36 (6852): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 12:06:36 (8052): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 18:20:33 (6956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2512, iMonCtr=1 Model crash detected, will try to restart... CCPDN Monitor - Quit request from BOINC... 22:30:52 (4348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/p1lmko.pjg3c10 Error converting file to netcdf: dataout/p1lmko.pig3c10 Error converting file to netcdf: dataout/p1lmko.pfg3c10 Error converting file to netcdf: dataout/p1lmka.phg3c10 Error converting file to netcdf: dataout/p1lmka.pgg3c10 Error converting file to netcdf: dataout/p1lmka.peg3c10 Error converting file to netcdf: dataout/p1lmka.pdg3c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6056, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4768, iMonCtr=1 Model crash detected, will try to restart... 12:19:43 (6100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:07:25 (2928): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:00:16 (3292): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2572, iMonCtr=1 Model crash detected, will try to restart... 11:53:56 (7032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:09:05 (7104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:27:49 (7680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7616, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=568, iMonCtr=1 Model crash detected, will try to restart... 07:57:28 (3776): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:02:22 (6740): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:13:31 (7204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4432, iMonCtr=1 Model crash detected, will try to restart... 01:21:17 (3508): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77576E0F read attempt to address 0x4053F9A4 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\BOINC/projects/climateprediction.net/hadcm3n_p1lm_1940_40_007422379/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
26 Sep 2011 13:17:48 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 777,600 | 1,522,812 | 1.9583 |
25 Sep 2011 20:25:45 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 751,680 | 1,468,086 | 1.9531 |
24 Sep 2011 19:34:30 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 725,760 | 1,417,318 | 1.9529 |
23 Sep 2011 18:15:04 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 699,840 | 1,366,897 | 1.9532 |
22 Sep 2011 17:58:53 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 673,920 | 1,313,565 | 1.9491 |
21 Sep 2011 14:51:25 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 648,000 | 1,260,880 | 1.9458 |
18 Sep 2011 15:48:21 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 622,080 | 1,208,139 | 1.9421 |
17 Sep 2011 04:33:20 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 596,160 | 1,157,495 | 1.9416 |
16 Sep 2011 05:42:49 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 570,240 | 1,106,060 | 1.9396 |
15 Sep 2011 13:58:37 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 544,320 | 1,055,681 | 1.9394 |
14 Sep 2011 04:09:36 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 518,400 | 1,005,484 | 1.9396 |
13 Sep 2011 01:38:45 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 492,480 | 955,527 | 1.9402 |
12 Sep 2011 04:40:32 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 466,560 | 905,491 | 1.9408 |
11 Sep 2011 14:03:02 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 440,640 | 854,070 | 1.9382 |
10 Sep 2011 23:31:26 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 414,720 | 802,932 | 1.9361 |
09 Sep 2011 19:44:24 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 388,800 | 752,262 | 1.9348 |
08 Sep 2011 19:21:10 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 362,880 | 700,866 | 1.9314 |
07 Sep 2011 14:09:05 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 336,960 | 651,697 | 1.9340 |
06 Sep 2011 16:47:31 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 311,040 | 602,985 | 1.9386 |
06 Sep 2011 02:03:17 | 775427 | 13293424 | hadcm3n_p1lm_1940_40_007422379_0 | 285,120 | 553,713 | 1.9420 |
©2024 cpdn.org