climateprediction.net home page
Task 11689521

Task 11689521

Name famous_v04w_1599_200_006685950_2
Workunit 6889203
Created 26 Aug 2010, 15:39:04 UTC
Sent 27 Aug 2010, 9:24:19 UTC
Report deadline 26 Nov 2010, 16:51:30 UTC
Received 20 Oct 2010, 17:01:56 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID 1055314
Run time 8 days 10 hours 0 min 24 sec
CPU time 6 days 0 hours 26 min 15 sec
Validate state Invalid
Credit 555.96
Device peak FLOPS 0.67 GFLOPS
Application version UK Met Office FAMOUS v6.11
windows_intelx86
Stderr
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4036, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPIController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4444, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1080, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4716, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4028, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4240, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4912, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3548, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4024, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4040, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4324, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3948, iMonCtr=1
Model crash detected, will try to restart...
19:15:18 (4044): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:14:01 (5448): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5552, iMonCtr=1
Model crash detected, will try to restart...
21:08:31 (4760): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:08:29 (3960): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:16:10 (4820): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:07:15 (2352): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2616, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
19:18:48 (4080): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:52:42 (3668): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4068, iMonCtr=1
Model crash detected, will try to restart...
11:34:29 (5012): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:46:27 (4132): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:45:38 (5124): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:44:14 (5956): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:42:59 (4200): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4156, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3936, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
14:00:34 (5112): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=556, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
10:32:03 (4428): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:08:31 (4512): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:07:25 (4036): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
19:41:05 (2248): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:40:09 (4452): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:37:30 (4740): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:36:28 (3092): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:35:37 (4284): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
20 Oct 2010 11:49:41 1055314 11689521 famous_v04w_1599_200_006685950_2 168,506 510,684 3.0307
17 Oct 2010 11:08:18 1055314 11689521 famous_v04w_1599_200_006685950_2 159,146 482,577 3.0323
16 Oct 2010 07:10:49 1055314 11689521 famous_v04w_1599_200_006685950_2 149,786 453,854 3.0300
23 Sep 2010 06:12:17 1055314 11689521 famous_v04w_1599_200_006685950_2 140,426 425,541 3.0304
21 Sep 2010 20:44:12 1055314 11689521 famous_v04w_1599_200_006685950_2 131,066 397,043 3.0293
20 Sep 2010 11:14:32 1055314 11689521 famous_v04w_1599_200_006685950_2 121,706 368,429 3.0272
15 Sep 2010 14:17:03 1055314 11689521 famous_v04w_1599_200_006685950_2 112,346 341,279 3.0377
14 Sep 2010 15:15:58 1055314 11689521 famous_v04w_1599_200_006685950_2 102,986 312,857 3.0379
13 Sep 2010 10:28:10 1055314 11689521 famous_v04w_1599_200_006685950_2 93,626 284,569 3.0394
12 Sep 2010 10:01:50 1055314 11689521 famous_v04w_1599_200_006685950_2 84,266 256,276 3.0413
11 Sep 2010 13:24:57 1055314 11689521 famous_v04w_1599_200_006685950_2 74,906 227,951 3.0432
09 Sep 2010 17:10:18 1055314 11689521 famous_v04w_1599_200_006685950_2 65,546 199,565 3.0447
08 Sep 2010 12:56:46 1055314 11689521 famous_v04w_1599_200_006685950_2 56,186 171,038 3.0441
07 Sep 2010 09:34:37 1055314 11689521 famous_v04w_1599_200_006685950_2 46,826 141,517 3.0222
04 Sep 2010 08:33:34 1055314 11689521 famous_v04w_1599_200_006685950_2 37,466 112,549 3.0040
04 Sep 2010 06:24:41 1055314 11689521 famous_v04w_1599_200_006685950_2 28,106 84,496 3.0063
30 Aug 2010 05:50:43 1055314 11689521 famous_v04w_1599_200_006685950_2 18,746 56,221 2.9991
28 Aug 2010 16:59:01 1055314 11689521 famous_v04w_1599_200_006685950_2 9,386 28,594 3.0465


©2024 cpdn.org