climateprediction.net home page
Task 12496664

Task 12496664

Name famous_wr7k_599_200_007122023_0
Workunit 7320383
Created 16 Jan 2011, 16:41:44 UTC
Sent 17 Jan 2011, 15:16:26 UTC
Report deadline 18 Apr 2011, 22:43:37 UTC
Received 13 Jun 2011, 22:42:41 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID 1103724
Run time 31 days 6 hours 24 min 19 sec
CPU time 19 days 15 hours 34 min 39 sec
Validate state Invalid
Credit 3,891.17
Device peak FLOPS 0.82 GFLOPS
Application version UK Met Office FAMOUS v6.11
windows_intelx86
Stderr
<core_client_version>6.6.28</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
09:51:42 (1408): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4980, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5528, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2060, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5788, iMonCtr=1
Model crash detected, will try to restart...
09:52:52 (5928): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
CCCPDN Monitor - Quit request from BOINC...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
07:43:31 (4452): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5144, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
No Process Handle
Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5052, selfPID=5052, iMonCtr=1

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4908, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4720, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
16:13:54 (4324): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat
BUFFIN: Read Failed: No such file or directory
BU BOIIN:...
 
BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5628, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3768, iMonCtr=1
Model crash detected, will try to restart...
09:34:29 (5292): No heartbeat from core client for 30 sec - exiting
09:35:47 (5292): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3060, iMonCtr=1
Model crash detected, will try to restart...
18:45:26 (5184): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5392, iMonCtr=1
Model crash detected, will try to restart...
15:35:02 (3824): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1324, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5352, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7820, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=280, iMonCtr=1
Model crash detected, will try to restart...
23:17:28 (544): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:22:34 (6024): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5880, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
09:30:11 (4728): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:20:43 (5808): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
CPDN Monitor - Quit request from BOINC...
08:54:25 (5460): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7068, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
06:24:24 (5636): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=920, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6644, iMonCtr=1
Model crash detected, will try to restart...
09:20:10 (5960): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:05:13 (7440): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:11:43 (5972): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:11:44 (5972): No heartbeat from core client for 30 sec - exiting
14:11:45 (5972): No heartbeat from core client for 30 sec - exiting
14:11:46 (5972): No heartbeat from core client for 30 sec - exiting
14:11:47 (5972): No heartbeat from core client for 30 sec - exiting
14:11:48 (5972): No heartbeat from core client for 30 sec - exiting
14:11:49 (5972): No heartbeat from core client for 30 sec - exiting
14:11:50 (5972): No heartbeat from core client for 30 sec - exiting
14:11:51 (5972): No heartbeat from core client for 30 sec - exiting
14:11:52 (5972): No heartbeat from core client for 30 sec - exiting
14:11:53 (5972): No heartbeat from core client for 30 sec - exiting
14:11:54 (5972): No heartbeat from core client for 30 sec - exiting
14:11:55 (5972): No heartbeat from core client for 30 sec - exiting
14:11:56 (5972): No heartbeat from core client for 30 sec - exiting
14:11:57 (5972): No heartbeat from core client for 30 sec - exiting
14:11:58 (5972): No heartbeat from core client for 30 sec - exiting
14:11:59 (5972): No heartbeat from core client for 30 sec - exiting
14:12:00 (5972): No heartbeat from core client for 30 sec - exiting
14:12:01 (5972): No heartbeat from core client for 30 sec - exiting
14:12:02 (5972): No heartbeat from core client for 30 sec - exiting
14:12:03 (5972): No heartbeat from core client for 30 sec - exiting
14:13:06 (5972): No heartbeat from core client for 30 sec - exiting
14:13:07 (5972): No heartbeat from core client for 30 sec - exiting
14:13:08 (5972): No heartbeat from core client for 30 sec - exiting
14:13:09 (5972): No heartbeat from core client for 30 sec - exiting
14:13:10 (5972): No heartbeat from core client for 30 sec - exiting
14:13:11 (5972): No heartbeat from core client for 30 sec - exiting
14:13:12 (5972): No heartbeat from core client for 30 sec - exiting
C
BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
19:45:53 (3348): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:38:55 (4772): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:56:55 (3160): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
13 Jun 2011 21:02:06 1103724 12496664 famous_wr7k_599_200_007122023_0 1,179,386 1,694,506 1.4368
10 Jun 2011 14:26:38 1103724 12496664 famous_wr7k_599_200_007122023_0 1,170,026 1,671,872 1.4289
08 Jun 2011 06:17:44 1103724 12496664 famous_wr7k_599_200_007122023_0 1,160,666 1,656,782 1.4274
07 Jun 2011 14:22:49 1103724 12496664 famous_wr7k_599_200_007122023_0 1,151,306 1,642,017 1.4262
07 Jun 2011 01:39:29 1103724 12496664 famous_wr7k_599_200_007122023_0 1,141,946 1,628,360 1.4260
06 Jun 2011 20:35:59 1103724 12496664 famous_wr7k_599_200_007122023_0 1,132,586 1,614,411 1.4254
06 Jun 2011 14:49:47 1103724 12496664 famous_wr7k_599_200_007122023_0 1,123,226 1,597,639 1.4224
05 Jun 2011 04:09:44 1103724 12496664 famous_wr7k_599_200_007122023_0 1,113,866 1,582,596 1.4208
03 Jun 2011 21:50:11 1103724 12496664 famous_wr7k_599_200_007122023_0 1,104,506 1,560,709 1.4130
03 Jun 2011 13:55:46 1103724 12496664 famous_wr7k_599_200_007122023_0 1,095,146 1,542,167 1.4082
01 Jun 2011 19:41:04 1103724 12496664 famous_wr7k_599_200_007122023_0 1,085,786 1,526,140 1.4056
01 Jun 2011 01:56:53 1103724 12496664 famous_wr7k_599_200_007122023_0 1,076,426 1,507,782 1.4007
31 May 2011 14:50:46 1103724 12496664 famous_wr7k_599_200_007122023_0 1,067,066 1,487,695 1.3942
27 May 2011 14:12:10 1103724 12496664 famous_wr7k_599_200_007122023_0 1,057,706 1,468,544 1.3884
25 May 2011 18:07:54 1103724 12496664 famous_wr7k_599_200_007122023_0 1,048,346 1,454,119 1.3871
23 May 2011 13:30:36 1103724 12496664 famous_wr7k_599_200_007122023_0 1,038,986 1,426,134 1.3726
19 May 2011 20:12:14 1103724 12496664 famous_wr7k_599_200_007122023_0 1,029,626 1,401,770 1.3614
16 May 2011 21:54:29 1103724 12496664 famous_wr7k_599_200_007122023_0 1,020,266 1,376,816 1.3495
11 May 2011 14:39:54 1103724 12496664 famous_wr7k_599_200_007122023_0 1,010,906 1,347,337 1.3328
08 May 2011 15:39:29 1103724 12496664 famous_wr7k_599_200_007122023_0 1,001,546 1,325,831 1.3238


©2024 cpdn.org