climateprediction.net home page
Task 11569721

Task 11569721

Name famous_unv5_799_200_006663772_3
Workunit 6867144
Created 10 Jun 2010, 15:31:03 UTC
Sent 3 Jul 2010, 18:50:41 UTC
Report deadline 3 Oct 2010, 2:17:52 UTC
Received 16 Jul 2010, 17:33:49 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1085414
Run time 8 days 7 hours 8 min 37 sec
CPU time 7 days 18 hours 9 min 11 sec
Validate state Invalid
Credit 1,667.69
Device peak FLOPS 0.74 GFLOPS
Application version UK Met Office FAMOUS v6.11
i686-pc-linux-gnu
Stderr
<core_client_version>6.10.17</core_client_version>
<![CDATA[
<message>
process exited with code 22 (0x16, -234)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1404, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 3 received, exiting...
 (716): called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=20174, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=20174, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
 (1274): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (1291): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (1311): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (1329): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (1349): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (1369): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (1385): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (1385): No heartbeat from core client for 30 sec - exiting
Signal 3 received, exiting...
 (1411): called boinc_finish
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 3 received, exiting...
 (1451): called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7956, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7956, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7996, selfPID=7996, iMonCtr=1
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9204, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9204, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 3 received, exiting...
 (15472): called boinc_finish
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7643, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7643, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7643, iMonCtr=1
Model crash detected, will try to restart...
Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=17276, selfPID=17276, iMonCtr=1
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=18043, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=18043, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 1

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 1

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 1

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 1
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=18043, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=18043, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=18043, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 60 - Return code = 1

BUFFIN: Read Failed: Invalid argument
BUFFIN: C I/O Error feof - Unit 61 - Return code = 1

BUFFIN: Read Failed: Invalid argument
BUFFIN: C I/O Error feof - Unit 68 - Return code = 1

BUFFIN: Read Failed: Invalid argument
BUFFIN: C I/O Error feof - Unit 69 - Return code = 1
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=18043, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
 (18043): called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
16 Jul 2010 16:44:13 1085414 11569721 famous_unv5_799_200_006663772_3 505,466 669,906 1.3253
16 Jul 2010 11:58:21 1085414 11569721 famous_unv5_799_200_006663772_3 496,106 656,747 1.3238
16 Jul 2010 05:35:54 1085414 11569721 famous_unv5_799_200_006663772_3 486,746 643,924 1.3229
16 Jul 2010 01:58:37 1085414 11569721 famous_unv5_799_200_006663772_3 477,386 631,738 1.3233
15 Jul 2010 23:37:08 1085414 11569721 famous_unv5_799_200_006663772_3 468,026 620,554 1.3259
15 Jul 2010 02:13:58 1085414 11569721 famous_unv5_799_200_006663772_3 458,666 609,004 1.3278
14 Jul 2010 18:48:19 1085414 11569721 famous_unv5_799_200_006663772_3 449,306 595,684 1.3258
14 Jul 2010 15:23:20 1085414 11569721 famous_unv5_799_200_006663772_3 439,946 584,054 1.3276
14 Jul 2010 11:30:26 1085414 11569721 famous_unv5_799_200_006663772_3 430,586 572,497 1.3296
14 Jul 2010 08:15:24 1085414 11569721 famous_unv5_799_200_006663772_3 421,226 561,173 1.3322
14 Jul 2010 05:07:26 1085414 11569721 famous_unv5_799_200_006663772_3 411,866 550,301 1.3361
14 Jul 2010 02:01:18 1085414 11569721 famous_unv5_799_200_006663772_3 402,506 539,445 1.3402
13 Jul 2010 23:51:55 1085414 11569721 famous_unv5_799_200_006663772_3 393,146 528,600 1.3445
13 Jul 2010 19:36:14 1085414 11569721 famous_unv5_799_200_006663772_3 383,786 517,423 1.3482
13 Jul 2010 16:23:31 1085414 11569721 famous_unv5_799_200_006663772_3 374,426 506,071 1.3516
13 Jul 2010 12:59:01 1085414 11569721 famous_unv5_799_200_006663772_3 365,066 494,637 1.3549
13 Jul 2010 09:30:59 1085414 11569721 famous_unv5_799_200_006663772_3 355,706 482,850 1.3574
13 Jul 2010 06:05:34 1085414 11569721 famous_unv5_799_200_006663772_3 346,346 471,497 1.3613
13 Jul 2010 02:53:15 1085414 11569721 famous_unv5_799_200_006663772_3 336,986 460,496 1.3665
12 Jul 2010 23:41:57 1085414 11569721 famous_unv5_799_200_006663772_3 327,626 449,558 1.3722


©2024 climateprediction.net