Name | famous_voa1_799_200_006734719_1 |
Workunit | 6938060 |
Created | 13 Jan 2011, 1:22:45 UTC |
Sent | 13 Jan 2011, 12:47:05 UTC |
Report deadline | 14 Apr 2011, 20:14:16 UTC |
Received | 3 Feb 2011, 16:47:21 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 1066945 |
Run time | 7 days 7 hours 58 min 30 sec |
CPU time | 6 days 9 hours 38 min 39 sec |
Validate state | Invalid |
Credit | 3,644.12 |
Device peak FLOPS | 2.28 GFLOPS |
Application version | UK Met Office FAMOUS v6.11 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3448, iMonCtr=1 Model crash detected, will try to restart... 09:03:54 (4672): No heartbeat from core client for 30 sec - exiting 09:03:55 (4672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:47:01 (3856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3908, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3620, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3624, iMonCtr=1 Model crash detected, will try to restart... 11:23:17 (1652): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:19:36 (4040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:20:22 (3624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:21:41 (4668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4400, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4088, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... 09:25:06 (3648): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:25:48 (4052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:26:26 (3716): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4280, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 09:05:03 (3464): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 09:06:23 (2024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... 10:08:43 (3560): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:15:21 (2304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:16:12 (2168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:28:01 (3552): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:28:52 (1840): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4644, iMonCtr=1 Model crash detected, will try to restart... 16:01:37 (3064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:02:29 (3624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1700, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3428, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
03 Feb 2011 17:09:15 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,104,506 | 551,939 | 0.4997 |
01 Feb 2011 14:24:10 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,095,146 | 547,809 | 0.5002 |
01 Feb 2011 13:00:56 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,085,786 | 543,240 | 0.5003 |
01 Feb 2011 12:54:26 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,076,426 | 538,688 | 0.5004 |
31 Jan 2011 23:02:03 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,067,066 | 534,015 | 0.5005 |
31 Jan 2011 18:20:48 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,057,706 | 529,396 | 0.5005 |
31 Jan 2011 16:37:21 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,048,346 | 524,828 | 0.5006 |
30 Jan 2011 21:41:00 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,038,986 | 520,150 | 0.5006 |
30 Jan 2011 18:27:45 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,029,626 | 515,328 | 0.5005 |
30 Jan 2011 15:54:47 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,020,266 | 510,573 | 0.5004 |
30 Jan 2011 14:14:47 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,010,906 | 505,912 | 0.5005 |
30 Jan 2011 12:22:39 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 1,001,546 | 501,169 | 0.5004 |
30 Jan 2011 10:48:28 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 992,186 | 496,500 | 0.5004 |
29 Jan 2011 16:23:52 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 982,826 | 491,717 | 0.5003 |
29 Jan 2011 14:53:07 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 973,466 | 487,098 | 0.5004 |
29 Jan 2011 13:22:55 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 964,106 | 482,504 | 0.5005 |
29 Jan 2011 12:26:39 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 954,746 | 477,901 | 0.5006 |
28 Jan 2011 21:32:07 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 945,386 | 473,259 | 0.5006 |
28 Jan 2011 19:48:54 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 936,026 | 468,518 | 0.5005 |
28 Jan 2011 18:52:25 | 1066945 | 12477749 | famous_voa1_799_200_006734719_1 | 926,666 | 463,844 | 0.5006 |
©2025 cpdn.org