Name | hadcm3n_n5b2_1920_40_008321509_0 |
Workunit | 8472644 |
Created | 24 Feb 2013, 19:43:05 UTC |
Sent | 24 Feb 2013, 19:46:58 UTC |
Report deadline | 27 May 2013, 3:14:09 UTC |
Received | 21 Apr 2013, 5:38:54 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1228218 |
Run time | 24 days 13 hours 42 min 16 sec |
CPU time | 19 days 1 hours 46 min 19 sec |
Validate state | Invalid |
Credit | 6,220.80 |
Device peak FLOPS | 1.82 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4504, iMonCtr=1 Model crash detected, will try to restart... 14:08:29 (3708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3716, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3712, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3148, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3912, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4020, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4192, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 12:22:16 (3932): No heartbeat from core client for 30 sec - exiting 12:22:17 (3932): No heartbeat from core client for 30 sec - exiting 12:22:18 (3932): No heartbeat from core client for 30 sec - exiting 12:22:19 (3932): No heartbeat from core client for 30 sec - exiting 12:22:20 (3932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:22:21 (3932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 09:42:07 (3216): No heartbeat from core client for 30 sec - exiting 09:42:08 (3216): No heartbeat from core client for 30 sec - exiting 09:42:09 (3216): No heartbeat from core client for 30 sec - exiting 09:42:10 (3216): No heartbeat from core client for 30 sec - exiting 09:42:11 (3216): No heartbeat from core client for 30 sec - exiting 09:42:12 (3216): No heartbeat from core client for 30 sec - exiting 09:42:13 (3216): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4352, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4048, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 09:07:06 (4476): No heartbeat from core client for 30 sec - exiting 09:07:07 (4476): No heartbeat from core client for 30 sec - exiting 09:07:08 (4476): No heartbeat from core client for 30 sec - exiting 09:07:09 (4476): No heartbeat from core client for 30 sec - exiting 09:07:10 (4476): No heartbeat from core client for 30 sec - exiting 09:07:11 (4476): No heartbeat from core client for 30 sec - exiting 09:07:12 (4476): No heartbeat from core client for 30 sec - exiting 09:07:13 (4476): No heartbeat from core client for 30 sec - exiting 09:07:15 (4476): No heartbeat from core client for 30 sec - exiting 09:07:16 (4476): No heartbeat from core client for 30 sec - exiting 09:07:17 (4476): No heartbeat from core client for 30 sec - exiting 09:07:18 (4476): No heartbeat from core client for 30 sec - exiting 09:07:19 (4476): No heartbeat from core client for 30 sec - exiting 09:07:20 (4476): No heartbeat from core client for 30 sec - exiting 09:07:21 (4476): No heartbeat from core client for 30 sec - exiting 09:07:22 (4476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 08:25:31 (4240): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4304, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 08:55:26 (3760): No heartbeat from core client for 30 sec - exiting 08:55:27 (3760): No heartbeat from core client for 30 sec - exiting 08:55:28 (3760): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4604, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 23:21:44 (4580): No heartbeat from core client for 30 sec - exiting 23:21:45 (4580): No heartbeat from core client for 30 sec - exiting 23:21:46 (4580): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=772, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4956, iMonCtr=1 Model crash detected, will try to restart... 15:02:45 (3648): No heartbeat from core client for 30 sec - exiting 15:02:46 (3648): No heartbeat from core client for 30 sec - exiting 15:02:47 (3648): No heartbeat from core client for 30 sec - exiting 15:02:48 (3648): No heartbeat from core client for 30 sec - exiting 15:02:49 (3648): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3896, iMonCtr=1 Model crash detected, will try to restart... 15:34:26 (4104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4584, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3992, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
20 Apr 2013 23:49:23 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 518,400 | 1,647,972 | 3.1790 |
19 Apr 2013 15:25:40 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 492,480 | 1,570,383 | 3.1887 |
16 Apr 2013 12:20:54 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 466,560 | 1,491,893 | 3.1976 |
13 Apr 2013 12:34:43 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 440,640 | 1,411,850 | 3.2041 |
10 Apr 2013 19:27:30 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 414,720 | 1,331,317 | 3.2102 |
07 Apr 2013 17:06:21 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 388,800 | 1,246,965 | 3.2072 |
04 Apr 2013 15:05:36 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 362,880 | 1,165,424 | 3.2116 |
02 Apr 2013 09:13:45 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 336,960 | 1,082,907 | 3.2138 |
30 Mar 2013 12:55:44 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 311,040 | 999,983 | 3.2150 |
28 Mar 2013 11:09:38 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 285,120 | 916,604 | 3.2148 |
25 Mar 2013 21:05:29 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 259,200 | 831,611 | 3.2084 |
23 Mar 2013 11:29:20 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 233,280 | 746,483 | 3.1999 |
19 Mar 2013 15:57:33 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 207,360 | 664,002 | 3.2022 |
17 Mar 2013 12:59:53 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 181,440 | 580,349 | 3.1986 |
15 Mar 2013 08:13:21 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 155,520 | 498,531 | 3.2056 |
11 Mar 2013 16:05:48 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 129,600 | 415,482 | 3.2059 |
09 Mar 2013 17:20:55 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 103,680 | 334,339 | 3.2247 |
05 Mar 2013 18:17:02 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 77,760 | 251,064 | 3.2287 |
02 Mar 2013 21:26:38 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 51,840 | 167,214 | 3.2256 |
28 Feb 2013 13:40:06 | 1228218 | 15636743 | hadcm3n_n5b2_1920_40_008321509_0 | 25,920 | 82,415 | 3.1796 |
©2024 cpdn.org