Name | hadam3p_eu_2qss_1973_1_008220391_0 |
Workunit | 8375515 |
Created | 6 Oct 2012, 13:46:35 UTC |
Sent | 6 Oct 2012, 13:48:00 UTC |
Report deadline | 18 Sep 2013, 19:08:00 UTC |
Received | 8 Jan 2013, 23:33:20 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 989394 |
Run time | 9 days 4 hours 23 min 23 sec |
CPU time | 6 days 6 hours 39 min 37 sec |
Validate state | Invalid |
Credit | 2,187.67 |
Device peak FLOPS | 1.46 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>6.6.36</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1280, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4716, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2636, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4508, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4792, selfPID=4556, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5416, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5764, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4708, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 12:23:31 (4600): No heartbeat from core client for 30 sec - exiting 12:23:32 (4600): No heartbeat from core client for 30 sec - exiting 12:23:33 (4600): No heartbeat from core client for 30 sec - exiting 12:23:34 (4600): No heartbeat from core client for 30 sec - exiting 12:23:35 (4600): No heartbeat from core client for 30 sec - exiting 12:23:36 (4600): No heartbeat from core client for 30 sec - exiting 12:23:37 (4600): No heartbeat from core client for 30 sec - exiting 12:23:38 (4600): No heartbeat from core client for 30 sec - exiting 12:23:39 (4600): No heartbeat from core client for 30 sec - exiting 12:23:40 (4600): No heartbeat from core client for 30 sec - exiting 12:23:41 (4600): No heartbeat from core client for 30 sec - exiting 12:23:42 (4600): No heartbeat from core client for 30 sec - exiting 12:23:43 (4600): No heartbeat from core client for 30 sec - exiting 12:23:44 (4600): No heartbeat from core client for 30 sec - exiting 12:23:45 (4600): No heartbeat from core client for 30 sec - exiting 12:23:46 (4600): No heartbeat from core client for 30 sec - exiting 12:23:47 (4600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6088, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4492, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4044, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2304, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5672, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4240, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... CGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5508, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4840, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4716, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5628, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4712, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2556, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4676, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1040, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6072, selfPID=5276, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5500, selfPID=4444, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5080, selfPID=3804, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5376, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4468, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5924, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2612, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5648, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4600, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4020, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4852, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3792, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4724, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5604, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1016, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4728, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6048, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4812, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4756, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2816, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4788, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=224, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5184, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3448, selfPID=4348, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4896, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5300, selfPID=4100, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4196, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4556, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5584, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5600, selfPID=4612, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4632, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4916, selfPID=4964, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4296, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4648, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=224, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4988, iMonCtr=2 Model crash detected, will try to restart... 18:34:51 (5000): No heartbeat from core client for 30 sec - exiting 18:34:52 (5000): No heartbeat from core client for 30 sec - exiting 18:34:53 (5000): No heartbeat from core client for 30 sec - exiting 18:34:54 (5000): No heartbeat from core client for 30 sec - exiting 18:34:55 (5000): No heartbeat from core client for 30 sec - exiting 18:34:56 (5000): No heartbeat from core client for 30 sec - exiting 18:34:57 (5000): No heartbeat from core client for 30 sec - exiting 18:34:58 (5000): No heartbeat from core client for 30 sec - exiting 18:34:59 (5000): No heartbeat from core client for 30 sec - exiting 18:35:00 (5000): No heartbeat from core client for 30 sec - exiting 18:35:01 (5000): No heartbeat from core client for 30 sec - exiting 18:35:02 (5000): No heartbeat from core client for 30 sec - exiting 18:35:03 (5000): No heartbeat from core client for 30 sec - exiting 18:35:04 (5000): No heartbeat from core client for 30 sec - exiting 18:35:05 (5000): No heartbeat from core client for 30 sec - exiting 18:35:06 (5000): No heartbeat from core client for 30 sec - exiting 18:35:07 (5000): No heartbeat from core client for 30 sec - exiting 18:35:08 (5000): No heartbeat from core client for 30 sec - exiting 18:35:09 (5000): No heartbeat from core client for 30 sec - exiting 18:35:10 (5000): No heartbeat from core client for 30 sec - exiting 18:35:11 (5000): No heartbeat from core client for 30 sec - exiting 18:35:12 (5000): No heartbeat from core client for 30 sec - exiting 18:35:13 (5000): No heartbeat from core client for 30 sec - exiting 18:35:14 (5000): No heartbeat from core client for 30 sec - exiting 18:35:15 (5000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5400, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4940, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4988, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1844, selfPID=4872, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4956, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2924, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4716, iMonCtr=2 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4840, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN proceController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5204, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5864, selfPID=4844, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5372, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4704, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4424, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3620, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3988, selfPID=5644, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5900, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4900, iMonCtr=2 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
08 Jan 2013 02:25:41 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 126,816 | 534,928 | 4.2181 |
28 Nov 2012 04:10:44 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 115,296 | 486,036 | 4.2155 |
22 Nov 2012 17:54:19 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 103,776 | 437,258 | 4.2135 |
18 Nov 2012 03:29:08 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 92,256 | 388,428 | 4.2103 |
10 Nov 2012 02:20:00 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 80,736 | 340,255 | 4.2144 |
07 Nov 2012 06:02:20 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 69,216 | 291,514 | 4.2117 |
01 Nov 2012 02:30:29 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 57,696 | 242,768 | 4.2077 |
26 Oct 2012 01:47:22 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 46,176 | 193,958 | 4.2004 |
19 Oct 2012 23:14:58 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 34,656 | 145,423 | 4.1962 |
16 Oct 2012 00:53:12 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 23,136 | 96,877 | 4.1873 |
10 Oct 2012 02:55:07 | 989394 | 15343822 | hadam3p_eu_2qss_1973_1_008220391_0 | 11,616 | 48,790 | 4.2002 |
©2024 cpdn.org