Name | hadam3p_eu_cnh1_2004_1_007988566_1 |
Workunit | 8143680 |
Created | 23 May 2012, 11:41:51 UTC |
Sent | 23 May 2012, 11:44:02 UTC |
Report deadline | 5 May 2013, 17:04:02 UTC |
Received | 8 Jul 2012, 11:41:43 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 1202128 |
Run time | 6 days 15 hours 25 min 54 sec |
CPU time | 5 days 3 hours 43 min 36 sec |
Validate state | Invalid |
Credit | 1,988.94 |
Device peak FLOPS | 2.56 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>7.0.25</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10192, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5364, selfPID=4288, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4752, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4032, selfPID=3592, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6484, selfPID=8064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7680, selfPID=6528, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6936, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7012, selfPID=6408, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6860, selfPID=1404, iMonCtr=1 Model crash detected, will try to restart... 14:35:34 (4260): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:35:36 (4260): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4956, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7444, selfPID=1624, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6996, selfPID=4700, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3312, selfPID=7252, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6004, selfPID=5640, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7696, selfPID=5336, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7780, selfPID=5348, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 19:09:27 (5740): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7916, selfPID=6684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8112, selfPID=6304, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7820, selfPID=6400, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7752, selfPID=5644, iMonCtr=1 Model crash detected, will try to restart... 14:23:10 (5996): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4936, selfPID=1880, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=724, selfPID=4624, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 06:01:40 (7088): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6688, selfPID=6688, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6736, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4244, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 04:53:25 (5516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7768, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7992, selfPID=7020, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6220, selfPID=1400, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4540, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4436, selfPID=5744, iMonCtr=1 Model crash detected, will try to restart... 18:58:02 (6200): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8140, iMonCtr=2 Model crash detected, will try to restart... 05:53:23 (6640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1420, selfPID=6376, iMonCtr=1 Model crash detected, will try to restart... 18:08:34 (4724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1580, selfPID=1580, iMonCtr=2 06:18:44 (6104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6160, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7656, iMonCtr=2 Model crash detected, will try to restart... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
22 Jun 2012 04:20:23 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 115,296 | 427,702 | 3.7096 |
22 Jun 2012 04:20:22 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 103,776 | 383,791 | 3.6983 |
22 Jun 2012 04:20:22 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 92,256 | 334,515 | 3.6259 |
22 Jun 2012 04:20:22 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 80,736 | 289,884 | 3.5905 |
12 Jun 2012 03:05:50 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 69,216 | 244,758 | 3.5361 |
12 Jun 2012 03:05:50 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 57,696 | 201,335 | 3.4896 |
07 Jun 2012 04:09:09 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 46,176 | 157,444 | 3.4097 |
07 Jun 2012 04:09:09 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 34,656 | 117,967 | 3.4039 |
04 Jun 2012 04:18:51 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 23,136 | 73,601 | 3.1812 |
04 Jun 2012 04:18:51 | 1202128 | 14734863 | hadam3p_eu_cnh1_2004_1_007988566_1 | 11,616 | 36,223 | 3.1184 |
©2024 cpdn.org