Name | hadam3p_anz_d6cq_2013_1_009724532_0 |
Workunit | 9797829 |
Created | 8 Apr 2015, 18:04:25 UTC |
Sent | 9 Apr 2015, 16:26:24 UTC |
Report deadline | 21 Mar 2016, 21:46:24 UTC |
Received | 5 Aug 2015, 19:39:59 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 194 (0x000000C2) EXIT_ABORTED_BY_CLIENT |
Computer ID | 980226 |
Run time | 12 days 0 hours 56 min 7 sec |
CPU time | 6 days 6 hours 48 min 2 sec |
Validate state | Invalid |
Credit | 2,000.18 |
Device peak FLOPS | 0.68 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> Got ack for job that's till active </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2260, iMonCtr=2 Model crash detected, will try to restart... 16:24:29 (4652): No heartbeat from core client for 30 sec - exiting 16:24:30 (4652): No heartbeat from core client for 30 sec - exiting 16:24:31 (4652): No heartbeat from core client for 30 sec - exiting 16:24:32 (4652): No heartbeat from core client for 30 sec - exiting 16:24:33 (4652): No heartbeat from core client for 30 sec - exiting 16:24:34 (4652): No heartbeat from core client for 30 sec - exiting 16:24:35 (4652): No heartbeat from core client for 30 sec - exiting 16:24:36 (4652): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:29:48 (5712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:29:49 (5712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1028, selfPID=1028, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8040, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1524, selfPID=1524, iMonCtr=2 GCoontroller:: CPDN process is not running, exiting, bRetVbal Workcr:: CPDN proc selfPID=1144, iMonCtr=2 ing,el crash detected, will try to restart... bRetVal = 1, checkPID=0, selfPID=156, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4008, selfPID=5220, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5640, selfPID=5640, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5264, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5536, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6112, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5428, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4124, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3824, selfPID=1148, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5624, selfPID=4644, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5360, selfPID=6020, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3236, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5016, selfPID=5816, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Glontroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3304, iMonCtr=2 Model crash detected, will try to restart... obal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3948, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=172, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7472, selfPID=7472, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3212, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5612, selfPID=5612, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4976, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4600, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3864, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
04 Aug 2015 16:17:33 | 980226 | 18279363 | hadam3p_anz_d6cq_2013_1_009724532_0 | 46,379 | 526,211 | 11.3459 |
03 Jul 2015 08:35:12 | 980226 | 18279363 | hadam3p_anz_d6cq_2013_1_009724532_0 | 34,859 | 390,357 | 11.1982 |
20 Jun 2015 20:45:49 | 980226 | 18279363 | hadam3p_anz_d6cq_2013_1_009724532_0 | 23,339 | 255,567 | 10.9502 |
30 May 2015 22:02:15 | 980226 | 18279363 | hadam3p_anz_d6cq_2013_1_009724532_0 | 11,819 | 127,254 | 10.7669 |
©2024 cpdn.org