climateprediction.net home page
Task 13180153

Task 13180153

Name hadam3p_eu_2kgx_1991_1_007380896_0
Workunit 7578326
Created 31 Jul 2011, 15:48:01 UTC
Sent 3 Aug 2011, 21:09:42 UTC
Report deadline 16 Jul 2012, 2:29:42 UTC
Received 17 Sep 2011, 7:48:38 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 725427
Run time 6 days 19 hours 8 min 18 sec
CPU time 4 days 11 hours 2 min 46 sec
Validate state Invalid
Credit 1,392.75
Device peak FLOPS 2.06 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6332, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7540, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
10:12:07 (5396): No heartbeat from core client for 30 sec - exiting
10:12:08 (5396): No heartbeat from core client for 30 sec - exiting
10:12:09 (5396): No heartbeat from core client for 30 sec - exiting
10:12:10 (5396): No heartbeat from core client for 30 sec - exiting
10:12:11 (5396): No heartbeat from core client for 30 sec - exiting
10:12:12 (5396): No heartbeat from core client for 30 sec - exiting
10:12:13 (5396): No heartbeat from core client for 30 sec - exiting
10:12:14 (5396): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1368, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9108, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5020, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7052, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7432, selfPID=8232, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4648, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=2
Model crash detected, will try to restart...
CGntroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7004, iMonCtr=2
Model crash detected, will try to restart...
lobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10152, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5472, selfPID=5940, iMonCtr=1
Model crash detected, will try to restart...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9600, iMonCtr=2
Model crash detected, will try to restart...
lobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7816, iMonCtr=2
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3224, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2448, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5424, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8836, selfPID=10128, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7232, selfPID=8424, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7660, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7572, iMonCtr=2
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8876, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6232, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6692, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9268, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4092, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=156, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5908, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
08:58:46 (4688): No heartbeat from core client for 30 sec - exiting
08:58:47 (4688): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6696, iMonCtr=2
Model crash detected, will try to restart...
08:01:15 (1468): No heartbeat from core client for 30 sec - exiting
08:01:16 (1468): No heartbeat from core client for 30 sec - exiting
08:01:17 (1468): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3652, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3996, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=816, selfPID=3988, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3716, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7080, selfPID=7704, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7004, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10088, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9688, selfPID=9628, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6020, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6528, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2804, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5760, iMonCtr=2
Model crash detected, will try to restart...
23:21:11 (4612): No heartbeat from core client for 30 sec - exiting
23:21:12 (4612): No heartbeat from core client for 30 sec - exiting
23:21:13 (4612): No heartbeat from core client for 30 sec - exiting
23:21:14 (4612): No heartbeat from core client for 30 sec - exiting
23:21:15 (4612): No heartbeat from core client for 30 sec - exiting
23:21:16 (4612): No heartbeat from core client for 30 sec - exiting
23:21:17 (4612): No heartbeat from core client for 30 sec - exiting
23:21:18 (4612): No heartbeat from core client for 30 sec - exiting
23:21:19 (4612): No heartbeat from core client for 30 sec - exiting
23:21:20 (4612): No heartbeat from core client for 30 sec - exiting
23:21:21 (4612): No heartbeat from core client for 30 sec - exiting
23:21:22 (4612): No heartbeat from core client for 30 sec - exiting
23:21:23 (4612): No heartbeat from core client for 30 sec - exiting
23:21:24 (4612): No heartbeat from core client for 30 sec - exiting
23:21:25 (4612): No heartbeat from core client for 30 sec - exiting
23:21:27 (4612): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7756, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6636, iMonCtr=2
08:22:35 (5996): No heartbeat from core client for 30 sec - exiting
08:22:38 (5996): No heartbeat from core client for 30 sec - exiting
08:22:40 (5996): No heartbeat from core client for 30 sec - exiting
08:22:41 (5996): No heartbeat from core client for 30 sec - exiting
08:22:42 (5996): No heartbeat from core client for 30 sec - exiting
08:22:43 (5996): No heartbeat from core client for 30 sec - exiting
08:22:44 (5996): No heartbeat from core client for 30 sec - exiting
08:22:45 (5996): No heartbeat from core client for 30 sec - exiting
08:22:46 (5996): No heartbeat from core client for 30 sec - exiting
08:22:48 (5996): No heartbeat from core client for 30 sec - exiting
08:22:50 (5996): No heartbeat from core client for 30 sec - exiting
08:22:51 (5996): No heartbeat from core client for 30 sec - exiting
08:22:54 (5996): No heartbeat from core client for 30 sec - exiting
08:22:55 (5996): No heartbeat from core client for 30 sec - exiting
08:22:56 (5996): No heartbeat from core client for 30 sec - exiting
08:22:57 (5996): No heartbeat from core client for 30 sec - exiting
08:22:58 (5996): No heartbeat from core client for 30 sec - exiting
08:22:59 (5996): No heartbeat from core client for 30 sec - exiting
08:23:00 (5996): No heartbeat from core client for 30 sec - exiting
08:23:01 (5996): No heartbeat from core client for 30 sec - exiting
08:23:02 (5996): No heartbeat from core client for 30 sec - exiting
08:23:03 (5996): No heartbeat from core client for 30 sec - exiting
08:23:04 (5996): No heartbeat from core client for 30 sec - exiting
08:23:05 (5996): No heartbeat from core client for 30 sec - exiting
08:23:07 (5996): No heartbeat from core client for 30 sec - exiting
08:23:08 (5996): No heartbeat from core client for 30 sec - exiting
08:23:09 (5996): No heartbeat from core client for 30 sec - exiting
08:23:10 (5996): No heartbeat from core client for 30 sec - exiting
08:23:11 (5996): No heartbeat from core client for 30 sec - exiting
08:23:12 (5996): No heartbeat from core client for 30 sec - exiting
08:23:13 (5996): No heartbeat from core client for 30 sec - exiting
08:23:14 (5996): No heartbeat from core client for 30 sec - exiting
08:23:15 (5996): No heartbeat from core client for 30 sec - exiting
08:23:16 (5996): No heartbeat from core client for 30 sec - exiting
08:23:17 (5996): No heartbeat from core client for 30 sec - exiting
08:23:18 (5996): No heartbeat from core client for 30 sec - exiting
08:23:19 (5996): No heartbeat from core client for 30 sec - exiting
08:23:20 (5996): No heartbeat from core client for 30 sec - exiting
08:23:21 (5996): No heartbeat from core client for 30 sec - exiting
08:23:22 (5996): No heartbeat from core client for 30 sec - exiting
08:23:23 (5996): No heartbeat from core client for 30 sec - exiting
08:23:24 (5996): No heartbeat from core client for 30 sec - exiting
08:23:25 (5996): No heartbeat from core client for 30 sec - exiting
08:23:26 (5996): No heartbeat from core client for 30 sec - exiting
08:23:27 (5996): No heartbeat from core client for 30 sec - exiting
08:23:28 (5996): No heartbeat from core client for 30 sec - exiting
08:23:29 (5996): No heartbeat from core client for 30 sec - exiting
08:23:30 (5996): No heartbeat from core client for 30 sec - exiting
08:23:31 (5996): No heartbeat from core client for 30 sec - exiting
08:23:32 (5996): No heartbeat from core client for 30 sec - exiting
08:23:33 (5996): No heartbeat from core client for 30 sec - exiting
08:23:34 (5996): No heartbeat from core client for 30 sec - exiting
08:23:35 (5996): No heartbeat from core client for 30 sec - exiting
08:23:36 (5996): No heartbeat from core client for 30 sec - exiting
08:23:37 (5996): No heartbeat from core client for 30 sec - exiting
08:23:38 (5996): No heartbeat from core client for 30 sec - exiting
08:23:39 (5996): No heartbeat from core client for 30 sec - exiting
08:23:41 (5996): No heartbeat from core client for 30 sec - exiting
08:23:42 (5996): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4940, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7512, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2960, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9012, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7272, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6780, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6288, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5112, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7128, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5712, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_2kgx_1991_1_007380896\tmp\xaakm.namelists
Image              PC        Routine            Line        Source             
hadam3p_eu_um_6.0  00DEA39A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00D92CD0  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00D91E9A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00D72819  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00C72287  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00D0E7B2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00D0F2DA  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00A89BD2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00DCE638  Unknown               Unknown  Unknown
kernel32.dll       76634B29  Unknown               Unknown  Unknown
ntdll.dll          77DAE1C6  Unknown               Unknown  Unknown
ntdll.dll          77DAE199  Unknown               Unknown  Unknown
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_2kgx_1991_1_007380896\tmp\xaakg.namelists
Image              PC        Routine            Line        Source             
hadrm3p_eu_um_6.0  0064C52A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  005F4460  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  005F362A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  005D2469  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  004D66EB  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  00572AE2  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  005735AF  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  00319860  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  00630893  Unknown               Unknown  Unknown
kernel32.dll       76634B29  Unknown               Unknown  Unknown
ntdll.dll          77DAE1C6  Unknown               Unknown  Unknown
ntdll.dll          77DAE199  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6496, selfPID=7296, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
14 Sep 2011 17:14:14 725427 13180153 hadam3p_eu_2kgx_1991_1_007380896_0 80,736 368,204 4.5606
11 Sep 2011 12:01:28 725427 13180153 hadam3p_eu_2kgx_1991_1_007380896_0 69,216 316,581 4.5738
29 Aug 2011 19:54:56 725427 13180153 hadam3p_eu_2kgx_1991_1_007380896_0 57,696 264,731 4.5884
28 Aug 2011 17:22:05 725427 13180153 hadam3p_eu_2kgx_1991_1_007380896_0 46,176 212,307 4.5978
27 Aug 2011 15:01:11 725427 13180153 hadam3p_eu_2kgx_1991_1_007380896_0 34,656 161,159 4.6502
25 Aug 2011 15:03:39 725427 13180153 hadam3p_eu_2kgx_1991_1_007380896_0 23,136 108,722 4.6993
07 Aug 2011 09:48:50 725427 13180153 hadam3p_eu_2kgx_1991_1_007380896_0 11,616 56,202 4.8383


©2024 climateprediction.net