climateprediction.net home page
Task 14117623

Task 14117623

Name hadam3p_eu_9i2p_1971_1_007759393_0
Workunit 7914502
Created 20 Feb 2012, 16:53:48 UTC
Sent 15 Mar 2012, 17:08:21 UTC
Report deadline 25 Feb 2013, 22:28:21 UTC
Received 20 May 2012, 16:51:20 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1098998
Run time 2 days 23 hours 5 min 2 sec
CPU time 2 days 21 hours 58 min 32 sec
Validate state Invalid
Credit 1,591.48
Device peak FLOPS 2.65 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.25</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5980, selfPID=6068, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5064, selfPID=3392, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4448, selfPID=3700, iMonCtr=1
Model crash detected, will try to restart...
02:47:08 (3256): No heartbeat from core client for 30 sec - exiting
02:47:10 (3256): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:47:11 (3256): No heartbeat from core client for 30 sec - exiting
02:47:12 (3256): No heartbeat from core client for 30 sec - exiting
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2456, selfPID=4532, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1508, selfPID=4448, iMonCtr=1
Model crash detected, will try to restart...
15:19:13 (4824): No heartbeat from core client for 30 sec - exiting
15:19:15 (4824): No heartbeat from core client for 30 sec - exiting
15:19:16 (4824): No heartbeat from core client for 30 sec - exiting
15:19:17 (4824): No heartbeat from core client for 30 sec - exiting
15:19:18 (4824): No heartbeat from core client for 30 sec - exiting
15:19:19 (4824): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4956, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
23:25:17 (3832): No heartbeat from core client for 30 sec - exiting
23:25:18 (3832): No heartbeat from core client for 30 sec - exiting
23:25:19 (3832): No heartbeat from core client for 30 sec - exiting
23:25:20 (3832): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4368, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4768, selfPID=3360, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5052, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2340, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5716, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3952, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4764, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1604, selfPID=4196, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4408, selfPID=2176, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4296, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5016, iMonCtr=2
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4832, selfPID=4328, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5016, selfPID=3704, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5100, selfPID=3848, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6092, selfPID=7068, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5012, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5020, selfPID=3880, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
01:21:14 (3880): No heartbeat from core client for 30 sec - exiting
01:21:15 (3880): No heartbeat from core client for 30 sec - exiting
01:21:16 (3880): No heartbeat from core client for 30 sec - exiting
01:21:17 (3880): No heartbeat from core client for 30 sec - exiting
01:21:18 (3880): No heartbeat from core client for 30 sec - exiting
01:21:19 (3880): No heartbeat from core client for 30 sec - exiting
01:21:21 (3880): No heartbeat from core client for 30 sec - exiting
01:21:22 (3880): No heartbeat from core client for 30 sec - exiting
01:21:23 (3880): No heartbeat from core client for 30 sec - exiting
01:21:24 (3880): No heartbeat from core client for 30 sec - exiting
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_9i2p_1971_1_007759393\tmp\xaakg.namelists

Image              PC        Routine            Line        Source             
hadrm3p_eu_um_6.0  012AC52A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01254460  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  0125362A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01232469  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  011366EB  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  011D2AE2  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  011D35AF  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  00F79860  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01290893  Unknown               Unknown  Unknown
kernel32.dll       7590339A  Unknown               Unknown  Unknown
ntdll.dll          779A9EF2  Unknown               Unknown  Unknown
ntdll.dll          779A9EC5  Unknown               Unknown  Unknown
rrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_9i2p_1971_1_007759393\tmp\xaakm.namelists

Image              PC        Routine            Line        Source             
hadam3p_eu_um_6.0  00E4A39A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00DF2CD0  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00DF1E9A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00DD2819  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00CD2287  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00D6E7B2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00D6F2DA  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00AE9BD2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00E2E638  Unknown               Unknown  Unknown
kernel32.dll       7590339A  Unknown               Unknown  Unknown
ntdll.dll          779A9EF2  Unknown               Unknown  Unknown
ntdll.dll          779A9EC5  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4692, selfPID=3772, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_9i2p_1971_1_007759393_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_9i2p_1971_1_007759393_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_9i2p_1971_1_007759393_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_9i2p_1971_1_007759393_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
18 May 2012 17:37:10 1098998 14117623 hadam3p_eu_9i2p_1971_1_007759393_0 92,256 234,647 2.5434
15 May 2012 16:07:00 1098998 14117623 hadam3p_eu_9i2p_1971_1_007759393_0 80,736 206,668 2.5598
13 May 2012 14:55:25 1098998 14117623 hadam3p_eu_9i2p_1971_1_007759393_0 69,216 178,258 2.5754
12 May 2012 14:28:24 1098998 14117623 hadam3p_eu_9i2p_1971_1_007759393_0 57,696 149,133 2.5848
09 May 2012 17:22:22 1098998 14117623 hadam3p_eu_9i2p_1971_1_007759393_0 46,176 119,934 2.5973
08 May 2012 07:52:15 1098998 14117623 hadam3p_eu_9i2p_1971_1_007759393_0 34,656 90,150 2.6013
05 May 2012 15:06:56 1098998 14117623 hadam3p_eu_9i2p_1971_1_007759393_0 23,136 59,699 2.5804
03 May 2012 12:34:36 1098998 14117623 hadam3p_eu_9i2p_1971_1_007759393_0 11,616 29,539 2.5430


©2024 climateprediction.net