climateprediction.net home page
Task 12987962

Task 12987962

Name hadam3p_saf_0yu4_1987_1_006887924_1
Workunit 7091240
Created 20 Jun 2011, 15:35:01 UTC
Sent 20 Jun 2011, 15:35:06 UTC
Report deadline 1 Jun 2012, 20:55:06 UTC
Received 23 Jul 2011, 6:28:18 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1107482
Run time 1 days 19 hours 14 min 49 sec
CPU time 1 days 18 hours 20 min 18 sec
Validate state Invalid
Credit 749.07
Device peak FLOPS 2.92 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Southern Africa v6.09
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6668, selfPID=4760, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6592, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3992, selfPID=2012, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3640, selfPID=3228, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4552, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4700, selfPID=4860, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4140, selfPID=3468, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5532, selfPID=3220, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4464, selfPID=1544, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1168, selfPID=5732, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5044, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4528, selfPID=3512, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4448, selfPID=4448, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7164, selfPID=5872, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5000, selfPID=6772, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4148, selfPID=4564, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4748, selfPID=3392, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4712, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6540, selfPID=9168, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2956, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1260, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5972, selfPID=1444, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1220, selfPID=5680, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6592, selfPID=4944, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5832, selfPID=5484, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4660, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5740, selfPID=4408, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5432, selfPID=6896, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
GClntroller:: CPDal prrkees: i C noP DN process is ngnot runnl g, exitingk PReDVa0 = selfPIeD=8ID=0, seltPID=4
752, l conCtr=2
tected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5736, selfPID=3576, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4992, selfPID=4076, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4140, selfPID=6564, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6536, selfPID=5092, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5432, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
17:47:28 (1236): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6532, selfPID=7664, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3508, selfPID=2192, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4368, selfPID=3820, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4728, selfPID=4120, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3844, selfPID=4760, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7728, selfPID=7024, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6576, selfPID=3560, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7500, selfPID=4008, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5620, selfPID=4128, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4476, selfPID=6344, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7240, selfPID=5340, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4224, selfPID=5144, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3744, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3340, selfPID=5164, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6920, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5560, selfPID=4716, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3960, selfPID=4864, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6212, selfPID=4800, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5048, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_saf_0yu4_1987_1_006887924\tmp\xaakg.namelists

Image              PC        Routine            Line        Source             
hadrm3p_saf_um_6.  00D5C52A  Unknown               Unknown  Unknown
hadrm3p_saf_um_6.  00D04460  Unknown               Unknown  Unknown
hadrm3p_saf_um_6.  00D0362A  Unknown               Unknown  Unknown
hadrm3p_saf_um_6.  00CE2469  Unknown               Unknown  Unknown
hadrm3p_saf_um_6.  00BE66EB  Unknown               Unknown  Unknown
hadrm3p_saf_um_6.  00C82AE2  Unknown               Unknown  Unknown
hadrm3p_saf_um_6.  00C835AF  Unknown               Unknown  Unknown
hadrm3p_saf_um_6.  00A29860  Unknown               Unknown  Unknown
hadrm3p_saf_um_6.  00D40893  Unknown               Unknown  Unknown
kernel32.dll       7618339A  Unknown               Unknown  Unknown
ntdll.dll          76EC9ED2  Unknown               Unknown  Unknown
ntdll.dll          76EC9EA5  Unknown               Unknown  Unknown
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_saf_0yu4_1987_1_006887924\tmp\xaakm.namelists

Image              PC        Routine            Line        Source             
hadam3p_saf_um_6.  00E1A39A  Unknown               Unknown  Unknown
hadam3p_saf_um_6.  00DC2CD0  Unknown               Unknown  Unknown
hadam3p_saf_um_6.  00DC1E9A  Unknown               Unknown  Unknown
hadam3p_saf_um_6.  00DA2819  Unknown               Unknown  Unknown
hadam3p_saf_um_6.  00CA2287  Unknown               Unknown  Unknown
hadam3p_saf_um_6.  00D3E7B2  Unknown               Unknown  Unknown
hadam3p_saf_um_6.  00D3F2DA  Unknown               Unknown  Unknown
hadam3p_saf_um_6.  00AB9BD2  Unknown               Unknown  Unknown
hadam3p_saf_um_6.  00DFE638  Unknown               Unknown  Unknown
kernel32.dll       7618339A  Unknown               Unknown  Unknown
ntdll.dll          76EC9ED2  Unknown               Unknown  Unknown
ntdll.dll          76EC9EA5  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5968, selfPID=4348, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_saf_0yu4_1987_1_006887924_1_5.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_0yu4_1987_1_006887924_1_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_0yu4_1987_1_006887924_1_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_0yu4_1987_1_006887924_1_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_0yu4_1987_1_006887924_1_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_0yu4_1987_1_006887924_1_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_0yu4_1987_1_006887924_1_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_0yu4_1987_1_006887924_1_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
25 Jul 2011 18:16:04 1107482 12987962 hadam3p_saf_0yu4_1987_1_006887924_1 46,176 128,057 2.7732
25 Jul 2011 14:18:22 1107482 12987962 hadam3p_saf_0yu4_1987_1_006887924_1 34,660 97,385 2.8097
10 Jul 2011 22:19:44 1107482 12987962 hadam3p_saf_0yu4_1987_1_006887924_1 34,656 96,881 2.7955
01 Jul 2011 07:33:13 1107482 12987962 hadam3p_saf_0yu4_1987_1_006887924_1 23,162 65,116 2.8113
01 Jul 2011 06:32:40 1107482 12987962 hadam3p_saf_0yu4_1987_1_006887924_1 23,146 64,528 2.7879
30 Jun 2011 17:37:10 1107482 12987962 hadam3p_saf_0yu4_1987_1_006887924_1 23,136 64,175 2.7738
24 Jun 2011 13:13:44 1107482 12987962 hadam3p_saf_0yu4_1987_1_006887924_1 11,616 32,119 2.7651


©2024 climateprediction.net