climateprediction.net home page
Task 13282084

Task 13282084

Name hadam3p_eu_vcfd_1985_1_006722000_2
Workunit 6925250
Created 21 Aug 2011, 6:17:24 UTC
Sent 21 Aug 2011, 6:17:43 UTC
Report deadline 2 Aug 2012, 11:37:43 UTC
Received 12 Sep 2011, 20:34:51 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 937793
Run time 4 days 23 hours 24 min 14 sec
CPU time 1 days 22 hours 46 min 16 sec
Validate state Invalid
Credit 1,194.02
Device peak FLOPS 2.61 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>6.10.29</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
22:49:59 (4868): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2140, selfPID=2140, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6912, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6584, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=812, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4908, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1884, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3144, iMonCtr=2
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4176, selfPID=4428, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7660, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7768, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2500, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
12:52:18 (5640): No heartbeat from core client for 30 sec - exiting
12:52:19 (5640): No heartbeat from core client for 30 sec - exiting
12:52:20 (5640): No heartbeat from core client for 30 sec - exiting
12:52:21 (5640): No heartbeat from core client for 30 sec - exiting
12:52:22 (5640): No heartbeat from core client for 30 sec - exiting
12:52:23 (5640): No heartbeat from core client for 30 sec - exiting
12:52:24 (5640): No heartbeat from core client for 30 sec - exiting
12:52:25 (5640): No heartbeat from core client for 30 sec - exiting
12:52:26 (5640): No heartbeat from core client for 30 sec - exiting
12:52:28 (5640): No heartbeat from core client for 30 sec - exiting
12:52:29 (5640): No heartbeat from core client for 30 sec - exiting
12:52:30 (5640): No heartbeat from core client for 30 sec - exiting
12:52:31 (5640): No heartbeat from core client for 30 sec - exiting
12:52:32 (5640): No heartbeat from core client for 30 sec - exiting
12:52:33 (5640): No heartbeat from core client for 30 sec - exiting
12:52:34 (5640): No heartbeat from core client for 30 sec - exiting
12:52:35 (5640): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3548, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2484, selfPID=4972, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2756, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5248, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
07:35:12 (4868): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=708, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5840, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
09:20:44 (4664): No heartbeat from core client for 30 sec - exiting
09:20:45 (4664): No heartbeat from core client for 30 sec - exiting
09:20:46 (4664): No heartbeat from core client for 30 sec - exiting
09:20:47 (4664): No heartbeat from core client for 30 sec - exiting
09:20:48 (4664): No heartbeat from core client for 30 sec - exiting
09:20:49 (4664): No heartbeat from core client for 30 sec - exiting
09:20:50 (4664): No heartbeat from core client for 30 sec - exiting
09:20:51 (4664): No heartbeat from core client for 30 sec - exiting
09:20:52 (4664): No heartbeat from core client for 30 sec - exiting
09:20:53 (4664): No heartbeat from core client for 30 sec - exiting
09:20:54 (4664): No heartbeat from core client for 30 sec - exiting
09:20:55 (4664): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=200, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4548, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4416, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
10:19:47 (4648): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:19:48 (4648): No heartbeat from core client for 30 sec - exiting
10:19:49 (4648): No heartbeat from core client for 30 sec - exiting
10:19:50 (4648): No heartbeat from core client for 30 sec - exiting
10:19:51 (4648): No heartbeat from core client for 30 sec - exiting
10:19:52 (4648): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2536, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4176, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5024, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4972, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3568, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3816, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5264, selfPID=4884, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6076, selfPID=5320, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Quit request from BOINC...
10:26:09 (5076): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3588, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3008, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4144, selfPID=5300, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1780, selfPID=4556, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5636, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=716, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5068, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3936, selfPID=5360, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4568, iMonCtr=2
Model crash detected, will try to restart...
23:32:39 (4916): No heartbeat from core client for 30 sec - exiting
23:32:40 (4916): No heartbeat from core client for 30 sec - exiting
23:32:41 (4916): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3328, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
08:16:50 (4668): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:13:01 (5632): No heartbeat from core client for 30 sec - exiting
10:13:02 (5632): No heartbeat from core client for 30 sec - exiting
10:13:03 (5632): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=2
14:59:46 (5832): No heartbeat from core client for 30 sec - exiting
14:59:48 (5832): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4352, selfPID=5372, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4448, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5000, selfPID=5180, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_vcfd_1985_1_006722000/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_vcfd_1985_1_006722000/dataout/region_restart.day after 11 attempts

Model crashed: READHI
TM Ednld of file in ERDHADS from history file for namm hst NLIHISTO                                                                                                                                                                                                                    tmp/xaakg.pipe_dummy                                                           
  2048    
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_eu_vcfd_1985_1_006722000_2_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_vcfd_1985_1_006722000_2_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_vcfd_1985_1_006722000_2_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_vcfd_1985_1_006722000_2_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_vcfd_1985_1_006722000_2_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_vcfd_1985_1_006722000_2_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
11 Sep 2011 23:37:52 937793 13282084 hadam3p_eu_vcfd_1985_1_006722000_2 69,216 168,761 2.4382
30 Aug 2011 05:40:25 937793 13282084 hadam3p_eu_vcfd_1985_1_006722000_2 57,696 141,439 2.4515
28 Aug 2011 07:45:29 937793 13282084 hadam3p_eu_vcfd_1985_1_006722000_2 46,176 112,494 2.4362
07 Sep 2011 08:35:46 937793 13282084 hadam3p_eu_vcfd_1985_1_006722000_2 34,661 85,050 2.4538
27 Aug 2011 12:32:37 937793 13282084 hadam3p_eu_vcfd_1985_1_006722000_2 34,656 83,985 2.4234
24 Aug 2011 13:51:39 937793 13282084 hadam3p_eu_vcfd_1985_1_006722000_2 23,136 56,455 2.4401
23 Aug 2011 08:04:46 937793 13282084 hadam3p_eu_vcfd_1985_1_006722000_2 11,616 28,486 2.4523


©2024 climateprediction.net