climateprediction.net home page
Task 18694503

Task 18694503

Name hadam3p_pnw_powg_2013_1_009979911_1
Workunit 9986269
Created 11 Jul 2015, 6:43:24 UTC
Sent 12 Jul 2015, 8:54:42 UTC
Report deadline 23 Jun 2016, 14:14:42 UTC
Received 30 Aug 2015, 2:15:20 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1268401
Run time 1 days 20 hours 58 min 54 sec
CPU time 1 days 20 hours 31 min 48 sec
Validate state Invalid
Credit 2,759.97
Device peak FLOPS 3.56 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v7.27
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8324, selfPID=6556, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8324, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8712, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9012, selfPID=7056, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6584, selfPID=6568, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9024, selfPID=8744, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5620, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7200, selfPID=10016, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8780, selfPID=3324, iMonCtr=1
Model crash detected, will try to restart...
11:34:16 (7316): No heartbeat from client for 30 sec - exiting
11:34:16 (7316): timer handler: client dead, exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:19:47 (3000): No heartbeat from client for 30 sec - exiting
11:19:47 (3000): timer handler: client dead, exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9020, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8336, selfPID=6736, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9948, selfPID=4196, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6896, selfPID=9140, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7384, selfPID=6928, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8088, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8460, selfPID=6304, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7552, selfPID=5876, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6320, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7084, selfPID=5672, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8328, selfPID=6628, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8028, selfPID=7012, iMonCtr=1
Model crash detected, will try to restart...

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7300, selfPID=7300, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7300, selfPID=6120, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
08:38:08 (6120): called boinc_finish(0)

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_pnw_powg_2013_1_009979911_1_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_powg_2013_1_009979911_1_13.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_powg_2013_1_009979911_1_14.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_powg_2013_1_009979911_1_15.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_powg_2013_1_009979911_1_16.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_powg_2013_1_009979911_1_17.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_powg_2013_1_009979911_1_18.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
29 Aug 2015 12:06:12 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 127,019 152,989 1.2045
23 Aug 2015 03:48:29 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 115,499 139,675 1.2093
22 Aug 2015 11:45:56 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 103,979 126,293 1.2146
22 Aug 2015 02:15:02 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 92,459 112,418 1.2159
21 Aug 2015 04:11:29 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 80,939 98,773 1.2203
12 Aug 2015 08:28:22 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 69,419 84,852 1.2223
12 Aug 2015 04:31:26 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 57,899 70,609 1.2195
16 Jul 2015 11:39:10 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 46,379 56,341 1.2148
15 Jul 2015 00:39:29 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 34,859 42,072 1.2069
14 Jul 2015 00:52:24 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 23,339 28,488 1.2206
13 Jul 2015 03:43:35 1268401 18694503 hadam3p_pnw_powg_2013_1_009979911_1 11,819 14,406 1.2189


©2024 cpdn.org