climateprediction.net home page
Task 15737089

Task 15737089

Name hadam3p_pnw_q7nq_2031_1_008353327_0
Workunit 8504186
Created 19 Apr 2013, 16:24:57 UTC
Sent 19 Apr 2013, 16:28:51 UTC
Report deadline 1 Apr 2014, 21:48:51 UTC
Received 1 Jul 2013, 12:10:37 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1097161
Run time 6 days 6 hours 13 min 42 sec
CPU time 5 days 17 hours 29 min 8 sec
Validate state Invalid
Credit 2,505.24
Device peak FLOPS 1.65 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
14:19:31 (4708): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5836, selfPID=2240, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5700, selfPID=4080, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5464, selfPID=3684, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5372, selfPID=4948, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5332, selfPID=5108, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5440, selfPID=3556, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3156, selfPID=4628, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5320, selfPID=3624, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2036, selfPID=4200, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4320, selfPID=4960, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Suspended CPDN Monitor - Suspend request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6184, selfPID=2276, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6008, selfPID=5128, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3220, selfPID=5896, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=168, selfPID=3240, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3872, selfPID=4820, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4052, selfPID=5224, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6068, selfPID=4060, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6932, selfPID=5784, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6064, selfPID=5512, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5264, selfPID=6636, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3612, selfPID=5148, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1124, selfPID=5276, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5116, selfPID=5248, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5308, selfPID=3944, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5280, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1940, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=780, selfPID=3648, iMonCtr=1
Model crash detected, will try to restart...
CGCController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4124, selfPID=5296, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 8
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4540, selfPID=5096, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6244, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3236, selfPID=4000, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4944, selfPID=5272, iMonCtr=1
Model crash detected, will try to restart...
10:25:22 (5172): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2172, selfPID=4320, iMonCtr=1
Model crash detected, will try to restart...

Model crashed: 
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 10
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_pnw_q7nq_2031_1_008353327_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_q7nq_2031_1_008353327_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
02 Jul 2013 10:55:14 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 115,296 485,201 4.2083
27 Jun 2013 17:43:39 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 103,776 436,573 4.2069
22 Jun 2013 20:09:38 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 92,256 387,482 4.2001
18 Jun 2013 20:03:22 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 80,736 340,911 4.2225
05 Jun 2013 19:32:48 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 69,216 293,408 4.2390
25 May 2013 15:54:32 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 57,696 246,449 4.2715
18 May 2013 16:37:03 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 46,176 197,171 4.2700
05 May 2013 19:58:23 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 34,656 149,411 4.3113
03 May 2013 07:13:11 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 23,136 99,606 4.3052
28 Apr 2013 11:52:06 1097161 15737089 hadam3p_pnw_q7nq_2031_1_008353327_0 11,616 49,248 4.2397


©2024 climateprediction.net