climateprediction.net home page
Task 13812763

Task 13812763

Name hadam3p_pnw_6xkx_2000_1_007592105_1
Workunit 7770235
Created 23 Dec 2011, 11:54:20 UTC
Sent 23 Dec 2011, 12:03:43 UTC
Report deadline 4 Dec 2012, 17:23:43 UTC
Received 12 Feb 2012, 9:21:52 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 725427
Run time 10 days 14 hours 33 min 3 sec
CPU time 8 days 1 hours 52 min 12 sec
Validate state Invalid
Credit 2,755.56
Device peak FLOPS 2.16 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
13:05:01 (7096): No heartbeat from core client for 30 sec - exiting
13:05:02 (7096): No heartbeat from core client for 30 sec - exiting
13:05:03 (7096): No heartbeat from core client for 30 sec - exiting
13:05:04 (7096): No heartbeat from core client for 30 sec - exiting
13:05:05 (7096): No heartbeat from core client for 30 sec - exiting
13:05:06 (7096): No heartbeat from core client for 30 sec - exiting
13:05:07 (7096): No heartbeat from core client for 30 sec - exiting
13:05:08 (7096): No heartbeat from core client for 30 sec - exiting
13:05:09 (7096): No heartbeat from core client for 30 sec - exiting
13:05:10 (7096): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10164, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9292, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 0
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10172, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9400, iMonCtr=2
Leaving CPDN_Main::Monitor...
21:48:33 (3408): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1264, selfPID=1264, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6208, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4660, iMonCtr=2
Model crash detected, will try to restart...
08:45:12 (4508): No heartbeat from core client for 30 sec - exiting
08:45:13 (4508): No heartbeat from core client for 30 sec - exiting
08:45:14 (4508): No heartbeat from core client for 30 sec - exiting
08:45:15 (4508): No heartbeat from core client for 30 sec - exiting
08:45:16 (4508): No heartbeat from core client for 30 sec - exiting
08:45:17 (4508): No heartbeat from core client for 30 sec - exiting
08:45:18 (4508): No heartbeat from core client for 30 sec - exiting
08:45:19 (4508): No heartbeat from core client for 30 sec - exiting
08:45:20 (4508): No heartbeat from core client for 30 sec - exiting
08:45:21 (4508): No heartbeat from core client for 30 sec - exiting
08:45:22 (4508): No heartbeat from core client for 30 sec - exiting
08:45:23 (4508): No heartbeat from core client for 30 sec - exiting
08:45:24 (4508): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2564, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3556, selfPID=6852, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8936, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10004, iMonCtr=2
Model crash detected, will try to restart...
GCobal Wooketroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6132, iMonCtr=2
Model crash detected, will try to restart...
r:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=2
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 1
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9268, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 1
CGntroller:: CPDN procels bas not running, exiting, bRetVal = 1, checkPID=0, selfPID=6752, iMonCtr=2
Model crash detected, will try to restart...
l Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7236, iMonCtr=2
Leaving CPDN_Main::Monitor...
00:27:31 (6656): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5140, selfPID=5140, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6464, selfPID=3468, iMonCtr=1
Model crash detected, will try to restart...
17:22:55 (5668): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:22:56 (5668): No heartbeat from core client for 30 sec - exiting
17:22:57 (5668): No heartbeat from core client for 30 sec - exiting
17:22:59 (5668): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3700, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3096, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6724, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4652, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6600, selfPID=6600, iMonCtr=2
GCobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=944, iMonCtr=2
ontroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6192, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9332, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 5
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6604, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7060, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4508, selfPID=9560, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 5
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5896, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6836, selfPID=5668, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5664, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7068, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 5
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5612, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4652, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6740, iMonCtr=2
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4580, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7764, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1056, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7140, selfPID=6776, iMonCtr=1
Model crash detected, will try to restart...
Colobal Workerro:llerN prCceDNs is noss runnint , exiting ,ebRetVal   b,RceecVal = 01 s clfhID=6876=0 iMonCfPr=2
3452, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=912, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7920, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 7
22:58:11 (3784): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7648, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8596, selfPID=7432, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8452, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8240, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 8
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8108, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2592, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3716, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5356, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3552, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 9
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3288, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1144, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 9
09:05:05 (5756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7064, iMonCtr=2
10:14:46 (2412): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6036, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=2
Mode
l crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 10
23:17:39 (2988): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6148, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Coltrolll WorCPDN p:: CsP DN not rcessinis not runn,ing, exiting,, bRcePID=l = selfPheckPI4D =0, selfPI
Dodel cr iMondCtr=2ed
, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2412, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 11
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_pnw_6xkx_2000_1_007592105/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_pnw_6xkx_2000_1_007592105/dataout/region_restart.day after 11 attempts

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 0
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_pnw_6xkx_2000_1_007592105_1_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
11 Feb 2012 19:18:21 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 126,816 682,608 5.3827
29 Jan 2012 12:31:51 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 115,296 620,773 5.3842
23 Jan 2012 13:06:17 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 103,776 559,699 5.3933
21 Jan 2012 00:03:25 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 92,266 498,783 5.4059
20 Jan 2012 23:03:06 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 92,256 497,955 5.3975
14 Jan 2012 20:22:23 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 80,736 436,394 5.4052
08 Jan 2012 17:37:04 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 69,216 374,252 5.4070
02 Jan 2012 16:31:36 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 57,696 311,128 5.3925
31 Dec 2011 19:11:58 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 46,176 250,536 5.4257
30 Dec 2011 09:36:20 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 34,661 190,319 5.4909
30 Dec 2011 00:32:30 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 34,656 189,496 5.4679
28 Dec 2011 14:41:12 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 23,136 126,939 5.4866
26 Dec 2011 09:05:12 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 11,618 64,255 5.5306
25 Dec 2011 23:53:23 725427 13812763 hadam3p_pnw_6xkx_2000_1_007592105_1 11,616 63,507 5.4672


©2024 climateprediction.net