climateprediction.net home page
Task 16098686

Task 16098686

Name hadcm3n_o6p0_2020_40_008372906_4
Workunit 8523765
Created 27 Nov 2013, 15:36:19 UTC
Sent 27 Nov 2013, 15:36:25 UTC
Report deadline 26 Feb 2014, 23:03:36 UTC
Received 25 Mar 2014, 11:33:03 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 1225270
Run time 24 days 7 hours 10 min 55 sec
CPU time 12 days 12 hours 42 min 49 sec
Validate state Invalid
Credit 6,220.80
Device peak FLOPS 2.30 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 193 (0xc1)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5372, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3740, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5368, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1
Model crash detected, will try to restart...
09:05:03 (4376): No heartbeat from core client for 30 sec - exiting
09:05:04 (4376): No heartbeat from core client for 30 sec - exiting
09:05:05 (4376): No heartbeat from core client for 30 sec - exiting
09:05:06 (4376): No heartbeat from core client for 30 sec - exiting
09:05:07 (4376): No heartbeat from core client for 30 sec - exiting
09:05:08 (4376): No heartbeat from core client for 30 sec - exiting
09:05:09 (4376): No heartbeat from core client for 30 sec - exiting
09:05:10 (4376): No heartbeat from core client for 30 sec - exiting
09:05:11 (4376): No heartbeat from core client for 30 sec - exiting
09:05:12 (4376): No heartbeat from core client for 30 sec - exiting
09:05:13 (4376): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5176, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3840, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5032, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4100, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4392, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4840, iMonCtr=1
Model crash detected, will try to restart...
CSuspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4564, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4948, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4608, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3812, iMonCtr=1
Model crash detected, will try to restart...
09:09:45 (5040): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:15:43 (5068): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CSuspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4888, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4812, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5600, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5024, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4100, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6108, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5600, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4632, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5992, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPIDCController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4728, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4628, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4472, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4388, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4512, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4236, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal =CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5208, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4564, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6920, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5796, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4888, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5824, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2504, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9196, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4700, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4700, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5368, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5672, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5772, iMonCtr=1
Model crash detected, will try to restart...
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
25 Mar 2014 10:26:25 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 518,400 1,082,562 2.0883
19 Mar 2014 12:42:15 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 492,480 1,030,972 2.0934
16 Mar 2014 16:09:19 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 466,560 978,288 2.0968
11 Mar 2014 12:25:57 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 440,640 924,335 2.0977
05 Mar 2014 14:59:38 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 414,720 871,935 2.1025
20 Feb 2014 11:22:00 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 388,800 819,775 2.1085
16 Feb 2014 10:08:49 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 362,880 765,004 2.1081
11 Feb 2014 13:36:51 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 336,960 707,969 2.1010
06 Feb 2014 15:34:44 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 311,040 653,488 2.1010
31 Jan 2014 12:30:06 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 285,120 603,271 2.1158
24 Jan 2014 14:03:37 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 259,200 551,476 2.1276
20 Jan 2014 17:36:06 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 233,280 498,340 2.1362
15 Jan 2014 11:59:51 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 207,360 444,666 2.1444
09 Jan 2014 12:28:05 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 181,440 390,582 2.1527
04 Jan 2014 14:09:01 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 155,520 334,530 2.1510
28 Dec 2013 11:06:07 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 129,600 281,498 2.1721
20 Dec 2013 14:20:49 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 103,680 225,796 2.1778
13 Dec 2013 12:49:22 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 77,760 168,069 2.1614
09 Dec 2013 11:50:18 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 51,840 111,935 2.1592
03 Dec 2013 12:34:06 1225270 16098686 hadcm3n_o6p0_2020_40_008372906_4 25,920 55,094 2.1255


©2024 cpdn.org