climateprediction.net home page
Task 17252377

Task 17252377

Name hadcm3n_sba7_1940_40_009110606_0
Workunit 9240942
Created 22 Oct 2014, 14:34:37 UTC
Sent 25 Oct 2014, 22:22:04 UTC
Report deadline 25 Jan 2015, 5:49:15 UTC
Received 29 Nov 2014, 12:04:35 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 1290798
Run time 21 days 21 hours 15 min 46 sec
CPU time 20 days 8 hours 59 min 4 sec
Validate state Invalid
Credit 9,331.20
Device peak FLOPS 2.38 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 193 (0xc1)
</message>
<stderr_txt>
03:24:15 (8468): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:43:44 (9872): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:43:45 (9872): No heartbeat from core client for 30 sec - exiting
19:43:46 (9872): No heartbeat from core client for 30 sec - exiting
19:43:47 (9872): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11428, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6076, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4716, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6964, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6028, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
00:02:41 (5676): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:36:13 (5696): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Atmos Hold Restart file rename failed on atmos_restart.hold
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9612, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4932, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4932, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4932, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5864, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5864, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4104, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4108, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4108, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
01:06:51 (4468): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
01:06:53 (4468): No heartbeat from core client for 30 sec - exiting
01:06:54 (4468): No heartbeat from core client for 30 sec - exiting
01:06:55 (4468): No heartbeat from core client for 30 sec - exiting
01:06:56 (4468): No heartbeat from core client for 30 sec - exiting
01:06:57 (4468): No heartbeat from core client for 30 sec - exiting
01:06:58 (4468): No heartbeat from core client for 30 sec - exiting
01:07:24 (12584): Can't acquire lockfile (32) - waiting 35s
01:21:58 (12584): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11588, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5680, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5680, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4676, iMonCtr=1
Model crash detected, will try to restart...
21:43:15 (5776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:52:29 (11720): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
22:13:26 (3476): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3192, iMonCtr=1
Model crash detected, will try to restart...
21:20:23 (1600): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3620, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5388, iMonCtr=1
Model crash detected, will try to restart...
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
29 Nov 2014 11:08:20 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 777,600 1,760,332 2.2638
28 Nov 2014 04:55:02 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 751,680 1,700,731 2.2626
26 Nov 2014 23:19:13 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 725,760 1,639,867 2.2595
25 Nov 2014 04:54:27 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 699,840 1,580,198 2.2579
23 Nov 2014 10:18:15 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 673,920 1,517,480 2.2517
22 Nov 2014 16:34:38 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 648,000 1,457,954 2.2499
21 Nov 2014 23:57:20 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 622,080 1,399,718 2.2501
20 Nov 2014 04:59:44 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 596,160 1,338,773 2.2457
19 Nov 2014 11:35:57 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 570,240 1,278,291 2.2417
18 Nov 2014 05:49:27 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 544,320 1,217,724 2.2371
17 Nov 2014 00:17:28 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 518,400 1,159,525 2.2367
16 Nov 2014 07:31:33 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 492,480 1,102,797 2.2393
15 Nov 2014 15:07:04 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 466,560 1,046,259 2.2425
14 Nov 2014 22:47:56 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 440,640 989,906 2.2465
13 Nov 2014 08:21:01 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 414,720 930,673 2.2441
12 Nov 2014 01:06:32 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 388,800 873,822 2.2475
10 Nov 2014 02:07:48 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 362,880 815,175 2.2464
09 Nov 2014 09:02:04 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 336,960 757,618 2.2484
08 Nov 2014 15:41:44 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 311,040 700,254 2.2513
07 Nov 2014 23:12:56 1290798 17252377 hadcm3n_sba7_1940_40_009110606_0 285,120 643,864 2.2582


©2024 cpdn.org