climateprediction.net home page
Task 16152046

Task 16152046

Name hadcm3n_856k_1980_40_008464672_1
Workunit 8615511
Created 20 Dec 2013, 22:13:16 UTC
Sent 20 Dec 2013, 22:13:23 UTC
Report deadline 22 Mar 2014, 5:40:34 UTC
Received 11 Apr 2014, 10:52:14 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1295275
Run time 10 days 6 hours 44 min 2 sec
CPU time 9 days 20 hours 47 min 39 sec
Validate state Invalid
Credit 7,153.92
Device peak FLOPS 3.32 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.2.28</core_client_version>
<![CDATA[
<message>
Het apparaat herkent de opdracht niet.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
10:39:39 (5104): No heartbeat from core client for 30 sec - exiting
10:39:40 (5104): No heartbeat from core client for 30 sec - exiting
10:39:41 (5104): No heartbeat from core client for 30 sec - exiting
10:39:42 (5104): No heartbeat from core client for 30 sec - exiting
10:39:43 (5104): No heartbeat from core client for 30 sec - exiting
10:39:44 (5104): No heartbeat from core client for 30 sec - exiting
10:39:45 (5104): No heartbeat from core client for 30 sec - exiting
10:39:46 (5104): No heartbeat from core client for 30 sec - exiting
10:39:47 (5104): No heartbeat from core client for 30 sec - exiting
10:39:48 (5104): No heartbeat from core client for 30 sec - exiting
10:39:49 (5104): No heartbeat from core client for 30 sec - exiting
10:39:50 (5104): No heartbeat from core client for 30 sec - exiting
10:39:51 (5104): No heartbeat from core client for 30 sec - exiting
10:39:52 (5104): No heartbeat from core client for 30 sec - exiting
10:39:53 (5104): No heartbeat from core client for 30 sec - exiting
10:39:54 (5104): No heartbeat from core client for 30 sec - exiting
10:39:55 (5104): No heartbeat from core client for 30 sec - exiting
10:39:56 (5104): No heartbeat from core client for 30 sec - exiting
10:39:57 (5104): No heartbeat from core client for 30 sec - exiting
10:39:58 (5104): No heartbeat from core client for 30 sec - exiting
10:39:59 (5104): No heartbeat from core client for 30 sec - exiting
10:40:00 (5104): No heartbeat from core client for 30 sec - exiting
10:40:01 (5104): No heartbeat from core client for 30 sec - exiting
10:40:02 (5104): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:40:03 (5104): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4856, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5108, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4732, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2888, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3512, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2640, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=924, iMonCtr=1
Model crash detected, will try to restart...
11:24:38 (4900): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4148, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4020, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3656, iMonCtr=1
Model crash detected, will try to restart...
18:59:37 (2548): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4416, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2096, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2508, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4356, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3380, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3912, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4132, iMonCtr=1
Model crash detected, will try to restart...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4260, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1404, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3540, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3540, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3540, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3540, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3540, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3540, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
10 Apr 2014 19:09:38 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 596,160 842,457 1.4131
06 Apr 2014 23:43:58 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 570,240 805,543 1.4126
06 Apr 2014 06:23:48 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 544,320 768,363 1.4116
31 Mar 2014 19:00:33 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 518,400 731,322 1.4107
23 Mar 2014 13:04:30 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 492,480 695,123 1.4115
22 Mar 2014 11:40:31 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 466,560 660,745 1.4162
15 Mar 2014 14:37:30 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 440,640 624,708 1.4177
04 Mar 2014 08:04:12 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 414,720 587,725 1.4172
24 Feb 2014 12:49:26 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 388,800 553,111 1.4226
21 Feb 2014 14:23:13 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 362,880 515,867 1.4216
14 Feb 2014 10:56:16 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 336,960 478,635 1.4205
09 Feb 2014 23:25:25 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 311,040 441,591 1.4197
09 Feb 2014 13:21:29 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 285,120 404,582 1.4190
06 Feb 2014 14:01:34 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 259,200 368,692 1.4224
02 Feb 2014 10:53:50 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 233,280 332,164 1.4239
26 Jan 2014 12:07:34 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 207,360 295,442 1.4248
20 Jan 2014 14:06:38 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 181,440 258,664 1.4256
17 Jan 2014 08:32:08 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 155,520 221,807 1.4262
10 Jan 2014 10:32:53 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 129,600 184,793 1.4259
09 Jan 2014 09:52:25 1295275 16152046 hadcm3n_856k_1980_40_008464672_1 103,680 147,969 1.4272


©2024 climateprediction.net