climateprediction.net home page
Task 17262061

Task 17262061

Name hadcm3n_sc31_1940_40_009112682_2
Workunit 9243018
Created 22 Oct 2014, 17:33:36 UTC
Sent 22 Oct 2014, 23:11:17 UTC
Report deadline 22 Jan 2015, 6:38:28 UTC
Received 31 Dec 2014, 15:09:44 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1261147
Run time 11 days 6 hours 42 min 55 sec
CPU time 9 days 21 hours 46 min 15 sec
Validate state Invalid
Credit 7,464.96
Device peak FLOPS 2.81 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5924, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4944, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16
BUFFIN: C I/O Error feof - Unit 62 - Return code = 16
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/sc31ko.pjf0c10
Error converting file to netcdf: dataout/sc31ko.pif0c10
Error converting file to netcdf: dataout/sc31ko.pff0c10
Error converting file to netcdf: dataout/sc31ko.pcf0c10
Error converting file to netcdf: dataout/sc31ko.pbf0c10
Error converting file to netcdf: dataout/sc31ko.paf0c10
Error converting file to netcdf: dataout/sc31ka.phf0c10
Error converting file to netcdf: dataout/sc31ka.pgf0c10
Error converting file to netcdf: dataout/sc31ka.pef0c10
Error converting file to netcdf: dataout/sc31ka.pdf0c10
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4024, iMonCtr=1
Model crash detected, will try to restart...
21:10:25 (4352): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:22:07 (20264): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3236, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3236, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3236, iMonCtr=1
Model crash detected, will try to restart...
22:10:11 (5756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
19:06:43 (4344): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:40:02 (4140): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:30:36 (4440): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 11 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6460, iMonCtr=1
Model crash detected, will try to restart...
Signal 11 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6460, iMonCtr=1
Model crash detected, will try to restart...
Signal 11 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6460, iMonCtr=1
Model crash detected, will try to restart...
Signal 11 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6460, iMonCtr=1
Model crash detected, will try to restart...
Signal 11 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6460, iMonCtr=1
Model crash detected, will try to restart...
Signal 11 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6460, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
29 Dec 2014 19:30:46 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 622,080 829,089 1.3328
29 Dec 2014 01:38:02 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 596,160 794,154 1.3321
27 Dec 2014 20:40:42 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 570,240 759,067 1.3311
24 Dec 2014 16:07:59 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 544,320 724,006 1.3301
21 Dec 2014 18:58:05 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 518,400 689,384 1.3298
20 Dec 2014 01:28:58 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 492,480 655,378 1.3308
16 Dec 2014 23:28:19 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 466,560 621,144 1.3313
14 Dec 2014 19:06:24 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 440,640 586,852 1.3318
11 Dec 2014 23:48:52 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 414,720 552,300 1.3317
08 Dec 2014 01:16:52 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 388,800 517,401 1.3308
07 Dec 2014 03:46:49 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 362,880 482,785 1.3304
03 Dec 2014 04:26:52 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 336,960 448,620 1.3314
30 Nov 2014 02:07:26 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 311,040 413,714 1.3301
28 Nov 2014 01:55:41 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 285,120 378,885 1.3289
25 Nov 2014 01:09:47 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 259,200 344,102 1.3276
21 Nov 2014 01:57:33 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 233,280 309,246 1.3256
16 Nov 2014 19:01:21 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 207,360 274,524 1.3239
15 Nov 2014 19:05:25 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 181,440 240,242 1.3241
11 Nov 2014 02:14:19 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 155,520 205,704 1.3227
06 Nov 2014 00:25:21 1261147 17262061 hadcm3n_sc31_1940_40_009112682_2 129,600 171,169 1.3207


©2024 cpdn.org