climateprediction.net home page
Task 15643754

Task 15643754

Name hadcm3n_4h9w_1940_40_008309950_1
Workunit 8461085
Created 28 Feb 2013, 20:35:07 UTC
Sent 28 Feb 2013, 20:35:17 UTC
Report deadline 31 May 2013, 4:02:28 UTC
Received 25 Mar 2013, 9:08:45 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1242385
Run time 18 days 1 hours 1 min 15 sec
CPU time 15 days 6 hours 37 min 15 sec
Validate state Invalid
Credit 11,508.48
Device peak FLOPS 2.61 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
12:20:31 (7028): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
14:05:42 (5876): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:05:43 (5876): No heartbeat from core client for 30 sec - exiting
14:05:44 (5876): No heartbeat from core client for 30 sec - exiting
14:05:45 (5876): No heartbeat from core client for 30 sec - exiting
14:05:46 (5876): No heartbeat from core client for 30 sec - exiting
14:05:47 (5876): No heartbeat from core client for 30 sec - exiting
14:05:48 (5876): No heartbeat from core client for 30 sec - exiting
14:05:49 (5876): No heartbeat from core client for 30 sec - exiting
14:05:50 (5876): No heartbeat from core client for 30 sec - exiting
14:05:51 (5876): No heartbeat from core client for 30 sec - exiting
14:05:52 (5876): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
12:05:19 (7020): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
11:00:51 (7208): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7608, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/4h9wko.pjg1c10
Error converting file to netcdf: dataout/4h9wko.pig1c10
Error converting file to netcdf: dataout/4h9wko.pfg1c10
Error converting file to netcdf: dataout/4h9wka.phg1c10
Error converting file to netcdf: dataout/4h9wka.pgg1c10
Error converting file to netcdf: dataout/4h9wka.peg1c10
Error converting file to netcdf: dataout/4h9wka.pdg1c10
09:49:31 (6804): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
12:56:26 (1332): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:56:27 (1332): No heartbeat from core client for 30 sec - exiting
12:56:28 (1332): No heartbeat from core client for 30 sec - exiting
12:56:29 (1332): No heartbeat from core client for 30 sec - exiting
12:56:30 (1332): No heartbeat from core client for 30 sec - exiting
12:56:31 (1332): No heartbeat from core client for 30 sec - exiting
12:56:32 (1332): No heartbeat from core client for 30 sec - exiting
12:56:33 (1332): No heartbeat from core client for 30 sec - exiting
12:56:34 (1332): No heartbeat from core client for 30 sec - exiting
12:56:35 (1332): No heartbeat from core client for 30 sec - exiting
12:56:36 (1332): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6480, iMonCtr=1
Model crash detected, will try to restart...
08:01:19 (6712): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6392, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6392, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4468, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6432, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...

Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.                                                                                                                                                                                                                     tmp/pipe_dummy                                                                  2048    

Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.                                                                                                                                                                                                                     tmp/pipe_dummy                                                                  2048    

Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.                                                                                                                                                                                                                     tmp/pipe_dummy                                                                  2048    

Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.                                                                                                                                                                                                                     tmp/pipe_dummy                                                                  2048    

Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.                                                                                                                                                                                                                     tmp/pipe_dummy                                                                  2048    

Model crashed: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.                                                                                                                                                                                                                     tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
25 Mar 2013 04:51:15 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 959,040 1,521,606 1.5866
24 Mar 2013 17:05:51 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 933,120 1,479,958 1.5860
24 Mar 2013 06:06:20 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 907,200 1,441,219 1.5886
23 Mar 2013 19:11:22 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 881,280 1,402,057 1.5909
23 Mar 2013 07:52:56 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 855,360 1,362,511 1.5929
22 Mar 2013 20:25:16 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 829,440 1,321,991 1.5938
22 Mar 2013 05:17:54 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 803,520 1,276,572 1.5887
21 Mar 2013 17:46:13 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 777,600 1,235,496 1.5889
21 Mar 2013 04:33:04 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 751,680 1,194,867 1.5896
20 Mar 2013 09:02:06 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 725,760 1,156,290 1.5932
19 Mar 2013 22:04:15 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 699,840 1,117,298 1.5965
19 Mar 2013 07:48:48 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 673,920 1,073,354 1.5927
18 Mar 2013 20:29:42 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 648,000 1,034,236 1.5960
18 Mar 2013 06:53:10 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 622,080 989,394 1.5905
17 Mar 2013 19:47:42 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 596,160 950,156 1.5938
17 Mar 2013 08:38:53 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 570,240 910,241 1.5962
16 Mar 2013 13:50:17 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 544,320 869,663 1.5977
16 Mar 2013 03:16:22 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 518,400 830,870 1.6028
13 Mar 2013 17:36:25 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 492,480 782,975 1.5899
13 Mar 2013 04:28:49 1242385 15643754 hadcm3n_4h9w_1940_40_008309950_1 466,560 737,214 1.5801


©2024 cpdn.org