climateprediction.net home page
Task 15886071

Task 15886071

Name hadcm3n_4gzi_1980_40_008399134_0
Workunit 8549990
Created 8 Jul 2013, 19:48:16 UTC
Sent 11 Jul 2013, 5:01:05 UTC
Report deadline 10 Oct 2013, 12:28:16 UTC
Received 3 Jan 2014, 4:35:15 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 25 (0x00000019) Unknown error code
Computer ID 1218845
Run time 20 days 0 hours 17 min 20 sec
CPU time 19 days 14 hours 27 min 36 sec
Validate state Invalid
Credit 8,398.08
Device peak FLOPS 2.66 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.2.33</core_client_version>
<![CDATA[
<message>
The drive cannot locate a specific area or track on the disk.
 (0x19) - exit code 25 (0x19)
</message>
<stderr_txt>
C17:30:50 (4804): No heartbeat from core client for 30 sec - exiting
17:30:51 (4804): No heartbeat from core client for 30 sec - exiting
17:30:52 (4804): No heartbeat from core client for 30 sec - exiting
17:30:53 (4804): No heartbeat from core client for 30 sec - exiting
17:30:54 (4804): No heartbeat from core client for 30 sec - exiting
17:30:55 (4804): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2172, iMonCtr=1
Model crash detected, will try to restart...
11:21:13 (5596): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2952, iMonCtr=1
Model crash detected, will try to restart...
15:12:50 (4428): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5544, iMonCtr=1
Model crash detected, will try to restart...
11:09:33 (1528): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4980, iMonCtr=1
Model crash detected, will try to restart...
19:46:11 (5504): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
09:30:56 (4172): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:49:30 (5012): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:56:49 (4624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4976, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4976, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5516, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5252, iMonCtr=1
Model crash detected, will try to restart...
22:39:28 (5316): No heartbeat from core client for 30 sec - exiting
22:39:29 (5316): No heartbeat from core client for 30 sec - exiting
22:39:30 (5316): No heartbeat from core client for 30 sec - exiting
22:39:31 (5316): No heartbeat from core client for 30 sec - exiting
22:39:32 (5316): No heartbeat from core client for 30 sec - exiting
22:39:33 (5316): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:38:39 (3084): No heartbeat from core client for 30 sec - exiting
08:38:40 (3084): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5300, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6000, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5960, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
09:09:07 (4824): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5892, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3764, iMonCtr=1
Model crash detected, will try to restart...
22:21:51 (868): No heartbeat from core client for 30 sec - exiting
22:21:52 (868): No heartbeat from core client for 30 sec - exiting
22:21:53 (868): No heartbeat from core client for 30 sec - exiting
22:21:54 (868): No heartbeat from core client for 30 sec - exiting
22:21:55 (868): No heartbeat from core client for 30 sec - exiting
22:21:56 (868): No heartbeat from core client for 30 sec - exiting
22:21:57 (868): No heartbeat from core client for 30 sec - exiting
22:21:58 (868): No heartbeat from core client for 30 sec - exiting
22:21:59 (868): No heartbeat from core client for 30 sec - exiting
22:22:01 (868): No heartbeat from core client for 30 sec - exiting
22:22:02 (868): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/4gziko.pjk1c10
Error converting file to netcdf: dataout/4gziko.pik1c10
Error converting file to netcdf: dataout/4gziko.pfk1c10
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
C14:16:30 (3352): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:25:40 (6096): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:25:41 (6096): No heartbeat from core client for 30 sec - exiting
14:25:42 (6096): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1620, iMonCtr=1
Model crash detected, will try to restart...
22:14:52 (5524): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
22:53:42 (4840): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:34:03 (5892): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4336, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4120, iMonCtr=1
Model crash detected, will try to restart...
C13:00:12 (4136): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:55:39 (4240): No heartbeat from core client for 30 sec - exiting
08:55:41 (4240): No heartbeat from core client for 30 sec - exiting
08:55:42 (4240): No heartbeat from core client for 30 sec - exiting
08:55:43 (4240): No heartbeat from core client for 30 sec - exiting
08:55:44 (4240): No heartbeat from core client for 30 sec - exiting
08:55:45 (4240): No heartbeat from core client for 30 sec - exiting
08:55:46 (4240): No heartbeat from core client for 30 sec - exiting
08:55:47 (4240): No heartbeat from core client for 30 sec - exiting
08:55:48 (4240): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4876, iMonCtr=1
Model crash detected, will try to restart...
Atmos Hold Restart file rename failed on atmos_restart.hold
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5748, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5304, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4408, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4852, iMonCtr=1
Model crash detected, will try to restart...
08:05:40 (5960): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
23 Dec 2013 06:30:40 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 699,840 1,680,244 2.4009
18 Dec 2013 02:45:07 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 673,920 1,588,329 2.3569
29 Nov 2013 06:13:39 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 648,000 1,503,495 2.3202
25 Nov 2013 06:05:10 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 622,080 1,410,627 2.2676
19 Nov 2013 08:53:46 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 596,160 1,320,898 2.2157
14 Nov 2013 13:29:36 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 570,240 1,283,210 2.2503
28 Sep 2013 01:46:41 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 544,320 1,202,641 2.2094
16 Aug 2013 12:00:17 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 518,400 1,113,699 2.1483
15 Aug 2013 07:25:10 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 492,480 1,072,797 2.1784
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 466,560 1,002,645 2.1490
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 440,640 913,699 2.0736
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 414,720 828,194 1.9970
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 388,800 789,753 2.0313
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 362,880 764,736 2.1074
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 336,960 738,380 2.1913
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 311,040 709,478 2.2810
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 285,120 683,523 2.3973
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 259,200 655,582 2.5293
14 Aug 2013 16:21:22 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 233,280 628,847 2.6957
29 Jul 2013 12:46:00 1218845 15886071 hadcm3n_4gzi_1980_40_008399134_0 207,360 601,408 2.9003


©2024 cpdn.org