climateprediction.net home page
Task 13110371

Task 13110371

Name hadcm3n_yfqh_1900_40_007353251_0
Workunit 7550681
Created 6 Jul 2011, 14:25:44 UTC
Sent 15 Jul 2011, 17:33:24 UTC
Report deadline 15 Oct 2011, 1:00:35 UTC
Received 31 Oct 2011, 21:56:16 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 25 (0x00000019) Unknown error code
Computer ID 950229
Run time 105 days 10 hours 6 min 17 sec
CPU time 94 days 4 hours 18 min 53 sec
Validate state Invalid
Credit 11,819.52
Device peak FLOPS 1.14 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=148, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1900, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3168, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4224, iMonCtr=1
Model crash detected, will try to restart...
21:15:53 (3712): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4700, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4160, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5340, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=464, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4524, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4524, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
19:00:59 (4256): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3720, iMonCtr=1
Model crash detected, will try to restart...
10:12:22 (144): Can't acquire lockfile (32) - waiting 35s
10:12:39 (2704): Can't acquire lockfile (32) - waiting 35s
10:12:48 (4316): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:13:14 (2704): Can't acquire lockfile (32) - exiting
10:13:14 (2704): Error: The process cannot access the file because it is being used by another process. (0x20)
10:13:45 (144): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4368, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
01:21:21 (1820): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=736, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
21:26:16 (4900): Can't acquire lockfile (32) - waiting 35s
21:26:38 (3624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
31 Oct 2011 18:36:02 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 984,960 7,946,701 8.0680
31 Oct 2011 16:39:57 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 959,040 7,725,222 8.0552
31 Oct 2011 14:12:06 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 933,120 7,503,190 8.0410
31 Oct 2011 14:12:06 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 907,200 7,279,955 8.0246
18 Oct 2011 02:14:48 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 881,280 7,057,345 8.0081
15 Oct 2011 05:19:09 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 855,360 6,835,246 7.9911
12 Oct 2011 10:26:59 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 829,440 6,620,180 7.9815
09 Oct 2011 22:04:09 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 803,520 6,408,169 7.9751
07 Oct 2011 08:55:57 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 777,600 6,195,036 7.9669
04 Oct 2011 14:57:36 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 751,680 5,973,753 7.9472
29 Sep 2011 18:42:30 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 725,760 5,751,864 7.9253
26 Sep 2011 22:25:43 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 699,840 5,528,204 7.8992
24 Sep 2011 01:16:29 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 673,920 5,304,511 7.8711
20 Sep 2011 23:20:18 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 648,000 5,085,092 7.8474
18 Sep 2011 12:35:45 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 622,080 4,878,186 7.8417
15 Sep 2011 19:32:58 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 596,160 4,659,142 7.8153
11 Sep 2011 10:18:33 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 570,240 4,457,071 7.8161
08 Sep 2011 16:17:49 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 544,320 4,249,483 7.8070
06 Sep 2011 06:47:55 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 518,400 4,042,754 7.7985
03 Sep 2011 09:31:37 950229 13110371 hadcm3n_yfqh_1900_40_007353251_0 492,480 3,822,413 7.7616


©2024 cpdn.org