climateprediction.net home page
Task 13447130

Task 13447130

Name hadcm3n_o09h_1940_40_007446754_3
Workunit 7644257
Created 28 Sep 2011, 17:25:41 UTC
Sent 28 Sep 2011, 17:26:46 UTC
Report deadline 29 Dec 2011, 0:53:57 UTC
Received 6 Nov 2011, 23:07:16 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1151256
Run time 15 days 17 hours 41 min 28 sec
CPU time 15 days 0 hours 51 min 27 sec
Validate state Invalid
Credit 5,909.76
Device peak FLOPS 1.73 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5636, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
08:36:17 (1476): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5668, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5668, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5444, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5444, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
21:39:46 (3452): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4724, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5672, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4312, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5384, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5288, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5664, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5512, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5504, iMonCtr=1
Model crash detected, will try to restart...
08:45:00 (5508): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:39:48 (5872): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5696, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5620, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
14:27:33 (5772): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
11:47:17 (5612): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5672, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5604, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5604, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5604, iMonCtr=1
Model crash detected, will try to restart...
08:04:51 (5788): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5856, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5856, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5856, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5856, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5856, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7104, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
04 Nov 2011 16:24:49 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 492,480 1,265,320 2.5693
03 Nov 2011 15:35:29 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 466,560 1,220,304 2.6155
01 Nov 2011 23:39:35 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 440,640 1,173,967 2.6642
31 Oct 2011 20:04:53 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 414,720 1,127,260 2.7181
31 Oct 2011 18:40:57 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 388,800 1,078,262 2.7733
31 Oct 2011 17:21:50 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 362,880 1,024,238 2.8225
31 Oct 2011 16:31:37 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 336,960 960,435 2.8503
31 Oct 2011 12:50:35 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 311,040 891,080 2.8648
31 Oct 2011 12:50:35 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 285,120 809,616 2.8396
31 Oct 2011 12:50:30 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 259,200 730,663 2.8189
18 Oct 2011 17:47:36 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 233,280 669,600 2.8704
14 Oct 2011 17:24:06 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 207,360 608,887 2.9364
13 Oct 2011 16:43:43 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 181,440 549,235 3.0271
12 Oct 2011 00:53:08 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 155,520 478,568 3.0772
09 Oct 2011 17:43:15 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 129,600 400,250 3.0883
07 Oct 2011 22:27:51 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 103,680 320,881 3.0949
05 Oct 2011 16:28:30 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 77,760 241,731 3.1087
02 Oct 2011 15:21:35 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 51,840 163,037 3.1450
30 Sep 2011 17:19:13 1151256 13447130 hadcm3n_o09h_1940_40_007446754_3 25,920 82,242 3.1729


©2024 cpdn.org