climateprediction.net home page
Task 15811543

Task 15811543

Name hadcm3n_o5oe_1940_40_008380448_0
Workunit 8531307
Created 31 May 2013, 21:51:09 UTC
Sent 16 Jun 2013, 19:34:27 UTC
Report deadline 16 Sep 2013, 3:01:38 UTC
Received 29 Aug 2013, 0:34:36 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1290798
Run time 27 days 20 hours 38 min 37 sec
CPU time 26 days 19 hours 29 min 30 sec
Validate state Invalid
Credit 10,886.40
Device peak FLOPS 2.40 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
20:57:18 (5584): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5540, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6008, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4408, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5024, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=1
Model crash detected, will try to restart...
20:31:13 (5968): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:31:14 (5968): No heartbeat from core client for 30 sec - exiting
20:31:15 (5968): No heartbeat from core client for 30 sec - exiting
20:31:16 (5968): No heartbeat from core client for 30 sec - exiting
20:31:17 (5968): No heartbeat from core client for 30 sec - exiting
20:31:18 (5968): No heartbeat from core client for 30 sec - exiting
20:31:20 (5968): No heartbeat from core client for 30 sec - exiting
20:31:21 (5968): No heartbeat from core client for 30 sec - exiting
20:31:22 (5968): No heartbeat from core client for 30 sec - exiting
20:31:23 (5968): No heartbeat from core client for 30 sec - exiting
20:31:24 (5968): No heartbeat from core client for 30 sec - exiting
20:31:25 (5968): No heartbeat from core client for 30 sec - exiting
20:31:26 (5968): No heartbeat from core client for 30 sec - exiting
20:31:27 (5968): No heartbeat from core client for 30 sec - exiting
20:31:28 (5968): No heartbeat from core client for 30 sec - exiting
20:31:29 (5968): No heartbeat from core client for 30 sec - exiting
20:31:30 (5968): No heartbeat from core client for 30 sec - exiting
20:31:32 (5968): No heartbeat from core client for 30 sec - exiting
20:31:33 (5968): No heartbeat from core client for 30 sec - exiting
20:31:34 (5968): No heartbeat from core client for 30 sec - exiting
20:31:35 (5968): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2256, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2256, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5360, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5368, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5368, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4932, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5240, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4644, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7036, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5968, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3544, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5052, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=852, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=852, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4872, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4436, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4436, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4904, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4904, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4904, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8012, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8012, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
18 Aug 2013 19:31:53 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 907,200 2,276,786 2.5097
18 Aug 2013 01:08:04 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 881,280 2,211,233 2.5091
17 Aug 2013 06:14:09 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 855,360 2,145,655 2.5085
16 Aug 2013 00:42:41 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 829,440 2,085,672 2.5146
14 Aug 2013 19:54:34 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 803,520 2,027,205 2.5229
14 Aug 2013 19:54:34 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 777,600 1,963,640 2.5253
14 Aug 2013 19:54:34 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 751,680 1,898,461 2.5256
14 Aug 2013 19:54:34 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 725,760 1,832,959 2.5256
14 Aug 2013 19:54:34 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 699,840 1,772,358 2.5325
14 Aug 2013 19:54:34 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 673,920 1,713,460 2.5425
24 Jul 2013 20:58:58 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 648,000 1,649,564 2.5456
23 Jul 2013 21:58:38 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 622,080 1,579,014 2.5383
23 Jul 2013 20:56:09 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 596,160 1,511,659 2.5357
23 Jul 2013 20:11:44 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 570,240 1,439,726 2.5248
23 Jul 2013 18:53:18 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 544,320 1,363,941 2.5058
23 Jul 2013 18:53:16 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 518,400 1,294,220 2.4966
23 Jul 2013 18:53:16 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 492,480 1,223,534 2.4844
23 Jul 2013 18:53:16 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 466,560 1,148,636 2.4619
23 Jul 2013 18:53:15 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 440,640 1,073,940 2.4372
11 Jul 2013 04:33:48 1189727 15811543 hadcm3n_o5oe_1940_40_008380448_0 414,720 1,002,855 2.4181


©2024 cpdn.org