climateprediction.net home page
Task 13609471

Task 13609471

Name hadcm3n_yd1z_1940_40_007539590_1
Workunit 7736822
Created 6 Nov 2011, 3:01:15 UTC
Sent 9 Nov 2011, 6:34:04 UTC
Report deadline 8 Feb 2012, 14:01:15 UTC
Received 19 Dec 2011, 21:40:22 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1082927
Run time 23 days 7 hours 15 min 12 sec
CPU time 22 days 23 hours 33 min 7 sec
Validate state Invalid
Credit 10,264.32
Device peak FLOPS 2.07 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
12:48:59 (2608): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6136, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5424, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2096, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11:15:46 (6744): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6388, iMonCtr=1
Model crash detected, will try to restart...
19:32:02 (4624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2616, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2616, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2616, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CCPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3304, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5924, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6628, iMonCtr=1
Model crash detected, will try to restart...
10:05:51 (5784): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
08:10:33 (3148): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
17:28:40 (1184): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7564, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3428, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3740, iMonCtr=1
Model crash detected, will try to restart...
CCPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4980, iMonCtr=1
Model crash detected, will try to restart...
11:20:24 (7488): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7504, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7504, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7504, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7504, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8096, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7648, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7648, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3724, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
19 Dec 2011 04:01:37 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 855,360 1,954,649 2.2852
17 Dec 2011 19:33:10 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 829,440 1,891,943 2.2810
16 Dec 2011 07:16:10 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 803,520 1,832,791 2.2810
14 Dec 2011 18:37:43 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 777,600 1,772,136 2.2790
13 Dec 2011 17:13:44 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 751,680 1,710,201 2.2752
12 Dec 2011 17:01:11 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 725,760 1,649,691 2.2731
11 Dec 2011 17:09:03 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 699,840 1,588,581 2.2699
10 Dec 2011 04:45:14 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 673,920 1,527,669 2.2668
09 Dec 2011 05:16:47 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 648,000 1,464,908 2.2607
08 Dec 2011 03:31:32 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 622,080 1,404,152 2.2572
07 Dec 2011 10:37:18 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 596,160 1,345,223 2.2565
06 Dec 2011 17:48:01 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 570,240 1,285,808 2.2549
05 Dec 2011 03:59:49 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 544,320 1,227,151 2.2545
04 Dec 2011 08:04:46 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 518,400 1,167,394 2.2519
03 Dec 2011 14:12:32 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 492,480 1,107,384 2.2486
02 Dec 2011 19:06:20 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 466,560 1,048,718 2.2478
01 Dec 2011 06:14:53 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 440,640 985,096 2.2356
30 Nov 2011 03:34:57 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 414,720 923,673 2.2272
27 Nov 2011 11:06:19 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 388,800 863,832 2.2218
26 Nov 2011 02:02:53 1082927 13609471 hadcm3n_yd1z_1940_40_007539590_1 362,880 803,564 2.2144


©2024 climateprediction.net