climateprediction.net home page
Task 15722726

Task 15722726

Name hadcm3n_3agi_1980_40_008283066_2
Workunit 8434201
Created 12 Apr 2013, 16:15:06 UTC
Sent 12 Apr 2013, 16:15:10 UTC
Report deadline 12 Jul 2013, 23:42:21 UTC
Received 24 May 2013, 16:00:03 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1179809
Run time 7 days 12 hours 6 min 30 sec
CPU time 6 days 23 hours 10 min 14 sec
Validate state Invalid
Credit 5,287.68
Device peak FLOPS 2.97 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3692, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3692, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9572, iMonCtr=1
Model crash detected, will try to restart...
CCPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4688, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4688, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
06:07:47 (5120): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:09:49 (6044): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CCPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4236, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
06:17:14 (4736): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6100, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2720, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=928, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=928, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=928, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5032, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2600, iMonCtr=1
Model crash detected, will try to restart...
06:05:01 (4864): No heartbeat from core client for 30 sec - exiting
06:05:02 (4864): No heartbeat from core client for 30 sec - exiting
06:05:03 (4864): No heartbeat from core client for 30 sec - exiting
06:05:04 (4864): No heartbeat from core client for 30 sec - exiting
06:05:05 (4864): No heartbeat from core client for 30 sec - exiting
06:05:07 (4864): No heartbeat from core client for 30 sec - exiting
06:05:08 (4864): No heartbeat from core client for 30 sec - exiting
06:05:09 (4864): No heartbeat from core client for 30 sec - exiting
06:05:10 (4864): No heartbeat from core client for 30 sec - exiting
06:05:11 (4864): No heartbeat from core client for 30 sec - exiting
06:05:12 (4864): No heartbeat from core client for 30 sec - exiting
06:05:13 (4864): No heartbeat from core client for 30 sec - exiting
06:05:14 (4864): No heartbeat from core client for 30 sec - exiting
06:05:15 (4864): No heartbeat from core client for 30 sec - exiting
06:05:16 (4864): No heartbeat from core client for 30 sec - exiting
06:05:18 (4864): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5212, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1156, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4548, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5712, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
24 May 2013 14:29:45 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 440,640 597,390 1.3557
22 May 2013 13:50:13 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 414,720 562,458 1.3562
18 May 2013 18:07:24 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 388,800 526,938 1.3553
16 May 2013 14:04:19 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 362,880 490,678 1.3522
14 May 2013 17:22:39 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 336,960 455,171 1.3508
11 May 2013 16:31:23 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 311,040 419,845 1.3498
09 May 2013 15:20:52 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 285,120 385,244 1.3512
06 May 2013 13:40:07 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 259,200 348,952 1.3463
01 May 2013 18:37:26 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 233,280 313,650 1.3445
28 Apr 2013 04:24:45 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 207,360 278,280 1.3420
27 Apr 2013 05:21:58 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 181,440 242,795 1.3382
26 Apr 2013 15:01:22 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 155,520 207,903 1.3368
24 Apr 2013 15:05:36 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 129,600 172,019 1.3273
22 Apr 2013 13:25:36 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 103,680 136,228 1.3139
19 Apr 2013 14:15:29 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 77,760 100,651 1.2944
17 Apr 2013 16:05:52 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 51,840 66,887 1.2903
15 Apr 2013 16:22:50 1179809 15722726 hadcm3n_3agi_1980_40_008283066_2 25,920 33,600 1.2963


©2024 cpdn.org