climateprediction.net home page
Task 16162620

Task 16162620

Name hadcm3n_of7u_1900_40_008474845_1
Workunit 8625684
Created 28 Dec 2013, 5:01:22 UTC
Sent 28 Dec 2013, 5:01:26 UTC
Report deadline 29 Mar 2014, 12:28:37 UTC
Received 24 Jan 2014, 12:29:32 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1319477
Run time 22 days 5 hours 33 min 39 sec
CPU time 19 days 9 hours 12 min 29 sec
Validate state Invalid
Credit 11,197.44
Device peak FLOPS 2.22 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.2.33</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
02:31:48 (7144): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
05:46:45 (7656): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:30:57 (6840): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:33:42 (3704): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
02:54:32 (7832): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:32:00 (6868): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:31:15 (2664): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:54:26 (8124): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5796, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
02:16:50 (4524): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
20:22:07 (5992): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:05:49 (3856): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
01:51:12 (5036): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
01:58:08 (7096): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
01:51:20 (5748): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:59:54 (6184): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4004, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4004, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4004, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4004, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4004, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4004, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
24 Jan 2014 12:32:23 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 933,120 1,638,808 1.7563
23 Jan 2014 00:03:34 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 907,200 1,593,091 1.7561
22 Jan 2014 10:18:43 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 881,280 1,547,374 1.7558
21 Jan 2014 19:00:20 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 855,360 1,501,579 1.7555
21 Jan 2014 04:14:03 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 829,440 1,455,827 1.7552
20 Jan 2014 13:51:29 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 803,520 1,410,187 1.7550
19 Jan 2014 23:28:21 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 777,600 1,364,630 1.7549
19 Jan 2014 08:35:08 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 751,680 1,319,152 1.7549
18 Jan 2014 18:11:35 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 725,760 1,273,856 1.7552
18 Jan 2014 02:00:31 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 699,840 1,228,768 1.7558
17 Jan 2014 11:52:50 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 673,920 1,183,510 1.7562
16 Jan 2014 21:49:16 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 648,000 1,138,526 1.7570
16 Jan 2014 07:18:10 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 622,080 1,093,034 1.7571
15 Jan 2014 17:03:28 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 596,160 1,048,038 1.7580
15 Jan 2014 02:52:27 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 570,240 1,002,879 1.7587
14 Jan 2014 12:49:15 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 544,320 957,741 1.7595
14 Jan 2014 00:40:58 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 518,400 912,247 1.7597
13 Jan 2014 07:52:38 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 492,480 866,937 1.7603
12 Jan 2014 16:32:56 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 466,560 821,273 1.7603
12 Jan 2014 01:46:48 1306645 16162620 hadcm3n_of7u_1900_40_008474845_1 440,640 775,876 1.7608


©2024 cpdn.org