climateprediction.net home page
Task 13143432

Task 13143432

Name hadcm3n_yd3j_1900_40_007349833_2
Workunit 7547263
Created 17 Jul 2011, 10:38:15 UTC
Sent 17 Jul 2011, 10:56:25 UTC
Report deadline 16 Oct 2011, 18:23:36 UTC
Received 19 Aug 2011, 9:30:22 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1099430
Run time 13 days 16 hours 29 min 10 sec
CPU time 13 days 2 hours 11 min 23 sec
Validate state Invalid
Credit 10,575.36
Device peak FLOPS 2.97 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
16:59:29 (4928): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4464, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
02:21:52 (10708): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
12:34:03 (420): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=1
Model crash detected, will try to restart...
17:21:26 (5060): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
18 Aug 2011 22:43:34 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 881,280 1,103,320 1.2520
18 Aug 2011 13:29:44 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 855,360 1,071,097 1.2522
18 Aug 2011 02:38:39 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 829,440 1,038,718 1.2523
17 Aug 2011 17:21:30 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 803,520 1,006,372 1.2525
17 Aug 2011 07:57:16 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 777,600 974,181 1.2528
16 Aug 2011 20:42:21 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 751,680 941,821 1.2530
16 Aug 2011 11:24:35 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 725,760 909,611 1.2533
16 Aug 2011 01:56:11 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 699,840 877,238 1.2535
15 Aug 2011 16:52:43 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 673,920 844,952 1.2538
15 Aug 2011 07:24:06 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 648,000 812,691 1.2542
14 Aug 2011 21:57:51 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 622,080 780,463 1.2546
14 Aug 2011 13:23:44 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 596,160 748,401 1.2554
14 Aug 2011 01:36:56 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 570,240 716,035 1.2557
13 Aug 2011 16:06:12 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 544,320 683,857 1.2564
13 Aug 2011 06:36:46 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 518,400 651,473 1.2567
12 Aug 2011 21:20:18 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 492,480 619,227 1.2574
12 Aug 2011 12:03:50 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 466,560 587,049 1.2582
11 Aug 2011 13:23:51 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 440,640 554,699 1.2588
10 Aug 2011 18:57:32 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 414,720 522,284 1.2594
10 Aug 2011 04:12:12 1099430 13143432 hadcm3n_yd3j_1900_40_007349833_2 388,800 489,674 1.2594


©2024 climateprediction.net