climateprediction.net home page
Task 16059441

Task 16059441

Name hadcm3n_odd6_1900_40_008472445_1
Workunit 8623284
Created 7 Oct 2013, 0:32:57 UTC
Sent 7 Oct 2013, 0:49:33 UTC
Report deadline 6 Jan 2014, 8:16:44 UTC
Received 24 Oct 2013, 20:29:38 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1180524
Run time 16 days 17 hours 52 min 17 sec
CPU time 13 days 6 hours 39 min 18 sec
Validate state Invalid
Credit 11,819.52
Device peak FLOPS 3.31 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2864, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2864, iMonCtr=1
Model crash detected, will try to restart...
07:45:20 (2864): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4256, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
07:46:04 (4460): Can't acquire lockfile (32) - waiting 35s
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4256, iMonCtr=1
Model crash detected, will try to restart...
07:46:07 (4256): No heartbeat from core client for 30 sec - exiting
07:46:08 (4256): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
21 Oct 2013 04:47:29 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 984,960 1,142,730 1.1602
20 Oct 2013 21:19:06 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 959,040 1,112,448 1.1600
20 Oct 2013 12:16:00 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 933,120 1,083,423 1.1611
20 Oct 2013 04:07:15 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 907,200 1,053,374 1.1611
19 Oct 2013 19:38:36 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 881,280 1,023,757 1.1617
19 Oct 2013 07:45:04 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 855,360 993,985 1.1621
18 Oct 2013 23:01:31 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 829,440 963,219 1.1613
18 Oct 2013 14:24:57 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 803,520 932,433 1.1604
18 Oct 2013 05:54:33 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 777,600 902,789 1.1610
17 Oct 2013 20:55:18 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 751,680 872,966 1.1614
17 Oct 2013 12:10:30 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 725,760 843,395 1.1621
17 Oct 2013 00:49:22 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 699,840 813,172 1.1619
16 Oct 2013 15:53:59 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 673,920 781,568 1.1597
16 Oct 2013 07:19:05 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 648,000 750,856 1.1587
15 Oct 2013 22:56:37 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 622,080 721,205 1.1593
15 Oct 2013 14:44:22 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 596,160 691,718 1.1603
15 Oct 2013 06:24:20 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 570,240 662,200 1.1613
14 Oct 2013 16:25:46 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 544,320 632,242 1.1615
14 Oct 2013 07:46:33 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 518,400 601,615 1.1605
13 Oct 2013 23:02:09 1180524 16059441 hadcm3n_odd6_1900_40_008472445_1 492,480 571,378 1.1602


©2024 cpdn.org