climateprediction.net home page
Task 16606619

Task 16606619

Name hadcm3n_8ee7_1980_40_008728074_1
Workunit 8874052
Created 1 May 2014, 21:17:34 UTC
Sent 1 May 2014, 22:23:50 UTC
Report deadline 1 Aug 2014, 5:51:01 UTC
Received 10 Jun 2014, 12:31:22 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1082920
Run time 20 days 5 hours 53 min 59 sec
CPU time 19 days 9 hours 6 min 36 sec
Validate state Invalid
Credit 10,575.36
Device peak FLOPS 2.71 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5428, iMonCtr=1
Model crash detected, will try to restart...
20:02:54 (5428): No heartbeat from core client for 30 sec - exiting
20:02:55 (5428): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1740, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1740, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1740, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6276, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6276, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish
19:24:36 (3404): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:26:25 (3192): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:15:12 (1380): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
06:52:59 (5360): No heartbeat from core client for 30 sec - exiting
06:53:00 (5360): No heartbeat from core client for 30 sec - exiting
06:53:01 (5360): No heartbeat from core client for 30 sec - exiting
06:53:03 (5360): No heartbeat from core client for 30 sec - exiting
06:53:04 (5360): No heartbeat from core client for 30 sec - exiting
06:53:05 (5360): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
11:02:47 (1288): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
10 Jun 2014 10:19:43 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 881,280 1,739,657 1.9740
10 Jun 2014 09:03:56 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 855,360 1,705,218 1.9936
10 Jun 2014 09:01:57 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 829,440 1,669,961 2.0134
10 Jun 2014 09:01:12 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 803,520 1,636,370 2.0365
10 Jun 2014 03:11:26 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 777,600 1,601,562 2.0596
08 Jun 2014 06:25:02 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 751,680 1,566,983 2.0846
06 Jun 2014 08:57:51 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 725,760 1,533,107 2.1124
05 Jun 2014 16:49:21 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 699,840 1,498,569 2.1413
05 Jun 2014 06:04:24 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 673,920 1,464,728 2.1734
04 Jun 2014 15:12:06 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 648,000 1,430,541 2.2076
03 Jun 2014 19:48:40 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 622,080 1,396,418 2.2448
03 Jun 2014 10:06:55 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 596,160 1,362,226 2.2850
02 Jun 2014 22:34:33 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 570,240 1,327,713 2.3283
01 Jun 2014 10:23:13 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 544,320 1,294,154 2.3776
01 Jun 2014 01:01:19 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 518,400 1,260,539 2.4316
31 May 2014 15:52:40 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 492,480 1,227,601 2.4927
31 May 2014 06:15:56 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 466,560 1,192,965 2.5569
30 May 2014 20:18:40 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 440,640 1,157,428 2.6267
12 May 2014 14:41:35 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 414,720 556,618 1.3422
12 May 2014 05:05:31 1082920 16606619 hadcm3n_8ee7_1980_40_008728074_1 388,800 522,592 1.3441


©2024 cpdn.org