climateprediction.net home page
Task 11906600

Task 11906600

Name hadsm3dhet2_u4jb_006726618_4
Workunit 6929961
Created 17 Sep 2010, 8:09:41 UTC
Sent 18 Sep 2010, 8:17:27 UTC
Report deadline 31 Aug 2011, 13:37:27 UTC
Received 10 Oct 2010, 13:56:38 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1101459
Run time
CPU time 6 days 9 hours 0 min
Validate state Invalid
Credit 3,672.01
Device peak FLOPS 2.35 GFLOPS
Application version UK Met Office HadSM3 Slab Model v6.07
windows_intelx86
Stderr
<core_client_version>6.2.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=724, iMonCtr=1
Model crash detected, will try to restart...
No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
MainError:	02:25:28 AM	No files match the supplied pattern.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2936, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2936, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2936, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2936, iMonCtr=1
Model crash detected, will try to restart...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3972, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3972, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3972, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3972, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3972, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4852, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
forrtl: Access is denied.
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6640, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
forrtl: Access is denied.
CPDN Monitor - Quit request from BOINC...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4104, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=180, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=180, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=180, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=180, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=180, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=180, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
05 Oct 2010 06:30:35 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 140,426 542,325 1.3569
04 Oct 2010 23:10:01 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 129,624 527,678 1.3569
04 Oct 2010 05:45:07 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 118,822 513,202 1.3574
03 Oct 2010 22:31:09 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 108,020 498,621 1.3576
03 Oct 2010 05:41:15 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 97,218 483,948 1.3576
02 Oct 2010 22:31:29 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 86,416 469,132 1.3572
02 Oct 2010 07:57:52 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 75,614 454,894 1.3585
02 Oct 2010 03:01:30 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 64,812 440,430 1.3591
01 Oct 2010 09:07:45 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 54,010 425,871 1.3595
01 Oct 2010 03:53:57 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 43,208 410,861 1.3584
30 Sep 2010 08:20:04 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 32,406 396,069 1.3580
30 Sep 2010 03:45:10 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 21,604 381,112 1.3570
29 Sep 2010 09:20:09 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 10,802 366,538 1.3573
29 Sep 2010 03:30:19 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 259,248 351,983 1.3577
28 Sep 2010 18:53:09 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 248,446 337,203 1.3572
28 Sep 2010 05:40:25 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 237,644 322,644 1.3577
27 Sep 2010 22:53:33 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 226,842 308,484 1.3599
27 Sep 2010 10:19:24 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 216,040 294,192 1.3617
27 Sep 2010 03:22:10 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 205,238 279,627 1.3625
26 Sep 2010 19:29:10 1101459 11906600 hadsm3dhet2_u4jb_006726618_4 194,436 264,848 1.3621


©2024 cpdn.org