climateprediction.net home page
Task 12900950

Task 12900950

Name hadcm3n_p7bz_1900_40_007227087_2
Workunit 7425327
Created 23 May 2011, 12:56:40 UTC
Sent 23 May 2011, 12:56:44 UTC
Report deadline 22 Aug 2011, 20:23:55 UTC
Received 1 Aug 2011, 6:38:18 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1073842
Run time 9 days 19 hours 28 min 29 sec
CPU time 9 days 3 hours 11 min 47 sec
Validate state Invalid
Credit 4,665.60
Device peak FLOPS 2.71 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.43</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3604, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4404, iMonCtr=1
Model crash detected, will try to restart...
08:38:51 (4836): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4652, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4216, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5432, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5432, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
08:50:37 (4636): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5528, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4688, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1944, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5528, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3312, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4764, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3832, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4396, iMonCtr=1
Model crash detected, will try to restart...
07:59:34 (4336): No heartbeat from core client for 30 sec - exiting
07:59:35 (4336): No heartbeat from core client for 30 sec - exiting
07:59:36 (4336): No heartbeat from core client for 30 sec - exiting
07:59:37 (4336): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4432, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4256, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5140, selfPID=5140, iMonCtr=1
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
29 Jul 2011 13:20:54 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 388,800 788,417 2.0278
26 Jul 2011 07:51:40 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 362,880 735,416 2.0266
25 Jul 2011 19:06:58 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 336,960 682,253 2.0247
25 Jul 2011 19:06:58 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 311,040 629,833 2.0249
25 Jul 2011 17:15:21 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 285,120 577,182 2.0243
25 Jul 2011 17:15:21 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 259,200 524,484 2.0235
01 Jul 2011 07:03:03 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 233,280 472,207 2.0242
24 Jun 2011 11:43:09 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 207,360 420,160 2.0262
22 Jun 2011 06:09:55 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 181,440 368,357 2.0302
15 Jun 2011 10:37:25 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 155,520 316,207 2.0332
10 Jun 2011 08:03:35 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 129,600 263,672 2.0345
08 Jun 2011 07:04:32 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 103,680 211,189 2.0369
06 Jun 2011 06:19:05 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 77,760 158,760 2.0417
31 May 2011 10:49:17 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 51,840 106,137 2.0474
26 May 2011 13:16:25 1073842 12900950 hadcm3n_p7bz_1900_40_007227087_2 25,920 52,059 2.0084


©2024 cpdn.org