climateprediction.net home page
Task 15865084

Task 15865084

Name hadcm3n_o7cq_1980_40_008396268_0
Workunit 8547127
Created 26 Jun 2013, 1:46:20 UTC
Sent 29 Jun 2013, 23:01:17 UTC
Report deadline 29 Sep 2013, 6:28:28 UTC
Received 21 Sep 2013, 0:39:32 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1261147
Run time 7 days 22 hours 49 min
CPU time 7 days 8 hours 21 min 19 sec
Validate state Invalid
Credit 5,598.72
Device peak FLOPS 2.79 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5884, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1220, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4896, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2880, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3068, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3032, iMonCtr=1
Model crash detected, will try to restart...
21:04:07 (3864): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5796, iMonCtr=1
Model crash detected, will try to restart...
10:25:20 (3188): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:41:48 (4820): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:11:01 (2908): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:11:02 (2908): No heartbeat from core client for 30 sec - exiting
20:49:32 (4176): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:00:15 (4444): No heartbeat from core client for 30 sec - exiting
17:00:16 (4444): No heartbeat from core client for 30 sec - exiting
17:00:17 (4444): No heartbeat from core client for 30 sec - exiting
17:00:18 (4444): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4000, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5560, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1056, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4584, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3372, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2540, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2968, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
20 Sep 2013 01:21:37 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 466,560 628,629 1.3474
16 Sep 2013 00:18:33 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 440,640 593,812 1.3476
14 Sep 2013 01:20:05 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 414,720 558,497 1.3467
09 Sep 2013 00:16:11 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 388,800 522,827 1.3447
05 Sep 2013 03:36:56 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 362,880 487,493 1.3434
02 Sep 2013 14:26:37 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 336,960 452,084 1.3417
31 Aug 2013 21:30:34 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 311,040 416,968 1.3406
30 Aug 2013 01:03:01 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 285,120 383,284 1.3443
16 Aug 2013 01:32:52 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 259,200 348,828 1.3458
15 Aug 2013 22:16:51 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 233,280 313,914 1.3457
15 Aug 2013 22:16:51 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 207,360 279,248 1.3467
15 Aug 2013 22:16:51 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 181,440 244,063 1.3451
15 Aug 2013 22:16:51 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 155,520 209,312 1.3459
26 Jul 2013 01:49:38 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 129,600 174,656 1.3477
23 Jul 2013 21:58:39 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 103,680 139,974 1.3501
23 Jul 2013 20:09:16 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 77,760 105,531 1.3571
23 Jul 2013 20:09:16 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 51,840 70,340 1.3569
09 Jul 2013 00:08:10 1261147 15865084 hadcm3n_o7cq_1980_40_008396268_0 25,920 35,110 1.3546


©2024 climateprediction.net