climateprediction.net home page
Task 14028468

Task 14028468

Name hadcm3n_yb6v_1940_40_007742727_3
Workunit 7897835
Created 29 Jan 2012, 17:28:21 UTC
Sent 29 Jan 2012, 17:28:27 UTC
Report deadline 30 Apr 2012, 0:55:38 UTC
Received 17 May 2012, 22:15:23 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID 1140429
Run time 16 days 15 hours 17 min 33 sec
CPU time 16 days 3 hours 44 min 5 sec
Validate state Invalid
Credit 9,642.24
Device peak FLOPS 2.29 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
21:14:36 (6604): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:32:14 (6404): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8008, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4888, iMonCtr=1
Model crash detected, will try to restart...
11:30:09 (5768): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
20:36:47 (9296): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:15:46 (7376): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:15:47 (7376): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
C22:15:43 (8840): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:25:48 (3228): No heartbeat from core client for 30 sec - exiting
23:25:49 (3228): No heartbeat from core client for 30 sec - exiting
23:25:50 (3228): No heartbeat from core client for 30 sec - exiting
23:25:51 (3228): No heartbeat from core client for 30 sec - exiting
23:25:52 (3228): No heartbeat from core client for 30 sec - exiting
23:25:53 (3228): No heartbeat from core client for 30 sec - exiting
23:25:54 (3228): No heartbeat from core client for 30 sec - exiting
23:25:55 (3228): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:25:56 (3228): No heartbeat from core client for 30 sec - exiting
19:53:36 (7256): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/yb6vko.pjf6c10
Error converting file to netcdf: dataout/yb6vko.pif6c10
Error converting file to netcdf: dataout/yb6vko.pff6c10
Error converting file to netcdf: dataout/yb6vka.phf6c10
Error converting file to netcdf: dataout/yb6vka.pgf6c10
Error converting file to netcdf: dataout/yb6vka.pef6c10
Error converting file to netcdf: dataout/yb6vka.pdf6c10
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5092, iMonCtr=1
Model crash detected, will try to restart...
21:02:10 (9424): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:01:56 (9832): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6732, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3752, iMonCtr=1
Model crash detected, will try to restart...
08:24:24 (5996): No heartbeat from core client for 30 sec - exiting
08:24:25 (5996): No heartbeat from core client for 30 sec - exiting
08:24:26 (5996): No heartbeat from core client for 30 sec - exiting
08:24:27 (5996): No heartbeat from core client for 30 sec - exiting
08:24:28 (5996): No heartbeat from core client for 30 sec - exiting
08:24:29 (5996): No heartbeat from core client for 30 sec - exiting
08:24:30 (5996): No heartbeat from core client for 30 sec - exiting
08:24:31 (5996): No heartbeat from core client for 30 sec - exiting
08:24:32 (5996): No heartbeat from core client for 30 sec - exiting
08:24:33 (5996): No heartbeat from core client for 30 sec - exiting
08:24:34 (5996): No heartbeat from core client for 30 sec - exiting
08:24:35 (5996): No heartbeat from core client for 30 sec - exiting
08:24:36 (5996): No heartbeat from core client for 30 sec - exiting
08:24:37 (5996): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:43:11 (4272): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4900, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8892, iMonCtr=1
Model crash detected, will try to restart...
20:38:30 (5392): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:51:04 (5884): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:52:48 (3024): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5976, iMonCtr=1
Model crash detected, will try to restart...
23:23:48 (7780): No heartbeat from core client for 30 sec - exiting
23:23:59 (7780): No heartbeat from core client for 30 sec - exiting
23:24:01 (7780): No heartbeat from core client for 30 sec - exiting
23:24:02 (7780): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CCPDN Monitor - Quit request from BOINC...
21:02:14 (996): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4776, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3888, iMonCtr=1
Model crash detected, will try to restart...
21:59:47 (7400): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
21:57:51 (2420): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:57:52 (2420): No heartbeat from core client for 30 sec - exiting

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
17 May 2012 16:51:52 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 803,520 1,377,294 1.7141
13 May 2012 19:15:08 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 777,600 1,331,286 1.7120
12 May 2012 17:22:43 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 751,680 1,287,692 1.7131
08 May 2012 10:45:27 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 725,760 1,243,829 1.7138
05 May 2012 20:17:18 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 699,840 1,199,254 1.7136
29 Apr 2012 13:23:15 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 673,920 1,153,220 1.7112
23 Apr 2012 18:55:56 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 648,000 1,108,987 1.7114
21 Apr 2012 18:53:37 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 622,080 1,064,349 1.7110
16 Apr 2012 12:38:36 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 596,160 1,018,359 1.7082
14 Apr 2012 20:19:45 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 570,240 972,973 1.7063
13 Apr 2012 21:38:53 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 544,320 927,602 1.7041
12 Apr 2012 20:02:58 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 518,400 882,778 1.7029
11 Apr 2012 10:19:52 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 492,480 839,269 1.7042
10 Apr 2012 12:47:59 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 466,560 795,227 1.7044
08 Apr 2012 17:51:25 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 440,640 752,111 1.7069
03 Apr 2012 16:07:21 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 414,720 706,697 1.7040
25 Mar 2012 20:16:36 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 388,800 662,554 1.7041
23 Mar 2012 20:27:07 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 362,880 618,155 1.7035
18 Mar 2012 21:38:06 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 336,960 573,781 1.7028
05 Mar 2012 21:25:32 1140429 14028468 hadcm3n_yb6v_1940_40_007742727_3 311,040 529,078 1.7010


©2024 cpdn.org