Name | hadcm3n_3dq8_1980_40_008367602_4 |
Workunit | 8518461 |
Created | 30 Nov 2013, 1:06:10 UTC |
Sent | 30 Nov 2013, 1:06:31 UTC |
Report deadline | 1 Mar 2014, 8:33:42 UTC |
Received | 7 Dec 2013, 23:19:43 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1290855 |
Run time | 4 days 15 hours 37 min 6 sec |
CPU time | 4 days 14 hours 40 min 10 sec |
Validate state | Invalid |
Credit | 3,732.48 |
Device peak FLOPS | 3.21 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 i686-pc-linux-gnu |
Stderr | <core_client_version>7.1.0</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234) </message> <stderr_txt> 20:10:19 (18790): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:14:58 (18817): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:20:56 (18855): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:28:41 (18892): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:34:52 (18938): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 21:33:34 (19149): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:50:57 (19180): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:00:41 (19234): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:02:21 (19276): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:06:45 (19473): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 12:07:55 (21640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:11:12 (21766): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:13:49 (21788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:28:54 (21813): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:29:11 (21813): No heartbeat from core client for 30 sec - exiting 12:29:12 (21813): No heartbeat from core client for 30 sec - exiting 12:29:13 (21813): No heartbeat from core client for 30 sec - exiting 12:29:14 (21813): No heartbeat from core client for 30 sec - exiting 12:29:15 (21813): No heartbeat from core client for 30 sec - exiting 12:29:16 (21813): No heartbeat from core client for 30 sec - exiting 12:29:17 (21813): No heartbeat from core client for 30 sec - exiting 12:29:18 (21813): No heartbeat from core client for 30 sec - exiting 12:29:19 (21813): No heartbeat from core client for 30 sec - exiting 12:55:19 (21870): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:57:49 (21949): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:57:50 (21949): No heartbeat from core client for 30 sec - exiting 12:57:51 (21949): No heartbeat from core client for 30 sec - exiting 12:57:52 (21949): No heartbeat from core client for 30 sec - exiting 13:06:06 (21964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:22:49 (22052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:24:56 (22115): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:27:28 (22137): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:51:07 (22156): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:13:44 (22329): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:16:28 (22508): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 15:21:12 (22876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:53:24 (32758): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:14:19 (472): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:24:12 (607): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:48:35 (688): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:57:34 (849): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:04:37 (913): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 17:34:06 (1098): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:36:24 (1249): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:36:25 (1249): No heartbeat from core client for 30 sec - exiting 17:36:26 (1249): No heartbeat from core client for 30 sec - exiting 17:36:27 (1249): No heartbeat from core client for 30 sec - exiting 17:36:28 (1249): No heartbeat from core client for 30 sec - exiting 17:36:29 (1249): No heartbeat from core client for 30 sec - exiting SIGABRT: abort called Stack trace (9 frames): /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7783400] [0xf7783430] /usr/lib/libc.so.6(gsignal+0x46)[0xf758e936] /usr/lib/libc.so.6(abort+0x143)[0xf7590173] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /usr/lib/libc.so.6(__libc_start_main+0xf3)[0xf7579963] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1274, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf777e400] [0xf777e430] /usr/lib/libc.so.6(gsignal+0x46)[0xf7589936] /usr/lib/libc.so.6(abort+0x143)[0xf758b173] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /usr/lib/libc.so.6(__libc_start_main+0xf3)[0xf7574963] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1274, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf76e5400] [0xf76e5430] /usr/lib/libc.so.6(gsignal+0x46)[0xf74f0936] /usr/lib/libc.so.6(abort+0x143)[0xf74f2173] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /usr/lib/libc.so.6(__libc_start_main+0xf3)[0xf74db963] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1274, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7792400] [0xf7792430] /usr/lib/libc.so.6(gsignal+0x46)[0xf759d936] /usr/lib/libc.so.6(abort+0x143)[0xf759f173] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /usr/lib/libc.so.6(__libc_start_main+0xf3)[0xf7588963] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1274, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7727400] [0xf7727430] /usr/lib/libc.so.6(gsignal+0x46)[0xf7532936] /usr/lib/libc.so.6(abort+0x143)[0xf7534173] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /usr/lib/libc.so.6(__libc_start_main+0xf3)[0xf751d963] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1274, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf76e7400] [0xf76e7430] /usr/lib/libc.so.6(gsignal+0x46)[0xf74f2936] /usr/lib/libc.so.6(abort+0x143)[0xf74f4173] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /usr/local/ondrejch/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /usr/lib/libc.so.6(__libc_start_main+0xf3)[0xf74dd963] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1274, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
05 Dec 2013 19:41:55 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 311,040 | 384,633 | 1.2366 |
05 Dec 2013 10:07:54 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 285,120 | 353,185 | 1.2387 |
05 Dec 2013 00:55:20 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 259,200 | 321,979 | 1.2422 |
04 Dec 2013 16:18:35 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 233,280 | 290,169 | 1.2439 |
04 Dec 2013 06:39:25 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 207,360 | 257,841 | 1.2434 |
03 Dec 2013 21:21:36 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 181,440 | 226,509 | 1.2484 |
03 Dec 2013 11:54:00 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 155,520 | 194,597 | 1.2513 |
03 Dec 2013 02:42:25 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 129,600 | 162,347 | 1.2527 |
02 Dec 2013 09:14:02 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 103,680 | 129,370 | 1.2478 |
01 Dec 2013 20:25:53 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 77,760 | 97,217 | 1.2502 |
01 Dec 2013 10:28:34 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 51,840 | 65,370 | 1.2610 |
01 Dec 2013 00:57:05 | 1290855 | 16100606 | hadcm3n_3dq8_1980_40_008367602_4 | 25,920 | 33,477 | 1.2916 |
©2024 cpdn.org