Name | hadcm3n_ycff_1980_40_008319496_4 |
Workunit | 8470631 |
Created | 29 Jul 2013, 15:19:56 UTC |
Sent | 30 Jul 2013, 9:21:34 UTC |
Report deadline | 29 Oct 2013, 16:48:45 UTC |
Received | 5 Sep 2013, 8:19:53 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1310137 |
Run time | 4 days 16 hours 14 min 29 sec |
CPU time | 4 days 13 hours 28 min 6 sec |
Validate state | Invalid |
Credit | 2,488.32 |
Device peak FLOPS | 2.94 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 i686-pc-linux-gnu |
Stderr | <core_client_version>7.0.65</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... 17:14:40 (2418): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:14:41 (2418): No heartbeat from core client for 30 sec - exiting 17:14:42 (2418): No heartbeat from core client for 30 sec - exiting 17:14:43 (2418): No heartbeat from core client for 30 sec - exiting 17:14:44 (2418): No heartbeat from core client for 30 sec - exiting 17:14:45 (2418): No heartbeat from core client for 30 sec - exiting 17:14:46 (2418): No heartbeat from core client for 30 sec - exiting 17:14:47 (2418): No heartbeat from core client for 30 sec - exiting 17:14:48 (2418): No heartbeat from core client for 30 sec - exiting 17:14:49 (2418): No heartbeat from core client for 30 sec - exiting 17:14:50 (2418): No heartbeat from core client for 30 sec - exiting 17:14:51 (2418): No heartbeat from core client for 30 sec - exiting 17:14:52 (2418): No heartbeat from core client for 30 sec - exiting 17:14:53 (2418): No heartbeat from core client for 30 sec - exiting 17:14:54 (2418): No heartbeat from core client for 30 sec - exiting 17:14:55 (2418): No heartbeat from core client for 30 sec - exiting 17:14:56 (2418): No heartbeat from core client for 30 sec - exiting 17:14:57 (2418): No heartbeat from core client for 30 sec - exiting 17:14:58 (2418): No heartbeat from core client for 30 sec - exiting 17:14:59 (2418): No heartbeat from core client for 30 sec - exiting 17:15:00 (2418): No heartbeat from core client for 30 sec - exiting 17:15:01 (2418): No heartbeat from core client for 30 sec - exiting 17:15:02 (2418): No heartbeat from core client for 30 sec - exiting 17:15:03 (2418): No heartbeat from core client for 30 sec - exiting 17:15:04 (2418): No heartbeat from core client for 30 sec - exiting Atmos Hold Restart file rename failed on atmos_restart.hold Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... 03:00:19 (1689): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:00:21 (1689): No heartbeat from core client for 30 sec - exiting 03:00:22 (1689): No heartbeat from core client for 30 sec - exiting 03:00:23 (1689): No heartbeat from core client for 30 sec - exiting 03:00:24 (1689): No heartbeat from core client for 30 sec - exiting 03:00:25 (1689): No heartbeat from core client for 30 sec - exiting 03:00:26 (1689): No heartbeat from core client for 30 sec - exiting 03:00:27 (1689): No heartbeat from core client for 30 sec - exiting 03:00:28 (1689): No heartbeat from core client for 30 sec - exiting 03:00:29 (1689): No heartbeat from core client for 30 sec - exiting 03:00:30 (1689): No heartbeat from core client for 30 sec - exiting 03:00:31 (1689): No heartbeat from core client for 30 sec - exiting 03:00:32 (1689): No heartbeat from core client for 30 sec - exiting 03:00:33 (1689): No heartbeat from core client for 30 sec - exiting 03:00:34 (1689): No heartbeat from core client for 30 sec - exiting 03:00:35 (1689): No heartbeat from core client for 30 sec - exiting 03:00:36 (1689): No heartbeat from core client for 30 sec - exiting 03:00:37 (1689): No heartbeat from core client for 30 sec - exiting 03:00:38 (1689): No heartbeat from core client for 30 sec - exiting 03:00:39 (1689): No heartbeat from core client for 30 sec - exiting 03:00:40 (1689): No heartbeat from core client for 30 sec - exiting 03:00:41 (1689): No heartbeat from core client for 30 sec - exiting SIGABRT: abort called Stack trace (9 frames): /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7783400] [0xf7783425] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f)[0xf756e1df] /lib/i386-linux-gnu/libc.so.6(abort+0x175)[0xf7571825] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xf75594d3] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17139, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf77f0400] [0xf77f0425] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f)[0xf75d61df] /lib/i386-linux-gnu/libc.so.6(abort+0x175)[0xf75d9825] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xf75c14d3] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17139, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7770400] [0xf7770425] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f)[0xf75561df] /lib/i386-linux-gnu/libc.so.6(abort+0x175)[0xf7559825] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xf75414d3] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17139, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf775c400] [0xf775c425] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f)[0xf75461df] /lib/i386-linux-gnu/libc.so.6(abort+0x175)[0xf7549825] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xf75314d3] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17139, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7787400] [0xf7787425] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f)[0xf75961df] /lib/i386-linux-gnu/libc.so.6(abort+0x175)[0xf7599825] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xf75814d3] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17139, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (9 frames): /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7774400] [0xf7774425] /lib/i386-linux-gnu/libc.so.6(gsignal+0x4f)[0xf755e1df] /lib/i386-linux-gnu/libc.so.6(abort+0x175)[0xf7561825] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/cpdn/BOINC/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xf75494d3] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17139, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
05 Sep 2013 04:07:02 | 1286052 | 15910981 | hadcm3n_ycff_1980_40_008319496_4 | 207,360 | 380,340 | 1.8342 |
04 Sep 2013 21:40:32 | 1286052 | 15910981 | hadcm3n_ycff_1980_40_008319496_4 | 181,440 | 333,512 | 1.8381 |
04 Sep 2013 03:14:18 | 1286052 | 15910981 | hadcm3n_ycff_1980_40_008319496_4 | 155,520 | 286,491 | 1.8421 |
03 Sep 2013 09:25:47 | 1286052 | 15910981 | hadcm3n_ycff_1980_40_008319496_4 | 129,600 | 239,661 | 1.8492 |
02 Sep 2013 20:05:02 | 1286052 | 15910981 | hadcm3n_ycff_1980_40_008319496_4 | 103,680 | 192,675 | 1.8584 |
02 Sep 2013 06:38:16 | 1286052 | 15910981 | hadcm3n_ycff_1980_40_008319496_4 | 77,760 | 145,631 | 1.8728 |
01 Sep 2013 17:22:45 | 1286052 | 15910981 | hadcm3n_ycff_1980_40_008319496_4 | 51,840 | 98,701 | 1.9040 |
01 Sep 2013 00:06:23 | 1286052 | 15910981 | hadcm3n_ycff_1980_40_008319496_4 | 25,920 | 50,459 | 1.9467 |
©2024 cpdn.org