Name | hadcm3n_yjxj_1900_40_007358689_0 |
Workunit | 7556119 |
Created | 6 Jul 2011, 15:01:26 UTC |
Sent | 8 Jul 2011, 6:14:51 UTC |
Report deadline | 7 Oct 2011, 13:42:02 UTC |
Received | 15 Jul 2011, 14:41:40 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1372513 |
Run time | 4 days 15 hours 35 min 1 sec |
CPU time | 4 days 14 hours 14 min 14 sec |
Validate state | Invalid |
Credit | 1,866.24 |
Device peak FLOPS | 1.56 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 i686-pc-linux-gnu |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 00:55:10 (2495): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:55:11 (2495): No heartbeat from core client for 30 sec - exiting 00:55:12 (2495): No heartbeat from core client for 30 sec - exiting 00:55:13 (2495): No heartbeat from core client for 30 sec - exiting 00:55:14 (2495): No heartbeat from core client for 30 sec - exiting 00:55:15 (2495): No heartbeat from core client for 30 sec - exiting 00:55:16 (2495): No heartbeat from core client for 30 sec - exiting 00:55:17 (2495): No heartbeat from core client for 30 sec - exiting 00:55:18 (2495): No heartbeat from core client for 30 sec - exiting 00:55:19 (2495): No heartbeat from core client for 30 sec - exiting 00:55:20 (2495): No heartbeat from core client for 30 sec - exiting 00:55:21 (2495): No heartbeat from core client for 30 sec - exiting 00:55:22 (2495): No heartbeat from core client for 30 sec - exiting 00:55:23 (2495): No heartbeat from core client for 30 sec - exiting 00:55:24 (2495): No heartbeat from core client for 30 sec - exiting 00:55:25 (2495): No heartbeat from core client for 30 sec - exiting 00:55:26 (2495): No heartbeat from core client for 30 sec - exiting 00:55:27 (2495): No heartbeat from core client for 30 sec - exiting 00:55:28 (2495): No heartbeat from core client for 30 sec - exiting 00:55:29 (2495): No heartbeat from core client for 30 sec - exiting 00:55:30 (2495): No heartbeat from core client for 30 sec - exiting 00:55:31 (2495): No heartbeat from core client for 30 sec - exiting 00:55:32 (2495): No heartbeat from core client for 30 sec - exiting 00:55:33 (2495): No heartbeat from core client for 30 sec - exiting 00:55:34 (2495): No heartbeat from core client for 30 sec - exiting 00:55:35 (2495): No heartbeat from core client for 30 sec - exiting 00:55:36 (2495): No heartbeat from core client for 30 sec - exiting 00:55:37 (2495): No heartbeat from core client for 30 sec - exiting 00:55:38 (2495): No heartbeat from core client for 30 sec - exiting 00:55:39 (2495): No heartbeat from core client for 30 sec - exiting 00:55:40 (2495): No heartbeat from core client for 30 sec - exiting 00:55:41 (2495): No heartbeat from core client for 30 sec - exiting 00:55:42 (2495): No heartbeat from core client for 30 sec - exiting 00:55:43 (2495): No heartbeat from core client for 30 sec - exiting 00:55:44 (2495): No heartbeat from core client for 30 sec - exiting Atmos Hold Restart file rename failed on atmos_restart.hold 00:59:13 (19512): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:00:20 (19606): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 01:40:45 (28602): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:40:46 (28602): No heartbeat from core client for 30 sec - exiting 01:40:47 (28602): No heartbeat from core client for 30 sec - exiting 01:40:48 (28602): No heartbeat from core client for 30 sec - exiting 01:40:49 (28602): No heartbeat from core client for 30 sec - exiting 01:40:50 (28602): No heartbeat from core client for 30 sec - exiting 01:40:51 (28602): No heartbeat from core client for 30 sec - exiting 01:40:52 (28602): No heartbeat from core client for 30 sec - exiting 01:40:53 (28602): No heartbeat from core client for 30 sec - exiting 01:40:54 (28602): No heartbeat from core client for 30 sec - exiting 01:40:55 (28602): No heartbeat from core client for 30 sec - exiting 01:40:56 (28602): No heartbeat from core client for 30 sec - exiting 01:40:57 (28602): No heartbeat from core client for 30 sec - exiting 01:40:58 (28602): No heartbeat from core client for 30 sec - exiting 01:40:59 (28602): No heartbeat from core client for 30 sec - exiting 01:41:00 (28602): No heartbeat from core client for 30 sec - exiting 01:41:01 (28602): No heartbeat from core client for 30 sec - exiting 01:41:02 (28602): No heartbeat from core client for 30 sec - exiting 01:41:03 (28602): No heartbeat from core client for 30 sec - exiting 01:41:04 (28602): No heartbeat from core client for 30 sec - exiting Atmos Hold Restart file rename failed on atmos_restart.hold Suspended CPDN Monitor - Suspend request from BOINC... 00:50:02 (16786): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:50:03 (16786): No heartbeat from core client for 30 sec - exiting 00:50:04 (16786): No heartbeat from core client for 30 sec - exiting 00:50:05 (16786): No heartbeat from core client for 30 sec - exiting 00:50:06 (16786): No heartbeat from core client for 30 sec - exiting 00:50:07 (16786): No heartbeat from core client for 30 sec - exiting 00:50:08 (16786): No heartbeat from core client for 30 sec - exiting 00:50:09 (16786): No heartbeat from core client for 30 sec - exiting 00:50:10 (16786): No heartbeat from core client for 30 sec - exiting 00:50:11 (16786): No heartbeat from core client for 30 sec - exiting 00:50:12 (16786): No heartbeat from core client for 30 sec - exiting 00:50:13 (16786): No heartbeat from core client for 30 sec - exiting 00:50:14 (16786): No heartbeat from core client for 30 sec - exiting 00:50:15 (16786): No heartbeat from core client for 30 sec - exiting 00:50:16 (16786): No heartbeat from core client for 30 sec - exiting 00:50:17 (16786): No heartbeat from core client for 30 sec - exiting 00:50:18 (16786): No heartbeat from core client for 30 sec - exiting 00:50:19 (16786): No heartbeat from core client for 30 sec - exiting 00:50:20 (16786): No heartbeat from core client for 30 sec - exiting 00:50:21 (16786): No heartbeat from core client for 30 sec - exiting 00:50:22 (16786): No heartbeat from core client for 30 sec - exiting 00:50:23 (16786): No heartbeat from core client for 30 sec - exiting 00:53:34 (9153): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:53:35 (9153): No heartbeat from core client for 30 sec - exiting 00:53:36 (9153): No heartbeat from core client for 30 sec - exiting 00:53:37 (9153): No heartbeat from core client for 30 sec - exiting 00:53:38 (9153): No heartbeat from core client for 30 sec - exiting 00:53:39 (9153): No heartbeat from core client for 30 sec - exiting 00:53:41 (9153): No heartbeat from core client for 30 sec - exiting 00:53:42 (9153): No heartbeat from core client for 30 sec - exiting 00:53:43 (9153): No heartbeat from core client for 30 sec - exiting 00:53:44 (9153): No heartbeat from core client for 30 sec - exiting 00:53:45 (9153): No heartbeat from core client for 30 sec - exiting 00:53:46 (9153): No heartbeat from core client for 30 sec - exiting 00:54:38 (9190): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:54:41 (9190): No heartbeat from core client for 30 sec - exiting 00:59:55 (9237): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 00:54:03 (16414): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:54:04 (16414): No heartbeat from core client for 30 sec - exiting 00:54:05 (16414): No heartbeat from core client for 30 sec - exiting 00:54:06 (16414): No heartbeat from core client for 30 sec - exiting 00:54:07 (16414): No heartbeat from core client for 30 sec - exiting 00:54:08 (16414): No heartbeat from core client for 30 sec - exiting 00:54:09 (16414): No heartbeat from core client for 30 sec - exiting 00:54:10 (16414): No heartbeat from core client for 30 sec - exiting 00:54:11 (16414): No heartbeat from core client for 30 sec - exiting 00:54:12 (16414): No heartbeat from core client for 30 sec - exiting 00:54:13 (16414): No heartbeat from core client for 30 sec - exiting 00:54:14 (16414): No heartbeat from core client for 30 sec - exiting 00:54:15 (16414): No heartbeat from core client for 30 sec - exiting 00:54:16 (16414): No heartbeat from core client for 30 sec - exiting 00:54:17 (16414): No heartbeat from core client for 30 sec - exiting 00:54:18 (16414): No heartbeat from core client for 30 sec - exiting 00:54:19 (16414): No heartbeat from core client for 30 sec - exiting 00:54:20 (16414): No heartbeat from core client for 30 sec - exiting 00:54:21 (16414): No heartbeat from core client for 30 sec - exiting 00:54:22 (16414): No heartbeat from core client for 30 sec - exiting 00:54:23 (16414): No heartbeat from core client for 30 sec - exiting 00:54:24 (16414): No heartbeat from core client for 30 sec - exiting 00:54:25 (16414): No heartbeat from core client for 30 sec - exiting 00:54:26 (16414): No heartbeat from core client for 30 sec - exiting 00:54:27 (16414): No heartbeat from core client for 30 sec - exiting 00:54:28 (16414): No heartbeat from core client for 30 sec - exiting 00:54:29 (16414): No heartbeat from core client for 30 sec - exiting 00:54:30 (16414): No heartbeat from core client for 30 sec - exiting 00:54:31 (16414): No heartbeat from core client for 30 sec - exiting 00:54:32 (16414): No heartbeat from core client for 30 sec - exiting 00:54:33 (16414): No heartbeat from core client for 30 sec - exiting 00:54:34 (16414): No heartbeat from core client for 30 sec - exiting 00:55:08 (31028): No heartbeat from core client for 30 sec - exiting 00:55:09 (31028): No heartbeat from core client for 30 sec - exiting 00:55:10 (31028): No heartbeat from core client for 30 sec - exiting 00:55:11 (31028): No heartbeat from core client for 30 sec - exiting 00:55:12 (31028): No heartbeat from core client for 30 sec - exiting 00:55:13 (31028): No heartbeat from core client for 30 sec - exiting 00:55:14 (31028): No heartbeat from core client for 30 sec - exiting 00:55:15 (31028): No heartbeat from core client for 30 sec - exiting 00:55:16 (31028): No heartbeat from core client for 30 sec - exiting 00:55:17 (31028): No heartbeat from core client for 30 sec - exiting 00:55:18 (31028): No heartbeat from core client for 30 sec - exiting 00:55:32 (31028): No heartbeat from core client for 30 sec - exiting 00:55:33 (31028): No heartbeat from core client for 30 sec - exiting 00:55:34 (31028): No heartbeat from core client for 30 sec - exiting 00:55:35 (31028): No heartbeat from core client for 30 sec - exiting 00:55:36 (31028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 00:57:58 (31050): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:58:11 (31050): No heartbeat from core client for 30 sec - exiting Atmos Hold Restart file rename failed on atmos_restart.hold 00:35:43 (32438): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:50:49 (28211): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:57:01 (28393): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:57:33 (28393): No heartbeat from core client for 30 sec - exiting 00:57:34 (28393): No heartbeat from core client for 30 sec - exiting 00:57:35 (28393): No heartbeat from core client for 30 sec - exiting 00:57:36 (28393): No heartbeat from core client for 30 sec - exiting 00:57:37 (28393): No heartbeat from core client for 30 sec - exiting 00:57:38 (28393): No heartbeat from core client for 30 sec - exiting 00:57:39 (28393): No heartbeat from core client for 30 sec - exiting 00:57:40 (28393): No heartbeat from core client for 30 sec - exiting 00:57:41 (28393): No heartbeat from core client for 30 sec - exiting 00:57:42 (28393): No heartbeat from core client for 30 sec - exiting 00:57:43 (28393): No heartbeat from core client for 30 sec - exiting 00:57:44 (28393): No heartbeat from core client for 30 sec - exiting 00:57:45 (28393): No heartbeat from core client for 30 sec - exiting 00:57:46 (28393): No heartbeat from core client for 30 sec - exiting 00:57:47 (28393): No heartbeat from core client for 30 sec - exiting 00:57:48 (28393): No heartbeat from core client for 30 sec - exiting 00:13:06 (28511): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:13:07 (28511): No heartbeat from core client for 30 sec - exiting 00:13:08 (28511): No heartbeat from core client for 30 sec - exiting 00:13:09 (28511): No heartbeat from core client for 30 sec - exiting 00:13:10 (28511): No heartbeat from core client for 30 sec - exiting 00:13:11 (28511): No heartbeat from core client for 30 sec - exiting 00:13:12 (28511): No heartbeat from core client for 30 sec - exiting 00:13:13 (28511): No heartbeat from core client for 30 sec - exiting 00:50:36 (19890): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:50:37 (19890): No heartbeat from core client for 30 sec - exiting 00:50:38 (19890): No heartbeat from core client for 30 sec - exiting 00:50:39 (19890): No heartbeat from core client for 30 sec - exiting 00:50:40 (19890): No heartbeat from core client for 30 sec - exiting 00:50:41 (19890): No heartbeat from core client for 30 sec - exiting 00:50:42 (19890): No heartbeat from core client for 30 sec - exiting 00:50:43 (19890): No heartbeat from core client for 30 sec - exiting SIGABRT: abort called Stack trace (10 frames): /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7776400] [0xf7776430] /lib/i686/cmov/libc.so.6(gsignal+0x51)[0xf75fe751] /lib/i686/cmov/libc.so.6(abort+0x182)[0xf7601b82] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xf75eac76] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x804cb11] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=20233, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (10 frames): /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7769400] [0xf7769430] /lib/i686/cmov/libc.so.6(gsignal+0x51)[0xf75f1751] /lib/i686/cmov/libc.so.6(abort+0x182)[0xf75f4b82] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xf75ddc76] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x804cb11] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=20233, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (10 frames): /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf77b6400] [0xf77b6430] /lib/i686/cmov/libc.so.6(gsignal+0x51)[0xf763e751] /lib/i686/cmov/libc.so.6(abort+0x182)[0xf7641b82] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xf762ac76] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x804cb11] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=20233, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (10 frames): /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf777a400] [0xf777a430] /lib/i686/cmov/libc.so.6(gsignal+0x51)[0xf7602751] /lib/i686/cmov/libc.so.6(abort+0x182)[0xf7605b82] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xf75eec76] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x804cb11] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=20233, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (10 frames): /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7722400] [0xf7722430] /lib/i686/cmov/libc.so.6(gsignal+0x51)[0xf75aa751] /lib/i686/cmov/libc.so.6(abort+0x182)[0xf75adb82] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xf7596c76] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x804cb11] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=20233, iMonCtr=1 Model crash detected, will try to restart... SIGABRT: abort called Stack trace (10 frames): /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x840da8f] [0xf7714400] [0xf7714430] /lib/i686/cmov/libc.so.6(gsignal+0x51)[0xf759c751] /lib/i686/cmov/libc.so.6(abort+0x182)[0xf759fb82] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x83400c3] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x838f395] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x839bdf8] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xf7588c76] /home/mugurel/boinc.data/projects/climateprediction.net/hadcm3n_um_6.07_i686-pc-linux-gnu[0x804cb11] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=20233, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
25 Jul 2011 13:04:55 | 486988 | 13121249 | hadcm3n_yjxj_1900_40_007358689_0 | 155,520 | 359,046 | 2.3087 |
25 Jul 2011 13:04:55 | 486988 | 13121249 | hadcm3n_yjxj_1900_40_007358689_0 | 129,600 | 301,158 | 2.3238 |
11 Jul 2011 03:26:18 | 486988 | 13121249 | hadcm3n_yjxj_1900_40_007358689_0 | 103,680 | 240,125 | 2.3160 |
10 Jul 2011 10:37:18 | 486988 | 13121249 | hadcm3n_yjxj_1900_40_007358689_0 | 77,760 | 181,137 | 2.3294 |
09 Jul 2011 18:19:50 | 486988 | 13121249 | hadcm3n_yjxj_1900_40_007358689_0 | 51,840 | 119,924 | 2.3133 |
09 Jul 2011 00:46:28 | 486988 | 13121249 | hadcm3n_yjxj_1900_40_007358689_0 | 25,920 | 61,769 | 2.3831 |
©2024 cpdn.org