Name | hadcm3n_y9wd_1900_40_007523235_0 |
Workunit | 7720710 |
Created | 28 Oct 2011, 13:21:44 UTC |
Sent | 31 Oct 2011, 16:31:42 UTC |
Report deadline | 30 Jan 2012, 23:58:53 UTC |
Received | 29 Nov 2011, 17:34:38 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 784541 |
Run time | 9 days 14 hours 43 min 4 sec |
CPU time | 8 days 4 hours 37 min 10 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 2.33 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1584, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4008, iMonCtr=1 Model crash detected, will try to restart... 08:03:13 (3080): No heartbeat from core client for 30 sec - exiting 08:03:14 (3080): No heartbeat from core client for 30 sec - exiting 08:03:15 (3080): No heartbeat from core client for 30 sec - exiting 08:03:16 (3080): No heartbeat from core client for 30 sec - exiting 08:03:17 (3080): No heartbeat from core client for 30 sec - exiting 08:03:18 (3080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3144, iMonCtr=1 Model crash detected, will try to restart... 20:48:44 (3924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:56:08 (3876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:56:13 (3876): No heartbeat from core client for 30 sec - exiting 18:16:09 (1760): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:32:43 (2596): No heartbeat from core client for 30 sec - exiting 17:32:44 (2596): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:44:11 (3160): No heartbeat from core client for 30 sec - exiting 15:44:18 (3160): No heartbeat from core client for 30 sec - exiting 15:44:19 (3160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:34:24 (1888): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:35:31 (4036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2612, iMonCtr=1 Model crash detected, will try to restart... 20:49:47 (4076): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:30:50 (3732): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:30:54 (3732): No heartbeat from core client for 30 sec - exiting 01:30:55 (3732): No heartbeat from core client for 30 sec - exiting 01:30:56 (3732): No heartbeat from core client for 30 sec - exiting 01:30:57 (3732): No heartbeat from core client for 30 sec - exiting 01:30:58 (3732): No heartbeat from core client for 30 sec - exiting 01:30:59 (3732): No heartbeat from core client for 30 sec - exiting 01:31:00 (3732): No heartbeat from core client for 30 sec - exiting 01:31:01 (3732): No heartbeat from core client for 30 sec - exiting 01:31:02 (3732): No heartbeat from core client for 30 sec - exiting 01:31:03 (3732): No heartbeat from core client for 30 sec - exiting 01:31:04 (3732): No heartbeat from core client for 30 sec - exiting 01:31:05 (3732): No heartbeat from core client for 30 sec - exiting 01:31:06 (3732): No heartbeat from core client for 30 sec - exiting 01:31:07 (3732): No heartbeat from core client for 30 sec - exiting 06:11:53 (3620): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:49:11 (3656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:49:13 (3656): No heartbeat from core client for 30 sec - exiting 08:03:44 (3684): No heartbeat from core client for 30 sec - exiting 08:03:45 (3684): No heartbeat from core client for 30 sec - exiting 08:03:46 (3684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:42:09 (4000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:28:28 (1076): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 06:26:17 (3272): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:44:04 (3724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:07:55 (4052): No heartbeat from core client for 30 sec - exiting 06:08:07 (4052): No heartbeat from core client for 30 sec - exiting 06:08:08 (4052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:11:06 (4024): No heartbeat from core client for 30 sec - exiting 06:11:13 (4024): No heartbeat from core client for 30 sec - exiting 06:11:15 (4024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:54:42 (1128): No heartbeat from core client for 30 sec - exiting 07:54:49 (1128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:21:46 (2832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:13:22 (1496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:17:40 (2304): No heartbeat from core client for 30 sec - exiting 08:17:41 (2304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:27:27 (2984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:27:29 (2984): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=944, iMonCtr=1 Model crash detected, will try to restart... 08:20:05 (3368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:07:06 (1292): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:07:09 (1292): No heartbeat from core client for 30 sec - exiting 09:51:40 (3532): No heartbeat from core client for 30 sec - exiting 09:51:41 (3532): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:24:08 (1064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:48:18 (3096): No heartbeat from core client for 30 sec - exiting 05:48:28 (3096): No heartbeat from core client for 30 sec - exiting 05:48:30 (3096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:13:06 (3964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=952, iMonCtr=1 Model crash detected, will try to restart... BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=360, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=360, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=360, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=360, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=360, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2624, iMonCtr=1 Model crash detected, will try to restart... 17:36:50 (2836): No heartbeat from core client for 30 sec - exiting 17:36:52 (2836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:53:55 (3752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3404, iMonCtr=1 Model crash detected, will try to restart... 05:50:24 (3880): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:22:06 (3616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:06:06 (3576): No heartbeat from core client for 30 sec - exiting 11:06:08 (3576): No heartbeat from core client for 30 sec - exiting 11:06:09 (3576): No heartbeat from core client for 30 sec - exiting 11:06:10 (3576): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3632, iMonCtr=1 Model crash detected, will try to restart... 16:28:30 (2756): No heartbeat from core client for 30 sec - exiting 16:28:31 (2756): No heartbeat from core client for 30 sec - exiting 16:28:32 (2756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:28:47 (2572): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:08:25 (1724): No heartbeat from core client for 30 sec - exiting 18:08:26 (1724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:31:45 (4016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:01:14 (3796): No heartbeat from core client for 30 sec - exiting 01:01:26 (3796): No heartbeat from core client for 30 sec - exiting 01:01:27 (3796): No heartbeat from core client for 30 sec - exiting 01:01:28 (3796): No heartbeat from core client for 30 sec - exiting 01:01:29 (3796): No heartbeat from core client for 30 sec - exiting 01:01:30 (3796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:30:16 (2336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:01:30 (2908): No heartbeat from core client for 30 sec - exiting 18:01:34 (2908): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:28:32 (3964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
29 Nov 2011 16:38:12 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 259,200 | 707,811 | 2.7308 |
26 Nov 2011 18:07:02 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 233,280 | 631,698 | 2.7079 |
24 Nov 2011 12:47:46 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 207,360 | 594,033 | 2.8647 |
20 Nov 2011 09:53:02 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 181,440 | 529,031 | 2.9157 |
16 Nov 2011 17:02:50 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 155,520 | 452,548 | 2.9099 |
15 Nov 2011 22:14:45 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 129,600 | 377,621 | 2.9137 |
15 Nov 2011 22:14:45 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 103,680 | 302,435 | 2.9170 |
08 Nov 2011 13:14:30 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 77,760 | 225,812 | 2.9040 |
06 Nov 2011 11:21:49 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 51,840 | 150,387 | 2.9010 |
04 Nov 2011 23:49:31 | 784541 | 13552164 | hadcm3n_y9wd_1900_40_007523235_0 | 25,920 | 76,525 | 2.9524 |
©2024 cpdn.org