Name | hadcm3n_t0lb_1940_40_007310849_2 |
Workunit | 7508279 |
Created | 28 Jun 2011, 21:03:56 UTC |
Sent | 28 Jun 2011, 21:08:11 UTC |
Report deadline | 28 Sep 2011, 4:35:22 UTC |
Received | 25 Jul 2011, 16:27:51 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 996482 |
Run time | 22 days 14 hours 29 min 49 sec |
CPU time | 19 days 12 hours 22 min 22 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 2.34 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.13.1</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> 03:08:46 (1244): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:18:39 (3472): No heartbeat from core client for 30 sec - exiting 03:18:41 (3472): No heartbeat from core client for 30 sec - exiting 03:18:42 (3472): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:16:30 (2784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:16:31 (2784): No heartbeat from core client for 30 sec - exiting 02:17:35 (6412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:31:47 (2764): No heartbeat from core client for 30 sec - exiting 23:31:48 (2764): No heartbeat from core client for 30 sec - exiting 23:31:49 (2764): No heartbeat from core client for 30 sec - exiting 23:31:50 (2764): No heartbeat from core client for 30 sec - exiting 23:31:51 (2764): No heartbeat from core client for 30 sec - exiting 23:31:52 (2764): No heartbeat from core client for 30 sec - exiting 23:31:53 (2764): No heartbeat from core client for 30 sec - exiting 23:31:54 (2764): No heartbeat from core client for 30 sec - exiting 23:31:55 (2764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3396, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: C I/O Error feof - Unit 62 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/t0lbko.pjf0c10 Error converting file to netcdf: dataout/t0lbko.pif0c10 Error converting file to netcdf: dataout/t0lbko.pff0c10 Error converting file to netcdf: dataout/t0lbko.pcf0c10 Error converting file to netcdf: dataout/t0lbko.pbf0c10 Error converting file to netcdf: dataout/t0lbko.paf0c10 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Atmos Hold Restart file rename failed on atmos_restart.hold 03:20:14 (3108): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 06:04:06 (2144): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:04:07 (2144): No heartbeat from core client for 30 sec - exiting Ocean Restart file copy failed on t0lbko.dag34s0 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3532, iMonCtr=1 Model crash detected, will try to restart... 08:21:20 (1164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:18:59 (4916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:47:20 (3184): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5020, selfPID=5020, iMonCtr=1 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77BC3A93 read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... Cannot serialize file D:\BOINC/projects/climateprediction.net/hadcm3n_t0lb_1940_40_007310849/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
25 Jul 2011 23:00:03 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 777,600 | 1,686,132 | 2.1684 |
25 Jul 2011 22:10:43 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 751,680 | 1,635,868 | 2.1763 |
25 Jul 2011 21:44:16 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 725,760 | 1,578,257 | 2.1746 |
25 Jul 2011 19:32:37 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 699,840 | 1,521,917 | 2.1747 |
25 Jul 2011 19:32:36 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 673,920 | 1,465,443 | 2.1745 |
25 Jul 2011 19:07:22 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 648,000 | 1,408,239 | 2.1732 |
25 Jul 2011 18:19:04 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 622,080 | 1,354,067 | 2.1767 |
25 Jul 2011 17:33:52 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 596,160 | 1,296,520 | 2.1748 |
25 Jul 2011 16:05:57 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 570,240 | 1,239,227 | 2.1732 |
25 Jul 2011 15:37:55 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 544,320 | 1,182,301 | 2.1721 |
25 Jul 2011 14:42:30 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 518,400 | 1,125,264 | 2.1706 |
25 Jul 2011 13:16:59 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 492,480 | 1,067,923 | 2.1685 |
25 Jul 2011 13:16:59 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 466,560 | 1,011,336 | 2.1676 |
25 Jul 2011 13:16:59 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 440,640 | 954,017 | 2.1651 |
25 Jul 2011 13:16:58 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 414,720 | 897,123 | 2.1632 |
25 Jul 2011 13:16:58 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 388,800 | 840,087 | 2.1607 |
25 Jul 2011 13:16:57 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 362,880 | 783,846 | 2.1601 |
10 Jul 2011 14:03:55 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 336,960 | 728,052 | 2.1606 |
09 Jul 2011 11:06:07 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 311,040 | 671,839 | 2.1600 |
08 Jul 2011 11:10:41 | 996482 | 13023688 | hadcm3n_t0lb_1940_40_007310849_2 | 285,120 | 615,340 | 2.1582 |
©2024 cpdn.org