Name | hadcm3n_ymdi_1900_40_007361856_1 |
Workunit | 7559286 |
Created | 6 Jul 2011, 15:21:55 UTC |
Sent | 7 Jul 2011, 9:30:34 UTC |
Report deadline | 6 Oct 2011, 16:57:45 UTC |
Received | 17 Sep 2011, 6:36:27 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 896256 |
Run time | 10 days 18 hours 10 min 59 sec |
CPU time | 9 days 21 hours 21 min 31 sec |
Validate state | Invalid |
Credit | 6,220.80 |
Device peak FLOPS | 2.78 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 i686-apple-darwin |
Stderr | <core_client_version>6.12.35</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x1821604: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x1821600: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c04: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c00: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c04: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c00: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c04: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c00: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c04: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c00: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c04: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(280,0xa0c6c540) malloc: *** error for object 0x822c00: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 09:19:53 (251): No heartbeat from core client for 30 sec - exiting 09:19:54 (251): No heartbeat from core client for 30 sec - exiting 09:19:55 (251): No heartbeat from core client for 30 sec - exiting 09:19:56 (251): No heartbeat from core client for 30 sec - exiting 09:19:57 (251): No heartbeat from core client for 30 sec - exiting 09:19:58 (251): No heartbeat from core client for 30 sec - exiting 09:19:59 (251): No heartbeat from core client for 30 sec - exiting 09:20:00 (251): No heartbeat from core client for 30 sec - exiting 09:20:01 (251): No heartbeat from core client for 30 sec - exiting 09:20:02 (251): No heartbeat from core client for 30 sec - exiting 09:20:03 (251): No heartbeat from core client for 30 sec - exiting 09:20:04 (251): No heartbeat from core client for 30 sec - exiting 09:20:05 (251): No heartbeat from core client for 30 sec - exiting 09:20:06 (251): No heartbeat from core client for 30 sec - exiting 09:20:07 (251): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 11:36:58 (258): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... execl(/Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_um_6.07_i686-apple-darwin, 139395) failed! Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... execl(/Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_um_6.07_i686-apple-darwin, 139395) failed! Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... execl(/Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_um_6.07_i686-apple-darwin, 139395) failed! Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... execl(/Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_um_6.07_i686-apple-darwin, 139395) failed! Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... execl(/Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_um_6.07_i686-apple-darwin, 139395) failed! Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... execl(/Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_um_6.07_i686-apple-darwin, 139395) failed! Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
16 Aug 2011 06:00:05 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 518,400 | 821,251 | 1.5842 |
15 Aug 2011 12:45:24 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 492,480 | 779,232 | 1.5823 |
09 Aug 2011 16:33:49 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 466,560 | 737,092 | 1.5798 |
06 Aug 2011 10:05:25 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 440,640 | 696,055 | 1.5796 |
05 Aug 2011 22:09:04 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 414,720 | 654,698 | 1.5787 |
04 Aug 2011 13:52:23 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 388,800 | 613,209 | 1.5772 |
04 Aug 2011 01:31:32 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 362,880 | 572,438 | 1.5775 |
03 Aug 2011 08:37:44 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 336,960 | 531,541 | 1.5775 |
02 Aug 2011 16:00:09 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 311,040 | 491,311 | 1.5796 |
01 Aug 2011 13:50:22 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 285,120 | 451,005 | 1.5818 |
31 Jul 2011 03:02:19 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 259,200 | 410,069 | 1.5821 |
30 Jul 2011 10:23:12 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 233,280 | 369,940 | 1.5858 |
25 Jul 2011 22:11:16 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 207,360 | 329,578 | 1.5894 |
25 Jul 2011 21:46:30 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 181,440 | 289,558 | 1.5959 |
25 Jul 2011 20:43:08 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 155,520 | 249,035 | 1.6013 |
25 Jul 2011 16:35:05 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 129,600 | 207,874 | 1.6040 |
25 Jul 2011 15:55:04 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 103,680 | 165,777 | 1.5989 |
10 Jul 2011 22:19:42 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 77,760 | 124,424 | 1.6001 |
10 Jul 2011 04:50:24 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 51,840 | 82,854 | 1.5983 |
09 Jul 2011 10:35:58 | 896256 | 13127584 | hadcm3n_ymdi_1900_40_007361856_1 | 25,920 | 41,331 | 1.5946 |
©2024 cpdn.org