Task 13102879

Name	hadcm3n_ycug_1900_40_007349506_0
Workunit	7546936
Created	6 Jul 2011, 13:58:53 UTC
Sent	17 Jul 2011, 15:08:53 UTC
Report deadline	16 Oct 2011, 22:36:04 UTC
Received	12 Apr 2013, 13:50:51 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION
Computer ID	1116471
Run time	27 days 8 hours 33 min 39 sec
CPU time	27 days 8 hours 33 min 39 sec
Validate state	Invalid
Credit	9,331.20
Device peak FLOPS	2.05 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=724, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3904, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3464, iMonCtr=1 Model crash detected, will tController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1368, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4028, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4060, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CCController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4008, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5020, iMonCtr=1 Model crash detected, will try to restaController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4040, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2588, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3756, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1476, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4272, iMonCtr=1 Model crash04:39:32 (4792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3528, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=276, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4820, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4968, iMonCtr=1 Model crash detected, will try to restart... 17:36:12 (1888): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1332, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3720, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3728, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4468, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3632, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3828, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3552, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 00:10:12 (2644): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:10:19 (2644): No heartbeat from core client for 30 sec - exiting 02:37:57 (5284): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:51:23 (1396): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:50:21 (1548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:49:09 (1960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:58:25 (3716): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77ABC3EB write attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x7754E742 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
11 Apr 2013 21:30:00	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	777,600	2,362,603	3.0383
03 Apr 2013 09:45:56	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	751,680	2,277,770	3.0302
06 Oct 2012 16:55:11	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	725,760	2,203,590	3.0363
15 Aug 2012 20:27:35	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	699,840	2,121,725	3.0317
14 Aug 2012 18:00:15	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	673,920	2,037,232	3.0230
28 Apr 2012 17:26:39	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	648,000	1,961,246	3.0266
27 Mar 2012 19:41:04	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	622,080	1,878,252	3.0193
28 Feb 2012 22:30:56	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	596,160	1,795,692	3.0121
24 Feb 2012 16:58:10	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	570,240	1,712,412	3.0030
12 Feb 2012 19:12:37	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	544,320	1,628,969	2.9927
05 Feb 2012 19:44:04	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	518,400	1,564,297	3.0175
19 Nov 2011 11:04:19	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	492,480	1,482,709	3.0107
08 Nov 2011 04:47:02	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	466,560	1,402,026	3.0050
31 Oct 2011 18:12:58	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	440,640	1,323,503	3.0036
31 Oct 2011 15:11:44	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	414,720	1,244,127	2.9999
07 Oct 2011 22:02:28	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	388,800	1,165,530	2.9978
14 Sep 2011 21:41:18	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	362,880	1,086,154	2.9931
11 Sep 2011 22:26:35	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	336,960	1,007,804	2.9909
10 Sep 2011 08:15:21	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	311,040	930,752	2.9924
08 Sep 2011 07:54:59	1116471	13102879	hadcm3n_ycug_1900_40_007349506_0	285,120	854,776	2.9980