Task 15400725

Name	hadcm3n_o0e5_2060_40_008239538_2
Workunit	8394662
Created	26 Oct 2012, 19:12:40 UTC
Sent	26 Oct 2012, 19:12:42 UTC
Report deadline	26 Jan 2013, 2:39:53 UTC
Received	19 Nov 2012, 20:09:18 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1232359
Run time	6 days 12 hours 33 min 52 sec
CPU time	4 days 21 hours 13 min 29 sec
Validate state	Invalid
Credit	3,110.40
Device peak FLOPS	2.68 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4180, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4196, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4344, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4392, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1 Model crash detected, will try to restart... 19:15:08 (3856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4892, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4412, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4068, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5448, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4276, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1344, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4832, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 08:37:12 (3976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:37:14 (3976): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2000, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3100, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4112, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5116, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4628, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77657373 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Cannot serialize file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o0e5_2060_40_008239538/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
19 Nov 2012 20:13:40	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	259,200	422,002	1.6281
17 Nov 2012 12:10:53	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	233,280	379,268	1.6258
14 Nov 2012 20:31:58	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	207,360	337,531	1.6278
10 Nov 2012 16:31:54	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	181,440	294,675	1.6241
06 Nov 2012 20:24:58	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	155,520	251,887	1.6196
04 Nov 2012 14:11:22	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	129,600	210,149	1.6215
03 Nov 2012 10:19:57	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	103,680	167,395	1.6145
01 Nov 2012 16:52:48	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	77,760	124,969	1.6071
31 Oct 2012 18:16:11	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	51,840	84,317	1.6265
28 Oct 2012 11:54:15	1232359	15400725	hadcm3n_o0e5_2060_40_008239538_2	25,920	42,509	1.6400