Task 13402769

Name	hadcm3n_t2p0_1940_40_007443148_3
Workunit	7640651
Created	20 Sep 2011, 15:21:23 UTC
Sent	20 Sep 2011, 15:22:21 UTC
Report deadline	20 Dec 2011, 22:49:32 UTC
Received	30 Oct 2011, 10:20:10 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	859264
Run time	15 days 2 hours 21 min 59 sec
CPU time	13 days 2 hours 29 min 17 sec
Validate state	Invalid
Credit	9,331.20
Device peak FLOPS	2.77 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4172, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2768, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4140, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4572, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4752, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4752, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4168, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1012, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/t2p0ko.pjg2c10 Error converting file to netcdf: dataout/t2p0ko.pig2c10 Error converting file to netcdf: dataout/t2p0ko.pfg2c10 Error converting file to netcdf: dataout/t2p0ka.phg2c10 Error converting file to netcdf: dataout/t2p0ka.pgg2c10 Error converting file to netcdf: dataout/t2p0ka.peg2c10 Error converting file to netcdf: dataout/t2p0ka.pdg2c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4688, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4520, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4820, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77CD6E0F read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t2p0_1940_40_007443148/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
31 Oct 2011 18:50:47	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	777,600	1,132,138	1.4559
31 Oct 2011 18:15:54	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	751,680	1,094,887	1.4566
31 Oct 2011 17:41:53	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	725,760	1,056,909	1.4563
31 Oct 2011 17:11:14	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	699,840	1,019,181	1.4563
31 Oct 2011 16:30:29	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	673,920	980,950	1.4556
31 Oct 2011 14:03:18	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	648,000	942,223	1.4540
31 Oct 2011 14:03:17	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	622,080	904,179	1.4535
31 Oct 2011 14:03:16	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	596,160	866,123	1.4528
31 Oct 2011 14:03:15	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	570,240	828,118	1.4522
18 Oct 2011 21:11:48	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	544,320	789,688	1.4508
17 Oct 2011 21:14:27	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	518,400	751,102	1.4489
17 Oct 2011 10:17:37	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	492,480	713,126	1.4480
15 Oct 2011 14:58:06	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	466,560	676,311	1.4496
13 Oct 2011 21:39:13	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	440,640	639,184	1.4506
12 Oct 2011 21:57:37	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	414,720	602,173	1.4520
12 Oct 2011 10:21:47	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	388,800	564,539	1.4520
09 Oct 2011 10:33:17	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	362,880	526,581	1.4511
07 Oct 2011 18:42:35	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	336,960	488,266	1.4490
06 Oct 2011 18:27:47	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	311,040	451,274	1.4509
05 Oct 2011 18:33:35	859264	13402769	hadcm3n_t2p0_1940_40_007443148_3	285,120	414,070	1.4523