Task 13647645

Name	hadcm3n_yd2f_1900_40_007518958_3
Workunit	7716433
Created	20 Nov 2011, 2:22:52 UTC
Sent	20 Nov 2011, 2:23:55 UTC
Report deadline	19 Feb 2012, 9:51:06 UTC
Received	8 Jan 2012, 20:10:35 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	986812
Run time	15 days 15 hours 10 min 27 sec
CPU time	13 days 18 hours 32 min 38 sec
Validate state	Invalid
Credit	3,110.40
Device peak FLOPS	1.73 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.6.36</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3520, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3700, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3560, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3720, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3336, iMonCtr=1 Model crash detected, will try to restart... CCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3864, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 12:16:53 (3876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3796, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3448, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3100, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2904, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3864, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x774E3A93 read attempt to address 0x40E16ACC Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yd2f_1900_40_007518958/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
08 Jan 2012 19:13:50	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	259,200	1,189,949	4.5909
02 Jan 2012 04:38:56	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	233,280	1,064,281	4.5622
30 Dec 2011 02:48:13	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	207,360	941,220	4.5391
27 Dec 2011 06:52:10	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	181,440	813,975	4.4862
22 Dec 2011 22:58:15	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	155,520	685,275	4.4063
18 Dec 2011 21:49:58	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	129,600	579,299	4.4699
13 Dec 2011 16:08:24	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	103,680	457,176	4.4095
07 Dec 2011 00:43:02	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	77,760	345,943	4.4489
02 Dec 2011 04:00:39	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	51,840	231,589	4.4674
27 Nov 2011 21:06:22	986812	13647645	hadcm3n_yd2f_1900_40_007518958_3	25,920	115,277	4.4474