Task 15442885

Name	hadcm3n_zm48_1880_40_008246655_1
Workunit	8401779
Created	21 Nov 2012, 2:42:39 UTC
Sent	21 Nov 2012, 2:42:41 UTC
Report deadline	20 Feb 2013, 10:09:52 UTC
Received	30 Jan 2013, 20:44:16 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1254204
Run time	17 days 18 hours 44 min 8 sec
CPU time	15 days 20 hours 0 min 41 sec
Validate state	Invalid
Credit	9,331.20
Device peak FLOPS	2.49 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6940, iMonCtr=1 Model crash detected, will try to restart... 08:44:21 (5728): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:29:54 (5692): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:29:56 (5692): No heartbeat from core client for 30 sec - exiting 11:29:57 (5692): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5564, iMonCtr=1 Model crash detected, will try to restart... 11:50:25 (5260): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8024, iMonCtr=1 Model crash detected, will try to restart... 14:56:27 (5064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Ocean Restart file copy failed on zm48ko.da897m0 CPDN Monitor - Quit request from BOINC... Ocean Restart file copy failed on zm48ko.da903p0 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1740, iMonCtr=1 Model crash detected, will try to restart... 03:58:50 (5220): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4196, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5208, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Ocean Restart file copy failed on zm48ko.da933g0 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5152, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 12:31:02 (7152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:31:48 (2188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1796, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5700, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5812, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN proc21:00:25 (5800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6032, iMonCtr=1 Model crash detected, will try to restart... 17:16:49 (5224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Ocean Restart file copy failed on zm48ko.da98be0 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4440, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Ocean Restart file copy failed on zm48ko.daa24l0 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:23:37 (3892): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Ocean Restart file copy failed on zm48ko.daa41o0 CPDN Monitor - Quit request from BOINC... 14:53:56 (5804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4092, iMonCtr=1 Model crash detected, will try to restart... Atmos Hold Restart file rename failed on atmos_restart.hold Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5772, iMonCtr=1 Model crash detected, will try to restart... 14:57:25 (5432): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:11:45 (2368): No heartbeat from core client for 30 sec - exiting 16:11:46 (2368): No heartbeat from core client for 30 sec - exiting 16:11:47 (2368): No heartbeat from core client for 30 sec - exiting 16:11:48 (2368): No heartbeat from core client for 30 sec - exiting 16:11:49 (2368): No heartbeat from core client for 30 sec - exiting 16:11:50 (2368): No heartbeat from core client for 30 sec - exiting 16:11:51 (2368): No heartbeat from core client for 30 sec - exiting 16:11:52 (2368): No heartbeat from core client for 30 sec - exiting 16:11:53 (2368): No heartbeat from core client for 30 sec - exiting 16:11:54 (2368): No heartbeat from core client for 30 sec - exiting 16:11:55 (2368): No heartbeat from core client for 30 sec - exiting 16:11:56 (2368): No heartbeat from core client for 30 sec - exiting 16:11:57 (2368): No heartbeat from core client for 30 sec - exiting 16:11:58 (2368): No heartbeat from core client for 30 sec - exiting 16:11:59 (2368): No heartbeat from core client for 30 sec - exiting 16:12:00 (2368): No heartbeat from core client for 30 sec - exiting 16:12:01 (2368): No heartbeat from core client for 30 sec - exiting 16:12:33 (2368): No heartbeat from core client for 30 sec - exiting 16:12:34 (2368): No heartbeat from core client for 30 sec - exiting 16:12:35 (2368): No heartbeat from core client for 30 sec - exiting 16:12:36 (2368): No heartbeat from core client for 30 sec - exiting 16:12:37 (2368): No heartbeat from core client for 30 sec - exiting 16:12:38 (2368): No heartbeat from core client for 30 sec - exiting 16:12:39 (2368): No heartbeat from core client for 30 sec - exiting 16:12:40 (2368): No heartbeat from core client for 30 sec - exiting 16:12:41 (2368): No heartbeat from core client for 30 sec - exiting 16:12:42 (2368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6032, iMonCtr=1 Model crash detected, will try to restart... Ocean Restart file copy failed on zm48ko.dab05s0 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x773D3FBB read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_zm48_1880_40_008246655/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
30 Jan 2013 20:47:43	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	777,600	1,368,035	1.7593
29 Jan 2013 17:25:53	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	751,680	1,325,810	1.7638
28 Jan 2013 13:00:11	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	725,760	1,281,003	1.7651
24 Jan 2013 13:50:21	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	699,840	1,236,419	1.7667
23 Jan 2013 09:38:22	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	673,920	1,193,001	1.7702
21 Jan 2013 11:56:55	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	648,000	1,149,970	1.7746
16 Jan 2013 11:03:22	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	622,080	1,105,950	1.7778
14 Jan 2013 12:20:06	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	596,160	1,062,921	1.7829
11 Jan 2013 12:50:09	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	570,240	1,018,737	1.7865
09 Jan 2013 17:41:15	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	544,320	972,465	1.7866
08 Jan 2013 12:15:06	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	518,400	925,298	1.7849
07 Jan 2013 12:35:16	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	492,480	875,671	1.7781
03 Jan 2013 15:05:49	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	466,560	826,176	1.7708
31 Dec 2012 13:45:32	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	440,640	778,081	1.7658
27 Dec 2012 11:24:54	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	414,720	729,724	1.7596
20 Dec 2012 13:21:58	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	388,800	683,571	1.7582
17 Dec 2012 14:15:30	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	362,880	637,059	1.7556
14 Dec 2012 14:22:08	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	336,960	588,676	1.7470
14 Dec 2012 14:22:08	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	311,040	542,458	1.7440
14 Dec 2012 14:22:08	1254204	15442885	hadcm3n_zm48_1880_40_008246655_1	285,120	497,907	1.7463