Task 13619345

Name	hadcm3n_ybr2_1940_40_007540313_2
Workunit	7737545
Created	9 Nov 2011, 1:19:50 UTC
Sent	9 Nov 2011, 1:30:10 UTC
Report deadline	8 Feb 2012, 8:57:21 UTC
Received	8 Dec 2011, 21:35:02 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1065689
Run time	12 days 3 hours 52 min 23 sec
CPU time	11 days 22 hours 27 min 47 sec
Validate state	Invalid
Credit	3,110.40
Device peak FLOPS	1.37 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.12.34</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5260, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5244, iMonCtr=1 Model crash detected, will try to restart... 20:03:41 (1864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=404, iMonCtr=1 Model crash detected, will try to restart... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4224, selfPID=4224, iMonCtr=1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4760, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 03:41:47 (1964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=868, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... WNo Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=824, selfPID=824, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1068, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1068, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1068, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1068, iMonCtr=1 Model crash detected, will try to restart... 08:03:37 (1068): No heartbeat from core client for 30 sec - exiting No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1888, selfPID=1888, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4296, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=388, iMonCtr=1 Model crash detected, will try to restart... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=688, selfPID=688, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1588, selfPID=1588, iMonCtr=1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=716, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=716, iMonCtr=1 Model crash detected, will try to restart... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2452, selfPID=2452, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4388, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4388, iMonCtr=1 Model crash detected, will try to restart... 13:25:55 (4388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
07 Dec 2011 21:39:10	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	259,200	1,031,233	3.9785
24 Nov 2011 17:59:09	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	233,280	928,260	3.9792
23 Nov 2011 12:58:12	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	207,360	825,405	3.9805
21 Nov 2011 19:49:47	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	181,440	722,187	3.9803
20 Nov 2011 14:34:07	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	155,520	619,609	3.9841
19 Nov 2011 00:36:33	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	129,600	515,890	3.9806
16 Nov 2011 00:16:06	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	103,680	412,558	3.9791
16 Nov 2011 00:16:06	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	77,760	309,067	3.9746
16 Nov 2011 00:16:06	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	51,840	205,965	3.9731
16 Nov 2011 00:16:06	1065689	13619345	hadcm3n_ybr2_1940_40_007540313_2	25,920	103,267	3.9841