Task 14019869

Name	hadcm3n_u2q5_1980_40_007684085_2
Workunit	7839172
Created	27 Jan 2012, 19:01:13 UTC
Sent	27 Jan 2012, 19:01:36 UTC
Report deadline	28 Apr 2012, 2:28:47 UTC
Received	27 Apr 2012, 22:19:13 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1045219
Run time	23 days 17 hours 31 min 9 sec
CPU time	23 days 1 hours 15 min 30 sec
Validate state	Invalid
Credit	12,441.60
Device peak FLOPS	2.90 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4284, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/u2q5ko.pji7c10 Error converting file to netcdf: dataout/u2q5ko.pii7c10 Error converting file to netcdf: dataout/u2q5ko.pfi7c10 Error converting file to netcdf: dataout/u2q5ka.phi7c10 Error converting file to netcdf: dataout/u2q5ka.pgi7c10 Error converting file to netcdf: dataout/u2q5ka.pei7c10 Error converting file to netcdf: dataout/u2q5ka.pdi7c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3980, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4900, iMonCtr=1 Model crash detected, will try to restart... 20:11:51 (204): No heartbeat from core client for 30 sec - exiting 20:11:52 (204): No heartbeat from core client for 30 sec - exiting 20:11:53 (204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4500, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4516, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
27 Apr 2012 22:23:17	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	1,036,800	1,991,724	1.9210
22 Apr 2012 17:37:23	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	1,010,880	1,941,595	1.9207
21 Apr 2012 00:36:29	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	984,960	1,891,233	1.9201
15 Apr 2012 22:42:06	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	959,040	1,840,962	1.9196
15 Apr 2012 07:27:45	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	933,120	1,790,787	1.9191
14 Apr 2012 18:09:35	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	907,200	1,742,502	1.9207
10 Apr 2012 03:20:20	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	881,280	1,693,636	1.9218
09 Apr 2012 13:34:52	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	855,360	1,643,435	1.9213
08 Apr 2012 23:19:06	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	829,440	1,593,095	1.9207
06 Apr 2012 01:12:55	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	803,520	1,542,728	1.9200
04 Apr 2012 13:42:34	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	777,600	1,492,155	1.9189
03 Apr 2012 23:22:07	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	751,680	1,441,478	1.9177
03 Apr 2012 08:34:02	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	725,760	1,391,141	1.9168
02 Apr 2012 06:25:59	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	699,840	1,340,392	1.9153
01 Apr 2012 15:41:45	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	673,920	1,289,564	1.9135
01 Apr 2012 03:09:24	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	648,000	1,245,900	1.9227
31 Mar 2012 15:01:20	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	622,080	1,201,928	1.9321
31 Mar 2012 01:17:06	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	596,160	1,153,733	1.9353
25 Mar 2012 13:15:08	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	570,240	1,102,968	1.9342
24 Mar 2012 22:12:49	1045219	14019869	hadcm3n_u2q5_1980_40_007684085_2	544,320	1,051,783	1.9323