Task 15068373

Name	hadcm3n_yhqf_1980_40_008116672_1
Workunit	8271786
Created	3 Aug 2012, 20:19:25 UTC
Sent	3 Aug 2012, 20:19:35 UTC
Report deadline	3 Nov 2012, 3:46:46 UTC
Received	14 Sep 2012, 19:23:23 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1135203
Run time	14 days 16 hours 3 min 21 sec
CPU time	14 days 6 hours 47 min 45 sec
Validate state	Invalid
Credit	9,642.24
Device peak FLOPS	2.92 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.31</core_client_version> <![CDATA[ <message> El dispositivo no reconoce el comando. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5548, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4708, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4324, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4324, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4324, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3544, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4300, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3496, iMonCtr=1 Model crash detected, will try to restart... 09:14:57 (4676): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4828, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4828, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4828, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4828, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4828, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4828, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
14 Sep 2012 14:58:53	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	803,520	1,221,465	1.5201
13 Sep 2012 19:37:35	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	777,600	1,182,380	1.5206
13 Sep 2012 08:32:04	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	751,680	1,143,358	1.5211
12 Sep 2012 13:55:54	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	725,760	1,104,414	1.5217
10 Sep 2012 03:45:42	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	699,840	1,065,236	1.5221
09 Sep 2012 16:43:40	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	673,920	1,026,206	1.5227
07 Sep 2012 19:10:56	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	648,000	986,863	1.5229
07 Sep 2012 07:39:38	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	622,080	947,173	1.5226
05 Sep 2012 07:25:08	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	596,160	907,124	1.5216
04 Sep 2012 10:56:32	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	570,240	867,552	1.5214
03 Sep 2012 15:02:36	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	544,320	828,643	1.5223
31 Aug 2012 20:01:44	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	518,400	788,422	1.5209
31 Aug 2012 08:35:26	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	492,480	749,674	1.5222
30 Aug 2012 13:57:28	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	466,560	710,976	1.5239
29 Aug 2012 19:15:56	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	440,640	672,354	1.5259
29 Aug 2012 08:05:10	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	414,720	633,364	1.5272
28 Aug 2012 12:11:46	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	388,800	594,203	1.5283
24 Aug 2012 16:38:32	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	362,880	554,880	1.5291
23 Aug 2012 22:03:43	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	336,960	515,466	1.5298
23 Aug 2012 10:41:48	1135203	15068373	hadcm3n_yhqf_1980_40_008116672_1	311,040	475,750	1.5295