Name | hadam3p_saf_7n01_2009_1_007586746_2 |
Workunit | 7764876 |
Created | 3 Dec 2011, 3:37:05 UTC |
Sent | 3 Dec 2011, 4:02:15 UTC |
Report deadline | 14 Nov 2012, 9:22:15 UTC |
Received | 7 Apr 2012, 4:21:05 UTC |
Server state | Over |
Outcome | Success |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 989394 |
Run time | 7 days 18 hours 44 min 7 sec |
CPU time | 6 days 9 hours 16 min 47 sec |
Validate state | Workunit error - check skipped |
Credit | 2,244.09 |
Device peak FLOPS | 1.53 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Southern Africa v6.09 windows_intelx86 |
Stderr | <core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5220, selfPID=4252, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5768, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5260, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5844, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3904, selfPID=3904, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5700, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4384, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5328, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4492, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... GloCbatroller:: CPDN process is not running, exiting, bRetVGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6124, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5552, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4288, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4408, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5908, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2776, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2780, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5492, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3792, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4720, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5912, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3516, selfPID=388, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4464, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4928, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3344, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5980, selfPID=2292, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2640, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4272, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1440, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=224, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4328, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5024, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3300, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3096, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5468, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5560, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4600, iMonCtr=2 Model crash detected, will try to restart... 17:50:55 (4888): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:50:58 (4888): No heartbeat from core client for 30 sec - exiting 17:50:59 (4888): No heartbeat from core client for 30 sec - exiting 17:51:00 (4888): No heartbeat from core client for 30 sec - exiting 17:51:01 (4888): No heartbeat from core client for 30 sec - exiting 17:51:02 (4888): No heartbeat from core client for 30 sec - exiting 17:51:03 (4888): No heartbeat from core client for 30 sec - exiting 17:51:04 (4888): No heartbeat from core client for 30 sec - exiting 17:51:05 (4888): No heartbeat from core client for 30 sec - exiting 17:51:06 (4888): No heartbeat from core client for 30 sec - exiting 17:51:07 (4888): No heartbeat from core client for 30 sec - exiting 17:51:08 (4888): No heartbeat from core client for 30 sec - exiting 17:51:09 (4888): No heartbeat from core client for 30 sec - exiting 17:51:10 (4888): No heartbeat from core client for 30 sec - exiting 17:51:11 (4888): No heartbeat from core client for 30 sec - exiting 17:51:12 (4888): No heartbeat from core client for 30 sec - exiting 17:51:13 (4888): No heartbeat from core client for 30 sec - exiting 17:51:14 (4888): No heartbeat from core client for 30 sec - exiting 17:51:15 (4888): No heartbeat from core client for 30 sec - exiting 17:51:16 (4888): No heartbeat from core client for 30 sec - exiting 17:51:17 (4888): No heartbeat from core client for 30 sec - exiting 17:51:18 (4888): No heartbeat from core client for 30 sec - exiting 17:51:19 (4888): No heartbeat from core client for 30 sec - exiting 17:51:20 (4888): No heartbeat from core client for 30 sec - exiting 17:51:21 (4888): No heartbeat from core client for 30 sec - exiting 17:51:22 (4888): No heartbeat from core client for 30 sec - exiting 17:51:23 (4888): No heartbeat from core client for 30 sec - exiting 17:51:24 (4888): No heartbeat from core client for 30 sec - exiting 17:51:25 (4888): No heartbeat from core client for 30 sec - exiting 17:51:26 (4888): No heartbeat from core client for 30 sec - exiting 17:51:27 (4888): No heartbeat from core client for 30 sec - exiting 17:51:28 (4888): No heartbeat from core client for 30 sec - exiting 17:51:29 (4888): No heartbeat from core client for 30 sec - exiting 17:51:30 (4888): No heartbeat from core client for 30 sec - exiting 17:51:31 (4888): No heartbeat from core client for 30 sec - exiting 17:51:32 (4888): No heartbeat from core client for 30 sec - exiting 17:51:33 (4888): No heartbeat from core client for 30 sec - exiting 17:51:34 (4888): No heartbeat from core client for 30 sec - exiting 17:51:35 (4888): No heartbeat from core client for 30 sec - exiting 17:51:36 (4888): No heartbeat from core client for 30 sec - exiting 17:51:38 (4888): No heartbeat from core client for 30 sec - exiting 17:51:39 (4888): No heartbeat from core client for 30 sec - exiting 17:51:40 (4888): No heartbeat from core client for 30 sec - exiting 17:51:41 (4888): No heartbeat from core client for 30 sec - exiting 17:51:42 (4888): No heartbeat from core client for 30 sec - exiting 17:51:43 (4888): No heartbeat from core client for 30 sec - exiting 17:51:44 (4888): No heartbeat from core client for 30 sec - exiting 17:51:45 (4888): No heartbeat from core client for 30 sec - exiting 17:51:46 (4888): No heartbeat from core client for 30 sec - exiting 17:51:47 (4888): No heartbeat from core client for 30 sec - exiting 17:51:48 (4888): No heartbeat from core client for 30 sec - exiting 17:51:49 (4888): No heartbeat from core client for 30 sec - exiting 17:51:50 (4888): No heartbeat from core client for 30 sec - exiting 17:51:51 (4888): No heartbeat from core client for 30 sec - exiting 17:51:52 (4888): No heartbeat from core client for 30 sec - exiting 17:51:53 (4888): No heartbeat from core client for 30 sec - exiting 17:51:54 (4888): No heartbeat from core client for 30 sec - exiting 17:51:55 (4888): No heartbeat from core client for 30 sec - exiting 17:51:57 (4888): No heartbeat from core client for 30 sec - exiting 17:51:58 (4888): No heartbeat from core client for 30 sec - exiting 17:51:59 (4888): No heartbeat from core client for 30 sec - exiting 17:52:00 (4888): No heartbeat from core client for 30 sec - exiting 17:52:01 (4888): No heartbeat from core client for 30 sec - exiting 17:52:02 (4888): No heartbeat from core client for 30 sec - exiting 17:52:03 (4888): No heartbeat from core client for 30 sec - exiting 17:52:04 (4888): No heartbeat from core client for 30 sec - exiting 17:52:05 (4888): No heartbeat from core client for 30 sec - exiting 17:52:06 (4888): No heartbeat from core client for 30 sec - exiting 17:52:07 (4888): No heartbeat from core client for 30 sec - exiting 17:52:08 (4888): No heartbeat from core client for 30 sec - exiting 17:52:09 (4888): No heartbeat from core client for 30 sec - exiting 17:52:10 (4888): No heartbeat from core client for 30 sec - exiting 17:52:11 (4888): No heartbeat from core client for 30 sec - exiting 17:52:12 (4888): No heartbeat from core client for 30 sec - exiting 17:52:18 (4888): No heartbeat from core client for 30 sec - exiting 17:52:19 (4888): No heartbeat from core client for 30 sec - exiting 17:52:20 (4888): No heartbeat from core client for 30 sec - exiting 17:52:21 (4888): No heartbeat from core client for 30 sec - exiting 17:52:22 (4888): No heartbeat from core client for 30 sec - exiting 17:52:23 (4888): No heartbeat from core client for 30 sec - exiting 17:52:24 (4888): No heartbeat from core client for 30 sec - exiting 17:52:25 (4888): No heartbeat from core client for 30 sec - exiting 17:52:26 (4888): No heartbeat from core client for 30 sec - exiting 17:52:27 (4888): No heartbeat from core client for 30 sec - exiting 17:52:28 (4888): No heartbeat from core client for 30 sec - exiting 17:52:29 (4888): No heartbeat from core client for 30 sec - exiting 17:52:30 (4888): No heartbeat from core client for 30 sec - exiting 17:52:31 (4888): No heartbeat from core client for 30 sec - exiting 17:52:32 (4888): No heartbeat from core client for 30 sec - exiting 17:52:33 (4888): No heartbeat from core client for 30 sec - exiting 17:52:34 (4888): No heartbeat from core client for 30 sec - exiting 17:52:35 (4888): No heartbeat from core client for 30 sec - exiting 17:52:36 (4888): No heartbeat from core client for 30 sec - exiting 17:52:37 (4888): No heartbeat from core client for 30 sec - exiting 17:52:39 (4888): No heartbeat from core client for 30 sec - exiting 17:52:40 (4888): No heartbeat from core client for 30 sec - exiting 17:52:41 (4888): No heartbeat from core client for 30 sec - exiting 17:52:42 (4888): No heartbeat from core client for 30 sec - exiting 17:52:43 (4888): No heartbeat from core client for 30 sec - exiting 17:52:44 (4888): No heartbeat from core client for 30 sec - exiting 17:52:45 (4888): No heartbeat from core client for 30 sec - exiting 17:52:46 (4888): No heartbeat from core client for 30 sec - exiting 17:52:47 (4888): No heartbeat from core client for 30 sec - exiting 17:52:48 (4888): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2904, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4636, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3416, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4228, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4716, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2836, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5324, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6088, iMonCtr=2 Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5812, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4144, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4516, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5232, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2704, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5236, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6020, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5464, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2968, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5316, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3020, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5128, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5848, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4428, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5408, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5028, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4488, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
07 Apr 2012 03:21:16 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 138,336 | 550,954 | 3.9827 |
25 Mar 2012 20:42:17 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 126,816 | 504,548 | 3.9786 |
14 Mar 2012 05:05:26 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 115,296 | 458,754 | 3.9789 |
07 Mar 2012 02:44:37 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 103,776 | 412,627 | 3.9761 |
26 Feb 2012 21:18:17 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 92,256 | 367,656 | 3.9852 |
13 Feb 2012 01:31:35 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 80,736 | 322,555 | 3.9952 |
08 Feb 2012 00:28:42 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 69,216 | 277,674 | 4.0117 |
04 Feb 2012 20:21:10 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 57,696 | 232,499 | 4.0297 |
02 Feb 2012 02:05:21 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 46,176 | 185,536 | 4.0180 |
22 Jan 2012 05:15:31 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 34,656 | 138,988 | 4.0105 |
11 Jan 2012 22:35:10 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 23,136 | 92,831 | 4.0124 |
06 Jan 2012 23:05:06 | 989394 | 13704059 | hadam3p_saf_7n01_2009_1_007586746_2 | 11,616 | 45,613 | 3.9267 |
©2024 cpdn.org