Name | hadam3p_pnw_6w36_2001_1_007590170_1 |
Workunit | 7768300 |
Created | 24 Dec 2011, 9:40:39 UTC |
Sent | 24 Dec 2011, 9:48:32 UTC |
Report deadline | 5 Dec 2012, 15:08:32 UTC |
Received | 7 Dec 2012, 10:37:10 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1161213 |
Run time | 3 days 1 hours 37 min 47 sec |
CPU time | 2 days 17 hours 56 min 50 sec |
Validate state | Invalid |
Credit | 1,754.30 |
Device peak FLOPS | 2.78 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Pacific North West v6.09 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5716, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5724, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4696, selfPID=2228, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 0 Called boinc_finish 10:48:58 (5372): No heartbeat from core client for 30 sec - exiting 10:48:59 (5372): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:49:00 (5372): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5248, selfPID=4212, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 0 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1440, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6216, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6224, selfPID=5656, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 18:06:44 (5384): No heartbeat from core client for 30 sec - exiting 18:06:46 (5384): No heartbeat from core client for 30 sec - exiting 18:06:47 (5384): No heartbeat from core client for 30 sec - exiting 18:06:48 (5384): No heartbeat from core client for 30 sec - exiting 18:06:49 (5384): No heartbeat from core client for 30 sec - exiting 18:06:50 (5384): No heartbeat from core client for 30 sec - exiting 18:06:51 (5384): No heartbeat from core client for 30 sec - exiting 18:06:52 (5384): No heartbeat from core client for 30 sec - exiting 18:06:53 (5384): No heartbeat from core client for 30 sec - exiting 18:06:54 (5384): No heartbeat from core client for 30 sec - exiting 18:06:55 (5384): No heartbeat from core client for 30 sec - exiting 18:06:56 (5384): No heartbeat from core client for 30 sec - exiting 18:06:57 (5384): No heartbeat from core client for 30 sec - exiting 18:06:58 (5384): No heartbeat from core client for 30 sec - exiting 18:06:59 (5384): No heartbeat from core client for 30 sec - exiting 18:07:00 (5384): No heartbeat from core client for 30 sec - exiting 18:07:01 (5384): No heartbeat from core client for 30 sec - exiting 18:07:02 (5384): No heartbeat from core client for 30 sec - exiting 18:07:03 (5384): No heartbeat from core client for 30 sec - exiting 18:07:04 (5384): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=884, selfPID=2700, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6620, selfPID=4928, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 3 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5980, selfPID=4832, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1380, selfPID=1380, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6992, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2660, selfPID=2660, iMonCtr=2 15:20:04 (5032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:12:32 (252): No heartbeat from core client for 30 sec - exiting 14:12:33 (252): No heartbeat from core client for 30 sec - exiting 14:12:34 (252): No heartbeat from core client for 30 sec - exiting 14:12:35 (252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:26:45 (4116): No heartbeat from core client for 30 sec - exiting 18:26:46 (4116): No heartbeat from core client for 30 sec - exiting 18:26:47 (4116): No heartbeat from core client for 30 sec - exiting 18:26:48 (4116): No heartbeat from core client for 30 sec - exiting 18:26:49 (4116): No heartbeat from core client for 30 sec - exiting 18:26:50 (4116): No heartbeat from core client for 30 sec - exiting 18:26:51 (4116): No heartbeat from core client for 30 sec - exiting 18:26:52 (4116): No heartbeat from core client for 30 sec - exiting 10:59:01 (4776): No heartbeat from core client for 30 sec - exiting 10:59:02 (4776): No heartbeat from core client for 30 sec - exiting 10:59:03 (4776): No heartbeat from core client for 30 sec - exiting 10:59:04 (4776): No heartbeat from core client for 30 sec - exiting 10:59:05 (4776): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:05:29 (4952): No heartbeat from core client for 30 sec - exiting 14:05:31 (4952): No heartbeat from core client for 30 sec - exiting 14:05:32 (4952): No heartbeat from core client for 30 sec - exiting 14:05:33 (4952): No heartbeat from core client for 30 sec - exiting 14:05:34 (4952): No heartbeat from core client for 30 sec - exiting 14:05:35 (4952): No heartbeat from core client for 30 sec - exiting 14:05:36 (4952): No heartbeat from core client for 30 sec - exiting 14:05:37 (4952): No heartbeat from core client for 30 sec - exiting 14:05:38 (4952): No heartbeat from core client for 30 sec - exiting 14:05:39 (4952): No heartbeat from core client for 30 sec - exiting 14:05:40 (4952): No heartbeat from core client for 30 sec - exiting 14:05:41 (4952): No heartbeat from core client for 30 sec - exiting 14:05:42 (4952): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:42:55 (4372): No heartbeat from core client for 30 sec - exiting 10:42:56 (4372): No heartbeat from core client for 30 sec - exiting 10:42:57 (4372): No heartbeat from core client for 30 sec - exiting 10:42:58 (4372): No heartbeat from core client for 30 sec - exiting 10:42:59 (4372): No heartbeat from core client for 30 sec - exiting 10:43:00 (4372): No heartbeat from core client for 30 sec - exiting 10:43:01 (4372): No heartbeat from core client for 30 sec - exiting 10:43:02 (4372): No heartbeat from core client for 30 sec - exiting 10:43:03 (4372): No heartbeat from core client for 30 sec - exiting 10:43:04 (4372): No heartbeat from core client for 30 sec - exiting 10:43:05 (4372): No heartbeat from core client for 30 sec - exiting 10:43:06 (4372): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:09:50 (4304): No heartbeat from core client for 30 sec - exiting 11:09:51 (4304): No heartbeat from core client for 30 sec - exiting 11:09:52 (4304): No heartbeat from core client for 30 sec - exiting 11:09:53 (4304): No heartbeat from core client for 30 sec - exiting 11:09:54 (4304): No heartbeat from core client for 30 sec - exiting 11:09:55 (4304): No heartbeat from core client for 30 sec - exiting 11:09:56 (4304): No heartbeat from core client for 30 sec - exiting 11:09:57 (4304): No heartbeat from core client for 30 sec - exiting 11:09:58 (4304): No heartbeat from core client for 30 sec - exiting 11:09:59 (4304): No heartbeat from core client for 30 sec - exiting 11:10:00 (4304): No heartbeat from core client for 30 sec - exiting 11:10:01 (4304): No heartbeat from core client for 30 sec - exiting 11:10:02 (4304): No heartbeat from core client for 30 sec - exiting 11:10:03 (4304): No heartbeat from core client for 30 sec - exiting 11:10:04 (4304): No heartbeat from core client for 30 sec - exiting 11:10:05 (4304): No heartbeat from core client for 30 sec - exiting 11:10:06 (4304): No heartbeat from core client for 30 sec - exiting 11:10:07 (4304): No heartbeat from core client for 30 sec - exiting 11:10:08 (4304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:21:11 (4440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3152, selfPID=1364, iMonCtr=1 Model crash detected, will try to restart... 10:33:18 (4224): No heartbeat from core client for 30 sec - exiting 10:33:19 (4224): No heartbeat from core client for 30 sec - exiting 10:33:20 (4224): No heartbeat from core client for 30 sec - exiting 10:33:21 (4224): No heartbeat from core client for 30 sec - exiting 10:33:22 (4224): No heartbeat from core client for 30 sec - exiting 10:33:23 (4224): No heartbeat from core client for 30 sec - exiting 10:33:24 (4224): No heartbeat from core client for 30 sec - exiting 10:33:25 (4224): No heartbeat from core client for 30 sec - exiting 10:33:26 (4224): No heartbeat from core client for 30 sec - exiting 10:33:27 (4224): No heartbeat from core client for 30 sec - exiting 10:33:28 (4224): No heartbeat from core client for 30 sec - exiting 10:33:29 (4224): No heartbeat from core client for 30 sec - exiting 10:33:30 (4224): No heartbeat from core client for 30 sec - exiting 10:33:31 (4224): No heartbeat from core client for 30 sec - exiting 10:33:32 (4224): No heartbeat from core client for 30 sec - exiting 10:33:33 (4224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Glontobal W:: CPDN pPDN procs is not running,eexsti isng, bRetVal = 1, checkPID=0, selfPIDcheckPID=0, selfP ID=5764, iMonCtr=2 d, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3444, selfPID=5480, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 7 Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_pnw_6w36_2001_1_007590170_1_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_6w36_2001_1_007590170_1_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_6w36_2001_1_007590170_1_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_6w36_2001_1_007590170_1_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_6w36_2001_1_007590170_1_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
05 Dec 2012 10:09:42 | 1161213 | 13815470 | hadam3p_pnw_6w36_2001_1_007590170_1 | 80,736 | 210,454 | 2.6067 |
30 Nov 2012 11:02:23 | 1161213 | 13815470 | hadam3p_pnw_6w36_2001_1_007590170_1 | 69,216 | 173,124 | 2.5012 |
05 Nov 2012 16:29:21 | 1161213 | 13815470 | hadam3p_pnw_6w36_2001_1_007590170_1 | 57,696 | 142,356 | 2.4673 |
06 Oct 2012 11:53:19 | 1161213 | 13815470 | hadam3p_pnw_6w36_2001_1_007590170_1 | 46,176 | 114,154 | 2.4722 |
04 Oct 2012 10:28:29 | 1161213 | 13815470 | hadam3p_pnw_6w36_2001_1_007590170_1 | 34,656 | 86,476 | 2.4953 |
03 Oct 2012 11:47:03 | 1161213 | 13815470 | hadam3p_pnw_6w36_2001_1_007590170_1 | 23,136 | 59,540 | 2.5735 |
02 Oct 2012 11:22:32 | 1161213 | 13815470 | hadam3p_pnw_6w36_2001_1_007590170_1 | 11,616 | 32,222 | 2.7739 |
©2024 climateprediction.net