Name | hadam3p_eu_2jhx_1990_1_007228102_0 |
Workunit | 7426342 |
Created | 28 Apr 2011, 11:37:20 UTC |
Sent | 13 May 2011, 11:51:32 UTC |
Report deadline | 24 Apr 2012, 17:11:32 UTC |
Received | 26 May 2011, 2:17:01 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 890420 |
Run time | 3 days 1 hours 37 min 21 sec |
CPU time | 2 days 17 hours 7 min 59 sec |
Validate state | Invalid |
Credit | 1,194.02 |
Device peak FLOPS | 2.28 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1328, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6020, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5824, selfPID=4704, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 09:29:46 (4904): No heartbeat from core client for 30 sec - exiting 09:29:47 (4904): No heartbeat from core client for 30 sec - exiting 09:29:48 (4904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running,Cexiting, bRetVal = 1, checkPID=0, selfPID=464, iMonCtr=2 ontroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4704, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4600, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5264, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4752, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 09:43:31 (2960): No heartbeat from core client for 30 sec - exiting 09:43:33 (2960): No heartbeat from core client for 30 sec - exiting 09:43:34 (2960): No heartbeat from core client for 30 sec - exiting 09:43:35 (2960): No heartbeat from core client for 30 sec - exiting 09:43:36 (2960): No heartbeat from core client for 30 sec - exiting 09:43:37 (2960): No heartbeat from core client for 30 sec - exiting 09:43:38 (2960): No heartbeat from core client for 30 sec - exiting 09:43:39 (2960): No heartbeat from core client for 30 sec - exiting 09:43:40 (2960): No heartbeat from core client for 30 sec - exiting 09:43:41 (2960): No heartbeat from core client for 30 sec - exiting 09:43:42 (2960): No heartbeat from core client for 30 sec - exiting 09:43:43 (2960): No heartbeat from core client for 30 sec - exiting 09:43:44 (2960): No heartbeat from core client for 30 sec - exiting 09:43:45 (2960): No heartbeat from core client for 30 sec - exiting 09:43:46 (2960): No heartbeat from core client for 30 sec - exiting 09:43:47 (2960): No heartbeat from core client for 30 sec - exiting 09:43:48 (2960): No heartbeat from core client for 30 sec - exiting 09:43:49 (2960): No heartbeat from core client for 30 sec - exiting 09:43:50 (2960): No heartbeat from core client for 30 sec - exiting 09:43:51 (2960): No heartbeat from core client for 30 sec - exiting 09:43:52 (2960): No heartbeat from core client for 30 sec - exiting 09:43:53 (2960): No heartbeat from core client for 30 sec - exiting 09:43:54 (2960): No heartbeat from core client for 30 sec - exiting 09:43:55 (2960): No heartbeat from core client for 30 sec - exiting 09:43:56 (2960): No heartbeat from core client for 30 sec - exiting 09:43:57 (2960): No heartbeat from core client for 30 sec - exiting 09:43:58 (2960): No heartbeat from core client for 30 sec - exiting 09:43:59 (2960): No heartbeat from core client for 30 sec - exiting 09:44:00 (2960): No heartbeat from core client for 30 sec - exiting 09:44:01 (2960): No heartbeat from core client for 30 sec - exiting 09:44:02 (2960): No heartbeat from core client for 30 sec - exiting 09:44:03 (2960): No heartbeat from core client for 30 sec - exiting 09:44:04 (2960): No heartbeat from core client for 30 sec - exiting 09:44:05 (2960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6020, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4480, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exilobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3852, iMonCtr=2 ting, bRetVal = 1, checkPID=0, selfPID=4872, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2024, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5112, selfPID=4828, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... SETPOS: Seek Failed: Invalid argument SETPOS: Unit 61 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 61 to Word Address -198 Failed with Error Code -1 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_eu_2jhx_1990_1_007228102_0_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2jhx_1990_1_007228102_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2jhx_1990_1_007228102_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2jhx_1990_1_007228102_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2jhx_1990_1_007228102_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2jhx_1990_1_007228102_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
25 May 2011 06:18:09 | 890420 | 12837750 | hadam3p_eu_2jhx_1990_1_007228102_0 | 69,216 | 216,760 | 3.1316 |
24 May 2011 03:56:49 | 890420 | 12837750 | hadam3p_eu_2jhx_1990_1_007228102_0 | 57,696 | 180,646 | 3.1310 |
23 May 2011 02:44:49 | 890420 | 12837750 | hadam3p_eu_2jhx_1990_1_007228102_0 | 46,176 | 144,783 | 3.1355 |
19 May 2011 09:54:41 | 890420 | 12837750 | hadam3p_eu_2jhx_1990_1_007228102_0 | 34,656 | 108,624 | 3.1343 |
18 May 2011 08:39:30 | 890420 | 12837750 | hadam3p_eu_2jhx_1990_1_007228102_0 | 23,136 | 72,601 | 3.1380 |
17 May 2011 06:18:21 | 890420 | 12837750 | hadam3p_eu_2jhx_1990_1_007228102_0 | 11,616 | 36,927 | 3.1790 |
©2024 climateprediction.net