Name | hadcm3n_yf0h_1980_40_007959431_4 |
Workunit | 8114543 |
Created | 15 May 2012, 18:35:30 UTC |
Sent | 15 May 2012, 18:36:11 UTC |
Report deadline | 15 Aug 2012, 2:03:22 UTC |
Received | 17 Sep 2012, 18:36:19 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1163731 |
Run time | 16 days 6 hours 6 min 13 sec |
CPU time | 12 days 10 hours 23 min 1 sec |
Validate state | Invalid |
Credit | 6,220.80 |
Device peak FLOPS | 2.58 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> 17:40:05 (4444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 19:17:11 (3556): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2400, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3544, iMonCtr=1 Model crash detected, will try to restart... 19:54:15 (3544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1580, iMonCtr=1 Model crash detected, will try to restart... 18:41:43 (1388): No heartbeat from core client for 30 sec - exiting 18:41:44 (1388): No heartbeat from core client for 30 sec - exiting 18:41:45 (1388): No heartbeat from core client for 30 sec - exiting 18:41:46 (1388): No heartbeat from core client for 30 sec - exiting 18:41:47 (1388): No heartbeat from core client for 30 sec - exiting 18:41:48 (1388): No heartbeat from core client for 30 sec - exiting 18:41:49 (1388): No heartbeat from core client for 30 sec - exiting 18:41:50 (1388): No heartbeat from core client for 30 sec - exiting 18:41:51 (1388): No heartbeat from core client for 30 sec - exiting 18:41:52 (1388): No heartbeat from core client for 30 sec - exiting 18:41:53 (1388): No heartbeat from core client for 30 sec - exiting 18:41:54 (1388): No heartbeat from core client for 30 sec - exiting 18:41:56 (1388): No heartbeat from core client for 30 sec - exiting 18:41:57 (1388): No heartbeat from core client for 30 sec - exiting 18:41:58 (1388): No heartbeat from core client for 30 sec - exiting 18:41:59 (1388): No heartbeat from core client for 30 sec - exiting 18:42:00 (1388): No heartbeat from core client for 30 sec - exiting 18:42:01 (1388): No heartbeat from core client for 30 sec - exiting 18:42:02 (1388): No heartbeat from core client for 30 sec - exiting 18:42:03 (1388): No heartbeat from core client for 30 sec - exiting 18:42:04 (1388): No heartbeat from core client for 30 sec - exiting 18:42:05 (1388): No heartbeat from core client for 30 sec - exiting 18:42:06 (1388): No heartbeat from core client for 30 sec - exiting 18:42:07 (1388): No heartbeat from core client for 30 sec - exiting 18:42:08 (1388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:37:22 (3992): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C05:57:47 (3836): No heartbeat from core client for 30 sec - exiting 05:57:48 (3836): No heartbeat from core client for 30 sec - exiting 05:57:49 (3836): No heartbeat from core client for 30 sec - exiting 05:57:50 (3836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:57:51 (3836): No heartbeat from core client for 30 sec - exiting 13:01:53 (3300): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C18:44:09 (1932): No heartbeat from core client for 30 sec - exiting 18:44:12 (1932): No heartbeat from core client for 30 sec - exiting 18:44:13 (1932): No heartbeat from core client for 30 sec - exiting 18:44:14 (1932): No heartbeat from core client for 30 sec - exiting 18:44:15 (1932): No heartbeat from core client for 30 sec - exiting 18:44:16 (1932): No heartbeat from core client for 30 sec - exiting 18:44:17 (1932): No heartbeat from core client for 30 sec - exiting 18:44:18 (1932): No heartbeat from core client for 30 sec - exiting 18:44:19 (1932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3168, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3168, iMonCtr=1 Model crash detected, will try to restart... 19:35:12 (3168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3640, iMonCtr=1 Model crash detected, will try to restart... 21:27:35 (3640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3624, iMonCtr=1 Model crash detected, will try to restart... C19:48:51 (3680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CoSuspended CPDN Monitor - Suspend request from BOINC... CCPDN Monitor - Quit request from BOINC... 11:49:20 (3184): No heartbeat from core client for 30 sec - exiting 11:49:21 (3184): No heartbeat from core client for 30 sec - exiting 11:49:22 (3184): No heartbeat from core client for 30 sec - exiting 11:49:23 (3184): No heartbeat from core client for 30 sec - exiting 11:49:24 (3184): No heartbeat from core client for 30 sec - exiting 11:49:25 (3184): No heartbeat from core client for 30 sec - exiting 11:49:26 (3184): No heartbeat from core client for 30 sec - exiting 11:49:27 (3184): No heartbeat from core client for 30 sec - exiting 11:49:28 (3184): No heartbeat from core client for 30 sec - exiting 11:49:29 (3184): No heartbeat from core client for 30 sec - exiting 11:49:31 (3184): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:49:32 (3184): No heartbeat from core client for 30 sec - exiting 11:49:33 (3184): No heartbeat from core client for 30 sec - exiting 14:05:00 (4608): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3792, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 17:38:40 (2880): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... C09:59:30 (3964): No heartbeat from core client for 30 sec - exiting 09:59:31 (3964): No heartbeat from core client for 30 sec - exiting 09:59:32 (3964): No heartbeat from core client for 30 sec - exiting 09:59:33 (3964): No heartbeat from core client for 30 sec - exiting 09:59:35 (3964): No heartbeat from core client for 30 sec - exiting 09:59:36 (3964): No heartbeat from core client for 30 sec - exiting 09:59:37 (3964): No heartbeat from core client for 30 sec - exiting 09:59:38 (3964): No heartbeat from core client for 30 sec - exiting 09:59:39 (3964): No heartbeat from core client for 30 sec - exiting 09:59:40 (3964): No heartbeat from core client for 30 sec - exiting 09:59:41 (3964): No heartbeat from core client for 30 sec - exiting 09:59:42 (3964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:59:43 (3964): No heartbeat from core client for 30 sec - exiting C16:09:04 (3308): No heartbeat from core client for 30 sec - exiting 16:09:20 (3308): No heartbeat from core client for 30 sec - exiting 16:09:21 (3308): No heartbeat from core client for 30 sec - exiting 16:09:22 (3308): No heartbeat from core client for 30 sec - exiting 16:09:23 (3308): No heartbeat from core client for 30 sec - exiting 16:09:25 (3308): No heartbeat from core client for 30 sec - exiting 16:09:26 (3308): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77E13AB3 read attempt to address 0x40C35CEC Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yf0h_1980_40_007959431/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
17 Sep 2012 17:54:09 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 518,400 | 1,074,178 | 2.0721 |
15 Sep 2012 19:24:20 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 492,480 | 1,019,909 | 2.0710 |
13 Sep 2012 14:08:36 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 466,560 | 965,552 | 2.0695 |
11 Sep 2012 19:28:10 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 440,640 | 911,144 | 2.0678 |
09 Sep 2012 18:45:06 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 414,720 | 856,892 | 2.0662 |
02 Sep 2012 12:56:34 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 388,800 | 803,242 | 2.0660 |
30 Aug 2012 18:44:05 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 362,880 | 748,145 | 2.0617 |
25 Aug 2012 14:25:51 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 336,960 | 694,028 | 2.0597 |
13 Aug 2012 19:36:58 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 311,040 | 638,733 | 2.0535 |
09 Aug 2012 16:53:29 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 285,120 | 585,207 | 2.0525 |
04 Aug 2012 07:55:25 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 259,200 | 532,386 | 2.0540 |
29 Jul 2012 18:51:56 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 233,280 | 478,138 | 2.0496 |
22 Jul 2012 18:47:21 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 207,360 | 424,798 | 2.0486 |
18 Jul 2012 18:29:32 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 181,440 | 372,028 | 2.0504 |
14 Jul 2012 07:04:40 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 155,520 | 318,292 | 2.0466 |
17 Jun 2012 19:43:29 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 129,600 | 264,331 | 2.0396 |
06 Jun 2012 18:39:51 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 103,680 | 214,363 | 2.0675 |
21 May 2012 18:31:51 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 77,760 | 161,693 | 2.0794 |
19 May 2012 16:23:51 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 51,840 | 106,487 | 2.0541 |
18 May 2012 13:22:45 | 1163731 | 14671029 | hadcm3n_yf0h_1980_40_007959431_4 | 25,920 | 53,853 | 2.0777 |
©2024 cpdn.org