Name | hadcm3n_39jn_1980_40_008317779_0 |
Workunit | 8468914 |
Created | 24 Feb 2013, 2:39:23 UTC |
Sent | 24 Feb 2013, 2:39:26 UTC |
Report deadline | 26 May 2013, 10:06:37 UTC |
Received | 20 Mar 2013, 23:16:57 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1127376 |
Run time | 22 days 23 hours 42 min 18 sec |
CPU time | 15 days 8 hours 3 min 3 sec |
Validate state | Invalid |
Credit | 6,220.80 |
Device peak FLOPS | 1.39 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3252, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3844, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3844, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3128, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2772, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 02:16:18 (3636): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:16:19 (3636): No heartbeat from core client for 30 sec - exiting 02:16:20 (3636): No heartbeat from core client for 30 sec - exiting 02:16:21 (3636): No heartbeat from core client for 30 sec - exiting 02:16:22 (3636): No heartbeat from core client for 30 sec - exiting 02:16:23 (3636): No heartbeat from core client for 30 sec - exiting 02:16:25 (3636): No heartbeat from core client for 30 sec - exiting 02:16:26 (3636): No heartbeat from core client for 30 sec - exiting 02:16:27 (3636): No heartbeat from core client for 30 sec - exiting 02:16:28 (3636): No heartbeat from core client for 30 sec - exiting 02:16:29 (3636): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4788, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3260, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3260, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4172, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2920, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2112, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... 10:13:33 (4720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:13:34 (4720): No heartbeat from core client for 30 sec - exiting 10:13:35 (4720): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=100120, iMonCtr=1 Model crash detected, will try to restart... 10:46:50 (4832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:46:51 (4832): No heartbeat from core client for 30 sec - exiting 10:46:53 (4832): No heartbeat from core client for 30 sec - exiting 10:46:54 (4832): No heartbeat from core client for 30 sec - exiting 10:46:55 (4832): No heartbeat from core client for 30 sec - exiting 23:02:22 (58400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:02:23 (58400): No heartbeat from core client for 30 sec - exiting 23:02:24 (58400): No heartbeat from core client for 30 sec - exiting 23:02:25 (58400): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4900, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1648, iMonCtr=1 Model crash detected, will try to restart... 06:34:37 (4836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:34:38 (4836): No heartbeat from core client for 30 sec - exiting 06:34:40 (4836): No heartbeat from core client for 30 sec - exiting 06:34:41 (4836): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17580, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5028, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
19 Mar 2013 23:14:42 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 518,400 | 1,324,862 | 2.5557 |
18 Mar 2013 17:48:39 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 492,480 | 1,258,288 | 2.5550 |
17 Mar 2013 21:34:54 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 466,560 | 1,192,290 | 2.5555 |
16 Mar 2013 20:50:06 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 440,640 | 1,126,564 | 2.5567 |
15 Mar 2013 16:40:47 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 414,720 | 1,060,629 | 2.5575 |
14 Mar 2013 12:25:50 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 388,800 | 994,376 | 2.5576 |
13 Mar 2013 06:44:24 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 362,880 | 928,670 | 2.5592 |
12 Mar 2013 03:09:55 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 336,960 | 862,931 | 2.5609 |
10 Mar 2013 21:59:11 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 311,040 | 796,354 | 2.5603 |
08 Mar 2013 22:04:18 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 285,120 | 730,549 | 2.5623 |
07 Mar 2013 03:48:47 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 259,200 | 664,307 | 2.5629 |
06 Mar 2013 01:49:12 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 233,280 | 599,014 | 2.5678 |
04 Mar 2013 19:38:38 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 207,360 | 530,711 | 2.5594 |
03 Mar 2013 16:15:46 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 181,440 | 464,517 | 2.5602 |
02 Mar 2013 14:35:30 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 155,520 | 396,366 | 2.5486 |
01 Mar 2013 10:07:47 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 129,600 | 330,879 | 2.5531 |
28 Feb 2013 18:35:49 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 103,680 | 265,758 | 2.5633 |
27 Feb 2013 05:36:05 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 77,760 | 200,123 | 2.5736 |
26 Feb 2013 04:59:07 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 51,840 | 135,239 | 2.6088 |
25 Feb 2013 03:53:01 | 1127376 | 15631622 | hadcm3n_39jn_1980_40_008317779_0 | 25,920 | 69,813 | 2.6934 |
©2024 cpdn.org