Name | hadcm3n_zlrm_1880_40_008239421_2 |
Workunit | 8394545 |
Created | 18 Nov 2012, 10:51:40 UTC |
Sent | 18 Nov 2012, 10:51:47 UTC |
Report deadline | 17 Feb 2013, 18:18:58 UTC |
Received | 1 Jan 2013, 16:14:09 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1137520 |
Run time | 5 days 17 hours 39 min 22 sec |
CPU time | 5 days 1 hours 27 min 26 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 3.02 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3484, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4024, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2716, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1212, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3156, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5948, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4476, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3636, iMonCtr=1 Model crash detected, will try to restart... 17:21:39 (2500): No heartbeat from core client for 30 sec - exiting 17:21:40 (2500): No heartbeat from core client for 30 sec - exiting 17:21:41 (2500): No heartbeat from core client for 30 sec - exiting 17:21:42 (2500): No heartbeat from core client for 30 sec - exiting 17:21:43 (2500): No heartbeat from core client for 30 sec - exiting 17:21:44 (2500): No heartbeat from core client for 30 sec - exiting 17:21:45 (2500): No heartbeat from core client for 30 sec - exiting 17:21:46 (2500): No heartbeat from core client for 30 sec - exiting 17:21:47 (2500): No heartbeat from core client for 30 sec - exiting 17:21:48 (2500): No heartbeat from core client for 30 sec - exiting 17:21:49 (2500): No heartbeat from core client for 30 sec - exiting 17:21:50 (2500): No heartbeat from core client for 30 sec - exiting 17:21:51 (2500): No heartbeat from core client for 30 sec - exiting 17:21:52 (2500): No heartbeat from core client for 30 sec - exiting 17:21:53 (2500): No heartbeat from core client for 30 sec - exiting 17:21:54 (2500): No heartbeat from core client for 30 sec - exiting 17:21:55 (2500): No heartbeat from core client for 30 sec - exiting 17:21:56 (2500): No heartbeat from core client for 30 sec - exiting 17:21:57 (2500): No heartbeat from core client for 30 sec - exiting 17:21:58 (2500): No heartbeat from core client for 30 sec - exiting 17:21:59 (2500): No heartbeat from core client for 30 sec - exiting 17:22:00 (2500): No heartbeat from core client for 30 sec - exiting 17:22:01 (2500): No heartbeat from core client for 30 sec - exiting 17:22:02 (2500): No heartbeat from core client for 30 sec - exiting 17:22:03 (2500): No heartbeat from core client for 30 sec - exiting 17:22:04 (2500): No heartbeat from core client for 30 sec - exiting 17:22:05 (2500): No heartbeat from core client for 30 sec - exiting 17:22:06 (2500): No heartbeat from core client for 30 sec - exiting 17:22:07 (2500): No heartbeat from core client for 30 sec - exiting 17:22:08 (2500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:33:45 (3360): No heartbeat from core client for 30 sec - exiting 16:33:46 (3360): No heartbeat from core client for 30 sec - exiting 16:33:47 (3360): No heartbeat from core client for 30 sec - exiting 16:33:48 (3360): No heartbeat from core client for 30 sec - exiting 16:33:50 (3360): No heartbeat from core client for 30 sec - exiting 16:33:51 (3360): No heartbeat from core client for 30 sec - exiting 16:33:52 (3360): No heartbeat from core client for 30 sec - exiting 16:33:53 (3360): No heartbeat from core client for 30 sec - exiting 16:33:54 (3360): No heartbeat from core client for 30 sec - exiting 16:33:55 (3360): No heartbeat from core client for 30 sec - exiting 16:33:56 (3360): No heartbeat from core client for 30 sec - exiting 16:33:57 (3360): No heartbeat from core client for 30 sec - exiting 16:33:58 (3360): No heartbeat from core client for 30 sec - exiting 16:33:59 (3360): No heartbeat from core client for 30 sec - exiting 16:34:00 (3360): No heartbeat from core client for 30 sec - exiting 16:34:01 (3360): No heartbeat from core client for 30 sec - exiting 16:34:02 (3360): No heartbeat from core client for 30 sec - exiting 16:34:03 (3360): No heartbeat from core client for 30 sec - exiting 16:34:04 (3360): No heartbeat from core client for 30 sec - exiting 16:34:05 (3360): No heartbeat from core client for 30 sec - exiting 16:34:06 (3360): No heartbeat from core client for 30 sec - exiting 16:34:07 (3360): No heartbeat from core client for 30 sec - exiting 16:34:08 (3360): No heartbeat from core client for 30 sec - exiting 16:34:09 (3360): No heartbeat from core client for 30 sec - exiting 16:34:10 (3360): No heartbeat from core client for 30 sec - exiting 16:34:11 (3360): No heartbeat from core client for 30 sec - exiting 16:34:12 (3360): No heartbeat from core client for 30 sec - exiting 16:34:13 (3360): No heartbeat from core client for 30 sec - exiting 16:34:14 (3360): No heartbeat from core client for 30 sec - exiting 16:34:15 (3360): No heartbeat from core client for 30 sec - exiting 16:34:16 (3360): No heartbeat from core client for 30 sec - exiting 16:34:17 (3360): No heartbeat from core client for 30 sec - exiting 16:34:18 (3360): No heartbeat from core client for 30 sec - exiting 16:34:19 (3360): No heartbeat from core client for 30 sec - exiting 16:34:20 (3360): No heartbeat from core client for 30 sec - exiting 16:34:21 (3360): No heartbeat from core client for 30 sec - exiting 16:34:22 (3360): No heartbeat from core client for 30 sec - exiting 16:34:23 (3360): No heartbeat from core client for 30 sec - exiting 16:34:24 (3360): No heartbeat from core client for 30 sec - exiting 16:34:25 (3360): No heartbeat from core client for 30 sec - exiting 16:34:26 (3360): No heartbeat from core client for 30 sec - exiting 16:34:27 (3360): No heartbeat from core client for 30 sec - exiting 16:34:28 (3360): No heartbeat from core client for 30 sec - exiting 16:34:29 (3360): No heartbeat from core client for 30 sec - exiting 16:34:30 (3360): No heartbeat from core client for 30 sec - exiting 16:34:31 (3360): No heartbeat from core client for 30 sec - exiting 16:34:32 (3360): No heartbeat from core client for 30 sec - exiting 16:34:33 (3360): No heartbeat from core client for 30 sec - exiting 16:34:34 (3360): No heartbeat from core client for 30 sec - exiting 16:34:35 (3360): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:17:43 (3400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3940, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3056, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3604, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6052, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4268, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1548, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3340, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 13:51:29 (3708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3880, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=784, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4800, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3192, iMonCtr=1 Model crash detected, will try to restart... 16:42:16 (1620): No heartbeat from core client for 30 sec - exiting 16:42:17 (1620): No heartbeat from core client for 30 sec - exiting 16:42:18 (1620): No heartbeat from core client for 30 sec - exiting 16:42:19 (1620): No heartbeat from core client for 30 sec - exiting 16:42:20 (1620): No heartbeat from core client for 30 sec - exiting 16:42:21 (1620): No heartbeat from core client for 30 sec - exiting 16:42:22 (1620): No heartbeat from core client for 30 sec - exiting 16:42:23 (1620): No heartbeat from core client for 30 sec - exiting 16:42:24 (1620): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C15:30:44 (2116): No heartbeat from core client for 30 sec - exiting 15:30:45 (2116): No heartbeat from core client for 30 sec - exiting 15:30:46 (2116): No heartbeat from core client for 30 sec - exiting 15:30:47 (2116): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:56:43 (3896): No heartbeat from core client for 30 sec - exiting 15:56:44 (3896): No heartbeat from core client for 30 sec - exiting 15:56:45 (3896): No heartbeat from core client for 30 sec - exiting 15:56:47 (3896): No heartbeat from core client for 30 sec - exiting 15:56:48 (3896): No heartbeat from core client for 30 sec - exiting 15:56:49 (3896): No heartbeat from core client for 30 sec - exiting 15:56:50 (3896): No heartbeat from core client for 30 sec - exiting 15:56:51 (3896): No heartbeat from core client for 30 sec - exiting 15:56:52 (3896): No heartbeat from core client for 30 sec - exiting 15:56:53 (3896): No heartbeat from core client for 30 sec - exiting 15:56:54 (3896): No heartbeat from core client for 30 sec - exiting 15:56:55 (3896): No heartbeat from core client for 30 sec - exiting 15:56:56 (3896): No heartbeat from core client for 30 sec - exiting 15:56:57 (3896): No heartbeat from core client for 30 sec - exiting 15:56:58 (3896): No heartbeat from core client for 30 sec - exiting 15:56:59 (3896): No heartbeat from core client for 30 sec - exiting 15:57:00 (3896): No heartbeat from core client for 30 sec - exiting 15:57:01 (3896): No heartbeat from core client for 30 sec - exiting 15:57:02 (3896): No heartbeat from core client for 30 sec - exiting 15:57:03 (3896): No heartbeat from core client for 30 sec - exiting 15:57:04 (3896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=156, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1376, iMonCtr=1 Model crash detected, will try to restart... 13:47:27 (3332): No heartbeat from core client for 30 sec - exiting 13:47:28 (3332): No heartbeat from core client for 30 sec - exiting 13:47:29 (3332): No heartbeat from core client for 30 sec - exiting 13:47:30 (3332): No heartbeat from core client for 30 sec - exiting 13:47:31 (3332): No heartbeat from core client for 30 sec - exiting 13:47:32 (3332): No heartbeat from core client for 30 sec - exiting 13:47:33 (3332): No heartbeat from core client for 30 sec - exiting 13:47:34 (3332): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CCController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1740, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3764, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77BB5EAB read attempt to address 0x40317F1F Engaging BOINC Windows Runtime Debugger... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
30 Dec 2012 16:11:08 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 259,200 | 430,646 | 1.6614 |
20 Dec 2012 21:24:59 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 233,280 | 385,764 | 1.6537 |
16 Dec 2012 11:20:02 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 207,360 | 342,861 | 1.6535 |
15 Dec 2012 10:53:58 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 181,440 | 298,633 | 1.6459 |
15 Dec 2012 10:53:58 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 155,520 | 256,856 | 1.6516 |
15 Dec 2012 10:53:58 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 129,600 | 214,418 | 1.6545 |
06 Dec 2012 12:14:01 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 103,680 | 169,825 | 1.6380 |
30 Nov 2012 17:24:49 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 77,760 | 126,202 | 1.6230 |
27 Nov 2012 19:31:45 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 51,840 | 83,899 | 1.6184 |
21 Nov 2012 18:05:15 | 1137520 | 15438433 | hadcm3n_zlrm_1880_40_008239421_2 | 25,920 | 41,264 | 1.5920 |
©2024 cpdn.org