climateprediction.net home page
Task 13828527

Task 13828527

Name hadcm3n_ycjm_1940_40_007618960_4
Workunit 7797136
Created 29 Dec 2011, 5:45:15 UTC
Sent 29 Dec 2011, 5:45:23 UTC
Report deadline 29 Mar 2012, 13:12:34 UTC
Received 5 Mar 2012, 9:50:31 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 1115875
Run time 23 days 18 hours 35 min 37 sec
CPU time 20 days 10 hours 11 min 2 sec
Validate state Invalid
Credit 12,441.60
Device peak FLOPS 2.40 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
 - exit code 193 (0xc1)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5888, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4132, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4316, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3516, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5784, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
05:56:08 (2504): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:06:24 (5320): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:44:01 (4136): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3168, iMonCtr=1
Model crash detected, will try to restart...
06:16:32 (2916): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:02:34 (2480): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4116, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4580, iMonCtr=1
Model crash detected, will try to restart...
06:57:58 (2336): No heartbeat from core client for 30 sec - exiting
06:57:59 (2336): No heartbeat from core client for 30 sec - exiting
06:58:00 (2336): No heartbeat from core client for 30 sec - exiting
06:58:01 (2336): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4268, iMonCtr=1
Model crash detected, will try to restart...
16:17:22 (5888): No heartbeat from core client for 30 sec - exiting
16:17:23 (5888): No heartbeat from core client for 30 sec - exiting
16:17:24 (5888): No heartbeat from core client for 30 sec - exiting
16:17:25 (5888): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1236, iMonCtr=1
Model crash detected, will try to restart...
22:05:10 (4608): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5828, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1
Model crash detected, will try to restart...
18:06:39 (4768): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5656, iMonCtr=1
Model crash detected, will try to restart...
05:55:57 (5876): No heartbeat from core client for 30 sec - exiting
05:55:58 (5876): No heartbeat from core client for 30 sec - exiting
05:55:59 (5876): No heartbeat from core client for 30 sec - exiting
05:56:00 (5876): No heartbeat from core client for 30 sec - exiting
05:56:01 (5876): No heartbeat from core client for 30 sec - exiting
05:56:02 (5876): No heartbeat from core client for 30 sec - exiting
05:56:03 (5876): No heartbeat from core client for 30 sec - exiting
05:56:04 (5876): No heartbeat from core client for 30 sec - exiting
05:56:05 (5876): No heartbeat from core client for 30 sec - exiting
05:56:06 (5876): No heartbeat from core client for 30 sec - exiting
05:56:07 (5876): No heartbeat from core client for 30 sec - exiting
05:56:08 (5876): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
05:56:09 (5876): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5256, iMonCtr=1
Model crash detected, will try to restart...
06:03:21 (4704): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3284, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
07:28:05 (4424): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4972, iMonCtr=1
Model crash detected, will try to restart...
05:57:59 (6024): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:24:31 (6012): No heartbeat from core client for 30 sec - exiting
07:24:33 (6012): No heartbeat from core client for 30 sec - exiting
07:24:34 (6012): No heartbeat from core client for 30 sec - exiting
07:24:35 (6012): No heartbeat from core client for 30 sec - exiting
07:24:36 (6012): No heartbeat from core client for 30 sec - exiting
07:24:37 (6012): No heartbeat from core client for 30 sec - exiting
07:24:38 (6012): No heartbeat from core client for 30 sec - exiting
07:24:39 (6012): No heartbeat from core client for 30 sec - exiting
07:24:40 (6012): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:24:41 (6012): No heartbeat from core client for 30 sec - exiting
Atmos Hold Restart file rename failed on atmos_restart.hold
17:59:51 (3920): No heartbeat from core client for 30 sec - exiting
17:59:52 (3920): No heartbeat from core client for 30 sec - exiting
17:59:53 (3920): No heartbeat from core client for 30 sec - exiting
17:59:54 (3920): No heartbeat from core client for 30 sec - exiting
17:59:55 (3920): No heartbeat from core client for 30 sec - exiting
17:59:56 (3920): No heartbeat from core client for 30 sec - exiting
17:59:57 (3920): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6108, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6108, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3836, iMonCtr=1
Model crash detected, will try to restart...
05:59:50 (5920): No heartbeat from core client for 30 sec - exiting
05:59:51 (5920): No heartbeat from core client for 30 sec - exiting
05:59:52 (5920): No heartbeat from core client for 30 sec - exiting
05:59:53 (5920): No heartbeat from core client for 30 sec - exiting
05:59:54 (5920): No heartbeat from core client for 30 sec - exiting
05:59:55 (5920): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
05:59:56 (5920): No heartbeat from core client for 30 sec - exiting
06:14:01 (5276): No heartbeat from core client for 30 sec - exiting
06:14:02 (5276): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:14:03 (5276): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5284, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3556, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4212, iMonCtr=1
Model crash detected, will try to restart...
16:55:43 (4412): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4324, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4432, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5828, iMonCtr=1
Model crash detected, will try to restart...
BUFFOUT: C I/O Error - Return code = 32

Model crashed: WRITHEAD: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6008, iMonCtr=1
Model crash detected, will try to restart...
12:06:39 (3404): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x771D331F read attempt to address 0x00000004

Engaging BOINC Windows Runtime Debugger...

Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_ycjm_1940_40_007618960/dataout/shmem_restart.day
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
05 Mar 2012 09:53:56 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 1,036,800 1,764,653 1.7020
04 Mar 2012 07:37:21 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 1,010,880 1,718,775 1.7003
03 Mar 2012 07:37:39 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 984,960 1,711,704 1.7378
26 Feb 2012 09:59:34 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 959,040 1,666,821 1.7380
24 Feb 2012 17:55:43 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 933,120 1,620,832 1.7370
23 Feb 2012 16:19:20 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 907,200 1,575,080 1.7362
21 Feb 2012 08:57:24 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 881,280 1,529,850 1.7359
20 Feb 2012 07:10:02 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 855,360 1,485,080 1.7362
18 Feb 2012 20:36:38 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 829,440 1,440,819 1.7371
18 Feb 2012 05:32:22 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 803,520 1,395,850 1.7372
16 Feb 2012 07:37:49 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 777,600 1,351,491 1.7380
14 Feb 2012 21:14:42 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 751,680 1,307,381 1.7393
12 Feb 2012 16:47:17 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 725,760 1,263,024 1.7403
11 Feb 2012 12:49:11 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 699,840 1,218,512 1.7411
10 Feb 2012 06:10:34 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 673,920 1,174,046 1.7421
06 Feb 2012 19:26:25 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 648,000 1,129,323 1.7428
06 Feb 2012 05:41:07 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 622,080 1,084,396 1.7432
04 Feb 2012 20:41:40 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 596,160 1,039,912 1.7444
04 Feb 2012 06:49:05 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 570,240 995,327 1.7455
03 Feb 2012 08:52:45 1115875 13828527 hadcm3n_ycjm_1940_40_007618960_4 544,320 951,348 1.7478


©2024 cpdn.org