Name | hadcm3n_t2p0_1940_40_007443148_3 |
Workunit | 7640651 |
Created | 20 Sep 2011, 15:21:23 UTC |
Sent | 20 Sep 2011, 15:22:21 UTC |
Report deadline | 20 Dec 2011, 22:49:32 UTC |
Received | 30 Oct 2011, 10:20:10 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 859264 |
Run time | 15 days 2 hours 21 min 59 sec |
CPU time | 13 days 2 hours 29 min 17 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 2.77 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4172, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2768, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4140, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4572, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4752, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4752, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4168, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1012, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4744, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/t2p0ko.pjg2c10 Error converting file to netcdf: dataout/t2p0ko.pig2c10 Error converting file to netcdf: dataout/t2p0ko.pfg2c10 Error converting file to netcdf: dataout/t2p0ka.phg2c10 Error converting file to netcdf: dataout/t2p0ka.pgg2c10 Error converting file to netcdf: dataout/t2p0ka.peg2c10 Error converting file to netcdf: dataout/t2p0ka.pdg2c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4688, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4520, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4820, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77CD6E0F read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t2p0_1940_40_007443148/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
31 Oct 2011 18:50:47 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 777,600 | 1,132,138 | 1.4559 |
31 Oct 2011 18:15:54 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 751,680 | 1,094,887 | 1.4566 |
31 Oct 2011 17:41:53 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 725,760 | 1,056,909 | 1.4563 |
31 Oct 2011 17:11:14 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 699,840 | 1,019,181 | 1.4563 |
31 Oct 2011 16:30:29 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 673,920 | 980,950 | 1.4556 |
31 Oct 2011 14:03:18 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 648,000 | 942,223 | 1.4540 |
31 Oct 2011 14:03:17 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 622,080 | 904,179 | 1.4535 |
31 Oct 2011 14:03:16 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 596,160 | 866,123 | 1.4528 |
31 Oct 2011 14:03:15 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 570,240 | 828,118 | 1.4522 |
18 Oct 2011 21:11:48 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 544,320 | 789,688 | 1.4508 |
17 Oct 2011 21:14:27 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 518,400 | 751,102 | 1.4489 |
17 Oct 2011 10:17:37 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 492,480 | 713,126 | 1.4480 |
15 Oct 2011 14:58:06 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 466,560 | 676,311 | 1.4496 |
13 Oct 2011 21:39:13 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 440,640 | 639,184 | 1.4506 |
12 Oct 2011 21:57:37 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 414,720 | 602,173 | 1.4520 |
12 Oct 2011 10:21:47 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 388,800 | 564,539 | 1.4520 |
09 Oct 2011 10:33:17 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 362,880 | 526,581 | 1.4511 |
07 Oct 2011 18:42:35 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 336,960 | 488,266 | 1.4490 |
06 Oct 2011 18:27:47 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 311,040 | 451,274 | 1.4509 |
05 Oct 2011 18:33:35 | 859264 | 13402769 | hadcm3n_t2p0_1940_40_007443148_3 | 285,120 | 414,070 | 1.4523 |
©2024 cpdn.org