Name | hadcm3n_zjca_1960_40_008363224_0 |
Workunit | 8514083 |
Created | 10 May 2013, 16:07:14 UTC |
Sent | 10 May 2013, 16:07:38 UTC |
Report deadline | 9 Aug 2013, 23:34:49 UTC |
Received | 12 Oct 2013, 13:16:52 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 255 (0x000000FF) Unknown error code |
Computer ID | 1187332 |
Run time | 9 days 8 hours 0 min 53 sec |
CPU time | 7 days 3 hours 18 min 52 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 2.72 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> Attributi estesi non coerenti. (0xff) - exit code 255 (0xff) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2488, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1388, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3160, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3508, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3652, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=208, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=912, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3704, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2244, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3704, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3012, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2236, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3232, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2464, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... 12:34:27 (1508): No heartbeat from core client for 30 sec - exiting 12:34:28 (1508): No heartbeat from core client for 30 sec - exiting 12:34:29 (1508): No heartbeat from core client for 30 sec - exiting 12:34:30 (1508): No heartbeat from core client for 30 sec - exiting 12:34:31 (1508): No heartbeat from core client for 30 sec - exiting 12:34:32 (1508): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2016, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=208, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2820, iMonCtr=1 Model crash detected, will try to restart... 11:31:25 (2116): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:31:26 (2116): No heartbeat from core client for 30 sec - exiting 11:31:28 (2116): No heartbeat from core client for 30 sec - exiting 11:31:29 (2116): No heartbeat from core client for 30 sec - exiting 11:31:31 (2116): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2016, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2440, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=820, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=840, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3920, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2848, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2612, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3372, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1032, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4016, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1484, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=580, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3936, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2268, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3560, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1168, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1812, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3924, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3612, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: C I/O Error feof - Unit 62 - Return code = 16 BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/zjcako.pjh0c10 Error converting file to netcdf: dataout/zjcako.pih0c10 Error converting file to netcdf: dataout/zjcako.pfh0c10 Error converting file to netcdf: dataout/zjcako.pch0c10 Error converting file to netcdf: dataout/zjcako.pbh0c10 Error converting file to netcdf: dataout/zjcako.pah0c10 Error converting file to netcdf: dataout/zjcaka.phh0c10 Error converting file to netcdf: dataout/zjcaka.pgh0c10 Error converting file to netcdf: dataout/zjcaka.peh0c10 Error converting file to netcdf: dataout/zjcaka.pdh0c10 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77798851 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x774F7DAA read attempt to address 0x00000004 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
11 Oct 2013 12:33:20 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 259,200 | 614,036 | 2.3690 |
04 Oct 2013 13:56:27 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 233,280 | 553,230 | 2.3715 |
29 Sep 2013 10:49:34 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 207,360 | 492,189 | 2.3736 |
11 Sep 2013 16:53:16 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 181,440 | 430,925 | 2.3750 |
31 Aug 2013 14:22:28 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 155,520 | 369,002 | 2.3727 |
19 Aug 2013 13:40:41 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 129,600 | 307,566 | 2.3732 |
09 Jul 2013 18:14:45 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 103,680 | 245,406 | 2.3670 |
21 Jun 2013 11:14:29 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 77,760 | 184,289 | 2.3700 |
02 Jun 2013 20:37:38 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 51,840 | 122,222 | 2.3577 |
17 May 2013 16:38:51 | 1187332 | 15772317 | hadcm3n_zjca_1960_40_008363224_0 | 25,920 | 60,113 | 2.3192 |
©2024 cpdn.org