Name | hadcm3n_p4jk_1900_40_007223472_1 |
Workunit | 7421712 |
Created | 26 Apr 2011, 15:29:37 UTC |
Sent | 29 Apr 2011, 12:19:03 UTC |
Report deadline | 29 Jul 2011, 19:46:14 UTC |
Received | 7 May 2011, 12:03:05 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1108969 |
Run time | 5 days 6 hours 40 min 29 sec |
CPU time | 2 days 0 hours 0 min 33 sec |
Validate state | Invalid |
Credit | 1,866.24 |
Device peak FLOPS | 1.93 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... 15:06:45 (8124): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 10:07:10 (3580): No heartbeat from core client for 30 sec - exiting 10:07:14 (3580): No heartbeat from core client for 30 sec - exiting 10:07:16 (3580): No heartbeat from core client for 30 sec - exiting 10:07:17 (3580): No heartbeat from core client for 30 sec - exiting 10:07:18 (3580): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:09:05 (5940): No heartbeat from core client for 30 sec - exiting 10:09:06 (5940): No heartbeat from core client for 30 sec - exiting 10:09:08 (5940): No heartbeat from core client for 30 sec - exiting 10:09:09 (5940): No heartbeat from core client for 30 sec - exiting 10:09:10 (5940): No heartbeat from core client for 30 sec - exiting 10:09:11 (5940): No heartbeat from core client for 30 sec - exiting 10:09:12 (5940): No heartbeat from core client for 30 sec - exiting 10:09:14 (5940): No heartbeat from core client for 30 sec - exiting 10:09:15 (5940): No heartbeat from core client for 30 sec - exiting 10:09:16 (5940): No heartbeat from core client for 30 sec - exiting 10:09:17 (5940): No heartbeat from core client for 30 sec - exiting 10:09:19 (5940): No heartbeat from core client for 30 sec - exiting 10:09:20 (5940): No heartbeat from core client for 30 sec - exiting 10:09:21 (5940): No heartbeat from core client for 30 sec - exiting 10:09:23 (5940): No heartbeat from core client for 30 sec - exiting 10:09:24 (5940): No heartbeat from core client for 30 sec - exiting 10:09:25 (5940): No heartbeat from core client for 30 sec - exiting 10:09:27 (5940): No heartbeat from core client for 30 sec - exiting 10:09:28 (5940): No heartbeat from core client for 30 sec - exiting 10:09:29 (5940): No heartbeat from core client for 30 sec - exiting 10:09:30 (5940): No heartbeat from core client for 30 sec - exiting 10:09:32 (5940): No heartbeat from core client for 30 sec - exiting 10:09:33 (5940): No heartbeat from core client for 30 sec - exiting 10:09:34 (5940): No heartbeat from core client for 30 sec - exiting 10:09:35 (5940): No heartbeat from core client for 30 sec - exiting 10:09:36 (5940): No heartbeat from core client for 30 sec - exiting 10:09:37 (5940): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:09:39 (5940): No heartbeat from core client for 30 sec - exiting 10:09:40 (5940): No heartbeat from core client for 30 sec - exiting 10:09:41 (5940): No heartbeat from core client for 30 sec - exiting 10:09:42 (5940): No heartbeat from core client for 30 sec - exiting 10:09:43 (5940): No heartbeat from core client for 30 sec - exiting 10:10:55 (4548): No heartbeat from core client for 30 sec - exiting 10:10:57 (4548): No heartbeat from core client for 30 sec - exiting 10:10:58 (4548): No heartbeat from core client for 30 sec - exiting 10:10:59 (4548): No heartbeat from core client for 30 sec - exiting 10:11:00 (4548): No heartbeat from core client for 30 sec - exiting 10:11:01 (4548): No heartbeat from core client for 30 sec - exiting 10:11:02 (4548): No heartbeat from core client for 30 sec - exiting 10:11:04 (4548): No heartbeat from core client for 30 sec - exiting 10:11:05 (4548): No heartbeat from core client for 30 sec - exiting 10:11:06 (4548): No heartbeat from core client for 30 sec - exiting 10:11:07 (4548): No heartbeat from core client for 30 sec - exiting 10:11:09 (4548): No heartbeat from core client for 30 sec - exiting 10:11:10 (4548): No heartbeat from core client for 30 sec - exiting 10:11:12 (4548): No heartbeat from core client for 30 sec - exiting 10:11:13 (4548): No heartbeat from core client for 30 sec - exiting 10:11:14 (4548): No heartbeat from core client for 30 sec - exiting 10:11:15 (4548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 18:45:58 (6228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... forrtl: Not enough storage is available to process this command. forrtl: Not enough storage is available to process this command. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6196, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: error reading file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_se_6.07_windows_intelx86.dll Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6196, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6196, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: error reading file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/ocean_restart.day forrtl: Not enough storage is available to process this command. cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p4jk_1900_40_007223472/dataout/ocean_restart.day after 11 attempts Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
06 May 2011 08:44:35 | 1108969 | 12826870 | hadcm3n_p4jk_1900_40_007223472_1 | 155,520 | 347,366 | 2.2336 |
05 May 2011 15:15:30 | 1108969 | 12826870 | hadcm3n_p4jk_1900_40_007223472_1 | 129,600 | 289,363 | 2.2327 |
04 May 2011 21:58:41 | 1108969 | 12826870 | hadcm3n_p4jk_1900_40_007223472_1 | 103,680 | 231,279 | 2.2307 |
04 May 2011 04:28:08 | 1108969 | 12826870 | hadcm3n_p4jk_1900_40_007223472_1 | 77,760 | 172,903 | 2.2235 |
02 May 2011 18:39:35 | 1108969 | 12826870 | hadcm3n_p4jk_1900_40_007223472_1 | 51,840 | 111,026 | 2.1417 |
30 Apr 2011 17:38:32 | 1108969 | 12826870 | hadcm3n_p4jk_1900_40_007223472_1 | 25,920 | 68,791 | 2.6540 |
©2024 cpdn.org