Name | hadcm3n_o28e_2140_40_008270209_2 |
Workunit | 8425333 |
Created | 6 Feb 2013, 3:12:57 UTC |
Sent | 6 Feb 2013, 3:13:03 UTC |
Report deadline | 8 May 2013, 10:40:14 UTC |
Received | 3 Mar 2013, 18:00:40 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 836088 |
Run time | 19 days 17 hours 37 min 14 sec |
CPU time | 15 days 5 hours 10 min 37 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 1.90 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.33</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 12:56:50 (5100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:29:12 (552): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:29:13 (552): No heartbeat from core client for 30 sec - exiting 14:29:14 (552): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 22:31:22 (4480): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 22:31:24 (4480): No heartbeat from core client for 30 sec - exiting 22:31:25 (4480): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:23:37 (6164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:23:38 (6164): No heartbeat from core client for 30 sec - exiting 21:23:39 (6164): No heartbeat from core client for 30 sec - exiting 21:24:20 (8184): No heartbeat from core client for 30 sec - exiting 21:24:21 (8184): No heartbeat from core client for 30 sec - exiting 21:24:22 (8184): No heartbeat from core client for 30 sec - exiting 21:24:23 (8184): No heartbeat from core client for 30 sec - exiting 21:24:24 (8184): No heartbeat from core client for 30 sec - exiting 21:24:25 (8184): No heartbeat from core client for 30 sec - exiting 21:24:26 (8184): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... MainError: 09:33:13 PM No files match the supplied pattern. MainError: 09:33:13 PM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 07:41:11 PM No files match the supplied pattern. MainError: 07:41:11 PM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 03:28:11 PM No files match the supplied pattern. MainError: 03:28:11 PM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 04:07:26 PM No files match the supplied pattern. MainError: 04:07:26 PM No files match the supplied pattern. 16:11:27 (2988): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 09:33:12 AM No files match the supplied pattern. MainError: 09:33:12 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 07:05:24 AM No files match the supplied pattern. MainError: 07:05:24 AM No files match the supplied pattern. 08:28:35 (3908): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:30:30 (2036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 04:28:11 AM No files match the supplied pattern. MainError: 04:28:11 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 12:33:02 AM No files match the supplied pattern. MainError: 12:33:02 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 10:10:03 PM No files match the supplied pattern. MainError: 10:10:03 PM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 05:56:15 PM No files match the supplied pattern. MainError: 05:56:15 PM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Error converting file to netcdf: dataout/o28eka.ph11c10 Error converting file to netcdf: dataout/o28eka.pg11c10 Error converting file to netcdf: dataout/o28eka.pe11c10 MainError: 04:45:05 PM No files match the supplied pattern. MainError: 04:45:05 PM No files match the supplied pattern. BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
03 Mar 2013 16:46:00 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 777,600 | 1,575,921 | 2.0266 |
02 Mar 2013 17:58:34 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 751,680 | 1,520,373 | 2.0226 |
01 Mar 2013 22:13:35 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 725,760 | 1,465,916 | 2.0198 |
01 Mar 2013 00:35:29 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 699,840 | 1,414,284 | 2.0209 |
28 Feb 2013 04:31:38 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 673,920 | 1,361,527 | 2.0203 |
27 Feb 2013 07:06:26 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 648,000 | 1,309,986 | 2.0216 |
26 Feb 2013 14:50:40 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 622,080 | 1,258,353 | 2.0228 |
25 Feb 2013 18:02:46 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 596,160 | 1,205,568 | 2.0222 |
24 Feb 2013 15:31:14 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 570,240 | 1,151,163 | 2.0187 |
23 Feb 2013 19:42:38 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 544,320 | 1,092,971 | 2.0080 |
22 Feb 2013 21:34:09 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 518,400 | 1,039,371 | 2.0050 |
22 Feb 2013 02:05:33 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 492,480 | 986,741 | 2.0036 |
21 Feb 2013 03:08:25 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 466,560 | 935,308 | 2.0047 |
20 Feb 2013 06:27:36 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 440,640 | 884,132 | 2.0065 |
19 Feb 2013 17:01:16 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 414,720 | 832,825 | 2.0082 |
18 Feb 2013 17:26:16 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 388,800 | 780,137 | 2.0065 |
18 Feb 2013 02:01:46 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 362,880 | 729,350 | 2.0099 |
17 Feb 2013 09:37:07 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 336,960 | 674,369 | 2.0013 |
16 Feb 2013 11:59:37 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 311,040 | 618,859 | 1.9896 |
15 Feb 2013 17:36:20 | 836088 | 15585445 | hadcm3n_o28e_2140_40_008270209_2 | 285,120 | 565,150 | 1.9821 |
©2024 cpdn.org