Name | hadcm3n_o2nt_2140_40_008269849_1 |
Workunit | 8424973 |
Created | 11 Jan 2013, 19:54:47 UTC |
Sent | 11 Jan 2013, 19:54:59 UTC |
Report deadline | 13 Apr 2013, 3:22:10 UTC |
Received | 7 Feb 2013, 1:25:14 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1027182 |
Run time | 18 days 22 hours 3 min 26 sec |
CPU time | 14 days 20 hours 10 min 9 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 2.50 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 04:13:30 (21604): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 15:05:36 (4640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:06:51 (5412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:35:46 (20668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:25:45 (5368): No heartbeat from core client for 30 sec - exiting 20:25:46 (5368): No heartbeat from core client for 30 sec - exiting 20:25:47 (5368): No heartbeat from core client for 30 sec - exiting 20:25:49 (5368): No heartbeat from core client for 30 sec - exiting 20:25:50 (5368): No heartbeat from core client for 30 sec - exiting 20:25:51 (5368): No heartbeat from core client for 30 sec - exiting 20:25:52 (5368): No heartbeat from core client for 30 sec - exiting 20:25:53 (5368): No heartbeat from core client for 30 sec - exiting 20:25:54 (5368): No heartbeat from core client for 30 sec - exiting 20:25:55 (5368): No heartbeat from core client for 30 sec - exiting 20:25:56 (5368): No heartbeat from core client for 30 sec - exiting 20:25:57 (5368): No heartbeat from core client for 30 sec - exiting 20:25:58 (5368): No heartbeat from core client for 30 sec - exiting 20:25:59 (5368): No heartbeat from core client for 30 sec - exiting 20:26:01 (5368): No heartbeat from core client for 30 sec - exiting 20:26:02 (5368): No heartbeat from core client for 30 sec - exiting 20:26:03 (5368): No heartbeat from core client for 30 sec - exiting 20:26:04 (5368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:02:56 (2584): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 21:02:58 (2584): No heartbeat from core client for 30 sec - exiting 21:02:59 (2584): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 11:15:04 AM No files match the supplied pattern. MainError: 11:15:04 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... MainError: 04:18:48 AM No files match the supplied pattern. MainError: 04:18:48 AM No files match the supplied pattern. MainError: 08:45:59 PM No files match the supplied pattern. MainError: 08:45:59 PM No files match the supplied pattern. MainError: 04:05:06 PM No files match the supplied pattern. MainError: 04:05:06 PM No files match the supplied pattern. MainError: 08:12:18 AM No files match the supplied pattern. MainError: 08:12:18 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 04:22:41 AM No files match the supplied pattern. MainError: 04:22:41 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 09:47:14 PM No files match the supplied pattern. MainError: 09:47:14 PM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 03:27:45 AM No files match the supplied pattern. MainError: 03:27:45 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:03:20 (8132): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 05:15:27 PM No files match the supplied pattern. MainError: 05:15:27 PM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 08:12:07 AM No files match the supplied pattern. MainError: 08:12:07 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Error converting file to netcdf: dataout/o2ntka.ph11c10 Error converting file to netcdf: dataout/o2ntka.pg11c10 Error converting file to netcdf: dataout/o2ntka.pe11c10 MainError: 11:25:05 PM No files match the supplied pattern. MainError: 11:25:05 PM No files match the supplied pattern. BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
06 Feb 2013 23:39:51 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 777,600 | 1,382,034 | 1.7773 |
06 Feb 2013 08:13:04 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 751,680 | 1,335,516 | 1.7767 |
05 Feb 2013 17:16:23 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 725,760 | 1,294,510 | 1.7837 |
04 Feb 2013 03:56:44 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 699,840 | 1,249,967 | 1.7861 |
02 Feb 2013 21:49:16 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 673,920 | 1,203,111 | 1.7852 |
02 Feb 2013 04:29:39 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 648,000 | 1,154,830 | 1.7821 |
01 Feb 2013 08:41:48 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 622,080 | 1,105,632 | 1.7773 |
31 Jan 2013 16:05:49 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 596,160 | 1,058,268 | 1.7751 |
30 Jan 2013 20:47:45 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 570,240 | 1,010,464 | 1.7720 |
30 Jan 2013 04:47:38 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 544,320 | 962,617 | 1.7685 |
29 Jan 2013 11:29:56 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 518,400 | 914,355 | 1.7638 |
27 Jan 2013 12:16:41 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 492,480 | 869,434 | 1.7654 |
26 Jan 2013 21:28:13 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 466,560 | 823,491 | 1.7650 |
26 Jan 2013 08:14:49 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 440,640 | 777,032 | 1.7634 |
25 Jan 2013 18:38:10 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 414,720 | 730,431 | 1.7613 |
25 Jan 2013 06:24:54 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 388,800 | 684,062 | 1.7594 |
24 Jan 2013 15:55:42 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 362,880 | 637,578 | 1.7570 |
23 Jan 2013 22:44:11 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 336,960 | 590,334 | 1.7519 |
23 Jan 2013 06:27:49 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 311,040 | 540,503 | 1.7377 |
22 Jan 2013 15:20:43 | 1027182 | 15529221 | hadcm3n_o2nt_2140_40_008269849_1 | 285,120 | 493,030 | 1.7292 |
©2024 cpdn.org