Name | hadcm3n_o2qp_2020_40_007956539_0 |
Workunit | 8111651 |
Created | 9 May 2012, 3:31:52 UTC |
Sent | 9 May 2012, 3:35:41 UTC |
Report deadline | 8 Aug 2012, 11:02:52 UTC |
Received | 29 May 2012, 13:46:10 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1191945 |
Run time | 10 days 19 hours 13 min 43 sec |
CPU time | 10 days 7 hours 56 min 27 sec |
Validate state | Invalid |
Credit | 8,709.12 |
Device peak FLOPS | 3.27 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.25</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:45:57 (14176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:45:58 (14176): No heartbeat from core client for 30 sec - exiting 08:45:59 (14176): No heartbeat from core client for 30 sec - exiting 08:46:00 (14176): No heartbeat from core client for 30 sec - exiting 08:46:01 (14176): No heartbeat from core client for 30 sec - exiting 08:46:02 (14176): No heartbeat from core client for 30 sec - exiting 08:46:03 (14176): No heartbeat from core client for 30 sec - exiting 08:46:04 (14176): No heartbeat from core client for 30 sec - exiting 08:46:05 (14176): No heartbeat from core client for 30 sec - exiting 08:46:06 (14176): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9140, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:10:31 (1992): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10528, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4108, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 63 to Word Address -198 Failed with Error Code -1 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
25 May 2012 02:19:41 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 725,760 | 864,158 | 1.1907 |
24 May 2012 16:30:20 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 699,840 | 833,126 | 1.1905 |
24 May 2012 06:56:58 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 673,920 | 801,195 | 1.1889 |
23 May 2012 22:41:20 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 648,000 | 769,644 | 1.1877 |
23 May 2012 12:57:08 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 622,080 | 738,112 | 1.1865 |
23 May 2012 04:16:25 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 596,160 | 707,697 | 1.1871 |
22 May 2012 19:06:41 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 570,240 | 677,325 | 1.1878 |
22 May 2012 09:19:09 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 544,320 | 646,754 | 1.1882 |
22 May 2012 01:02:57 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 518,400 | 616,573 | 1.1894 |
21 May 2012 16:10:11 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 492,480 | 586,628 | 1.1912 |
18 May 2012 14:15:57 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 466,560 | 555,849 | 1.1914 |
18 May 2012 05:37:35 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 440,640 | 524,987 | 1.1914 |
17 May 2012 20:40:22 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 414,720 | 494,387 | 1.1921 |
17 May 2012 11:18:20 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 388,800 | 463,673 | 1.1926 |
17 May 2012 03:21:56 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 362,880 | 433,079 | 1.1934 |
16 May 2012 17:11:34 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 336,960 | 402,141 | 1.1934 |
16 May 2012 08:11:36 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 311,040 | 371,073 | 1.1930 |
16 May 2012 00:12:30 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 285,120 | 340,141 | 1.1930 |
15 May 2012 15:00:30 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 259,200 | 308,886 | 1.1917 |
15 May 2012 05:03:38 | 1191945 | 14645369 | hadcm3n_o2qp_2020_40_007956539_0 | 233,280 | 277,877 | 1.1912 |
©2024 cpdn.org