Name | hadcm3n_o1rj_1940_40_008379353_4 |
Workunit | 8530212 |
Created | 3 Dec 2013, 12:46:18 UTC |
Sent | 3 Dec 2013, 12:46:32 UTC |
Report deadline | 4 Mar 2014, 20:13:43 UTC |
Received | 23 Dec 2013, 22:32:05 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1163925 |
Run time | 18 days 11 hours 24 min 47 sec |
CPU time | 17 days 9 hours 37 min 19 sec |
Validate state | Invalid |
Credit | 9,953.28 |
Device peak FLOPS | 2.99 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.2.31</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 07:21:27 (10152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Ocean Restart file copy failed on o1rjko.daf56g0 06:26:14 (17028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:30:14 (20976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2948, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2948, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2948, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2948, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2948, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2948, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
21 Dec 2013 21:36:45 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 829,440 | 1,482,869 | 1.7878 |
21 Dec 2013 08:26:18 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 803,520 | 1,437,640 | 1.7892 |
20 Dec 2013 19:18:06 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 777,600 | 1,392,737 | 1.7911 |
20 Dec 2013 05:29:52 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 751,680 | 1,345,198 | 1.7896 |
19 Dec 2013 16:22:12 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 725,760 | 1,299,411 | 1.7904 |
19 Dec 2013 03:08:42 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 699,840 | 1,254,164 | 1.7921 |
18 Dec 2013 13:08:21 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 673,920 | 1,209,043 | 1.7940 |
17 Dec 2013 23:43:54 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 648,000 | 1,162,893 | 1.7946 |
17 Dec 2013 09:51:49 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 622,080 | 1,117,526 | 1.7964 |
16 Dec 2013 20:47:20 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 596,160 | 1,073,850 | 1.8013 |
16 Dec 2013 07:43:38 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 570,240 | 1,028,462 | 1.8036 |
15 Dec 2013 17:44:06 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 544,320 | 979,956 | 1.8003 |
15 Dec 2013 03:50:15 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 518,400 | 931,477 | 1.7968 |
14 Dec 2013 15:41:40 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 492,480 | 881,506 | 1.7899 |
13 Dec 2013 19:51:21 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 466,560 | 833,367 | 1.7862 |
13 Dec 2013 05:27:25 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 440,640 | 784,647 | 1.7807 |
12 Dec 2013 14:53:33 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 414,720 | 736,555 | 1.7760 |
12 Dec 2013 01:17:53 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 388,800 | 687,977 | 1.7695 |
11 Dec 2013 15:59:04 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 362,880 | 654,463 | 1.8035 |
11 Dec 2013 05:06:51 | 1163925 | 16102819 | hadcm3n_o1rj_1940_40_008379353_4 | 336,960 | 620,153 | 1.8404 |
©2024 cpdn.org