Name | hadsm3dhet2_jqrf_006597677_0 |
Workunit | 6801050 |
Created | 15 Mar 2010, 12:04:10 UTC |
Sent | 27 Sep 2010, 5:41:34 UTC |
Report deadline | 9 Sep 2011, 11:01:34 UTC |
Received | 21 Dec 2010, 15:11:53 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1093542 |
Run time | 32 days 4 hours 31 min |
CPU time | 28 days 20 hours 58 min 34 sec |
Validate state | Invalid |
Credit | 1,687.14 |
Device peak FLOPS | 1.60 GFLOPS |
Application version | UK Met Office HadSM3 Slab Model v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3700, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3272, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2672, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2672, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2672, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3324, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3324, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1388, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1388, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1388, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1736, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1736, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1736, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3408, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2024, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4192, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1900, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1900, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CCPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1300, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1300, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
09 Oct 2010 23:31:59 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 183,634 | 310,199 | 1.6892 |
09 Oct 2010 23:31:59 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 172,832 | 292,324 | 1.6914 |
09 Oct 2010 23:31:59 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 162,030 | 274,801 | 1.6960 |
09 Oct 2010 23:31:59 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 151,228 | 256,996 | 1.6994 |
06 Oct 2010 03:45:43 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 140,426 | 239,285 | 1.7040 |
05 Oct 2010 04:42:22 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 129,624 | 221,581 | 1.7094 |
03 Oct 2010 19:27:18 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 118,822 | 204,210 | 1.7186 |
02 Oct 2010 19:03:16 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 108,020 | 185,824 | 1.7203 |
02 Oct 2010 13:18:09 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 97,218 | 167,692 | 1.7249 |
02 Oct 2010 03:07:36 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 86,416 | 148,909 | 1.7232 |
01 Oct 2010 13:46:38 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 75,614 | 130,141 | 1.7211 |
30 Sep 2010 11:07:44 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 64,812 | 111,969 | 1.7276 |
30 Sep 2010 06:08:07 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 54,010 | 94,070 | 1.7417 |
29 Sep 2010 13:13:32 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 43,208 | 75,546 | 1.7484 |
28 Sep 2010 14:12:30 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 32,406 | 57,479 | 1.7737 |
28 Sep 2010 06:39:04 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 21,604 | 38,267 | 1.7713 |
27 Sep 2010 20:57:51 | 1093542 | 11038498 | hadsm3dhet2_jqrf_006597677_0 | 10,802 | 18,745 | 1.7353 |
©2024 cpdn.org