Name | hadam3p_pnw_bq49_1968_1_007920152_0 |
Workunit | 8075264 |
Created | 18 Apr 2012, 11:49:06 UTC |
Sent | 5 May 2012, 15:08:42 UTC |
Report deadline | 17 Apr 2013, 20:28:42 UTC |
Received | 8 Jun 2012, 20:49:15 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 984314 |
Run time | 6 days 1 hours 10 min 23 sec |
CPU time | 6 days 1 hours 10 min 23 sec |
Validate state | Invalid |
Credit | 2,755.56 |
Device peak FLOPS | 2.19 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Pacific North West v6.09 windows_intelx86 |
Stderr | <core_client_version>6.2.28</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4468, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4812, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3400, selfPID=5344, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2364, selfPID=2472, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2324, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5576, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5628, selfPID=5220, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 3 CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4404, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4484, selfPID=5320, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 3 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5440, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 4 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5268, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 4 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1432, selfPID=5860, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4408, selfPID=5960, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 6 GlobaController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2888, selfPID=4396, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4636, selfPID=6136, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3860, selfPID=4480, iMonCtr=1 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5204, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 7 CCPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3368, selfPID=1364, iMonCtr=1 Model crash detected, will try to restart... GClntroller::oCbDNaplo eWsoir koe ru:n:n ,CePiDin ,pbreoVac e s,sc eiksI =n,ostl PrD=u18n,iiMgnC r=x iodilncga,h dbtRceet,Vwall ry=t re,t rth.e ckPID=0, selfPID=5328, iMonCtr=2 Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 9 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6096, selfPID=6096, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1656, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2176, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2548, selfPID=4176, iMonCtr=1 Model crash detected, will try to restart... 20:27:33 (4636): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5348, selfPID=5348, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1464, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4184, selfPID=4348, iMonCtr=1 Model crash detected, will try to restart... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
06 Jun 2012 13:32:22 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 126,816 | 491,911 | 3.8789 |
05 Jun 2012 16:47:04 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 115,299 | 448,337 | 3.8885 |
04 Jun 2012 21:04:59 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 115,296 | 447,814 | 3.8840 |
02 Jun 2012 14:25:52 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 103,776 | 403,309 | 3.8863 |
28 May 2012 19:31:46 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 92,256 | 359,111 | 3.8925 |
21 May 2012 16:46:05 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 80,739 | 315,273 | 3.9048 |
20 May 2012 19:59:19 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 80,736 | 314,766 | 3.8987 |
19 May 2012 20:24:36 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 69,216 | 270,779 | 3.9121 |
19 May 2012 07:58:43 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 57,696 | 226,834 | 3.9315 |
16 May 2012 18:54:14 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 46,176 | 181,541 | 3.9315 |
12 May 2012 17:04:31 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 34,656 | 136,124 | 3.9279 |
08 May 2012 19:25:45 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 23,136 | 90,861 | 3.9273 |
06 May 2012 17:06:01 | 984314 | 14461690 | hadam3p_pnw_bq49_1968_1_007920152_0 | 11,616 | 45,572 | 3.9232 |
©2024 cpdn.org