Questions and Answers :
Wish list :
Longer Deadlines - That's What I'd Like To See
Message board moderation
Author | Message |
---|---|
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
I've noticed lately that work units' deadlines are earlier than they used to be and somewhat unrealistic. We aren't all running supercomputers 24/7 and not all running CPDN only. My current work unit which is now running at high-priority, meaning something else is waiting as a result, has run, at last check, for 213.13.30 with 147.34.56 to go and a deadline of 25/09/2013 which by my calculations it will not meet. I recall much longer deadlines being given in the past which meant the work could progress at normal priority and get done, even allowing for other projects, and the fact that I like to conserve electricity, and money, by shutting down at night, not to mention give my machine a well-earned rest. Why the sudden urgency? Peter Toronto, Canada |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
There was a time when one participant ran the full 160-year model on a single machine. Thanks to years of whining/whinging on the boards, the models were broken into four tasks each, with consequent larger down-/up-loads to pass start-up dumps between machines. (We all know what the added load did to the servers' lifespans.) Deadlines were shortened to allow reasonable completion dates for the four-task set -- can't start the second 40 years until the first 40 are in hand, etc. It's a balancing act, science requirements vs. processing realities. Not so sudden, really. It's been like this for quite a while. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
Well I didn't mean to come across as whinging and whining - just opining and asking. ;-) Thanks for the explanation. Not sure I totally understand it but I'll take your word for it. I guess I'm thinking back quite a while. Not sure I'm happy with any project that decides to pre-empt other projects by going high-priority right from the start but it seems to be becoming more and more prevalent so I guess we have to put up with it. It could mean people cutting back on numbers of projects and/or how much cache they decide to keep. Peter Toronto, Canada |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Keep in mind that: a) the deadlines aren't enforced - the models will still be accepted regardless. b) in the long run, CPDN will still end up with your preferred resource-share even if it has a high-priority task. It'll grab more than its fair share in the short term, but then other projects will get the priority for a long while until it has evened out again. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
OK, thanks. Peter Toronto, Canada |
Send message Joined: 17 Nov 07 Posts: 142 Credit: 4,271,370 RAC: 0 |
science requirements vs. processing realities. [...]Not so sudden, really. It's been like this for quite a while. Processing realities, indeed. And limitations of BOINC when used for long duration tasks. The problem that CPDN has, is with task re-issue. The BOINC system won't re-issue a task until either the client reports failure, or the deadline date has passed. The second case is unfortunately quite common with CPDN's work, even with generous deadlines: a volunteer computer starts on a task, and then ... gets redeployed to do something else. As far as CPDN knows, the task is still "in progress". If CPDN has set the deadline years into the future, it has to wait years just to learn that it needs to re-issue the task ... and then wait some more. But even scientists have to work to a schedule. ;-) Until reasonably recently, the way that CPDN compensated for this time-out problem was to issue tasks to 5 or 7 computers at a time. The odds were that one of them would process the task to completion. But of course the odds were also that more than one, or more than two, of them would process the task to completion, too. There was duplication of effort by crunchers, and extra data transmission, processing and storage costs for the project. The new method, of short deadlines and issuing work to only one computer (reissuing if timed out or failed), fixes these problems (and may have been intended to). But now we have this "high priority" problem... It's all an illustration of Eric Sevareid's line, "the chief cause of problems is solutions". |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
"the chief cause of problems is solutions"- yes indeed. At least CPDN is not carrying it to extremes as did another project I since left whereby several WU's would regularly arrive with just hours left for the deadline, naturally all would run at high priority. I complained and explained in detail why and was told, "tough", so left them to their own devices. I guess it's striking a happy balance between deadline and the level of urgency. However, contrary to an earlier statement, things don't seem to level off because CPDN's WU continues to run 100% of the time, so it really is depriving something else of "having a go" when the hourly changeover occurs and will continue to do so it seems, until 25 September deadline and beyond. It's only 1 WU so I guess it's not too serious a problem as I can run 8 at once. Peter Toronto, Canada |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
... However, contrary to an earlier statement, ... Which statement are you referring to? If it's mine, 'short term' = duration of the WU, 'long term' is the year or so afterwards that it will take the processing-debt to sort itself out. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
Sorry, I was a bit vague. I was referring to It'll grab more than its fair share in the short term, but then other projects will get the priority for a long while until it has evened out again. In my experience that doesn't happen. That could be a problem with BOINC itself. Peter Toronto, Canada |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
What I meant was this: It'll grab more than its fair share until the workunit has finished, but then other projects will get the priority for a long while until it has evened out again. Boinc has a complicated 'debt' system which means that CPDN will 'owe' the other projects a lot of CPU time. Until that debt has been paid back, Boinc should prevent CPDN from downloading new units. But it has been years since I looked at this 'debt' functionality last. It may have changed. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
It's a weird system but must make sense to someone I suppose. Of course hanging on until that work is finished means others may miss a deadline as a result, but I'm learning (slowly) to disregard it. Some project managers unfortunately exploit the situation as I mentioned earlier. Peter Toronto, Canada |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
A lot of it depends on how many Boinc projects you are running simultaneously. The more there are on a single machine, the more likely that work units are going to go into high priority mode. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
Agreed and I have a feeling I'm guilty of that. I signed on to a lot of projects simply to guarantee work, but maybe I went overboard. Peter Toronto, Canada |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
Well I've been proven wrong and I apologise. CPDN just relinquished it's grip and now some other work, due tomorrow, has taken over. Peter Toronto, Canada |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,089,004 RAC: 2,537 |
One of the things that I have noticed about the tendency of Boinc to go into �high priority� mode is the size of the work buffer. I mainly run CPDN. When I am trying to get new work from CP I keep the work buffer at 10 days. When I run another project (usually when CPDN was no work) I reset the buffer at 1 day and go to WCG. At 1 day on the buffer settings CPDN and WCG projects such as SN2S will share the available cores. If however I reset the work buffer to 10 days (with WCG set to �no new tasks�) WCG will immediately go into �high priority� and not allow CPDN to run. There has been no change in the number of tasks the machine has to finish before WCG deadlines, but, it suddenly gets paranoid about finishing in time. |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
BOINC is a work in progress obviously. |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
BOINC is a work in progress obviously. There are a lot of things about Boinc that I am personally unhappy with. It was designed for running short, disposable jobs which can be validated by bytewise comparison of result files, and does not cope well with jobs which can last for weeks or even months of CPU time. However, it's simple, widespread, and easy to set up, and thats why CPDN uses it. The bottom line is that CPDN benefits overall using Boinc, despite the various issues. I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
Yet another WU is about to miss it's deadline...a day from now and it's only 65% there, so has weeks to go at the rate it is going, even at high-priority (at which it has been running for quite some time now so BOINC agrees with me, or at least it can't be blamed). This never used to happen and I've always had multi-projects. Are you absolutely sure that longer deadlines aren't in order or possible? Am I the only one this is happening to? I seem to recall deadlines of almost a year at one time. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,089,004 RAC: 2,537 |
The old 160 year hadcm WU�s had a deadline of about 14 months. Even the relatively short hadam3p WU�s have a deadline of about 9 months. The hadcm3n (aka. RAPIT) on the other hand only give about a 3 month deadline. This is because of the segmental nature of the models. The results of each segment are used to generate the next. The researchers don�t want to wait forever for their results. The only solution that I can think of (short of buying a faster computer) is either to run more hours each day or restrict to your machine to the hadam3p�s. Computers don�t really need to rest and every time you shut down (even doing it the �right� way) there is a small chance of the Hadcm3n�s crashing. I have found that my computer running 24/7 uses about $6 USD per month. Shutting down 6 hours/day would only save me about $2/month. |
Send message Joined: 26 Aug 04 Posts: 84 Credit: 351,331 RAC: 0 |
Thanks. I do run it most of the time anyway and it is a fast machine however I do prevent BOINC tasks from using my graphics card which may or may not apply here. I'll look into your suggestions. |
©2024 climateprediction.net