Message boards :
Number crunching :
New error message
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 5 Jul 09 Posts: 63 Credit: 6,091,274 RAC: 0 |
Just out of curiosity, what is the definition around here of a hoarder? Kevin |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
The scientists would have to define it, but I would say along the lines of someone who holds more work units in their buffer than they can finish in the time than it takes for another user to do them. In other words, hoarding slows down the science rather than expedites it. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
I recently suggested shorter times, but didn't get anywhere, which is why I think what I said in my post may be "how things are". Yes, all you need for a start is just to shorten the deadlines. The reliable computers are used when they need a quorum of two results from different users to agree (i.e., be validated against each other). The reliable machine is just a way to identify a third party who usually gets the results back in time to meet the deadline (usually seven days in their case), in case one of the first users does not return the result in a timely manner. But given the long deadlines here, and that you don't use the quorum system anyway, it is quite unnecessary. I just mention it to point out some of the techniques that some of the other projects use. |
Send message Joined: 3 Sep 04 Posts: 126 Credit: 26,610,380 RAC: 3,377 |
It might help giving a bonus for a completed result. Now it doesn't matter for the credits if one finishes tasks or not. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
It might help giving a bonus for a completed result. This has been brought up a few times over the years. And if you stand well back, and look at the matter sideways, (sort of thing), then there IS a bonus for finishing quickly - you get to run more models in a given time, thus increasing your monopoly money faster. Unlike most, if not all other projects, cpdn is all statistics: throw a lot of data sets at the problem, wait to see how they fair, and then throw more datasets at it if necessary. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Yes, Jim's pretty right with that. I recently came across one model that had been sent in early February last year, and was only run and completed in January this year. And looking at the dates/times of trickle returns it was run fast. And all of my recent posts in this thread are just my thoughts on the matter. I could be way out. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
47an That's the well know Intel FORTRAN error. There are a number of threads about it on these boards. One is here in the Windows section: Visual Fortran Runtime error |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
Hi, Jim, Yes, all you need for a start is just to shorten the deadlines. We went through that battle early in the project and a few times since. The problem with shorter deadlines, given relatively long CPDN tasks (though they are MUCH shorter now), is that CPDN too-quickly gets to "High Priority" for CPU time with short deadlines. People running multiple projects objected -- strenuously. The 'solution' was longer deadlines for CPDN -- good for other projects, not so good for CPDN. Jim "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 9 Dec 05 Posts: 116 Credit: 12,472,534 RAC: 1,333 |
If I have understood correctly, the current versions of Boinc want to calculate through all (or almost all) work from one project when it switches projects. That is at least what I am seeing for CPDN. If I have CPDN work on my host Boinc will crunch them with a single go until there is only one or two hours time left to calculate that WU before switching to projects. Projects are usually not switched every 60 minutes like the preferences are set to do. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The new versions initially only run a single task from a newly added project. And I think that it also only runs one project at a time. This is so that BOINC can get an idea of how long these new-to-it tasks will take to run. With the climate models taking so long to complete, this learning can take a long time, especially if the computer is not on all the time, or is heavily used by the owner. Projects are usually not switched every 60 minutes like the preferences are set to do. The "switch time" is only a timer that tells BOINC the earliest moment that it can start to consider switching. When it gets to this time, other factors come into play which can delay the switching for longer, so, yes, the projects don't get switched at that interval. e.g. 60 minutes. |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
I'm stealing your Thread for the moment, Les, because the Board wouldn't allow me to create a new one (Missing parameter error -- though nothing was missing.) Re.: wah2_sas50_ This error sequence occurred on two machines (I didn't give a third box the satisfaction.) Staff notified. 3/20/2016 12:30:30 PM|climateprediction.net|Sending scheduler request: To fetch work. Requesting 2322726 seconds of work, reporting 0 completed tasks 3/20/2016 12:30:35 PM|climateprediction.net|[error] Can't parse file info in scheduler reply: file name is empty or has '..' (Ditto -- for another 13 lines) 3/20/2016 12:30:35 PM|climateprediction.net|Scheduler request succeeded: got 5 new tasks 3/20/2016 12:30:35 PM|climateprediction.net|[error] State file error: missing file 3/20/2016 12:30:35 PM|climateprediction.net|[error] Can't handle task wah2_sas50_fpqd_201412_13_367_010380127_1 in scheduler reply "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 29 May 08 Posts: 128 Credit: 6,289,876 RAC: 0 |
astroWX wrote:
I see that I've returned two of these with similar stderr output. Any advice on what to do with any in our queues ready to start? |
Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0 |
In my case, nothing can be done because they don't exist on my machines -- though my server account shows them alive and well -- and assigned to me. However, a rather serious indication that they are not confused with anything else on the machine -- one box had no main-site work on it and still has no main-site work on it, despite server evidence that that machine owns one. (The other box supposedly caught five new tasks - only four are listed on the machine [three EU and one AFR]...) If you have (non-returned) live ones on your machine, my suggestion is to hold them until we hear from staff, probably Monday. "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
There's been an email back about it - looks like a template error. So, dump any of these you have and try again. If trouble persists, post back here. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
There's been an email back about it - looks like a template error. What batches are the bad WU’s from so they can be dumped? |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Looking at the last line in Astro's post, they're sas50, batch 367. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
Thanks. Fortunately I only had 1 from that batch. It has been aborted. |
Send message Joined: 3 Sep 04 Posts: 105 Credit: 5,646,090 RAC: 102,785 |
test post... unable to create new thread ... But I can post to this thread?? I get Unable to handle request missing or bad parameter: id; supplied: |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Yes, that's a know problem. It's been on the back burner while the credits problem was/is being worked on. |
©2024 cpdn.org