climateprediction.net home page
New work
New work
log in

Advanced search

Message boards : Number crunching : New work

1 · 2 · 3 · 4 . . . 13 · Next
Author Message
Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6408
Credit: 16,839,542
RAC: 21,887
Message 54840 - Posted: 26 Sep 2016, 23:16:25 UTC

A batch of Africa models showed up a few hours ago, and now they're gone.

There's over 200,000 tasks out there somewhere, so some people must have stockpiles.
Which may be why some of these latest batches are showing up - the researchers aren't getting their data back.

Profile Byron Leigh Hatch @ team Carl Sagan
Send message
Joined: 17 Aug 04
Posts: 270
Credit: 38,039,661
RAC: 64,767
Message 54842 - Posted: 27 Sep 2016, 0:49:56 UTC - in response to Message 54840.

Hi Les, hi everyone, would it be possible or would it cause problems to some crunchers, to change the deadline in BOINC from a year to say 3 or 4 months for Climate Prediction? I have two computers. One computer crunches only Climate Prediction. My second computer crunches only SETI@home. I think the deadline at SETI@home is two months. Come to think of it, I guess people who crunch Climate Prediction and other multiple projects on one computer there might be a problem not sure? Any way just a thought.

Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6408
Credit: 16,839,542
RAC: 21,887
Message 54843 - Posted: 27 Sep 2016, 1:29:00 UTC - in response to Message 54842.

Hi Byron

I asked about this some time ago, and apparently it's not going to change.
This is to do with the second part of your question: It does affect those who are running lots of projects.

But there is a down side for those people who take months to complete their tasks: if the researchers don't get their results in a reasonable time, they can just re-issue them (plus a few more perhaps), as a new batch, and then forget about the earlier unreturned tasks.

Profile Byron Leigh Hatch @ team Carl Sagan
Send message
Joined: 17 Aug 04
Posts: 270
Credit: 38,039,661
RAC: 64,767
Message 54849 - Posted: 27 Sep 2016, 14:21:47 UTC - in response to Message 54843.

if the researchers don't get their results in a reasonable time, they can just re-issue them (plus a few more perhaps), as a new batch, and then forget about the earlier unreturned tasks.

Hi Les, I agree with you 100%, in the last few weeks I have come across many computers with big stockpiles, 200 tasks or more sitting on their hard drives since March or April, and they have not uploaded a single Zip file for those tasks. What can be done to solve this?

John Eric Hopkinson
Send message
Joined: 27 Jan 05
Posts: 74
Credit: 1,028,011
RAC: 0
Message 54850 - Posted: 27 Sep 2016, 16:26:27 UTC

I have an afr50 in the"ready to start" status while 3 others are crunching with as much as 2 days to completion. If this is common practice then that adds to the load of idle outstanding work, correct?
This does not seem reasonable when there are no wu,s available but other computers are looking for work.
Is the afr50 an orphan from some rejected event?
____________

WB8ILI
Send message
Joined: 1 Sep 04
Posts: 98
Credit: 40,088,982
RAC: 54,475
Message 54851 - Posted: 27 Sep 2016, 16:42:14 UTC - in response to Message 54843.

Les -

Is it possible to run some kind of script against the database and find all the tasks that were sent out at least 3 or 4 months ago and have no trickles? Then maybe re-issue them?
____________

Profile Iain Inglis
Volunteer moderator
Send message
Joined: 16 Jan 10
Posts: 877
Credit: 100,083
RAC: 3,242
Message 54852 - Posted: 27 Sep 2016, 16:48:52 UTC - in response to Message 54850.

Is the afr50 an orphan from some rejected event?

The name of that AFR50 is wah2_afr50_a16w_201312_13_451_010735872_0. The trailing "_0" indicates it's the first model issued in that work unit. It might have to wait a few days but it stands a much better chance of completing on your PC than elsewhere. BOINC Manager allows users to build a work buffer to smooth out the supply of models: if the supply of models was smooth in itself then perhaps the buffers would empty.

Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6408
Credit: 16,839,542
RAC: 21,887
Message 54855 - Posted: 27 Sep 2016, 20:57:29 UTC

WB8ili and Byron

This may not have an elegant solution.
e.g. People may download a few models, than need to be away for a while, (hospital, business trip), and have their computer turned off.
Plus there's those who cycle through lots of projects, possibly on a slow computer, so they'd take a while too.
I don't think the project people want to have to decide on a strategy for how long is acceptable.

One of the things that physicists need to be good at, is computer programming, so I'm sure that "our" people can and probably do create and run lots of scripts and searches for lots of reasons.
But what I said before, about issuing a new batch, is what I'd do.
Then it'd only be the hoarders who suffered by way of wasted power costs and bandwidth time.

Not sure where to insert this, just write it here. I just thought: all of this is probably a mindset problem - aka "set and forget". And some people have to learn the hard way. And things like that. :)

Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6408
Credit: 16,839,542
RAC: 21,887
Message 54856 - Posted: 27 Sep 2016, 21:12:51 UTC

John

Having waiting tasks can be normal; it all depends on your number of processors and how your cache is set up.
If you're set for say 10 days, but only have 4 cores, then any extra than this will go into wait state. Or they'll swap between them, and take it in turn to run.

I've left the first 2 Network usage options at blank, so I only get a new task when one already running finishes.
Just my way of doing things.

When the new batch of afr50's showed up, I needed some more, so I got some more. But only one was an afr50; the rest where re-sends of other batches. So I've got a motley collection running.

Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6408
Credit: 16,839,542
RAC: 21,887
Message 54857 - Posted: 27 Sep 2016, 22:55:16 UTC

John

Another thing.
Your "waiting" task could just as well have ended up on a computer where it would sit for e.g. 6-8 months before starting, so waiting on your computer for a week or 2 isn't a bad thing.

John Eric Hopkinson
Send message
Joined: 27 Jan 05
Posts: 74
Credit: 1,028,011
RAC: 0
Message 54859 - Posted: 28 Sep 2016, 13:58:54 UTC - in response to Message 54857.

Thank you Iain and Les.
This Dell XPS has an i7 core but I am reluctant to monkey around with the workload because of temperature concerns. (Actually I admit that I dont know how to adjust core performance or overclocking.)
The Extreme Tuning Utility shows temps of 80-85C and that is mildly alarming. I am more familiar with 60-70C on the older equipment. And according to DELL, the XPS clock speed is 1.90 GHz as a means of temp control. ie slower all around performance but longer life.
This machine is dedicated to CPDN so if you guys are not concerned about heat, I will try some adjustments to allow more cores and more work units.
There are now 3 waiting to start.
____________

Art Masson
Avatar
Send message
Joined: 16 Oct 11
Posts: 163
Credit: 12,090,256
RAC: 5,899
Message 54868 - Posted: 29 Sep 2016, 23:33:47 UTC - in response to Message 54859.

Hi John,

As a data point, I run an I7-3770 running at 3.4GHz with six of the eight cores available to BOINC. These six cores run virtually 100% of the time crunching multiple projects. All these CPU's run between 69 and 71 degrees C with no problems and have been running this way for over 2 years.

Art

klepel
Send message
Joined: 9 Oct 04
Posts: 15
Credit: 40,909,273
RAC: 19,723
Message 54869 - Posted: 30 Sep 2016, 15:57:59 UTC

@Bayliss

Another spin for the long deadlines:
I had to change a hard disk on one of my computers. I copied the whole ProgramData/BOINC/projects folder to an USB stick before the change, changed the disc, installed BOINC, connected to the project again, downloaded some new WUs, merged the two computers with the same name on the homepage of every project, copied the old WUs from the USB to corresponding folders under projects folder.

The old WUs are visible under the computer name on the homepage, but not in the computer (BOINC manager). So if nothing happens, the will be reported until the deadline as “in process” but will never be processed. So if someone might point out how to get them recognized by BOINC on this particular computer, I would be very grateful. If this would work to transfer WUs from one computer to another, would be even better, as I have some WUs on slow computers to transfer to fasters.

Finally, I also had a crash of a hard disk, so I lost all this WUs, but as the deadline is so long, they won´t be reissued for a year.

____________

Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6408
Credit: 16,839,542
RAC: 21,887
Message 54871 - Posted: 30 Sep 2016, 20:28:53 UTC - in response to Message 54869.

It's necessary to start the copying from, and including, the boinc folder, because that contains lots of important info, such as client_state.xml, which is BOINC's "To Do" list. That's where BOINC stares information about tasks, where they're from, and where to send the results.

Also in there, is the folder slots, which runs parallel to the projects folder.

So without all of this, it's bye bye models.

****************

Tasks are logged by the server as being sent to a particular computer, and getting the results back from a different computer can cause problems.

Moving models to a faster computer has, in the past, caused the model to error out with something about a time limit being exceeded. It's been a long while, and I forget what the message said.

John Eric Hopkinson
Send message
Joined: 27 Jan 05
Posts: 74
Credit: 1,028,011
RAC: 0
Message 54872 - Posted: 30 Sep 2016, 23:40:34 UTC - in response to Message 54868.

@ Art Masson
Thanks for the info.
The "Brand String " for the cpu on this Dell XPS is" i7-3517U, CPU @ 1.90GHz; now you may understand the comparisons to your setups, but I dont..
The curious thing here is the CPU shown at 1.90 GHz by Intel, but their Monitor shows Max Core Frequency (operational I assume) @2.76 GHz. CPU utilization is very low, 15%-45% variable.
So the cpu may respond to the demands of the CPDN programs and jump up as required (ie auto overclocking?) hence the heat, which is never < 67C.
If I solve some of this I will pass it on, because your operating characteristics indicate that something is amiss here and I could be more useful to the cause.
____________

Profile Alan K
Send message
Joined: 22 Feb 06
Posts: 203
Credit: 10,727,650
RAC: 8,191
Message 54876 - Posted: 2 Oct 2016, 22:07:17 UTC - in response to Message 54872.

You could try installing the Intel Extreme Tuning Utility. This can be configured to show which cores are being used, how much, and what temperatures the cores and whole package are getting up to. You can also tweek the CPU usage in BOINC manager via Computing preferences in the options menu.

Art Masson
Avatar
Send message
Joined: 16 Oct 11
Posts: 163
Credit: 12,090,256
RAC: 5,899
Message 54879 - Posted: 4 Oct 2016, 18:51:27 UTC - in response to Message 54872.

Hi John,

Your Dell motherboard may restrict the CPU speed that you can achieve with your CPU. The Intel Extreme Tuning Utility may provide you some ways to "overclock" but I would advise you to be conservative on increasing the speed. The temperature your CPU runs at is not only a function of the load (% utilization and type of processing) you put on it, but is also constrained by how good a hardware design (i.e. heat sink design/effectiveness) Dell did for you in that machine. I would be careful not to push the utilization or CPU speed up so much that your CPU runs hotter than 85 Degrees C or so -- because that will eventually likely cause an early CPU failure running that hot. It's better to have a reliable machine consistently crunching CPDN tasks than one that is pushed so hard you get errors -- that's why I've constrained my CPU utilization to 75% in BOINC...so that only six CPU's are available. I've found that if I push to more than 6 CPU's I get consistent but random CPDN work unit processing failures.

Good luck!!!

Art Masson

Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6408
Credit: 16,839,542
RAC: 21,887
Message 54880 - Posted: 4 Oct 2016, 21:33:13 UTC

And to bring this back to topic, there's some more work.
Weather At Home European region.

Profile Dave Jackson
Send message
Joined: 15 May 09
Posts: 1790
Credit: 2,671,578
RAC: 898
Message 54881 - Posted: 5 Oct 2016, 8:21:29 UTC - in response to Message 54880.

Hopper now empty again but I suspect there is more on the way soon.

Profile JIM
Send message
Joined: 31 Dec 07
Posts: 982
Credit: 14,320,108
RAC: 19,627
Message 54890 - Posted: 7 Oct 2016, 18:45:47 UTC
Last modified: 7 Oct 2016, 18:46:52 UTC

Anyone have any idea when new work will appear. I have a brand new computer and it's hungry to sink its teeth into some nice juicy Climate Models. It’s having to make do with backup projects. Please feed my computer. ;)
____________

1 · 2 · 3 · 4 . . . 13 · Next

Message boards : Number crunching : New work


Main page · Your account · Message boards


Copyright © 2017 climateprediction.net