climateprediction.net (CPDN) home page
Thread 'Boinc config question'

Thread 'Boinc config question'

Message boards : Number crunching : Boinc config question
Message board moderation

To post messages, you must log in.

AuthorMessage
JustSomeGuy

Send message
Joined: 20 May 19
Posts: 14
Credit: 6,708,175
RAC: 0
Message 60694 - Posted: 22 Jul 2019, 17:01:00 UTC

With the lack of CPDN work recently, I wanted to setup a 2nd project that would run smaller tasks only CPDN was doing nothing. Is there a good way to do this? I've set CPDN to have 90% processing and the other at to 10%, yet routinely CPDN tasks are set aside and all tasks (32 cores) are the side app.
ID: 60694 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 60695 - Posted: 22 Jul 2019, 18:35:51 UTC - in response to Message 60694.  

One way is to set the "other" project to 0% resource share on its project preference page (not in BOINC). Then, it will download the other project only when CPDN does not have work. However, that works best when the other project has relatively long work units (several hours in length). Otherwise, BOINC is constantly having to download new work units, since it downloads only one at a time per free core.

But you could still do what you want via the BOINC manager. In fact, I would leave CPDN at 100%, and the other to 1%. It will of necessity be working on the "other" project most of the time; that is what you want. There is no point leaving the CPU unused when CPDN does not have work. Then, when CPDN does get work, it will bump off the other project and take precedence.

None of this is relevant unless CPDN sends out work however.
ID: 60695 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 60696 - Posted: 22 Jul 2019, 19:30:33 UTC - in response to Message 60695.  

I am slowly gaining free cores but am going to wait until the weather cools down before letting WCG back on my boxes.
ID: 60696 · Report as offensive     Reply Quote
JustSomeGuy

Send message
Joined: 20 May 19
Posts: 14
Credit: 6,708,175
RAC: 0
Message 60697 - Posted: 22 Jul 2019, 19:34:00 UTC - in response to Message 60695.  

>Then, when CPDN does get work, it will bump off the other project and take precedence.

This doesn't seem to be happening tho. I currently have 8 jobs for CPDN locally and Rosetta's jobs keep hogging the cores. Usually there are about 20 jobs running, but if i let Rosetta run at all it takes over all cores, not just the available ones.

I currently have Rosetta set to 'no new tasks' so these 8 can actually run.

I'll set Rosetta to 0 and see what happens.
ID: 60697 · Report as offensive     Reply Quote
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 60698 - Posted: 22 Jul 2019, 20:01:51 UTC

I admit to ignorance when it comes to how this sharing actually works. I think one of the problems is that cpdn tasks have incredibly long deadlines (most 11+ months) compared to other projects.
ID: 60698 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 60699 - Posted: 22 Jul 2019, 20:08:10 UTC - in response to Message 60697.  

This doesn't seem to be happening tho. I currently have 8 jobs for CPDN locally and Rosetta's jobs keep hogging the cores. Usually there are about 20 jobs running, but if i let Rosetta run at all it takes over all cores, not just the available ones.

It takes time (several weeks) for BOINC to adjust to the settings. I speed it up to a couple of days by the use of a cc_config.xml file. You create it in a text editor (Notepad), but use the "save as" function to save it as an ".xml" file, not a .txt file. Then, you place the resulting file in the BOINC Data folder, and restart BOINC (or read the file in via "Options/Read config files"). Usually, I just reboot.

It looks like this:
<cc_config>
  <options>	
  	<rec_half_life_days>1.000000</rec_half_life_days>
  </options>
</cc_config>
ID: 60699 · Report as offensive     Reply Quote
JustSomeGuy

Send message
Joined: 20 May 19
Posts: 14
Credit: 6,708,175
RAC: 0
Message 60700 - Posted: 22 Jul 2019, 21:35:26 UTC

So I've got Rosetta at 0% and BOINC filled up the free cores with Rosetta, but left CPDN running which is what i want.

It didn't get itself any queue of work, which isn't great in case of network outage. I'm happy with where it's at tho. Now if CPDN could just publish some more work so i could see that it pushes out Rosetta, I'd feel all warm and fuzzy.

Thanks for the help everyone!
ID: 60700 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 60701 - Posted: 22 Jul 2019, 21:57:22 UTC - in response to Message 60700.  

It didn't get itself any queue of work, which isn't great in case of network outage. I'm happy with where it's at tho.

Sure. At 0%, you don't get any extra, only when it is needed.


Now if CPDN could just publish some more work so i could see that it pushes out Rosetta, I'd feel all warm and fuzzy.

I think that would be useful too.
ID: 60701 · Report as offensive     Reply Quote
ProfileJIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,363,583
RAC: 5,022
Message 60702 - Posted: 23 Jul 2019, 14:43:02 UTC - in response to Message 60701.  

I’ve found that the only way to make Boinc share time between CPDN and other projects is to set the work buffer very small at around 1day of less. When the work buffer is set higher than this it tend to give all the time to the other non¬-CPDN projects.
ID: 60702 · Report as offensive     Reply Quote
JustSomeGuy

Send message
Joined: 20 May 19
Posts: 14
Credit: 6,708,175
RAC: 0
Message 60703 - Posted: 23 Jul 2019, 15:07:46 UTC - in response to Message 60702.  

The prioritization between projects is strange. Any one know a BOINC dev I could talk to about it? Just so I can understand how it works and see if there is any way to get what i want out of it.
ID: 60703 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 60704 - Posted: 23 Jul 2019, 20:56:11 UTC - in response to Message 60703.  

The prioritization between projects is strange. Any one know a BOINC dev I could talk to about it?

I have given up trying to figure out the BOINC scheduler. Some combinations of projects work, while others don't.
That it, you may get all of one project running, then all of another, without much mixing of them.

You can ask on the BOINC forum.
https://boinc.berkeley.edu/dev/forum_forum.php?id=10
But there are no longer any paid developers; it is all volunteer so no one may try to fix any particular problem. And I think no one really understands the BOINC scheduler anyway, which doesn't make it any easier. But if you can cite a particular problem, maybe someone can help.
ID: 60704 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 60706 - Posted: 24 Jul 2019, 1:21:11 UTC - in response to Message 60696.  

I am slowly gaining free cores but am going to wait until the weather cools down before letting WCG back on my boxes.


There is a new feature for WCG* that allows you to set a process limit for each of their (5) current projects. You can set from 0 to some relatively large number (64? I do not remember). If you do that, it will download only that many. So if you have 32 cores and actually want to run all the WCG projects that are active, you can set them all to 1 and be sure to have 27 spare cores for other projects.

_____
* You do it on their web site. Select Settings near the top right of their page, select Device Manager from the menu on the left of the screen,
select the appropriate profile (I select the default). You will get to a page full of stuff. Near the bottom it says:

Project Limits

The following settings allow you to set the maximum number of tasks assigned to one of your devices for a project.

Please note that use of these settings could cause your device to not always have work to run if one or more of the projects does not have work available at the time your device requests work.

ID: 60706 · Report as offensive     Reply Quote
mmonnin

Send message
Joined: 28 May 17
Posts: 49
Credit: 17,422,443
RAC: 2,301
Message 60710 - Posted: 24 Jul 2019, 22:41:44 UTC

I have no idea why it was set to take so long for resource share to get to the desired ratio between projects. I don't see the benefit of having a setting that takes a month to get to the desired effect.

Very long tasks like CPDN can mess with that as well. Users are better off with setting a project_max_concurrent. That can put a host in a spot where there are CPDN tasks available but limited by that option and no other tasks will download from other projects because the queue is full. Cores can end up idle.

If I end up downloading more CPDN tasks that I want to run at once I will suspend any buffer and extra tasks I do not want to run. That stops more from downloading and clears up the queue for another project. Set CPDN to something like 1000 resource share so they never get interrupted and another project at 1% so it won't take over anything but the left over cores. It's above 0% so a queue will download.

When some CPDN tasks complete I release some more to run. Repeat.
ID: 60710 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4542
Credit: 19,039,635
RAC: 18,944
Message 60712 - Posted: 25 Jul 2019, 5:44:01 UTC - in response to Message 60710.  

When some CPDN tasks complete I release some more to run. Repeat.


That system works well. I only run projects other than CPDN when there is nothing from CPDN so I work things a bit differently, suspending other projects when I have work from here.

I think the major problem with getting resource share to work is that there are so many different possibilities among what users want to do with it as well as the affect CPDN having very long work units and even longer deadlines. Experienced users mostly find what works for them but unless someone decides they have the time to document this with lots of examples for different situations most of those wanting help will always be relying on the various forums. (Actually I think this would be the case even if it did have such extensive documentation!) Hands up all those who have read more than three pages of the BOINC user docs?
ID: 60712 · Report as offensive     Reply Quote
JustSomeGuy

Send message
Joined: 20 May 19
Posts: 14
Credit: 6,708,175
RAC: 0
Message 61042 - Posted: 27 Sep 2019, 16:48:38 UTC

So it looks like there is new work showing up finally, but my client refuses to take any. CPDN is setup with 100% resource share, and rosetta has 0. This arrangement worked well as my work dried up, but now that there is new CPDN work isn't not pulling new tasks. I've set Rosetta to not pull tasks and let is run for a while, so there are free cores now. I've removed and readded the project to my client. Restarted the computer. Still nothing from CPDN. Any thoughts on what's going on?
ID: 61042 · Report as offensive     Reply Quote
mngn

Send message
Joined: 13 Jul 18
Posts: 38
Credit: 62,933,508
RAC: 84,702
Message 61043 - Posted: 27 Sep 2019, 16:58:28 UTC - in response to Message 61042.  

The UK Met Office HadCM3 short WUs are Linux/x86 only.
https://www.cpdn.org/apps.php
ID: 61043 · Report as offensive     Reply Quote
JustSomeGuy

Send message
Joined: 20 May 19
Posts: 14
Credit: 6,708,175
RAC: 0
Message 61045 - Posted: 27 Sep 2019, 19:01:37 UTC - in response to Message 61043.  

Ahhh, Thank you.
ID: 61045 · Report as offensive     Reply Quote

Message boards : Number crunching : Boinc config question

©2024 cpdn.org