climateprediction.net home page
Why doesn\'t BOINC send smaller units of work?

Why doesn\'t BOINC send smaller units of work?

Questions and Answers : Windows : Why doesn\'t BOINC send smaller units of work?
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user502027

Send message
Joined: 15 Feb 08
Posts: 2
Credit: 1,711
RAC: 0
Message 32653 - Posted: 18 Feb 2008, 16:41:22 UTC

I am seeing the following two things:

1. If I reboot my computer or put it on standby and then back on, the work starts over from 0%.

2. If the internet goes down in the middle of the night, the work starts over from 0%.

My assessment: BOINC is not robust enough to handle any errors and stops all work, restarting from scratch.

The problem: I had 48 hours of work done over this weekend on my laptop and then I moved it to take it to work and lost all the work. If BOINC is not robust enough to handle real-life situations, then the work should be chunked into real-life-sized pieces like 1-4 hours. I think its ridiculous otherwise. I have a high-powered laptop and can really help on this project but not if the program cannot handle real-life situations.

Jennifer

2/18/2008 1:27:01 AM||[error] Couldn\'t write state file: system rename
2/18/2008 1:27:22 AM|climateprediction.net|Deferring communication for 7 min 52 sec
2/18/2008 1:27:22 AM|climateprediction.net|Reason: scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi failed: system fopen
2/18/2008 1:27:24 AM|climateprediction.net|[file_xfer] Finished upload of file hadam3h_n_118s2_009c_009c_0_4_4.zip
2/18/2008 1:27:24 AM|climateprediction.net|[file_xfer] Throughput 6074 bytes/sec
2/18/2008 1:27:30 AM||Can\'t delete previous state file; The process cannot access the file because it is being used by another process. (0x20)
2/18/2008 1:27:42 AM||Can\'t rename current state file to previous state file; The process cannot access the file because it is being used by another process. (0x20)
2/18/2008 1:27:53 AM||Can\'t rename state file; Cannot create a file when that file already exists. (0xb7)
2/18/2008 1:27:53 AM||[error] Couldn\'t write state file: system rename
2/18/2008 1:28:02 AM|climateprediction.net|[file_xfer] Finished upload of file hadam3h_n_118s2_009c_009c_0_4_5.zip
2/18/2008 1:28:02 AM|climateprediction.net|[file_xfer] Throughput 5155 bytes/sec
2/18/2008 1:28:08 AM||Can\'t delete previous state file; The process cannot access the file because it is being used by another process. (0x20)
2/18/2008 1:28:21 AM||Can\'t rename current state file to previous state file; The process cannot access the file because it is being used by another process. (0x20)
2/18/2008 1:28:33 AM||Can\'t rename state file; Cannot create a file when that file already exists. (0xb7)
2/18/2008 1:28:33 AM||[error] Couldn\'t write state file: system rename
2/18/2008 1:35:20 AM|climateprediction.net|Deferring communication for 19 min 26 sec
2/18/2008 1:35:20 AM|climateprediction.net|Reason: scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi failed: system fopen
2/18/2008 1:54:45 AM|climateprediction.net|Sending scheduler request: To send trickle-up message
2/18/2008 1:54:45 AM|climateprediction.net|Requesting 95040 seconds of new work, and reporting 1 completed tasks
2/18/2008 1:54:55 AM|climateprediction.net|Scheduler RPC succeeded [server version 509]
2/18/2008 1:54:57 AM|climateprediction.net|[file_xfer] Started download of file hadcm3istd_01lk_1920_160_15925688.zip
2/18/2008 1:54:57 AM|climateprediction.net|[file_xfer] Started download of file SULPC_OXIDANTS_19_A2_1990.mod.gz
2/18/2008 1:55:00 AM|climateprediction.net|[file_xfer] Finished download of file hadcm3istd_01lk_1920_160_15925688.zip
2/18/2008 1:55:00 AM|climateprediction.net|[file_xfer] Throughput 2447 bytes/sec
ID: 32653 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 32654 - Posted: 18 Feb 2008, 18:17:19 UTC


Hi Jennifer,

It may be worth having a browse through the various readme files to get some background info on the climate models. Some machines do kill Boinc as they shut down - but usually this is Vista machines rather than XP, and it results in error codes 0, 1, 0xC00000142, or 0x40001004.

Your models are listed here:
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?hostid=835685

Your actual error codes (mostly -185) are very unusual - mostly \'cannot read init file\', and another model which also appears to have had a file-permissions based problem.

Do you always run the models with the same user id + domain? (for example, at work, do you log into a company domain, whereas at home just logging into a local account?). If this is the case, then one solution might be to install as a \'service\' using a particular account.

If not, then I could suggest right-clicking on the Boinc icon and clicking \'exit\' just prior to shutting down or going into standby. If you run with \'advanced/network activity disabled\' for most of the time (turn it on again briefly every few days to allow the climate data to upload), that should solve the problem with the internet access failing.

Nice fast looking machine by the way.

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 32654 · Report as offensive     Reply Quote
old_user502027

Send message
Joined: 15 Feb 08
Posts: 2
Credit: 1,711
RAC: 0
Message 32706 - Posted: 22 Feb 2008, 19:38:27 UTC - in response to Message 32654.  

Yeah I think its too much trouble to figure this out. If it can\'t install and have it just work and submit work in small chunks, then I can\'t really have it on the machine. It wastes my CPU cycles without producing any value at all for this worthy project. I hope someone at BOINC reads this feedback and makes a more robust and user friendly program. Good luck in the future folks! Maybe I\'ll check back in a few years and see if you\'ve improved your code.

(uninstalling now) Jen
ID: 32706 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 32707 - Posted: 22 Feb 2008, 19:58:27 UTC


The problem with your last post, is that you\'re confusing BOINC, (which just looks after the uploads and downloads) with the science application, which does the actual work.
BOINC is an American program, used by nearly 100 different projects, whereas the science app here was developed by the UK\'s Met Office.

This climate program is huge, and noramlly runs on supercomputers for climate and weather forecasting; getting it to run on a desktop/laptop requires some TLC.

But it can be done; thousands of people HAVE got it running stabley.
And the work IS submitted in small chunks; every model year for the smallest type of model.

Your main problems may be that you\'re not allowing the model to run long enough to reach a \"checkpoint\", (where it saves the data), and the way in which you shut down.
The former will force the model to restart from the beginning, because there\'s nothing later to start with, and the latter may cause a model to crash.

ID: 32707 · Report as offensive     Reply Quote

Questions and Answers : Windows : Why doesn\'t BOINC send smaller units of work?

©2024 climateprediction.net