Is the Rosetta client "linear" ?

Message boards : Number crunching : Is the Rosetta client "linear" ?

To post messages, you must log in.

AuthorMessage
mikus

Send message
Joined: 7 Nov 05
Posts: 58
Credit: 700,115
RAC: 0
Message 16664 - Posted: 19 May 2006, 22:13:48 UTC

Rosetta is the only BOINC project on my Linux computer. I normally run off-line. My 'work cache' is specified as six days, and my 'target CPU run time' is specified as 12 hours.

When I had BOINC 5.2.13 installed, with stable WUs I don't remember BOINC going into earliest-deadline-first scheduling mode. But now that I have installed BOINC 5.4.9 (which has implemented "tightened" scheduling policies), I've seen EDF mode entered following the downloading of new work. Although in my environment EDF mode makes no difference, this change in client system behavior made me curious.

In discussions on the BOINC list, it was suggested that the BOINC client gets "nervous" about scheduling when the 'progress on the result is non-linear'. [An example of non-linear would be if after the first hour of crunching a 12-hour WU, only 3% 'progress toward the result' were being reported.]


My question: Does the __Rosetta__ client behave linearly -- is the value being reported, EVERY TIME it uses the boinc_fraction_done API, accurate for (accumulated time spent crunching this WU / expected total time spent crunching this WU) ?
.
ID: 16664 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 16667 - Posted: 20 May 2006, 0:17:45 UTC - in response to Message 16664.  
Last modified: 20 May 2006, 0:18:11 UTC

Rosetta is the only BOINC project on my Linux computer. I normally run off-line. My 'work cache' is specified as six days, and my 'target CPU run time' is specified as 12 hours.

When I had BOINC 5.2.13 installed, with stable WUs I don't remember BOINC going into earliest-deadline-first scheduling mode. But now that I have installed BOINC 5.4.9 (which has implemented "tightened" scheduling policies), I've seen EDF mode entered following the downloading of new work. Although in my environment EDF mode makes no difference, this change in client system behavior made me curious.

In discussions on the BOINC list, it was suggested that the BOINC client gets "nervous" about scheduling when the 'progress on the result is non-linear'. [An example of non-linear would be if after the first hour of crunching a 12-hour WU, only 3% 'progress toward the result' were being reported.]


My question: Does the __Rosetta__ client behave linearly -- is the value being reported, EVERY TIME it uses the boinc_fraction_done API, accurate for (accumulated time spent crunching this WU / expected total time spent crunching this WU) ?
.

The CPU time is linear so long as the work unit is never removed from memory. The percent complete is not necessarily linear. If you change the time setting while the work unit is being processed it will effect the percent complete. If you remove the work unit from memory it will effect the CPU time count and the percent complete. The time to completion is almost never correct.

This is why we recommend you decide how you want to configure your computer, and then let it run for some time with those setting so it will stabilize. Depending on what you set up, it could take 2-5 days for it to become stable.


Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 16667 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16713 - Posted: 20 May 2006, 16:25:35 UTC
Last modified: 20 May 2006, 16:26:20 UTC

I think actually it is not linear, if you look at the time to completion. But once a model is completed (generally within an hour or so) the completion time is reduced again. So that wouldn't effect your cache size... for more than an hour anyway. I think the description you are reading is geared more towards the climate projects where you can crunch on a WU for 250hrs and have the estimated runtime be greater than it was when you started.

So I believe the answer is that R@H is "linear enough" for your cache to be properly sized. But it often takes BOINC several days to "learn" how you use your PC and how long the WUs take to run.

Is your PC showing the 12hr runtime as the initial estimate? And you will see that increase until model 1 is completed for a given WU.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16713 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikus

Send message
Joined: 7 Nov 05
Posts: 58
Credit: 700,115
RAC: 0
Message 16721 - Posted: 20 May 2006, 18:18:51 UTC - in response to Message 16667.  

The percent complete is not necessarily linear. If you change the time setting while the work unit is being processed it will effect the percent complete. ... The time to completion is almost never correct.

I think it's been many many days since I've changed ANYTHING. My last change was from BOINC 5.2.13 to BOINC 5.4.9. I have not changed the values in my preferences in weeks; there has been *plenty* of time for a "history" of my processing (with the 12 hour time setting) to stabilize. Plus, the work unit does *not* get removed from memory.

I corresponded with the BOINC developers about seeing a download and a couple of hours later seeing the BOINC client set EDF mode. (Rosetta is the *only* BOINC project on that computer.) They told me that the BOINC client __does__ use the value reported using boinc_fraction_done(). In particular, I interpreted what they said as: "If after one hour of processing it is reported that the result is 3.333% done, the BOINC client will test for deadlines using a formula for completing THAT result which evaluates closer to 30 hours than to the workunit's estimated time.

I believe that (immediately following the download) a report DURING THE PROCESSING OF THE CURRENT WORKUNIT that "inflated" its 'time to completion' would be *enough* to explain why my system was set to EDF mode. The download fetched so many workunits that (given a 14-day deadline but my 6-day cache size) the "safety margin comparing completion time vs. deadline" was only a matter of hours !! Thus "very little" added (e.g., induced by non-linearity) inflation would have been needed to trigger EDF mode.
.
ID: 16721 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Moderator9
Volunteer moderator

Send message
Joined: 22 Jan 06
Posts: 1014
Credit: 0
RAC: 0
Message 16722 - Posted: 20 May 2006, 18:33:19 UTC - in response to Message 16721.  
Last modified: 20 May 2006, 18:37:21 UTC

The percent complete is not necessarily linear. If you change the time setting while the work unit is being processed it will effect the percent complete. ... The time to completion is almost never correct.

I think it's been many many days since I've changed ANYTHING. My last change was from BOINC 5.2.13 to BOINC 5.4.9. I have not changed the values in my preferences in weeks; there has been *plenty* of time for a "history" of my processing (with the 12 hour time setting) to stabilize. Plus, the work unit does *not* get removed from memory.

I corresponded with the BOINC developers about seeing a download and a couple of hours later seeing the BOINC client set EDF mode. (Rosetta is the *only* BOINC project on that computer.) They told me that the BOINC client __does__ use the value reported using boinc_fraction_done(). In particular, I interpreted what they said as: "If after one hour of processing it is reported that the result is 3.333% done, the BOINC client will test for deadlines using a formula for completing THAT result which evaluates closer to 30 hours than to the workunit's estimated time.

I believe that (immediately following the download) a report DURING THE PROCESSING OF THE CURRENT WORKUNIT that "inflated" its 'time to completion' would be *enough* to explain why my system was set to EDF mode. The download fetched so many workunits that (given a 14-day deadline but my 6-day cache size) the "safety margin comparing completion time vs. deadline" was only a matter of hours !! Thus "very little" added (e.g., induced by non-linearity) inflation would have been needed to trigger EDF mode.
.

Your basic premiss is correct. Making adjustments to the time parameter on the fly will produce the result you are seeing as would a larger protein that progresses more slowly the more normal work. BOINC is a dynamic scheduler. It makes decision on what it sees at the moment it needs to decide on a scheduling issue. If it finds a work unit that is progressing slower than it though based on the last time it looked, it could very easily fall into EDF mode.

As I said the CPU time is linear. The percent complete is only partially so and the completion time is not. The time to completion will climb higher until the percent complete changes, then it will fall back to a more accurate value. If BOINC looks at the work only moments prior to a jump up in the percent complete, it can make what appears to a human to be the incorrect decision in terms of EDF mode. But in fact it does all average out.

A lot of people who like to micromanage their machines are finding this to be disconcerting, but BOINC really can take care of itself. On my systems I set my time preferences to create a run time similar to other projects I am running. I set it to connect about every .25 days, and then I just let it run. I have never had any issues with those settings. The only time I make adjustments is when I am trying to duplicate an error situation to help a users in the forums. Even then the system adjusts and stabilizes over time. But keep in mind that BOINC adjust its expectations faster in increases, than it does in decrements. So a longer then expected work unit will have more impact than one that is shorter than expected.

One thing to note here is that for a single project machine EDF mode is not a problem, it is an advantage.

Moderator9
ROSETTA@home FAQ
Moderator Contact
ID: 16722 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jose

Send message
Joined: 28 Mar 06
Posts: 820
Credit: 48,297
RAC: 0
Message 16829 - Posted: 22 May 2006, 12:07:07 UTC

I would like to know if people running multiple applications have had the experience of BOIC preempting a work unit in one application to run a work unit in another application that has an earlier report date? Have you had problems getting back to the application that was preempted?
This and no other is the root from which a Tyrant springs; when he first appears he is a protector.”
Plato
ID: 16829 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 16833 - Posted: 22 May 2006, 15:05:34 UTC - in response to Message 16829.  

I would like to know if people running multiple applications have had the experience of BOIC preempting a work unit in one application to run a work unit in another application that has an earlier report date? Have you had problems getting back to the application that was preempted?

Well... yes and no. The BOINC debt is a tricky thing. This is normal for BOINC in some situations. And if you suspend WUs and projects to try and get Ralph work etc. this can sort of confuse BOINC as to what your goals are. But, in the end, the deadline is your failsafe. If the project is ignored for too long, and has WUs downloaded, eventually BOINC will see the looming deadline and use earliest deadline first (EDF) mode to get the work done... thus perhaps furthering a debt imbalance. Then the next thing that happens is "why won't BOINC download any work for that project?" because it got CPU time to meet it's deadline, it incurs a debt to the other projects. And so BOINC "knows" it needs to spend time crunching the other projects to balance things out. Then it will bring down more work.

It really does a great job with what are often conflicting goals. But if you look at any given hour of time and ask "why?" it can be confusing. I prefer to look at a 100 hour timeframe. And you will see that in that timeframe, more or less regardless of the debt situation, you will see your resource shares reflected.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 16833 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Is the Rosetta client "linear" ?



©2024 University of Washington
https://www.bakerlab.org