Switching between applications

Message boards : Number crunching : Switching between applications

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Martin Johnson

Send message
Joined: 18 Oct 05
Posts: 19
Credit: 171,164
RAC: 0
Message 1774 - Posted: 26 Oct 2005, 5:49:15 UTC

I believed that the idea of using BOINC was to allow the sharing of resources. If, for instance, I wish to switch between applications every 30 minutes, then I should be allowed to do so.
But not with Rosetta! Some work units are so large that they do not write to disc within the 30 minute share period, and so start again at the previous disc-write. This means the same calculations are done again and again without the percentage increasing, thus wasting much crunch time.
But I have already specified in my preferences that I wish all applications to write to disc every 10 seconds, so that a re-start can be accomplished at virtually any point. All other applications seem to do this.
May I ask if Rosetta will eventually allow this?
ID: 1774 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 1777 - Posted: 26 Oct 2005, 7:29:48 UTC

Um, none of the applications will guarentee that.

The checkpoint is not a command as it seems, it is a, um, suggestion. The save every "x" is used by the science application in a test when IT is ready to checkpoint to see if it is allowed (if I understand the code correctly).

Rosetta@Home right now does not checkpoint as often as it "should" and the project is supposed to be looking into making a change for that. With that, you would be able to checkpoint more often, but you are never going to get all science applications to checkpoint as often as you are suggesting.

In some case, CPDN comes to mind, the checkpointing may require such a large write of data it would take less time to recalculate than to recover it ...
ID: 1777 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 576
Credit: 4,695,177
RAC: 21
Message 1805 - Posted: 26 Oct 2005, 19:29:47 UTC

The more often you switch, the more work you're going to lose, on all projects... I believe the SETI app checkpoints every three minutes (I could be wrong) so no matter what your MINIMUM checkpoint time is set to, you could still lose up to 3 min work. Rosetta obviously is even longer. (Yes, too long IMHO, because it's based on 'where' it is and not 'when' it is, and 'where' moves slowly in some WUs; this may not be fixable!) The bottom line is that the preference allows you to set the minimum, to reduce writes to disk - it doesn't direct the application to write at the time you set, just to make sure that at least that much time has elapsed since it's last write.

Someone did a lot of research and for _their_ mix of projects, on their computer, they determined that the best switch-time was an hour and a half. You CAN switch every 30 minutes if you like - you aren't being prevented from doing this - but that doesn't mean that you SHOULD, or that it's a good idea. Basically, the slower the computer, the longer you should give between switches; but even on faster computers, the hour "default" is probably a good amount to give each project. Leaving the applications in memory will reduce the amount of lost time as well...

ID: 1805 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Martin Johnson

Send message
Joined: 18 Oct 05
Posts: 19
Credit: 171,164
RAC: 0
Message 1861 - Posted: 28 Oct 2005, 2:27:08 UTC

Thank you both for this information. The problem is that there are no readily available guidelines. And Boinc seems to be in its infancy.
For instance, on my faster computer, I have 5 projects running, with roughly equal share percentage. Rosetta units have bottomed out (on Boinc 5.2.2) at a projected duration time of about 30 minutes. Because I cannot ensure that I can connect every day, I set the connect time to every 2 days. So what does Rosetta download today? 51 units! This is a bit silly. I really only wanted 21! Some more tweaking still needs to be done.
ID: 1861 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 576
Credit: 4,695,177
RAC: 21
Message 1862 - Posted: 28 Oct 2005, 7:21:36 UTC - in response to Message 1861.  
Last modified: 28 Oct 2005, 7:22:22 UTC

Rosetta units have bottomed out (on Boinc 5.2.2) at a projected duration time of about 30 minutes. Because I cannot ensure that I can connect every day, I set the connect time to every 2 days. So what does Rosetta download today? 51 units! This is a bit silly. I really only wanted 21!


At 30 min/WU, that's 48 per day, so 96 would be two days worth. Realize that the program has no way of knowing what "will be" available from other projects when it's looking for work, or what the deadlines are; only what is "on hand" at the moment. If you have _nothing_ in the queue on your computer, then the first project it connects to will be asked for two days work. The second will be asked for about one days work. (I think? Not sure on this.) Then the third through fifth will be asked for (two days/5), call it 10 hours each. The program has to assume that when you connect every two days, you really will be "unplugged" in between, and it has to make sure it has plenty of work to last two days, even if only one project has work available at the moment it connects. If you already had one days work waiting to run, then when you connected to Rosetta, it wanted one more day worth, but you were expecting 10 hours worth...

As with most things BOINC, the design is for everything to work out in the LONG run, not necessarily at every single connection, app-switch, etc. In this case, on your next connection, it will ask for less Rosetta - probably - but possibly more of something else.

Oh - the Wiki is a good source for detailed information on this and almost any other BOINC topic!

ID: 1862 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Alan Baxter

Send message
Joined: 7 Oct 05
Posts: 6
Credit: 7,749
RAC: 0
Message 1865 - Posted: 28 Oct 2005, 7:40:02 UTC - in response to Message 1805.  

Leaving the applications in memory will reduce the amount of lost time as well...

Um, are you sure about this. My impression is that leaving the applications in memory eliminate the amount of lost time completely.
ID: 1865 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 1867 - Posted: 28 Oct 2005, 7:51:55 UTC

Depending on version of the BOINC Client, benchmarks will remove the work from memory ...
ID: 1867 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 576
Credit: 4,695,177
RAC: 21
Message 1880 - Posted: 28 Oct 2005, 18:15:40 UTC - in response to Message 1865.  

Leaving the applications in memory will reduce the amount of lost time as well...

Um, are you sure about this. My impression is that leaving the applications in memory eliminate the amount of lost time completely.


If you're talking about checkpoint time, re-crunching the same data - yes.

If you're talking about CPU time lost by switching - it can't be eliminated 100% completely - even if "in memory", it's probably been swapped out to VM and must be reloaded from disk, plus there is some time involved in switching processes, signalling one to stop, another to start... but it can be a significant reduction, "almost" completely, not even counting checkpoint issues. It eliminates the need to launch a new thread AND start from the last checkpoint. So it saves minutes, but there is still at least a few seconds of cost to any switch.

ID: 1880 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Alan Baxter

Send message
Joined: 7 Oct 05
Posts: 6
Credit: 7,749
RAC: 0
Message 1890 - Posted: 28 Oct 2005, 20:00:51 UTC - in response to Message 1880.  

If you're talking about CPU time lost by switching - it can't be eliminated 100% completely...


Yes, I'm talking about CPU time lost by switching between applications, the title of this thread and what I thought was the context of your remark that I originally quoted. I guess I should have been clearer about that. I'm emphasizing that keeping suspended and preempted processes in memory virtually eliminates the cost of switching between applications. I consider your estimate of a few seconds to be negligible compared to the several minutes or more that could be lost if the process is not kept in memory, even if it's swapped out to the paging file due to a shortage of RAM.

That said, my system has 1024 MB RAM, none of its four projects are ever swapped out to the paging file, and BOINC doesn't even need to use up a few seconds to switch between applications.
ID: 1890 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Alan Baxter

Send message
Joined: 7 Oct 05
Posts: 6
Credit: 7,749
RAC: 0
Message 1900 - Posted: 29 Oct 2005, 7:55:21 UTC - in response to Message 1774.  

I'm not sure I like the tone of my last post. All I want to do is address the original poster's concern about wasting a lot of crunch time every time BOINC switches applications.
If, for instance, I wish to switch between applications every 30 minutes, then I should be allowed to do so.
But not with Rosetta! Some work units are so large that they do not write to disc within the 30 minute share period, and so start again at the previous disc-write. This means the same calculations are done again and again without the percentage increasing, thus wasting much crunch time.
But I have already specified in my preferences that I wish all applications to write to disc every 10 seconds, so that a re-start can be accomplished at virtually any point.

1) It you set Leave applications in memory while preempted? to YES, then you won't lose any of the crunch time you've already done when BOINC switches applications.
2) Set write to disc back to every 60 seconds. If it really did checkpoint every 10 seconds, the application would be wasting too much time checkpointing.

Hope this helps.
ID: 1900 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Snake Doctor
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 6,401,938
RAC: 0
Message 1917 - Posted: 29 Oct 2005, 15:10:24 UTC - in response to Message 1900.  

I'm not sure I like the tone of my last post. All I want to do is address the original poster's concern about wasting a lot of crunch time every time BOINC switches applications.
If, for instance, I wish to switch between applications every 30 minutes, then I should be allowed to do so.
But not with Rosetta! Some work units are so large that they do not write to disc within the 30 minute share period, and so start again at the previous disc-write. This means the same calculations are done again and again without the percentage increasing, thus wasting much crunch time.
But I have already specified in my preferences that I wish all applications to write to disc every 10 seconds, so that a re-start can be accomplished at virtually any point.

1) It you set Leave applications in memory while preempted? to YES, then you won't lose any of the crunch time you've already done when BOINC switches applications.
2) Set write to disc back to every 60 seconds. If it really did checkpoint every 10 seconds, the application would be wasting too much time checkpointing.

Hope this helps.


In my observation and research most of the information these guys are suggesting is correct, including the above. The thing to remember is that the BOINC client itself is a scheduler, it also provides the function of keeping track of the stats (% complete, CPU time, time to completion, etc). Many things that in human terms seem to be adjustments to particular features of the operation of BOINC and the science applications, really do not work the way we would intuitively think they should. For example the Write to disk setting in prefs is really there so a guy running a laptop can allow the drive to spin down. Most laptops do this to save energy. But in human terms this really has absolutely nothing to do with when the programers have told a particular science app to checkpoint. it is only an instruction to the BOINC software telling it not to spin up the drive any more often than what ever the setting is.

Since some of you are running laptops you can test this yourself. If you set that time to a very short interval, the drive will stay on almost all the time while BOINC lays down the running stats for the apps. Set it to a long time and it will spin down and stay down unless something like an app switch or non BOINC app needs to use it. This can cause problems if you have very little real memory an rely a lot on virtual memory.

But keeping the applications in memory is not an issue for most people. While many think it will impact the system performance, in fact it does not, unless you are really memory bound, and these days most systems are not. While the application may page in and out of real memory from virtual memory, from a systems perspective it is all memory. If a paging event occurs at the system level, the write to disk BOINC setting will be overridden because paging is a system function outside the control of BOINC with a high level system flag.

At app switch time BOINC is SUPPOSED to be sensitive to the drive setting, and work accordingly. But this is in fact a maximum frequency setting. In other-words don't use my drive more often than X min. Not a minimum setting. In other-words use it at least every x min.

The bottom line is that while simple in concept, the BOINC system and it's operation is really somewhat complicated, and a lot of stuff is still being worked out. The application switching is one area where there are still a few insects to hunt down. It would help a lot of the applications designers and the BOINC folks would revisit the "standards" and provide more organization to the application switching, and checkpointing part of the programs. Your point is well taken that as a user you should be able to restart your system at any desired time, and while you may lose some reasonable amount of application work on a WU you should not have to lose more than a min or two. The current R@H application design can lose over an hour on some slower machines and that needs some work. All is not lost they are working on it.

Regards
Phil


We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
ID: 1917 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Martin Johnson

Send message
Joined: 18 Oct 05
Posts: 19
Credit: 171,164
RAC: 0
Message 1925 - Posted: 29 Oct 2005, 20:43:47 UTC

OK, thanks again. I will try leaving it in memory.
ID: 1925 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Alan Baxter

Send message
Joined: 7 Oct 05
Posts: 6
Credit: 7,749
RAC: 0
Message 1926 - Posted: 29 Oct 2005, 21:27:45 UTC - in response to Message 1925.  

OK, thanks again. I will try leaving it in memory.

Please let us know how well this works out for you.
ID: 1926 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 1937 - Posted: 30 Oct 2005, 4:09:24 UTC

Snake Doctor,

Excellent explanation...

One of my observations is that there is more "switching" than "should" be taking place ... and especially frustrating is the work unit that is switched out minutes or seconds before it completes.

I was thinking tonight, I really ought to make a short study to see if the switching is far more common than it should be ... It is not so much the loss of time in switching that bothers me is the number of work units that are held in flight longer than is really needed...

I was also looking forward to the potentials of "intelligent" scheduling so that the system tried to "pair" up work so as to take the best advantage of available resources on the CPUs of the current and next generations where we see more things like HT being added to increase parallelism.

What I mean by this, would be to pair, for example, work that is primarily integer in nature (say PrimeGrid) with that that is floating point so as to keep more execution units busy ... heck, with more insight into operational characteristics, it might even be rational for a project to use scaled numbers instead of floating point so as to take maximum advantage.
ID: 1937 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Aurora Borealis

Send message
Joined: 7 Oct 05
Posts: 15
Credit: 352,300
RAC: 0
Message 1939 - Posted: 30 Oct 2005, 4:43:55 UTC

I'm using W98SE. I haven't been running Rosetta because in the past, leaving the applications in memory caused problems such as multiple applications running simultaneously and producing very long crunch times.

Does anyone know if this problem still exist?

I am interested in this project, but I'm not willing to stop running other projects to do so.
Questions? Answers are in the BOINC Wiki.

Boinc V6.12.41
Win 7 i5 GPU Nvidia 470
ID: 1939 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Alan Baxter

Send message
Joined: 7 Oct 05
Posts: 6
Credit: 7,749
RAC: 0
Message 1940 - Posted: 30 Oct 2005, 5:35:29 UTC - in response to Message 1939.  

I think setting the General Preference: Do work while computer is in use? to "no" might alleviate a "multiple applications running simultaneously" problem. I don't see why "leaving the applications in memory" would make any difference.
ID: 1940 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Tern
Avatar

Send message
Joined: 25 Oct 05
Posts: 576
Credit: 4,695,177
RAC: 21
Message 1948 - Posted: 30 Oct 2005, 20:29:51 UTC - in response to Message 1940.  

I think setting the General Preference: Do work while computer is in use? to "no" might alleviate a "multiple applications running simultaneously" problem. I don't see why "leaving the applications in memory" would make any difference.


I would think it would be the other way around... "do work" being "no" will at best send another "stop" signal to any process that's running and shouldn't be, when you move the mouse - but if the process ignored the first signal when another process was started, it might ignore this one as well. On the other hand, removing a process from memory is a different "signal", "stop and exit", which just might be "listened" to better. Both reduce efficiency tremendously, so it's better to avoid doing either if you can.

In general, the multiple apps problem was caused by a bug in the BOINC libraries. Any project application compiled since the bug was fixed SHOULD exit correctly. I personally haven't seen two running for a month or so now, about when I moved to BOINC 4.72, although that alone shouldn't have solved it. Before that, I was seeing it with SETI/Einstein, SETI/SZTAKI, and Predictor/SZTAKI - wasn't on Rosetta then. My guess is that the optimized SETI apps I installed were from newer source than the 'standard' app or the older optimized ones, and I know SZTAKI recompiled around then too.

ID: 1948 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Alan Baxter

Send message
Joined: 7 Oct 05
Posts: 6
Credit: 7,749
RAC: 0
Message 1952 - Posted: 30 Oct 2005, 22:19:04 UTC - in response to Message 1948.  
Last modified: 30 Oct 2005, 22:20:41 UTC

In general, the multiple apps problem was caused by a bug in the BOINC libraries. Any project application compiled since the bug was fixed SHOULD exit correctly. I personally haven't seen two running for a month or so now, about when I moved to BOINC 4.72, although that alone shouldn't have solved it. Before that, I was seeing it with SETI/Einstein, SETI/SZTAKI, and Predictor/SZTAKI - wasn't on Rosetta then. My guess is that the optimized SETI apps I installed were from newer source than the 'standard' app or the older optimized ones, and I know SZTAKI recompiled around then too.

I did not know that. Thank you for the explanation. Then again, I'm running a dual-processor system, so seeing double is normal for me.
ID: 1952 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Martin Johnson

Send message
Joined: 18 Oct 05
Posts: 19
Credit: 171,164
RAC: 0
Message 1965 - Posted: 31 Oct 2005, 1:55:12 UTC

I am reporting back that "Leaving in Memory" seems to have no detrimental effect on the running of the computer, and it does allow switching more often. I am now set back to my preference of every 20 minutes, and it's nice to see Rosetta carring on where it left off, with no "re-crunching".
Why every 20 mins? Well, it keeps me interested in all the 5 projects on this machine, and stops the "Debts" building up to enormous proportions.
ID: 1965 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Aurora Borealis

Send message
Joined: 7 Oct 05
Posts: 15
Credit: 352,300
RAC: 0
Message 1988 - Posted: 31 Oct 2005, 14:33:37 UTC
Last modified: 31 Oct 2005, 14:34:21 UTC

Ok, I tried 'leave in memory' option last night, but the old problem stll exists.
I currently have Seti and LHC ticking away at the same time.
I guess that with W98, leaving in memory, still isn't an option that can be used.
I wont be crunching for Rosetta until someone at Rosetta or Boinc addresses this problem.


Questions? Answers are in the BOINC Wiki.

Boinc V6.12.41
Win 7 i5 GPU Nvidia 470
ID: 1988 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : Switching between applications



©2024 University of Washington
https://www.bakerlab.org