Message boards : Number crunching : Switching between applications
Author | Message |
---|---|
Martin Johnson Send message Joined: 18 Oct 05 Posts: 19 Credit: 171,164 RAC: 0 |
I believed that the idea of using BOINC was to allow the sharing of resources. If, for instance, I wish to switch between applications every 30 minutes, then I should be allowed to do so. But not with Rosetta! Some work units are so large that they do not write to disc within the 30 minute share period, and so start again at the previous disc-write. This means the same calculations are done again and again without the percentage increasing, thus wasting much crunch time. But I have already specified in my preferences that I wish all applications to write to disc every 10 seconds, so that a re-start can be accomplished at virtually any point. All other applications seem to do this. May I ask if Rosetta will eventually allow this? |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Um, none of the applications will guarentee that. The checkpoint is not a command as it seems, it is a, um, suggestion. The save every "x" is used by the science application in a test when IT is ready to checkpoint to see if it is allowed (if I understand the code correctly). Rosetta@Home right now does not checkpoint as often as it "should" and the project is supposed to be looking into making a change for that. With that, you would be able to checkpoint more often, but you are never going to get all science applications to checkpoint as often as you are suggesting. In some case, CPDN comes to mind, the checkpointing may require such a large write of data it would take less time to recalculate than to recover it ... |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,177 RAC: 21 |
The more often you switch, the more work you're going to lose, on all projects... I believe the SETI app checkpoints every three minutes (I could be wrong) so no matter what your MINIMUM checkpoint time is set to, you could still lose up to 3 min work. Rosetta obviously is even longer. (Yes, too long IMHO, because it's based on 'where' it is and not 'when' it is, and 'where' moves slowly in some WUs; this may not be fixable!) The bottom line is that the preference allows you to set the minimum, to reduce writes to disk - it doesn't direct the application to write at the time you set, just to make sure that at least that much time has elapsed since it's last write. Someone did a lot of research and for _their_ mix of projects, on their computer, they determined that the best switch-time was an hour and a half. You CAN switch every 30 minutes if you like - you aren't being prevented from doing this - but that doesn't mean that you SHOULD, or that it's a good idea. Basically, the slower the computer, the longer you should give between switches; but even on faster computers, the hour "default" is probably a good amount to give each project. Leaving the applications in memory will reduce the amount of lost time as well... |
Martin Johnson Send message Joined: 18 Oct 05 Posts: 19 Credit: 171,164 RAC: 0 |
Thank you both for this information. The problem is that there are no readily available guidelines. And Boinc seems to be in its infancy. For instance, on my faster computer, I have 5 projects running, with roughly equal share percentage. Rosetta units have bottomed out (on Boinc 5.2.2) at a projected duration time of about 30 minutes. Because I cannot ensure that I can connect every day, I set the connect time to every 2 days. So what does Rosetta download today? 51 units! This is a bit silly. I really only wanted 21! Some more tweaking still needs to be done. |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,177 RAC: 21 |
Rosetta units have bottomed out (on Boinc 5.2.2) at a projected duration time of about 30 minutes. Because I cannot ensure that I can connect every day, I set the connect time to every 2 days. So what does Rosetta download today? 51 units! This is a bit silly. I really only wanted 21! At 30 min/WU, that's 48 per day, so 96 would be two days worth. Realize that the program has no way of knowing what "will be" available from other projects when it's looking for work, or what the deadlines are; only what is "on hand" at the moment. If you have _nothing_ in the queue on your computer, then the first project it connects to will be asked for two days work. The second will be asked for about one days work. (I think? Not sure on this.) Then the third through fifth will be asked for (two days/5), call it 10 hours each. The program has to assume that when you connect every two days, you really will be "unplugged" in between, and it has to make sure it has plenty of work to last two days, even if only one project has work available at the moment it connects. If you already had one days work waiting to run, then when you connected to Rosetta, it wanted one more day worth, but you were expecting 10 hours worth... As with most things BOINC, the design is for everything to work out in the LONG run, not necessarily at every single connection, app-switch, etc. In this case, on your next connection, it will ask for less Rosetta - probably - but possibly more of something else. Oh - the Wiki is a good source for detailed information on this and almost any other BOINC topic! |
Alan Baxter Send message Joined: 7 Oct 05 Posts: 6 Credit: 7,749 RAC: 0 |
Leaving the applications in memory will reduce the amount of lost time as well... Um, are you sure about this. My impression is that leaving the applications in memory eliminate the amount of lost time completely. |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Depending on version of the BOINC Client, benchmarks will remove the work from memory ... |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,177 RAC: 21 |
Leaving the applications in memory will reduce the amount of lost time as well... If you're talking about checkpoint time, re-crunching the same data - yes. If you're talking about CPU time lost by switching - it can't be eliminated 100% completely - even if "in memory", it's probably been swapped out to VM and must be reloaded from disk, plus there is some time involved in switching processes, signalling one to stop, another to start... but it can be a significant reduction, "almost" completely, not even counting checkpoint issues. It eliminates the need to launch a new thread AND start from the last checkpoint. So it saves minutes, but there is still at least a few seconds of cost to any switch. |
Alan Baxter Send message Joined: 7 Oct 05 Posts: 6 Credit: 7,749 RAC: 0 |
If you're talking about CPU time lost by switching - it can't be eliminated 100% completely... Yes, I'm talking about CPU time lost by switching between applications, the title of this thread and what I thought was the context of your remark that I originally quoted. I guess I should have been clearer about that. I'm emphasizing that keeping suspended and preempted processes in memory virtually eliminates the cost of switching between applications. I consider your estimate of a few seconds to be negligible compared to the several minutes or more that could be lost if the process is not kept in memory, even if it's swapped out to the paging file due to a shortage of RAM. That said, my system has 1024 MB RAM, none of its four projects are ever swapped out to the paging file, and BOINC doesn't even need to use up a few seconds to switch between applications. |
Alan Baxter Send message Joined: 7 Oct 05 Posts: 6 Credit: 7,749 RAC: 0 |
I'm not sure I like the tone of my last post. All I want to do is address the original poster's concern about wasting a lot of crunch time every time BOINC switches applications. If, for instance, I wish to switch between applications every 30 minutes, then I should be allowed to do so. 1) It you set Leave applications in memory while preempted? to YES, then you won't lose any of the crunch time you've already done when BOINC switches applications. 2) Set write to disc back to every 60 seconds. If it really did checkpoint every 10 seconds, the application would be wasting too much time checkpointing. Hope this helps. |
Snake Doctor Send message Joined: 17 Sep 05 Posts: 182 Credit: 6,401,938 RAC: 0 |
I'm not sure I like the tone of my last post. All I want to do is address the original poster's concern about wasting a lot of crunch time every time BOINC switches applications. In my observation and research most of the information these guys are suggesting is correct, including the above. The thing to remember is that the BOINC client itself is a scheduler, it also provides the function of keeping track of the stats (% complete, CPU time, time to completion, etc). Many things that in human terms seem to be adjustments to particular features of the operation of BOINC and the science applications, really do not work the way we would intuitively think they should. For example the Write to disk setting in prefs is really there so a guy running a laptop can allow the drive to spin down. Most laptops do this to save energy. But in human terms this really has absolutely nothing to do with when the programers have told a particular science app to checkpoint. it is only an instruction to the BOINC software telling it not to spin up the drive any more often than what ever the setting is. Since some of you are running laptops you can test this yourself. If you set that time to a very short interval, the drive will stay on almost all the time while BOINC lays down the running stats for the apps. Set it to a long time and it will spin down and stay down unless something like an app switch or non BOINC app needs to use it. This can cause problems if you have very little real memory an rely a lot on virtual memory. But keeping the applications in memory is not an issue for most people. While many think it will impact the system performance, in fact it does not, unless you are really memory bound, and these days most systems are not. While the application may page in and out of real memory from virtual memory, from a systems perspective it is all memory. If a paging event occurs at the system level, the write to disk BOINC setting will be overridden because paging is a system function outside the control of BOINC with a high level system flag. At app switch time BOINC is SUPPOSED to be sensitive to the drive setting, and work accordingly. But this is in fact a maximum frequency setting. In other-words don't use my drive more often than X min. Not a minimum setting. In other-words use it at least every x min. The bottom line is that while simple in concept, the BOINC system and it's operation is really somewhat complicated, and a lot of stuff is still being worked out. The application switching is one area where there are still a few insects to hunt down. It would help a lot of the applications designers and the BOINC folks would revisit the "standards" and provide more organization to the application switching, and checkpointing part of the programs. Your point is well taken that as a user you should be able to restart your system at any desired time, and while you may lose some reasonable amount of application work on a WU you should not have to lose more than a min or two. The current R@H application design can lose over an hour on some slower machines and that needs some work. All is not lost they are working on it. Regards Phil We Must look for intelligent life on other planets as, it is becoming increasingly apparent we will not find any on our own. |
Martin Johnson Send message Joined: 18 Oct 05 Posts: 19 Credit: 171,164 RAC: 0 |
OK, thanks again. I will try leaving it in memory. |
Alan Baxter Send message Joined: 7 Oct 05 Posts: 6 Credit: 7,749 RAC: 0 |
OK, thanks again. I will try leaving it in memory. Please let us know how well this works out for you. |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Snake Doctor, Excellent explanation... One of my observations is that there is more "switching" than "should" be taking place ... and especially frustrating is the work unit that is switched out minutes or seconds before it completes. I was thinking tonight, I really ought to make a short study to see if the switching is far more common than it should be ... It is not so much the loss of time in switching that bothers me is the number of work units that are held in flight longer than is really needed... I was also looking forward to the potentials of "intelligent" scheduling so that the system tried to "pair" up work so as to take the best advantage of available resources on the CPUs of the current and next generations where we see more things like HT being added to increase parallelism. What I mean by this, would be to pair, for example, work that is primarily integer in nature (say PrimeGrid) with that that is floating point so as to keep more execution units busy ... heck, with more insight into operational characteristics, it might even be rational for a project to use scaled numbers instead of floating point so as to take maximum advantage. |
Aurora Borealis Send message Joined: 7 Oct 05 Posts: 15 Credit: 352,300 RAC: 0 |
I'm using W98SE. I haven't been running Rosetta because in the past, leaving the applications in memory caused problems such as multiple applications running simultaneously and producing very long crunch times. Does anyone know if this problem still exist? I am interested in this project, but I'm not willing to stop running other projects to do so. Questions? Answers are in the BOINC Wiki. Boinc V6.12.41 Win 7 i5 GPU Nvidia 470 |
Alan Baxter Send message Joined: 7 Oct 05 Posts: 6 Credit: 7,749 RAC: 0 |
I think setting the General Preference: Do work while computer is in use? to "no" might alleviate a "multiple applications running simultaneously" problem. I don't see why "leaving the applications in memory" would make any difference. |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,177 RAC: 21 |
I think setting the General Preference: Do work while computer is in use? to "no" might alleviate a "multiple applications running simultaneously" problem. I don't see why "leaving the applications in memory" would make any difference. I would think it would be the other way around... "do work" being "no" will at best send another "stop" signal to any process that's running and shouldn't be, when you move the mouse - but if the process ignored the first signal when another process was started, it might ignore this one as well. On the other hand, removing a process from memory is a different "signal", "stop and exit", which just might be "listened" to better. Both reduce efficiency tremendously, so it's better to avoid doing either if you can. In general, the multiple apps problem was caused by a bug in the BOINC libraries. Any project application compiled since the bug was fixed SHOULD exit correctly. I personally haven't seen two running for a month or so now, about when I moved to BOINC 4.72, although that alone shouldn't have solved it. Before that, I was seeing it with SETI/Einstein, SETI/SZTAKI, and Predictor/SZTAKI - wasn't on Rosetta then. My guess is that the optimized SETI apps I installed were from newer source than the 'standard' app or the older optimized ones, and I know SZTAKI recompiled around then too. |
Alan Baxter Send message Joined: 7 Oct 05 Posts: 6 Credit: 7,749 RAC: 0 |
In general, the multiple apps problem was caused by a bug in the BOINC libraries. Any project application compiled since the bug was fixed SHOULD exit correctly. I personally haven't seen two running for a month or so now, about when I moved to BOINC 4.72, although that alone shouldn't have solved it. Before that, I was seeing it with SETI/Einstein, SETI/SZTAKI, and Predictor/SZTAKI - wasn't on Rosetta then. My guess is that the optimized SETI apps I installed were from newer source than the 'standard' app or the older optimized ones, and I know SZTAKI recompiled around then too. I did not know that. Thank you for the explanation. Then again, I'm running a dual-processor system, so seeing double is normal for me. |
Martin Johnson Send message Joined: 18 Oct 05 Posts: 19 Credit: 171,164 RAC: 0 |
I am reporting back that "Leaving in Memory" seems to have no detrimental effect on the running of the computer, and it does allow switching more often. I am now set back to my preference of every 20 minutes, and it's nice to see Rosetta carring on where it left off, with no "re-crunching". Why every 20 mins? Well, it keeps me interested in all the 5 projects on this machine, and stops the "Debts" building up to enormous proportions. |
Aurora Borealis Send message Joined: 7 Oct 05 Posts: 15 Credit: 352,300 RAC: 0 |
Ok, I tried 'leave in memory' option last night, but the old problem stll exists. I currently have Seti and LHC ticking away at the same time. I guess that with W98, leaving in memory, still isn't an option that can be used. I wont be crunching for Rosetta until someone at Rosetta or Boinc addresses this problem. Questions? Answers are in the BOINC Wiki. Boinc V6.12.41 Win 7 i5 GPU Nvidia 470 |
Message boards :
Number crunching :
Switching between applications
©2024 University of Washington
https://www.bakerlab.org