Message boards : Number crunching : Lost 2h of work
Author | Message |
---|---|
eL_nino Send message Joined: 20 Jan 06 Posts: 10 Credit: 45,343 RAC: 0 |
Ok, WTF is this- I have Target CPU run time = 16h in last few days, and now my WU came to 4h of crunching and then I restarted my computer (because of some new software) and when Windows started again and I started Boinc this WU was on 2h progress! WTF is that?! And that is not 1st time something like that happened! Now I will put my target WU time on 1h so this shi* wont happen again, in last 5-6 days I lost arround 10h of work because of this. This really sucks when you do same work twice... :( |
Maxxou59 Send message Joined: 5 May 06 Posts: 10 Credit: 84,743 RAC: 0 |
Because Rosetta software do a backup between each step, if a step is not finsih when you restart, the step will be re calculate before the first step ... Maxxou59-Lille-France Student at University of Chemistry |
tralala Send message Joined: 8 Apr 06 Posts: 376 Credit: 581,806 RAC: 0 |
Rosetta does checkpoint often. On my machine usually between 5 and 20 minutes but that might take longer on slow computers and on certain WUs. If you restart your machine some work is inevitably lost but that is unpreventable. Your computers are hidden so we can't check them what the problem is. One idea is to time your restarts when a WU is finished but that might be inconvenient. |
ronald Send message Joined: 6 Jun 06 Posts: 1 Credit: 448 RAC: 0 |
qustion? i`m running another program called united devices before i turned my computer off i always close it since with that program too i have lost time with it but since i close it now have not lost any computer time with it i`m new to your program since i turn my system off at night by closing down your program will that save it from messing up just wondering thank you ronald schwarz |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
In short, it's not messing up. It's just at a point in exploration of the protein that can't easily be preserved. So, when you restart your computer, you pick up at the last checkpoint where it was possible to save your place. Think of it this way, if you drop breadcrumbs as you explore a forest, and you turn off your computer. You enter the forest again tomorrow, you follow all the breadcrumbs, and you end up at the last one... you've lost everything after that and have to explore it again. Bottom line, it gets more work done if it is not interrupted by powering off your computer. But, reality as it is (I've been powering mine off during the day as we are now in to air conditioning season), it is able to run just fine as long as you have enough runtime to drop a breadcrumb before you turn off your machine. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
NJMHoffmann Send message Joined: 17 Dec 05 Posts: 45 Credit: 45,891 RAC: 0 |
Rosetta does checkpoint often. On my machine usually between 5 and 20 minutes but that might take longer on slow computers and on certain WUs. E.g some ofthe t296__CASP_ABINITIO_SAVE_ALL_OUT workunits here have their first checkpoint after about 90-120 min. (2200+ AMD, 2400 INTEL) Norbert |
eL_nino Send message Joined: 20 Jan 06 Posts: 10 Credit: 45,343 RAC: 0 |
I have put now my "target WU time" on 1h... So no problem now, I hate to loose 1-2h of work everytime when I turn of or restart my computer, it is not very nice... |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
I have put now my "target WU time" on 1h... So no problem now, I hate to loose 1-2h of work everytime when I turn of or restart my computer, it is not very nice... Reducing the target WU time will not reduce the average amount of work lost when the computer is turned off. In fact, it will increase the average ammount of work lost. Here's why. Assume that half the WUs crunch models very fast, and half take 4 hours per model. If the target WU time is long then the computer will be spending half its time on each type. But with a 1 hour target time, the fast WUs will take 1 hour and the slow ones will take 4 hours (as they must complete at least 1 model no matter how short the target time). Thus, the computer will now be spending 1/5 of its time on fast, frequently checkpointing WUs and 4/5 of its time on slow WUs. |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Reducing the target WU time will not reduce the average amount of work lost when the computer is turned off. AMD is correct. Your runtime preference doesn't effect when checkpoints can occur. Sorry, it just doesn't work that way. The FAQ on WU runtime preference explains that you must crunch a complete model, regardless of whether or not the time to do that exceeds your 1hr preference. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
eL_nino Send message Joined: 20 Jan 06 Posts: 10 Credit: 45,343 RAC: 0 |
Reducing the target WU time will not reduce the average amount of work lost when the computer is turned off. YES, i know all that, but when WU is on 1h (so far I crunched arround 20 wu-s like that and they all were from 50 minutes to 70 minutes)- even when I turn of my computer off or restart maximum I can loose is 10-20 minutes and not 2-3h like happened few times. |
tralala Send message Joined: 8 Apr 06 Posts: 376 Credit: 581,806 RAC: 0 |
Reducing the target WU time will not reduce the average amount of work lost when the computer is turned off. edit: I actually had nothing to say, sorry. |
Ricardo Send message Joined: 9 Dec 05 Posts: 26 Credit: 24,039 RAC: 0 |
Reducing the target WU time will not reduce the average amount of work lost when the computer is turned off. Hi, I am not sure but can this matter not be solved leaving application in memory (swap file) while preempted? Regards, Ricardo (Ex Seti cruncher) |
Ricardo Send message Joined: 9 Dec 05 Posts: 26 Credit: 24,039 RAC: 0 |
Reducing the target WU time will not reduce the average amount of work lost when the computer is turned off. Forget my earlier comments because swap file is deleted when the computer is turned out. |
Feet1st Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
...when I turn of my computer off or restart maximum I can loose is 10-20 minutes and not 2-3h like happened few times. Yes, R@H has recently added additional checkpointing. ...and just in time for these large WUs they are getting from CASP. The objective is the when R@H is ended (either due to turning off the PC, or removing the app from memory when you switch to crunch another project) that you would lose on average, only 10 or 20 minutes. If you would, keep some notes. If you find another case where you lose 2 hours, please note the following: Rosetta application release (shown in work tab), WU name, % complete shown at the time you ended, and which step you were on when it ended, then report those details in the appropriate thread about problems with a given release of Rosetta. Losing 2 hours of work should be considered a problem. They may not have an immediate solution, but if a specific WU has such a problem, perhaps they can learn more about why it isn't checkpointing more and later find ways to do more checkpointing on such WUs. Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
Message boards :
Number crunching :
Lost 2h of work
©2024 University of Washington
https://www.bakerlab.org