Message boards : Number crunching : If You Don't Know Where to Put it, Post it here.
Author | Message |
---|---|
Siran d'Vel'nahr Send message Joined: 15 Nov 06 Posts: 72 Credit: 2,674,678 RAC: 0 |
Greetings, Here's a weird issue that I haven't seen before in my 16+ years of using BOINC: I just noticed that when I restarted BOINC, after logging back in, all the tasks I was previously working on ran back to 0 (zero), starting over. Is this a Rosetta thing? Have a great day! :) Siran CAPT Siran d'Vel'nahr XO USS Vre'kasht NCC-33187 "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1667 Credit: 17,446,841 RAC: 24,735 |
|
Siran d'Vel'nahr Send message Joined: 15 Nov 06 Posts: 72 Credit: 2,674,678 RAC: 0 |
Losing WU progress Hi Grant, Thanks! I didn't see that thread. I did change my memory and disk usage per your post over on SETI. I have 32GB RAM and 1TB M.2 NVMe SSD. I'll see if this fixes it. :) It just seemed weird when I started BOINC and all 6 tasks started over at zero. Thanks again and have a great day! :) Siran CAPT Siran d'Vel'nahr XO USS Vre'kasht NCC-33187 "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Millenium Send message Joined: 20 Sep 05 Posts: 68 Credit: 184,283 RAC: 0 |
Yup, had the same problem with a bunch of WUs that eventually took over 12 hours to complete, with a 8 hours target time. Simply, a single decoy of these WUs took 12 hours, so it could not stop before, nor checkpoint. Very rare anyway. |
Siran d'Vel'nahr Send message Joined: 15 Nov 06 Posts: 72 Credit: 2,674,678 RAC: 0 |
Greetings, I'm done with Rosetta! I have it set to NNT and if and when I get done with the current tasks, I will be detaching from Rosetta. I'm tired of having these 7 hr tasks get to 5+ hours just to rewind to zero % done when I log out and log back in and then have to make up for the 5+ hours already done that's lost. I have a dual boot system and I log into Windows 10 once or twice a day to play World of Warcraft. If Rosetta cannot set/or respect a checkpoint so that I can begin where I left off, then this project is NOT for me. That, in my opinion, is disrespectful to the user and the device doing the work. And to think I really, really wanted to do this to help with COVID-19... :( Have a great day! :) Siran CAPT Siran d'Vel'nahr XO USS Vre'kasht NCC-33187 "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
yoerik Send message Joined: 24 Mar 20 Posts: 128 Credit: 169,525 RAC: 0 |
I'm tired of having these 7 hr tasks get to 5+ hours just to rewind to zero % done when I log out and log back in and then have to make up for the 5+ hours already done that's lost. I have a dual boot system and I log into Windows 10 once or twice a day to play World of Warcraft. If Rosetta cannot set/or respect a checkpoint so that I can begin where I left off, then this project is NOT for me. That, in my opinion, is disrespectful to the user and the device doing the work. That sounds like a glitch. What is your "request tasks to checkpoint at most every" set to, in computing preferences in BOINC Manager? |
dcdc Send message Joined: 3 Nov 05 Posts: 1831 Credit: 119,445,013 RAC: 10,974 |
If you can hibernate instead of shutting down then it will restart from where it left off. If the models are large then it might not be able to checkpoint very often. |
Siran d'Vel'nahr Send message Joined: 15 Nov 06 Posts: 72 Credit: 2,674,678 RAC: 0 |
Hi Yoerik That sounds like a glitch. What is your "request tasks to checkpoint at most every" set to, in computing preferences in BOINC Manager? It is set to 600 seconds (10 minutes). There was a reason, which I cannot remember off hand now, for setting it to 600 when all I was doing was SETI. I don't remember ever seeing SETI tasks running at high priority or losing the checkpoints. I don't think they (the Rosetta crew) set a long enough deadline. I'm also back to running high priority again. I have 16 tasks due tomorrow and 8 due the day after. If I were to be running more than one BOINC project, no other project would get a chance to get tasks done. Anyway, it's all a moot point right now since I've got it set to NNT and will detach if and when the tasks get done. Have a great day! :) Siran CAPT Siran d'Vel'nahr XO USS Vre'kasht NCC-33187 "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Siran d'Vel'nahr Send message Joined: 15 Nov 06 Posts: 72 Credit: 2,674,678 RAC: 0 |
If you can hibernate instead of shutting down then it will restart from where it left off. It's a dual boot system, hibernation does not work when booting into another OS. At least not that I know of. Have a great day! :) Siran CAPT Siran d'Vel'nahr XO USS Vre'kasht NCC-33187 "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1667 Credit: 17,446,841 RAC: 24,735 |
It is set to 600 seconds (10 minutes). There was a reason, which I cannot remember off hand now, for setting it to 600 when all I was doing was SETI. I don't remember ever seeing SETI tasks running at high priority or losing the checkpoints.Set the checkpoint to every 60 seconds. It doesn't necessarily check point at that time, it just asks the Application to checkpoint if it is able to. I don't think they (the Rosetta crew) set a long enough deadline. I'm also back to running high priority again. I have 16 tasks due tomorrow and 8 due the day after. If I were to be running more than one BOINC project, no other project would get a chance to get tasks done.4 days is plenty. Of course the more projects you run, the smaller you cache should be. The fact you have just joined up to another project means it will have to sort out just how long the Tasks run for (yes, they are set for 8 hours by default- but the BOINC Manager and the Rosetta servers need to work things out so the Estimated times match reality. The more projects you have, and the larger your cache, the longer it will take for the Manager to sort things out. Tasks running High priority isn't an error, it isn't a problem (normally). It's just the Manager trying to honour your cache & resource share settings. Anyway, it's all a moot point right now since I've got it set to NNT and will detach if and when the tasks get done.Or you can just leave and not help. *shrug* Grant Darwin NT |
Siran d'Vel'nahr Send message Joined: 15 Nov 06 Posts: 72 Credit: 2,674,678 RAC: 0 |
It is set to 600 seconds (10 minutes). There was a reason, which I cannot remember off hand now, for setting it to 600 when all I was doing was SETI. I don't remember ever seeing SETI tasks running at high priority or losing the checkpoints.Set the checkpoint to every 60 seconds. Hi Grant, I set the checkpoint setting to 60 seconds. Rosetta is the only project I am running. I still have SETI set for any GPU resends, but have not gotten anything since about the 31st. Actually I ran Rosetta way back in late 2006. I don't remember why I quite Rosetta back whenever I did. I don't know, perhaps to do SETI solo? I don't think I have had more than 30 tasks this time around at any given time since I restarted doing Rosetta. 4 days? The deadlines for SETI were in the weeks and the tasks took MUCH less time to do than here, even the CPU tasks I was doing in about 40 minutes give or take. Wasn't it 11 of each type of app, at SETI, that was needed to figure out the estimated run time? I have done WAY more than that on some of these here and it still takes 7 to 15 hours to do them. Rosetta-mini is my lowest, I've done 1 of those. The rest are 10, 15, 28, 39 (not in that order). How many do I have to do before the servers and BOINC figure out this PC can do them faster? Last night before bed, I saw that my tasks had been running about 8 hours each with 3 hours left. This morning new tasks were being processed so I logged into Windows 10 and when done there logged back into here and 4 of the 6 tasks restarted at zero % and 2 continued from where they left off. This is ridiculous. It's like I'm running on a 486 instead of an i7 8th gen. This PC is no slouch. It's a Gen 8 8086K running at 4Ghz with 32GB RAM. I was doing SETI GPU tasks in seconds to a minute and as I mentioned CPU tasks in about 40 minutes give or take. It's NOT that I don't want to help with COVID-19, it's the fact that I'm tired of redoing and redoing and redoing work that takes too long to do on this PC when it should be doing the work in much less time and NOT starting over because BOINC "forgets" the checkpoints. If I see a major difference between now and the last tasks listed, perhaps I will stay with Rosetta a while longer. If, however, checkpoints are still not honored and it still takes 8 to 15 hours to do a task, I'm gone. I'm sorry, but my PC's time is worth a LOT more than this. And as an aside, I really hate ALL white websites and fora... ;) Have a great day! :) Siran CAPT Siran d'Vel'nahr XO USS Vre'kasht NCC-33187 "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1667 Credit: 17,446,841 RAC: 24,735 |
I have done WAY more than that on some of these here and it still takes 7 to 15 hours to do them. Rosetta-mini is my lowest, I've done 1 of those. The rest are 10, 15, 28, 39 (not in that order). How many do I have to do before the servers and BOINC figure out this PC can do them faster?Still around 10 non error Tasks, however there have been new applications in the last few days, so everything starts from scratch for them. Likewise, different Tasks may take more or less processing, so it takes time for things to adjust to those as well. The fact is Tasks run for a set time, the default being 8 hours. It takes a while for the Estimated times to match up with the actual times. This PC is no slouch. It's a Gen 8 8086K running at 4Ghz with 32GB RAM. I was doing SETI GPU tasks in seconds to a minute and as I mentioned CPU tasks in about 40 minutes give or take.Once again, Tasks run for a set time. Some may bail out early. Some may run longer (but there is a 4 hour cutoff). There are some where the default time was set longer than usual, but most of those have gone now. But the fact is Tasks run for a set time. It's NOT that I don't want to help with COVID-19, it's the fact that I'm tired of redoing and redoing and redoing work that takes too long to do on this PC when it should be doing the work in much less time and NOT starting over because BOINC "forgets" the checkpoints.I have no idea what settings you have on your system to make checkpoints not work. I've set it for the default of 60 seconds, and the most time i have lost on a restart has been 5 minutes. Usually it's only a couple of minutes. I've got 6c/12t all in use, 32GB RAM, with no issues, these are my settings, Computing preferences- Usage limits Use at most 100 % of the CPUs Use at most 100 % of CPU time When to suspend Basically, never. Other Store at least 1 days of work Store up to an additional 0.02 days of work Switch between tasks every 60 minutes Request tasks to checkpoint at most every 60 seconds Disk Use no more than 20 GB Leave at least 2 GB free Use no more than 60% of total When computer is in use, use at most 95 % When computer is not in use, use at most 95 % Leave non-GPU tasks in memory while suspended (not selected) Page/swap file: use at most 75 % Rosetta@home preferences Percentage of CPU time used for graphics not selected Number of frames per second for graphics not selected Target CPU run time not selected Grant Darwin NT |
Siran d'Vel'nahr Send message Joined: 15 Nov 06 Posts: 72 Credit: 2,674,678 RAC: 0 |
Hi Grant, With the exception of 3 settings, all were basically the same as yours. The ones I changed were: Use at most 60 % of CPUs - Changed to 100 % Store up to an additional 0.05 days of work - Changed to 0.02 My disk usage is set to 100GB since I have a 1TB SSD. The graphics settings are now set to "Not selected" even though it says that "Not selected" will default to 10. So, basically there should be no problem with checkpoints and running high priority since my settings were basically identical to yours. ;) We'll see what happens in a few hours when I log into Windows 10 to do some World of Warcraft stuff for about an hour and 1/2. ;) Have a great day! :) Siran CAPT Siran d'Vel'nahr XO USS Vre'kasht NCC-33187 "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Siran d'Vel'nahr Send message Joined: 15 Nov 06 Posts: 72 Credit: 2,674,678 RAC: 0 |
Greetings, Ok. I just logged back in and 11 of my 12 tasks started over at ZERO! A few were getting somewhat close to finish when I shut down BOINC and logged into Windows 10. For those here, including Grant, that think I might be blowing smoke about this, I have a little test for you to perform and I bet that what happens when I log back in will happen to many of you. It's really simple. Take note of where your tasks are at in elapsed and remaining time. Don't need to be precise, just a mental note. Shut down BOINC including the app(s), wait a few seconds or so then restart BOINC. I'll bet $10 bucks that some, if not all, of your tasks will restart at zero. @Grant: My settings are damn near identical to yours. I really would like to continue with Rosetta, but if something isn't done about the checkpoints... forget it. I can live with the tasks running in high priority, I'm just tired of the wasted work because no checkpoints are being set. Have a great day! :) Siran CAPT Siran d'Vel'nahr XO USS Vre'kasht NCC-33187 "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Are all of the machines discussed in this thread running the i686 Linux application? It seems to presently have an issue where it runs long enough that the watchdog ends it, and doesn't complete the first model. Rosetta Moderator: Mod.Sense |
JohnDK Send message Joined: 6 Apr 20 Posts: 33 Credit: 2,390,240 RAC: 0 |
So anybody getting new work the last hours? |
Millenium Send message Joined: 20 Sep 05 Posts: 68 Credit: 184,283 RAC: 0 |
It is dry, we crunched everything, time to wait for new work. |
JohnDK Send message Joined: 6 Apr 20 Posts: 33 Credit: 2,390,240 RAC: 0 |
Server status says 13355 tasks ready to send, if you can count on that. |
Siran d'Vel'nahr Send message Joined: 15 Nov 06 Posts: 72 Credit: 2,674,678 RAC: 0 |
Are all of the machines discussed in this thread running the i686 Linux application? It seems to presently have an issue where it runs long enough that the watchdog ends it, and doesn't complete the first model. Hi Mod, Ok, now we seem to be getting somewhere. I checked the properties on a couple of the tasks I have running and they are running on the i686 app. Is this why no checkpoints are set and the tasks start over from zero % when restarted? In case it helps this is my current system. Have a great day! :) Siran CAPT Siran d'Vel'nahr XO USS Vre'kasht NCC-33187 "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
JohnDK Send message Joined: 6 Apr 20 Posts: 33 Credit: 2,390,240 RAC: 0 |
Got work now :) |
Message boards :
Number crunching :
If You Don't Know Where to Put it, Post it here.
©2024 University of Washington
https://www.bakerlab.org