Message boards : Number crunching : Question for developers - Does the New Versions on the 20Th have stuck at 1% fix?
Previous · 1 · 2
Author | Message |
---|---|
Jack Schonbrun Send message Joined: 1 Nov 05 Posts: 115 Credit: 5,954 RAC: 0 |
I found one waiting to run and deleted it. Can we now assume that there are no more waiting to be downloaded, just in case I go to bed and get one overnight. I think that they no longer in the queue, so you should be able to rest easy. |
Pixiebot Send message Joined: 6 Nov 05 Posts: 50 Credit: 60,515 RAC: 0 |
Unfortunately not 4468774 3761481 20 Dec 2005 19:23:37 UTC 17 Jan 2006 19:23:37 UTC In Progress Unknown New |
Lee Carre Send message Joined: 6 Oct 05 Posts: 96 Credit: 79,331 RAC: 0 |
The results would be valid, though I'm not sure how your or our computers would like the files that are 100 times as big. ... It's probably better for everybody to abort and get new WUs. We are still investigating the issue with the WUs finish too quickly. Well my concerns are; is/has the work been sent out as the "correct" smaller WUs? and have the "DEFAULT" units been canceled on the server/database, so that they aren't sent out anymore? I for one, am going to leave mine running until i hear an definate answer from the officials |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
Having now aborted the 'default...' WU, it now sits in the work area with a status of 'Aborted by User'. There have been a couple of updates since then where results have been reported but it still sits there. How can I get rid of it? |
Jack Schonbrun Send message Joined: 1 Nov 05 Posts: 115 Credit: 5,954 RAC: 0 |
I for one, am going to leave mine running until i hear an definate answer from the officials I am an official, and I can tell you that right thing to do is abort the Work Unit. They will be sent again, with the correct arguments. |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
IF ANYONE SEES A "DEFAULT_xxxxx_205_.........." (batch 205) WORKUNIT PLEASE ABORT IT. When Einstein made a mistake like this, they managed to give everyone credit for the aborted WU - if you ask them nicely they may still have the script handy. (If they are not sure when this was, it was when they issued WU whose names differed only by upper-vs-lower case, and they confused the Windows machines) People initially got 0, but after the script was run everyone ended up getting what their client claimed for the result. Just a thought, don't think it affects me personally. R~~ |
Desti Send message Joined: 16 Sep 05 Posts: 50 Credit: 3,018 RAC: 0 |
IF ANYONE SEES A "DEFAULT_xxxxx_205_.........." (batch 205) WORKUNIT PLEASE ABORT IT. The DEFAULT__206 workunits are ok? I have just finished one of them without any problems :) https://boinc.bakerlab.org/rosetta/workunit.php?wuid=3768678 LUE |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
Batch 206 is okay, ONLY ABORT 205. |
Lee Carre Send message Joined: 6 Oct 05 Posts: 96 Credit: 79,331 RAC: 0 |
I am an official, and I can tell you that right thing to do is abort the Work Unit. They will be sent again, with the correct arguments.sorry, didn't mean to imply that you weren't, i just ment someone from the group of admins generally, since i've been out for a bit the WU timed out anyway, think it exceeded the max CPU time allowed, so there we go lol |
MikeX Send message Joined: 17 Sep 05 Posts: 1 Credit: 16,201 RAC: 0 |
WU's 4469584, 4436212 and 4399861 had an 'Unrecoverable error' just after a few minutes crunching. Wanna visit BOINC Synergy? Click my stats! Join BOINC Synergy |
Jack Schonbrun Send message Joined: 1 Nov 05 Posts: 115 Credit: 5,954 RAC: 0 |
Just to let everyone know, we are closing in on the bug causing many jobs to exit after a minute or so. We have a work around until the bug is found, so we should be able to keep sending reliable jobs over the holidays. |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
Thanks Jack |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
Still getting lots of the aborting ones after a few seconds... |
rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0 |
IF ANYONE SEES A "DEFAULT_xxxxx_205_.........." (batch 205) WORKUNIT PLEASE ABORT IT. Is there any way to stop these from getting re-issued after someone processes them until they default? I got a second go-around on one that someone else had crunched until it was defaulted. I guess they could be re-issued up to 5 times? Regards, Bob P. |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
5 times? I have some that have been issued 10 times...that is why I have suspended Rosetta till they are fixed....it is a waste of my bandwidth. (this is with ref to the 'crashing' ones after a few seconds) |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Having now aborted the 'default...' WU, it now sits in the work area with a status of 'Aborted by User'. There have been a couple of updates since then where results have been reported but it still sits there. Wait, it will try to run and immediately client error .. then reporting will clear it like other failed work unist. |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
Yeah thanks Paul...it had gone when I got up this morning! |
rbpeake Send message Joined: 25 Sep 05 Posts: 168 Credit: 247,828 RAC: 0 |
5 times? I have some that have been issued 10 times...that is why I have suspended Rosetta till they are fixed....it is a waste of my bandwidth. I was referring to the 100X as long ones, the 205 series that run for hours before they default out, but your point is valid too! Regards, Bob P. |
Message boards :
Number crunching :
Question for developers - Does the New Versions on the 20Th have stuck at 1% fix?
©2024 University of Washington
https://www.bakerlab.org