Author | Message |
XS_Vietnam_Soldiers
Send message
Joined: 11 Jan 06 Posts: 240 Credit: 2,880,653 RAC: 0
|
There are a couple of things that have to do with rosetta that "concern" me and I'd like to see some discussion on them.
When I set a machine to "crunch" a work unit that you have sent it is with the assumption that it can be crunched( for lack of a better word)
Many times there are "issues" with work units and for whatever reason they fail to crunch to finish.
My machine has done it's part of the bargain and yet we receive no credit for this time involved.
Personally this strikes me as a very unfair way of handling the situation.I have done what you asked with the work unit you sent me so I do not see why I am not afforded the credit for my work.
2)I may be incorrect in this and please correct me if I am wrong: The way I interpreted the info I read is that Rosetta sends out a work unit to at least two different users, then when returned if the data from that is the same on both returned work units the credit that is given is based on the smaller of the 2 returned units.
example: The work unit is sent to "Jimmy" running a P4-2000 and he receives X points.
the work unit is sent to "Peter" and he is running a highly modified AMD X2 4600 thats watercooled and stable at 3100mhz.
Should Peter receive points based on Jimmy's capabilities?
Seems a bit unfair to me. One person is donating to you the time from a machine that costs AND does mathematical computations at a rate app twice the other but receives the same credit?
Again, if I misread the last point, please forgive me and please correct me.
We all get into DC for the science as the first goal but the "points" are what make it possible to attract a lot of people to these types of projects. When they see things that don't seem to be fair I think it's important to address them.
Thank you for your time.
Movieman from XS
|
|
Dimitris Hatzopoulos
Send message
Joined: 5 Jan 06 Posts: 336 Credit: 80,939 RAC: 0
|
Movieman, a few quick comments (FWIW, I'm just a participant like you, not part of the project):
1/ Rosetta@home project doesn't send the same work to many PCs, just to ONE PC. Hence no donor resources are "wasted". Only if that PC fails to come back with a result within 7 days (the shortest WU deadline, although most WUs have a deadline of 2 weeks in my experience), then the same WU is re-sent to a different PC.
2/ For all BOINC projects, the "credits" ("points") are computed based on a benchmark that is run periodically on each PC. A fast PC will claim more then a slower one, per unit of CPU time. e.g.
one of your PCs P4/3.4GHz ( 166404) benchmarks as 2163.27 million ops/sec
whereas one of mine P4/2.53GHz ( 166828) benchmarks as 1198.13 million ops/sec
So your PC will claim 1.8x times the credits mine does, per unit of time (e.g. 1hr), which you can easily check (btw I've changed my preferences to crunch the same WU for 8hr vs the latest default of 2hr)
So the example of Jimmy's slow PC and Peter's fast PC getting same credits isn't applicable here.
What happens in this case, some other BOINC projects send the very same WU to e.g. 3 different PCs and then ignore the highest and lowest point claim and award the points claimed by the PC in the middle (all credit claims are normalised for speed). This approach wouldn't work in Rosetta@home anyway, because R's WUs are not fixed-length, but their runtime can vary based on user preferences.
The issue of "inflated point claims" arises when people use modified versions of BOINC (an open source app, common to all BOINC projects). This particular issue has been discussed extensively lately here and apparently some change is on its way in the BOINC sw to alleviate this problem.
Hope this clears things up. Best UFO Resources
Wikipedia R@h
How-To: Join Distributed Computing projects that benefit humanity
|
|
Moderator9 Volunteer moderator
Send message
Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0
|
Movieman, a few quick comments (FWIW, I'm just a participant like you, not part of the project):
1/ Rosetta@home project doesn't send the same work to many PCs, just to ONE PC. Hence no donor resources are "wasted". Only if that PC fails to come back with a result within 7 days (the shortest WU deadline, although most WUs have a deadline of 2 weeks in my experience), then the same WU is re-sent to a different PC.
2/ For all BOINC projects, the "credits" ("points") are computed based on a benchmark that is run periodically on each PC. A fast PC will claim more then a slower one, per unit of CPU time. e.g.
one of your PCs P4/3.4GHz (166404) benchmarks as 2163.27 million ops/sec
whereas one of mine P4/2.53GHz (166828) benchmarks as 1198.13 million ops/sec
So your PC will claim 1.8x times the credits mine does, per unit of time (e.g. 1hr), which you can easily check (btw I've changed my preferences to crunch the same WU for 8hr vs the latest default of 2hr)
So the example of Jimmy's slow PC and Peter's fast PC getting same credits isn't applicable here.
What happens in this case, some other BOINC projects send the very same WU to e.g. 3 different PCs and then ignore the highest and lowest point claim and award the points claimed by the PC in the middle (all credit claims are normalised for speed). This approach wouldn't work in Rosetta@home anyway, because R's WUs are not fixed-length, but their runtime can vary based on user preferences.
The issue of "inflated point claims" arises when people use modified versions of BOINC (an open source app, common to all BOINC projects). This particular issue has been discussed extensively lately here and apparently some change is on its way in the BOINC sw to alleviate this problem.
Hope this clears things up.
This is all basically correct, All project have error WUs for a variety of reasons. On occasion it is the projects fault,and Rosetta does try to provide credit when it is possible to do so. But just to put this in perspective. My systems have only had two errors in the last few weeks, and one of them was a download problem on my own network. So individual system conditions play a significant role in this issue.
Moderator9
ROSETTA@home FAQ
Moderator Contact
|
|
BennyRop
Send message
Joined: 17 Dec 05 Posts: 555 Credit: 140,800 RAC: 0
|
I was involved in a bit of case 2 there..
The person before me ran the WU for some time, and then turned it in after the deadline and got 50 points. (Boinc and the validator wasn't supposed to allow that.. and is being fixed.)
I ran it for 24 hours, and returned it, claiming 250 points (but received the previous fellow's 50 points. Hopefully, that means that I created 5 times as many models for that WU than (s)he did.
Comparing scores, as mentioned earlier, is much tougher now - since there's the speed of the machine, the number of hours it's set to work on a WU (always doing at least one), and the linux/windows normal client/optimized client issues. (Which are supposedly also being worked on.)
|
|
XS_Vietnam_Soldiers
Send message
Joined: 11 Jan 06 Posts: 240 Credit: 2,880,653 RAC: 0
|
Movieman, a few quick comments (FWIW, I'm just a participant like you, not part of the project):
1/ Rosetta@home project doesn't send the same work to many PCs, just to ONE PC. Hence no donor resources are "wasted". Only if that PC fails to come back with a result within 7 days (the shortest WU deadline, although most WUs have a deadline of 2 weeks in my experience), then the same WU is re-sent to a different PC.
2/ For all BOINC projects, the "credits" ("points") are computed based on a benchmark that is run periodically on each PC. A fast PC will claim more then a slower one, per unit of CPU time. e.g.
one of your PCs P4/3.4GHz (166404) benchmarks as 2163.27 million ops/sec
whereas one of mine P4/2.53GHz (166828) benchmarks as 1198.13 million ops/sec
So your PC will claim 1.8x times the credits mine does, per unit of time (e.g. 1hr), which you can easily check (btw I've changed my preferences to crunch the same WU for 8hr vs the latest default of 2hr)
So the example of Jimmy's slow PC and Peter's fast PC getting same credits isn't applicable here.
What happens in this case, some other BOINC projects send the very same WU to e.g. 3 different PCs and then ignore the highest and lowest point claim and award the points claimed by the PC in the middle (all credit claims are normalised for speed). This approach wouldn't work in Rosetta@home anyway, because R's WUs are not fixed-length, but their runtime can vary based on user preferences.
The issue of "inflated point claims" arises when people use modified versions of BOINC (an open source app, common to all BOINC projects). This particular issue has been discussed extensively lately here and apparently some change is on its way in the BOINC sw to alleviate this problem.
Hope this clears things up.
Thank you for your input. That answers my questions on the " Jimmy VS Peter" issue.
I'm still left with the question of why we don't get credit for the work we've done when it's a bad work unit. It's not a matter of whose fault it is that th work unit hasn't crunched itself to completion, it's a matter of "I've done what you asked me too and through no fault of my own the WU failed. I still did the same amount of work, donated the same PC, and paid for the electricity to do it" IE: I figure that in all fairness I'm entitled to the points for doing so. I got a message tonight from one of the guys on my team. He's telling me that unfortunately he has to turn off 2 PC's as his electric bill that normally runs $280.00 a month just came in today at $420.00. He runs 4-5 machines normally. To get to the point bluntly, there is a considerable cost to running Rosetta, all we ask in return is the recognition(points) for doing so. A very small thing in my estimation.There are a lot of blue collar people out there that run rosetta and cut back in other areas to be able to afford the electrical cost to do so. Where I live elec is the 3rd highest of the 50 states in the USA. My average elec bill is $180.00 per month without Rosetta running and $270.00 with Rosetta running. This is with 4 machines running here 24/7/365.
2-DX P4 xeons, a P3 xeon and a Pentium M setup in a desktop board.
I think you can see my point that it is frustrating to walk in from work, find that a machine has just spent 15 hours on a WU and is at 1% and even though I can abort it at that point, I receive nothing for the 15 hours work.
|
|
XS_Vietnam_Soldiers
Send message
Joined: 11 Jan 06 Posts: 240 Credit: 2,880,653 RAC: 0
|
Movieman, a few quick comments (FWIW, I'm just a participant like you, not part of the project):
1/ Rosetta@home project doesn't send the same work to many PCs, just to ONE PC. Hence no donor resources are "wasted". Only if that PC fails to come back with a result within 7 days (the shortest WU deadline, although most WUs have a deadline of 2 weeks in my experience), then the same WU is re-sent to a different PC.
2/ For all BOINC projects, the "credits" ("points") are computed based on a benchmark that is run periodically on each PC. A fast PC will claim more then a slower one, per unit of CPU time. e.g.
one of your PCs P4/3.4GHz (166404) benchmarks as 2163.27 million ops/sec
whereas one of mine P4/2.53GHz (166828) benchmarks as 1198.13 million ops/sec
So your PC will claim 1.8x times the credits mine does, per unit of time (e.g. 1hr), which you can easily check (btw I've changed my preferences to crunch the same WU for 8hr vs the latest default of 2hr)
So the example of Jimmy's slow PC and Peter's fast PC getting same credits isn't applicable here.
What happens in this case, some other BOINC projects send the very same WU to e.g. 3 different PCs and then ignore the highest and lowest point claim and award the points claimed by the PC in the middle (all credit claims are normalised for speed). This approach wouldn't work in Rosetta@home anyway, because R's WUs are not fixed-length, but their runtime can vary based on user preferences.
The issue of "inflated point claims" arises when people use modified versions of BOINC (an open source app, common to all BOINC projects). This particular issue has been discussed extensively lately here and apparently some change is on its way in the BOINC sw to alleviate this problem.
Hope this clears things up.
This is all basically correct, All project have error WUs for a variety of reasons. On occasion it is the projects fault,and Rosetta does try to provide credit when it is possible to do so. But just to put this in perspective. My systems have only had two errors in the last few weeks, and one of them was a download problem on my own network. So individual system conditions play a significant role in this issue.
Moderator#9:
I understand what your saying in relation to there being many factors involved in bad WU's and at the risk of sounding like I'm blowing my own horn I don't think thats the issue in my case. What I have here are not the average home PC's on an average home network. I build high end business machines for a living and also do networking.
My home is wired( by me) with CAT6 to all commercial components. My main PC is a Supermicro X6DA8-G2 w/2-3600/2mb/800 Irwindales,550 watt EPS12 PS, with raptors in raid0. It is spot on dead stable. It is Prime 95 24 hour tested stable which puts more strain on a system than you'd beleive.We're talking close to $5000.00 in parts alone. I don't say this to brag, there are many that I know with equal or better systems. My point is that when a WU errors out I am fairly certain that it is not my equipment causing the issue. If it was I wouldn't be writing this I'd be replacing the faulty equipment.Over 2 weeks ago I submitted a text to David Kim of over 3000 points that I felt my team had been shortchanged on and have yet to receive a reply on this or to my knowledge have those points been awarded.
I can't speak for others but those are my " individual system conditions".
|
|
reddwarf
Send message
Joined: 5 Nov 05 Posts: 11 Credit: 228,026 RAC: 0
|
|
|
David Baker Volunteer moderator Project administrator Project developer Project scientist
Send message
Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0
|
Movieman, a few quick comments (FWIW, I'm just a participant like you, not part of the project):
1/ Rosetta@home project doesn't send the same work to many PCs, just to ONE PC. Hence no donor resources are "wasted". Only if that PC fails to come back with a result within 7 days (the shortest WU deadline, although most WUs have a deadline of 2 weeks in my experience), then the same WU is re-sent to a different PC.
2/ For all BOINC projects, the "credits" ("points") are computed based on a benchmark that is run periodically on each PC. A fast PC will claim more then a slower one, per unit of CPU time. e.g.
one of your PCs P4/3.4GHz (166404) benchmarks as 2163.27 million ops/sec
whereas one of mine P4/2.53GHz (166828) benchmarks as 1198.13 million ops/sec
So your PC will claim 1.8x times the credits mine does, per unit of time (e.g. 1hr), which you can easily check (btw I've changed my preferences to crunch the same WU for 8hr vs the latest default of 2hr)
So the example of Jimmy's slow PC and Peter's fast PC getting same credits isn't applicable here.
What happens in this case, some other BOINC projects send the very same WU to e.g. 3 different PCs and then ignore the highest and lowest point claim and award the points claimed by the PC in the middle (all credit claims are normalised for speed). This approach wouldn't work in Rosetta@home anyway, because R's WUs are not fixed-length, but their runtime can vary based on user preferences.
The issue of "inflated point claims" arises when people use modified versions of BOINC (an open source app, common to all BOINC projects). This particular issue has been discussed extensively lately here and apparently some change is on its way in the BOINC sw to alleviate this problem.
Hope this clears things up.
This is all basically correct, All project have error WUs for a variety of reasons. On occasion it is the projects fault,and Rosetta does try to provide credit when it is possible to do so. But just to put this in perspective. My systems have only had two errors in the last few weeks, and one of them was a download problem on my own network. So individual system conditions play a significant role in this issue.
Moderator#9:
I understand what your saying in relation to there being many factors involved in bad WU's and at the risk of sounding like I'm blowing my own horn I don't think thats the issue in my case. What I have here are not the average home PC's on an average home network. I build high end business machines for a living and also do networking.
My home is wired( by me) with CAT6 to all commercial components. My main PC is a Supermicro X6DA8-G2 w/2-3600/2mb/800 Irwindales,550 watt EPS12 PS, with raptors in raid0. It is spot on dead stable. It is Prime 95 24 hour tested stable which puts more strain on a system than you'd beleive.We're talking close to $5000.00 in parts alone. I don't say this to brag, there are many that I know with equal or better systems. My point is that when a WU errors out I am fairly certain that it is not my equipment causing the issue. If it was I wouldn't be writing this I'd be replacing the faulty equipment.Over 2 weeks ago I submitted a text to David Kim of over 3000 points that I felt my team had been shortchanged on and have yet to receive a reply on this or to my knowledge have those points been awarded.
I can't speak for others but those are my " individual system conditions".
We appreciate all of your concerns and are working to address them. I really hope we can fix the sources of the problems very soon--this is our number one priority now, and with Rom on board hopefully this will move quickly. After these are fixed we will get back to the lost credit issue.
|
|
Los Alcoholicos~La Muis
Send message
Joined: 4 Nov 05 Posts: 34 Credit: 1,041,724 RAC: 0
|
I think it is impossible to award points for the stuck_at_1% wu's. If you don't babysit your pc, it can run for ages. And there isn't a way for Rosetta to know how much cpu time it wasted. So how much credits should they award? Whatever a participant claims?
It would be nice when Boinc would kill those stuck wu's automagicly after a fair amount of time (like the no_heartbeat_from_client_core function ) so there would be a limit to the wasted time.
|
|
AMD_is_logical
Send message
Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0
|
It would be nice when Boinc would kill those stuck wu's automagicly after a fair amount of time (like the no_heartbeat_from_client_core function ) so there would be a limit to the wasted time.
They tried using the boinc timeout function, but it worked really badly. They do need something, though, because they will never be able to track down all stuck-at bugs in code that is continually being developed.
Perhaps they could have their own watchdog thread. A flag could be set whenever the program makes real progress, and the watchdog could wake up every now and then and would keep track of how much CPU had been used since the last bit of progress.
That would keep WUs that were making slow progress from timing out, while a truly stuck WU would time out fairly quickly.
|
|
XS_Vietnam_Soldiers
Send message
Joined: 11 Jan 06 Posts: 240 Credit: 2,880,653 RAC: 0
|
I think it is impossible to award points for the stuck_at_1% wu's. If you don't babysit your pc, it can run for ages. And there isn't a way for Rosetta to know how much cpu time it wasted. So how much credits should they award? Whatever a participant claims?
It would be nice when Boinc would kill those stuck wu's automagicly after a fair amount of time (like the no_heartbeat_from_client_core function ) so there would be a limit to the wasted time.
I took 3 hours and went through the logs to compile the list of what WU's errored out, a simple copy/paste. As to the issue of Rosttta staff knowing how much time was wasted, they have access through the node to the same information.
They could check the accuracy of anything that I or you submitted. Understandably, to check each and every would be incredibly time consuming but a spotcheck here and there would not be. I look at this from the perspective of honor. I would not submit anything I didn't feel was due to me or my team and I make the assumption that no one else would either. If they do, they are just cheating themselves.This "game" we play to draw in new people with the back and forth between us in all reality is to remove the boring reality of what we do.
It adds a little spice to the game, gives one an opportunity to meet people from other cultures. There is a huge benefit to the points and competitions.
Mr. Baker:
Thank you for your comments. I applaud you for your checking in on this issue.
I wish you the best of luck in solving the issue of the stuck work units.
I am glad to hear that the issue of credits for lost work is on your agenda. This is a very sore point to many in the DC field. I can not empathise(sp?) enough how many are upset over this issue. We had a team member leave Rosetta taking 62 machines with him to F@H just yesterday over this.
I don't know this person "Rom" you mentioned but I assume that he is a member of your team brought in to help out.
My best wishes to both of you,
Movieman from XS
|
|
Moderator9 Volunteer moderator
Send message
Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0
|
I think it is impossible to award points for the stuck_at_1% wu's. If you don't babysit your pc, it can run for ages. And there isn't a way for Rosetta to know how much cpu time it wasted. So how much credits should they award? Whatever a participant claims?
It would be nice when Boinc would kill those stuck wu's automagicly after a fair amount of time (like the no_heartbeat_from_client_core function ) so there would be a limit to the wasted time.
I took 3 hours and went through the logs to compile the list of what WU's errored out, a simple copy/paste. As to the issue of Rosttta staff knowing how much time was wasted, they have access through the node to the same information.
They could check the accuracy of anything that I or you submitted. Understandably, to check each and every would be incredibly time consuming but a spotcheck here and there would not be. I look at this from the perspective of honor. I would not submit anything I didn't feel was due to me or my team and I make the assumption that no one else would either. If they do, they are just cheating themselves.This "game" we play to draw in new people with the back and forth between us in all reality is to remove the boring reality of what we do.
It adds a little spice to the game, gives one an opportunity to meet people from other cultures. There is a huge benefit to the points and competitions.
Mr. Baker:
Thank you for your comments. I applaud you for your checking in on this issue.
I wish you the best of luck in solving the issue of the stuck work units.
I am glad to hear that the issue of credits for lost work is on your agenda. This is a very sore point to many in the DC field. I can not empathise(sp?) enough how many are upset over this issue. We had a team member leave Rosetta taking 62 machines with him to F@H just yesterday over this.
I don't know this person "Rom" you mentioned but I assume that he is a member of your team brought in to help out.
My best wishes to both of you,
Movieman from XS
This issue has actually been discussed before, but it might help to see it again in case you missed it.
The project IS committed to giving legitimate credits. In some cases it is simply not possible to do an accurate job of this. For example, when you outright abort a WU, that will be listed as a user aborted WU. There is no way to determine a credit value for that. IF a WU gets stuck, and you restart it, in almost every case, it will run to completion, but it will claim credit based on the restart from the last checkpoint. Usually this will be a normal credit claim, and will not include the time during which it was stuck. In that case there is no way for the the project to know how long it was stuck.
BUT in the case of bad WUs that crash, they will usually cycle through 3 computers before the server "retires" them. In that case there will be three computers that will have claimed some credit. The project CAN and HAS awarded credits for this type of problem. BUT, they have to wait for all the results to moved through the system, and land in the backend part of the project data base before they can be processed for credit awards. This of course assumes that one of the three system did not complete the WU successfully, in which case it is not a WU issue but is instead a client computer issue. At this time I am aware of three batches that are in this "bad Wu' category, where no one was able to process them. It takes about three to four weeks for those results to get to the place where they can be processed for credit awards, and some additional time to set up and run the process.
The last type of WU/Credit problem is the "Max Time" error, where the WU runs for a long time and then fails near the end of the process because it ran longer than the system thought it should. These would normally have completed ok if the system had not aborted them. There is at least one large group of these awaiting credit awards.
I have been told, and I think Dr, Baker's messages confirms this, that the credits will be awards for all of the above categories of WU issues, where it is possible to determine what should be awarded. But it takes someone a few days to process each of these sets of results for those awards, and right now there are simply not enough people to do it on demand. Since the project has stated that they will do this as soon as practical, and they have honored their word on this in the past, I think we can all expect them to honor their word this time. The real issue is, if you get the credit in the end, does it really matter if it is today, or two weeks from now, as long as you get it?
I know the project would like credit to be awarded as soon as the WU result is submitted. As users we would all like credit to be awarded as soon as the right conditions are met as well. But for errors this is simply not possible to do in the same time frame as would be the case for a normal WU. On some of the projects I run, I have credit awards outstanding for over 4 weeks waiting for a quorum in the normal BOINC award system. So personally I am content to wait. Since the project has never said they would not award the credit, I do not think waiting is unreasonable.
As an aside, while I know that we ALL spend money on running these projects, it is an individual choice as to what level of resource input you want to put into it. Since no one is getting paid for this unless they are working for the project, it is really after all a hobby, akin to flying for pleasure or restoring old cars. While your restored classic car may not get an award for every show it enters, in no case is there a responsibility for the show to pay the cost of your restoration or a part of the gate receipts of the show when you enter your car in the competition. As a volunteer, you have to look at this as a volunteer effort. And while I would agree that the project should put forth the effort to reward participation wherever they can, our expectations of reward as volunteers should be tempered with a little perspective of the realities of running these projects, and the fact that we participate willingly. If people are spending the rent money to participate solely for the credits, they really need to rethink what they are doing.
Moderator9
ROSETTA@home FAQ
Moderator Contact
|
|
XS_Vietnam_Soldiers
Send message
Joined: 11 Jan 06 Posts: 240 Credit: 2,880,653 RAC: 0
|
I think it is impossible to award points for the stuck_at_1% wu's. If you don't babysit your pc, it can run for ages. And there isn't a way for Rosetta to know how much cpu time it wasted. So how much credits should they award? Whatever a participant claims?
It would be nice when Boinc would kill those stuck wu's automagicly after a fair amount of time (like the no_heartbeat_from_client_core function ) so there would be a limit to the wasted time.
I took 3 hours and went through the logs to compile the list of what WU's errored out, a simple copy/paste. As to the issue of Rosttta staff knowing how much time was wasted, they have access through the node to the same information.
They could check the accuracy of anything that I or you submitted. Understandably, to check each and every would be incredibly time consuming but a spotcheck here and there would not be. I look at this from the perspective of honor. I would not submit anything I didn't feel was due to me or my team and I make the assumption that no one else would either. If they do, they are just cheating themselves.This "game" we play to draw in new people with the back and forth between us in all reality is to remove the boring reality of what we do.
It adds a little spice to the game, gives one an opportunity to meet people from other cultures. There is a huge benefit to the points and competitions.
Mr. Baker:
Thank you for your comments. I applaud you for your checking in on this issue.
I wish you the best of luck in solving the issue of the stuck work units.
I am glad to hear that the issue of credits for lost work is on your agenda. This is a very sore point to many in the DC field. I can not empathise(sp?) enough how many are upset over this issue. We had a team member leave Rosetta taking 62 machines with him to F@H just yesterday over this.
I don't know this person "Rom" you mentioned but I assume that he is a member of your team brought in to help out.
My best wishes to both of you,
Movieman from XS
This issue has actually been discussed before, but it might help to see it again in case you missed it.
The project IS committed to giving legitimate credits. In some cases it is simply not possible to do an accurate job of this. For example, when you outright abort a WU, that will be listed as a user aborted WU. There is no way to determine a credit value for that. IF a WU gets stuck, and you restart it, in almost every case, it will run to completion, but it will claim credit based on the restart from the last checkpoint. Usually this will be a normal credit claim, and will not include the time during which it was stuck. In that case there is no way for the the project to know how long it was stuck.
BUT in the case of bad WUs that crash, they will usually cycle through 3 computers before the server "retires" them. In that case there will be three computers that will have claimed some credit. The project CAN and HAS awarded credits for this type of problem. BUT, they have to wait for all the results to moved through the system, and land in the backend part of the project data base before they can be processed for credit awards. This of course assumes that one of the three system did not complete the WU successfully, in which case it is not a WU issue but is instead a client computer issue. At this time I am aware of three batches that are in this "bad Wu' category, where no one was able to process them. It takes about three to four weeks for those results to get to the place where they can be processed for credit awards, and some additional time to set up and run the process.
The last type of WU/Credit problem is the "Max Time" error, where the WU runs for a long time and then fails near the end of the process because it ran longer than the system thought it should. These would normally have completed ok if the system had not aborted them. There is at least one large group of these awaiting credit awards.
I have been told, and I think Dr, Baker's messages confirms this, that the credits will be awards for all of the above categories of WU issues, where it is possible to determine what should be awarded. But it takes someone a few days to process each of these sets of results for those awards, and right now there are simply not enough people to do it on demand. Since the project has stated that they will do this as soon as practical, and they have honored their word on this in the past, I think we can all expect them to honor their word this time. The real issue is, if you get the credit in the end, does it really matter if it is today, or two weeks from now, as long as you get it?
I know the project would like credit to be awarded as soon as the WU result is submitted. As users we would all like credit to be awarded as soon as the right conditions are met as well. But for errors this is simply not possible to do in the same time frame as would be the case for a normal WU. On some of the projects I run, I have credit awards outstanding for over 4 weeks waiting for a quorum in the normal BOINC award system. So personally I am content to wait. Since the project has never said they would not award the credit, I do not think waiting is unreasonable.
As an aside, while I know that we ALL spend money on running these projects, it is an individual choice as to what level of resource input you want to put into it. Since no one is getting paid for this unless they are working for the project, it is really after all a hobby, akin to flying for pleasure or restoring old cars. While your restored classic car may not get an award for every show it enters, in no case is there a responsibility for the show to pay the cost of your restoration or a part of the gate receipts of the show when you enter your car in the competition. As a volunteer, you have to look at this as a volunteer effort. And while I would agree that the project should put forth the effort to reward participation wherever they can, our expectations of reward as volunteers should be tempered with a little perspective of the realities of running these projects, and the fact that we participate willingly. If people are spending the rent money to participate solely for the credits, they really need to rethink what they are doing.
Moderator9:
Before I respond to your comments, I'd like to know the following:
Do you speak officially for the rosetta project and/or are you employed by them or are you a volunteer that just acts as a moderator on this forum?
Thank you,
Movieman from XS
|
|
FluffyChicken
Send message
Joined: 1 Nov 05 Posts: 1260 Credit: 369,635 RAC: 0
|
I think it is impossible to award points for the stuck_at_1% wu's. If you don't babysit your pc, it can run for ages. And there isn't a way for Rosetta to know how much cpu time it wasted. So how much credits should they award? Whatever a participant claims?
It would be nice when Boinc would kill those stuck wu's automagicly after a fair amount of time (like the no_heartbeat_from_client_core function ) so there would be a limit to the wasted time.
I took 3 hours and went through the logs to compile the list of what WU's errored out, a simple copy/paste. As to the issue of Rosttta staff knowing how much time was wasted, they have access through the node to the same information.
They could check the accuracy of anything that I or you submitted. Understandably, to check each and every would be incredibly time consuming but a spotcheck here and there would not be. I look at this from the perspective of honor. I would not submit anything I didn't feel was due to me or my team and I make the assumption that no one else would either. If they do, they are just cheating themselves.This "game" we play to draw in new people with the back and forth between us in all reality is to remove the boring reality of what we do.
It adds a little spice to the game, gives one an opportunity to meet people from other cultures. There is a huge benefit to the points and competitions.
Mr. Baker:
Thank you for your comments. I applaud you for your checking in on this issue.
I wish you the best of luck in solving the issue of the stuck work units.
I am glad to hear that the issue of credits for lost work is on your agenda. This is a very sore point to many in the DC field. I can not empathise(sp?) enough how many are upset over this issue. We had a team member leave Rosetta taking 62 machines with him to F@H just yesterday over this.
I don't know this person "Rom" you mentioned but I assume that he is a member of your team brought in to help out.
My best wishes to both of you,
Movieman from XS
This issue has actually been discussed before, but it might help to see it again in case you missed it.
The project IS committed to giving legitimate credits. In some cases it is simply not possible to do an accurate job of this. For example, when you outright abort a WU, that will be listed as a user aborted WU. There is no way to determine a credit value for that. IF a WU gets stuck, and you restart it, in almost every case, it will run to completion, but it will claim credit based on the restart from the last checkpoint. Usually this will be a normal credit claim, and will not include the time during which it was stuck. In that case there is no way for the the project to know how long it was stuck.
BUT in the case of bad WUs that crash, they will usually cycle through 3 computers before the server "retires" them. In that case there will be three computers that will have claimed some credit. The project CAN and HAS awarded credits for this type of problem. BUT, they have to wait for all the results to moved through the system, and land in the backend part of the project data base before they can be processed for credit awards. This of course assumes that one of the three system did not complete the WU successfully, in which case it is not a WU issue but is instead a client computer issue. At this time I am aware of three batches that are in this "bad Wu' category, where no one was able to process them. It takes about three to four weeks for those results to get to the place where they can be processed for credit awards, and some additional time to set up and run the process.
The last type of WU/Credit problem is the "Max Time" error, where the WU runs for a long time and then fails near the end of the process because it ran longer than the system thought it should. These would normally have completed ok if the system had not aborted them. There is at least one large group of these awaiting credit awards.
I have been told, and I think Dr, Baker's messages confirms this, that the credits will be awards for all of the above categories of WU issues, where it is possible to determine what should be awarded. But it takes someone a few days to process each of these sets of results for those awards, and right now there are simply not enough people to do it on demand. Since the project has stated that they will do this as soon as practical, and they have honored their word on this in the past, I think we can all expect them to honor their word this time. The real issue is, if you get the credit in the end, does it really matter if it is today, or two weeks from now, as long as you get it?
I know the project would like credit to be awarded as soon as the WU result is submitted. As users we would all like credit to be awarded as soon as the right conditions are met as well. But for errors this is simply not possible to do in the same time frame as would be the case for a normal WU. On some of the projects I run, I have credit awards outstanding for over 4 weeks waiting for a quorum in the normal BOINC award system. So personally I am content to wait. Since the project has never said they would not award the credit, I do not think waiting is unreasonable.
As an aside, while I know that we ALL spend money on running these projects, it is an individual choice as to what level of resource input you want to put into it. Since no one is getting paid for this unless they are working for the project, it is really after all a hobby, akin to flying for pleasure or restoring old cars. While your restored classic car may not get an award for every show it enters, in no case is there a responsibility for the show to pay the cost of your restoration or a part of the gate receipts of the show when you enter your car in the competition. As a volunteer, you have to look at this as a volunteer effort. And while I would agree that the project should put forth the effort to reward participation wherever they can, our expectations of reward as volunteers should be tempered with a little perspective of the realities of running these projects, and the fact that we participate willingly. If people are spending the rent money to participate solely for the credits, they really need to rethink what they are doing.
Moderator9:
Before I respond to your comments, I'd like to know the following:
Do you speak officially for the rosetta project and/or are you employed by them or are you a volunteer that just acts as a moderator on this forum?
Thank you,
Movieman from XS
The moderators are volunteers that help out in the forum.
BUT
When using the 'moderator' name as opposed to there real nickname on the forum they are speaking in official capacity for rosetta@home and hence represent them. That was why the moderator nicknames where created. If it was their own and not Rosetta@home's opinion then they should be responing under their own individual nicknamename.
Note: but give them some slack they're only human ;-)
Note2: If you want true 'official-ness' you will see Project developer or
Project scientist or Project something else under their name.
Team mauisun.org
|
|
Moderator9 Volunteer moderator
Send message
Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0
|
...
Moderator9:
Before I respond to your comments, I'd like to know the following:
Do you speak officially for the rosetta project and/or are you employed by them or are you a volunteer that just acts as a moderator on this forum?
Thank you,
Movieman from XS
I am not employed by the project, nor do I get paid to moderate the forums. While some of the Moderators are employed by the project or the University, I am a volunteer. I have four primary areas of participation as a Moderator in no particular order.
1) Keep things as organized as possible and assist the project team and the user community in finding what they need in the forums.
2) Respond to questions and provide direct assistance to the user community.
3) Draw project attention to information and issues of import on the forums (usually through off forum contacts), and distribute project responses to those issues as necessary.
4) Maintain a polite professional "G" or at most "GP" rated environment through monitoring of forum activities, and if necessary removal of inappropriate materials.
As to speaking for the project, I believe the project expects that of all of the moderators. I routinely ask the project team for feedback on my work on the forums, and so far the response has been "keep going you are doing fine". I usually do not take a firm position on a topic unless I have had some guidance from the project team. When I have taken a position without such guidance, I have indicated that was the case. I try not to get too deeply involved in issue discussions (But I am human and I am a Rosetta participant) so I sometimes do get involved.
That said, everything I post eventually gets read by someone on the project team. In fact one of the reasons for the color is to help them locate my posts quickly, and if I have said something they do not agree with they can respond to it quickly. So far that has not happened.
Everything in my posts on this thread has been gleaned from project guidance and prior posts from the project with the exception of the last paragraph of the post you quoted, which as I indicated when I posted it, was an aside.
Moderator9
ROSETTA@home FAQ
Moderator Contact
|
|
XS_Vietnam_Soldiers
Send message
Joined: 11 Jan 06 Posts: 240 Credit: 2,880,653 RAC: 0
|
...
Moderator9:
Before I respond to your comments, I'd like to know the following:
Do you speak officially for the rosetta project and/or are you employed by them or are you a volunteer that just acts as a moderator on this forum?
Thank you,
Movieman from XS
I am not employed by the project, nor do I get paid to moderate the forums. While some of the Moderators are employed by the project or the University, I am a volunteer. I have four primary areas of participation as a Moderator in no particular order.
1) Keep things as organized as possible and assist the project team and the user community in finding what they need in the forums.
2) Respond to questions and provide direct assistance to the user community.
3) Draw project attention to information and issues of import on the forums (usually through off forum contacts), and distribute project responses to those issues as necessary.
4) Maintain a polite professional "G" or at most "GP" rated environment through monitoring of forum activities, and if necessary removal of inappropriate materials.
As to speaking for the project, I believe the project expects that of all of the moderators. I routinely ask the project team for feedback on my work on the forums, and so far the response has been "keep going you are doing fine". I usually do not take a firm position on a topic unless I have had some guidance from the project team. When I have taken a position without such guidance, I have indicated that was the case. I try not to get too deeply involved in issue discussions (But I am human and I am a Rosetta participant) so I sometimes do get involved.
That said, everything I post eventually gets read by someone on the project team. In fact one of the reasons for the color is to help them locate my posts quickly, and if I have said something they do not agree with they can respond to it quickly. So far that has not happened.
Everything in my posts on this thread has been gleaned from project guidance and prior posts from the project with the exception of the last paragraph of the post you quoted, which as I indicated when I posted it, was an aside.
Moderator9:
Thank you for your reply.
From reading your comments I get the impression that you saw my comments as "wanting some financial reward" for participating in the project?
That is not the case at all. The ONLY reason I brought up a dollar value at all was to try to impart to you that I wasn't using shoddy equipment and that what I was using was stable and tested. The comment about entering a auto in a car show has no relevance to what I mentioned at all. I'm not trying to be combative here, quite the contrary, I'm trying to get discussion on what I saw as an "issue" in Rosetta and Dr. Baker answered those questions and I was surprised to see your comments made after he had answered my questions.
I have no problem at all with "waiting" for credits to be issued, whether it takes a month or three.
I do think however that since there are problems with some of the work units that have nothing to do with the users equipment or client that credit should be issued for the period of time spent on those units.
I look upon DC projects as a partnership. We provide the computing power that otherwise would cost hundreds of thousands of dollars to a lab and might make the project financially undoable.
Perhaps the following will give you a better insight into my thinking.
What I look for in a project is the following:
1) It has a goal that will benefit mankind in a direct way.Medical research into the diseases that effect and destroy lives to me are the ones that should get the most support from the DC community.
2)That the project be professionally run and properly funded.
3) That the people running the project respect and understand the commitment that is made by the people that support them. By this I mean a quiet understanding. I do not nor do I think most people have a desire or a need to be publicly Thanked for what we do. All the thanks I would ever need would be in hearing that a project I support has made a breakthrough that benefitted peoples lives.On the XS team, we have a sub team called LBM( Local Business Men. This sub team is composed of 2 young college age students that have worked very hard to put together over 85 computers for Rosetta. From what I understand last week one of the WU they crunched had some positive outcome. This was brought up in the XS forum and the pride of accomplishment was something you could "see".
4) That the project have in place some sort of points system that may be used by the DC community to rally other users to the cause and that it be a solid well working system that gives points based upon time involved and the computing power of the machines involved.
I don't expect perfection from anyone. You will never see me post on a server being down or a slow web page. I understand that these things do happen and are most times caused by issues beyond anyones control.
Back to my main point, if we look upon this as a partnership, together we can do what neither of us can do alone.
Thank you for your time.
Movieman from XS
|
|
Stwainer
Send message
Joined: 9 Nov 05 Posts: 27 Credit: 4,406,829 RAC: 0
|
We get points for doing this? I just thought it was fun.
|
|
Moderator9 Volunteer moderator
Send message
Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0
|
...
Moderator9:
Thank you for your reply.
From reading your comments I get the impression that you saw my comments as "wanting some financial reward" for participating in the project?
That is not the case at all. The ONLY reason I brought up a dollar value at all was to try to impart to you that I wasn't using shoddy equipment and that what I was using was stable and tested. The comment about entering a auto in a car show has no relevance to what I mentioned at all. I'm not trying to be combative here, quite the contrary, I'm trying to get discussion on what I saw as an "issue" in Rosetta and Dr. Baker answered those questions and I was surprised to see your comments made after he had answered my questions.
I have no problem at all with "waiting" for credits to be issued, whether it takes a month or three.
I do think however that since there are problems with some of the work units that have nothing to do with the users equipment or client that credit should be issued for the period of time spent on those units.
I look upon DC projects as a partnership. We provide the computing power that otherwise would cost hundreds of thousands of dollars to a lab and might make the project financially undoable.
Perhaps the following will give you a better insight into my thinking.
What I look for in a project is the following:
1) It has a goal that will benefit mankind in a direct way.Medical research into the diseases that effect and destroy lives to me are the ones that should get the most support from the DC community.
2)That the project be professionally run and properly funded.
3) That the people running the project respect and understand the commitment that is made by the people that support them. By this I mean a quiet understanding. I do not nor do I think most people have a desire or a need to be publicly Thanked for what we do. All the thanks I would ever need would be in hearing that a project I support has made a breakthrough that benefitted peoples lives.On the XS team, we have a sub team called LBM( Local Business Men. This sub team is composed of 2 young college age students that have worked very hard to put together over 85 computers for Rosetta. From what I understand last week one of the WU they crunched had some positive outcome. This was brought up in the XS forum and the pride of accomplishment was something you could "see".
4) That the project have in place some sort of points system that may be used by the DC community to rally other users to the cause and that it be a solid well working system that gives points based upon time involved and the computing power of the machines involved.
I don't expect perfection from anyone. You will never see me post on a server being down or a slow web page. I understand that these things do happen and are most times caused by issues beyond anyones control.
Back to my main point, if we look upon this as a partnership, together we can do what neither of us can do alone.
Thank you for your time.
Movieman from XS
Please don't misunderstand my comments. The reason I posted after Dr. Baker was simply to provide more background to his comments, and I certainly did not intend to either offend you or attack your position on this issue. What I was trying to do was draw attention to a range of comments and communications from all over the forums, and in some off line communications, and draw them together in one place to crystalize the positions I have seen from the project. AS you might notice Some of the posts from the project team are quite brief, so I do sometimes post afterward with additional information.
As for my opinion on all this, I realize you were not suggesting you be paid for participation, that was not my intended point. My point was that I frequently read how much people are paying for their participation, and while that is admirable, it is also a personal choice. Many of us build systems for this purpose, and are happy to do so. But it is supposed to be fun, as well as helpful to humanity.
Moderator9
ROSETTA@home FAQ
Moderator Contact
|
|
XS_Vietnam_Soldiers
Send message
Joined: 11 Jan 06 Posts: 240 Credit: 2,880,653 RAC: 0
|
...
Moderator9:
Thank you for your reply.
From reading your comments I get the impression that you saw my comments as "wanting some financial reward" for participating in the project?
That is not the case at all. The ONLY reason I brought up a dollar value at all was to try to impart to you that I wasn't using shoddy equipment and that what I was using was stable and tested. The comment about entering a auto in a car show has no relevance to what I mentioned at all. I'm not trying to be combative here, quite the contrary, I'm trying to get discussion on what I saw as an "issue" in Rosetta and Dr. Baker answered those questions and I was surprised to see your comments made after he had answered my questions.
I have no problem at all with "waiting" for credits to be issued, whether it takes a month or three.
I do think however that since there are problems with some of the work units that have nothing to do with the users equipment or client that credit should be issued for the period of time spent on those units.
I look upon DC projects as a partnership. We provide the computing power that otherwise would cost hundreds of thousands of dollars to a lab and might make the project financially undoable.
Perhaps the following will give you a better insight into my thinking.
What I look for in a project is the following:
1) It has a goal that will benefit mankind in a direct way.Medical research into the diseases that effect and destroy lives to me are the ones that should get the most support from the DC community.
2)That the project be professionally run and properly funded.
3) That the people running the project respect and understand the commitment that is made by the people that support them. By this I mean a quiet understanding. I do not nor do I think most people have a desire or a need to be publicly Thanked for what we do. All the thanks I would ever need would be in hearing that a project I support has made a breakthrough that benefitted peoples lives.On the XS team, we have a sub team called LBM( Local Business Men. This sub team is composed of 2 young college age students that have worked very hard to put together over 85 computers for Rosetta. From what I understand last week one of the WU they crunched had some positive outcome. This was brought up in the XS forum and the pride of accomplishment was something you could "see".
4) That the project have in place some sort of points system that may be used by the DC community to rally other users to the cause and that it be a solid well working system that gives points based upon time involved and the computing power of the machines involved.
I don't expect perfection from anyone. You will never see me post on a server being down or a slow web page. I understand that these things do happen and are most times caused by issues beyond anyones control.
Back to my main point, if we look upon this as a partnership, together we can do what neither of us can do alone.
Thank you for your time.
Movieman from XS
Please don't misunderstand my comments. The reason I posted after Dr. Baker was simply to provide more background to his comments, and I certainly did not intend to either offend you or attack your position on this issue. What I was trying to do was draw attention to a range of comments and communications from all over the forums, and in some off line communications, and draw them together in one place to crystalize the positions I have seen from the project. AS you might notice Some of the posts from the project team are quite brief, so I do sometimes post afterward with additional information.
As for my opinion on all this, I realize you were not suggesting you be paid for participation, that was not my intended point. My point was that I frequently read how much people are paying for their participation, and while that is admirable, it is also a personal choice. Many of us build systems for this purpose, and are happy to do so. But it is supposed to be fun, as well as helpful to humanity.
Fair enough..
yes, supposed to be fun..exactly my point..and thats where the competitions come into play and the points necessary for running those said competitions..and the stability of the points system is crucial to that..
IE: you run a given speed computer for a given time you expect it to produce a given number of points..not to find that it has run 15 hours at 1% and you have lost that time and ergo the points that go with it..That just leaves the participant with an empty feeling and also asking themselves" Why did I just run my machine for nothing?" and the nothing I refer to is not just the loss of points, but the fact that the machine did not contribute anything to the science.
I not only want to see credit issued for time spent on a WU, but the problem of WU's sent that can't be crunched. Thats the core issue, fix that and there are no issues about lost time.
Again, we're back to partnership. The closer we can work together on issues like this the more computing power that can be brought to this project.
I'd like nothing more than to see Dr.Baker on the cover of Time with a breakthrough that greatly benefits mankind. That to me would be the best "payment" that I could think of!
I'm glad to see that we agree at least in most areas..
Thank you for your time,
Movieman from XS
|
|
Moderator9 Volunteer moderator
Send message
Joined: 22 Jan 06 Posts: 1014 Credit: 0 RAC: 0
|
...Fair enough..
yes, supposed to be fun..exactly my point..and thats where the competitions come into play and the points necessary for running those said competitions..and the stability of the points system is crucial to that..
IE: you run a given speed computer for a given time you expect it to produce a given number of points..not to find that it has run 15 hours at 1% and you have lost that time and ergo the points that go with it..That just leaves the participant with an empty feeling and also asking themselves" Why did I just run my machine for nothing?" and the nothing I refer to is not just the loss of points, but the fact that the machine did not contribute anything to the science.
I not only want to see credit issued for time spent on a WU, but the problem of WU's sent that can't be crunched. Thats the core issue, fix that and there are no issues about lost time.
Again, we're back to partnership. The closer we can work together on issues like this the more computing power that can be brought to this project.
I'd like nothing more than to see Dr.Baker on the cover of Time with a breakthrough that greatly benefits mankind. That to me would be the best "payment" that I could think of!
I'm glad to see that we agree at least in most areas..
Thank you for your time,
Movieman from XS
We actually agree on almost everything, and please feel free to consider the questions I am going to raise as rhetorical, as they are not directed at you specifically.
The difficulty I see is that many people do not realize that a big part of Rosetta is development of the software required to run accurate models. Because of this, there is a lot of variability in the WUs and the way they are run. While many people expect the project to be as mature and stable as SETI and some of the other projects that have been around for a while, that is not the case here. So at least for now, a number of people cannot simply load the software and let it run without SOME monitoring. But many more can. This will change. That is the focus of Ralph.
I think your question is fair. If a WU is flat out bad, that is an issue. But it is also fair to say that even if it fails on two systems, if it runs to success on the third, then it might not be the WU that is at fault. It could be, but more than likely it is not. So what is the fair ground here? Considering the extra work for the project to award credits outside the normal process, and the fact that diverts them from fixing the problem itself, should there not be some point at which it is fair to say a user might loose some computer time?
I see many people trying to run the project with systems that are clearly not up to the task. They say they run other projects just fine. But this project is not like other projects. The requirements are clear for Rosetta, and people should expect to have trouble if they do not meet those requirements. When they don't, they may loose credits. Is this the fault of the project? Man of these users are among the loudest on this issue. How far should the project go to address these demands for credit?
Many users complain about leaving the application in memory, and they criticize Rosetta for that. But the fact is that if you do not leave applications in memory you will loose CPU cycles at application switches on ALL of the projects. With Rosetta the loss is more significant, but even CPDN will loose up to 15 min for each swap. It is not a requirement to keep applications in memory, but it improves the success rate, and saves cycles, so it is suggested. If people do not follow that suggestion, and they crash a WU, or loose CPU time, or hang a WU, who is at fault? Not all systems crash WUs on restart after a swap, and not all are running with apps in memory.
So I guess what I am saying is, that asking for fair credit awards is reasonable, but there also has to be some consideration given as to what is a fair request, and the time frame in which the project must actually award the credits once they have committed to do so. The answer to that varies from user to user, so there are about 46,000 opinions.
Moderator9
ROSETTA@home FAQ
Moderator Contact
|
|