Message boards : Number crunching : Recent Average Credit
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
OK, to sum up for my own feeble mind - 1. RAC is really a 'Project' issue and saves them a great deal of database 'issues'. 2. 'Normal' averages that 'normal' folk understand are available at other stats sites. 3. Everybody gets what they want somewhere! 4. No Problem! |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Must be time again to mention the use of "hobbyist" databases (MySQL) instead of real ones (Informix, et al). As soon as you agree to pay the licence cost for all projects that want to use BOINC, I am sure they will all be willing to move to Oracle. Until then, MySQL is free, and does the job ... |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
OK, to sum up for my own feeble mind - Yes |
carl.h Send message Joined: 28 Dec 05 Posts: 555 Credit: 183,449 RAC: 0 |
Bill, excuse me if I appear ignorant.... Yes the WU`s are the most important, agreed. BUT The data for the RAC is output from the server how many times a day for stats, it must make that calc and lookup at some point ? the project sites are running in "real time" - every time the validator awards credit for a WU, the total credits and RAC for those hosts and participants and teams is recalculated. Is that not saying the server has to make a complicated calculation, instead of 6+1 /7 ? P D Buck said A point I have tried to make in the past with people, usually unsuccessfully, is that the project is not about *US*, but, about getting work done for the project at minimal cost. A company that does not look after it`s workforce will eventually find itself in trouble ! Ask how many current crunchers here were originally at United Devices !! Paul if you want my money, which essentially is what the project gets don`t then stick two fingers up at me (or give me the finger in U.S. terms) The project is US and the D.Baker team together, don`t ever think otherwise. We pull our money..... no project! Adding a moving average sounds great to us, but does nothing for the project except increase the cost of operation ... So the RAC doesn`t move ? Don`t be absurd...the calculation for RAC is done somewhere, all I`m saying is that 6+1 /7 is definately easier and carries more weight with the workforce than an RAC which everyone ignores cos it`s crap. To even say it is an average one must have a key to state a starting point and an end. To state it is a recent average is just a lie. You want to save money ? Cut the RAC cos it`s worthless, it`s crap, it aint worth it, it`s bunkum. Not all Czech`s bounce but I`d like to try with Barbar ;-) Make no mistake This IS the TEDDIES TEAM. |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
6 +1/7 of what exactly? The scores are updated on the project every time a validator finishes. You might have one new score in a day, or none, or several. The RAC forlmula is designed so that the same formula can run as often as needed. Any form of moving average is easier sums but harder on the database as you need access to a range of past scores each time the figure changes. RAC just needs the old figures for running total and RAC. Sums are fast, database acces is slow. The stats projects do this by holding all the stats in memory all the time - BoincStats Willy is complaining now that he is filling up hi 3 Gb of memory just holding stats. That is not a huge problem for him, his server is a dedicated stats server. To do what you are asking on a main project database server would cripple the system for all of us. The project is US and the D.Baker team together, don`t ever think otherwise. You correctly said that the project is the scientists plus us. For me the "us" includes people like Willy, Mundayweb, BoincSynergy etc who provide volunteer services to enhance the enjoyment of the projects. There is absolutely no need, in my opinion, for the mainstream BOINC team to duplicate the excellent work of volunteers. Each of the three stats sites Ive mentioned here allow anyone to use their stats whether you are a member of their team or not. Try the following links, browse around. In the first site you can find your own stats by putting your user name in the serach box on the left hand side - the other sites have similar facilities. Boinc Stats Boinc Synergy Munday Web River~~ |
carl.h Send message Joined: 28 Dec 05 Posts: 555 Credit: 183,449 RAC: 0 |
River~~ Exactly my point the RAC is not needed and is wasting cycles, it is a complex piece of Math for it`s own sake. It is superfluos, it is like wanting to know the speed of car by measuring distance travelled over a period then adding in all previous journeys with a half life... RAC just needs the old figures for running total and RAC. and over which period is this credit average ? Or do we need a start date ? So accordingly you just need a total (wu`s) and RAC...is that correct ? Here you are then here`s my Total 7,037....my RAC 344.54 Show me your working out please as it`s easier on the server, must be easy to show. How about my way.... My total on 9th Jan was 6,037 (<--one figure per day) My total on 16th Jan 7,037 1000/7 = 142.86 Average.... Are you going to tell me the figures (total) are not held on the server ? Not talking about forever here am I ? Question : If the Inland Revenue or IRS worked out your pay like this would you be happy ? Would you say it is correct ? Not all Czech`s bounce but I`d like to try with Barbar ;-) Make no mistake This IS the TEDDIES TEAM. |
carl.h Send message Joined: 28 Dec 05 Posts: 555 Credit: 183,449 RAC: 0 |
Member example : A team member in the last 7 put in... 37 30 22 0 0 0 180 = total 269 His RAC is 116.66 ....which is in reference to what exactly ? He can use this figure in what aspect ? Not all Czech`s bounce but I`d like to try with Barbar ;-) Make no mistake This IS the TEDDIES TEAM. |
Vester Send message Joined: 2 Nov 05 Posts: 258 Credit: 3,651,260 RAC: 521 |
Why don't you post at BOINC instead of beating a dead horse in Seattle? I like RAC because it shows me when one of my remote crunchers isn't running continuously. |
carl.h Send message Joined: 28 Dec 05 Posts: 555 Credit: 183,449 RAC: 0 |
How does RAC show when one of your remotes isn`t continuous ? I`m interested...I want to know why everyone thinks RAC is easier when I can`t see it, nor can I see to what degree it is useful. Going to learn nothing by sitting back and accepting am I ? Not all Czech`s bounce but I`d like to try with Barbar ;-) Make no mistake This IS the TEDDIES TEAM. |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
I think what Vic means is that as it it a total BOINC question and not specific to Rosetta then the best place to query it would be at BOINC FORUMS rather than here ;-))) |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,450 RAC: 13 |
River~~ Exactly my point the RAC is not needed and is wasting cycles, it is a complex piece of Math for it`s own sake. I'm not going to defend the RAC; I don't like it either. But it is a single number that gives an _idea_ of how much you are producing for the project at any moment. Yes, it's worthless for a new participant because it climbs too slowly, and it doesn't instantly show when someone is producing less, because it decays too slowly. For the _average_ "set it and forget it" participant, however, who starts everything up and doesn't micromanage it, it is a single figure that can be used to determine if they're doing "more" or "less" than someone else. Your point is that you would prefer a _different_ number to do the same thing; a different way of calculating. We've seen that this different number is already available to you, done the way you want it, at Free-DC. You're still saying the project should use this method and drop RAC. RAC just needs the old figures for running total and RAC. There are three other items we need; the date/time of the last update of RAC, the current date/time, and how much credit you're being given right this second. That's it; NO history, NO "how many credits you had at any point in the past", NO start date. What the server has to store and retrieve to do RAC is - RAC, date/time of RAC, and total credits. Current date/time and current additional credit are available to the program that is recalculating the RAC. How about my way.... You're right, I'm going to tell you the figures are not held on the server. Your method would require not only changing the code, and adding a new (MASSIVE) program that would run every day, but adding six additional fields in the database, for the last six day's total credits. Even ignoring the additional megabytes of data storage, that's not the problem. Let me attempt to explain in "pseudo-code" why your simple request is incredibly complicated... Current RAC: Validator program runs to add credit. Reads database record it's about to update and gets RAC, RAC_DATE, and TOTAL_CREDIT. Using those and CURRENT_DATE and CURRENT_CREDIT, it calculates the new RAC, and writes the new value into RAC, the CURRENT_DATE into RAC_DATE, and adds CURRENT_CREDIT to TOTAL_CREDIT and writes it back in. One database read of three fields, one database write of three fields. (Repeated for participant, host, and team records.) Weekly, SOME projects run a "decay/cleanup" routine. This is the same code as above, but the CURRENT_CREDIT input is 0. These projects show a decreasing RAC when someone quits; if this is not run, then the RAC "freezes" when someone quits. Some projects do not run this routine, simply because it requires a LOT of database reads and writes; one for every participant, host, and team record. This can take hours to run, and severely slows down the servers while it's running. Moving Average Method: We'll assume the 7-day average _replaces_ current RAC: if it's in addition to, then the following is done in addition to the preceding. Validator program runs to add credit. Reads database record it's about to update and gets TOTAL_CREDIT. Adds CURRENT_CREDIT to TOTAL_CREDIT and writes it back in. One database read of one field, one database write of one field. (Repeated for participant, host, and team records.) No database load change from current approach, just less math. No gain, no loss. New "Carl's Average Calculating Program" runs. This MUST be done every day, and at the exact same time every day. Validator must be stopped while it runs. If it does not run "on time" for some reason, the Validator must not run until CACP can be run; otherwise "today" will get too much credit and "tomorrow" too little. This program is based on the old "decay" weekly program. It reads and writes every record in the three tables, as follows; Read TOTAL_CREDIT, CREDIT_BACK_1, CREDIT_BACK_2, CREDIT_BACK_3, CREDIT_BACK_4, CREDIT_BACK_5, and CREDIT_BACK_6. Subtract CREDIT_BACK_6 from TOTAL_CREDIT and divide by 7. Write this value to RAC. Write CREDIT_BACK_5 into CREDIT_BACK_6, CREDIT_BACK_4 into CREDIT_BACK_5, etc., shifting all the numbers "back a day". Trivial math; but requires a read and a write for every record in the participant, host, and team tables. ----- So - maybe you can see that what you are asking for is not just a "change in the math", but for every project to run a program that reads and writes tens or hundreds of thousands of records - EVERY DAY. When there is _already_ a program that was originally supposed to be run weekly to do "RAC decay", reading and writing those records; but some projects have HAD to stop running it, simply because even once a week, and even though the "decay" program doesn't require shutting off the validator, it puts too heavy a load on their servers. MAYBE you could get by with running this program weekly instead of daily... but people are NOT going to like having their RAC changed only once a week. Or you could just drop RAC completely and let the 3rd-party sites do it; but that will not reduce the project's load at all (still have to do the one read/write for each new credit) and participants who _do_ use the RAC will be screaming. The current method of RAC was chosen _because_ it can be updated "only when credit is added" - when the system is _already_ reading and writing that record. Any other method of coming up with a RAC that _I_ can think of, requires some program to read and write _every_ record, on some regular schedule. Therefore, the formula for RAC may be "tweaked" a bit here and there, but it's not going to be changed significantly. |
carl.h Send message Joined: 28 Dec 05 Posts: 555 Credit: 183,449 RAC: 0 |
At BOINC Please use these boards only for messages related to BOINC software: questions, bug reports, feature requests, etc. Not all Czech`s bounce but I`d like to try with Barbar ;-) Make no mistake This IS the TEDDIES TEAM. |
carl.h Send message Joined: 28 Dec 05 Posts: 555 Credit: 183,449 RAC: 0 |
Thanks Bill....I`m working on it ! Though I see flaws in the working out of my way, this is an initial view. Doesn`t have to read 1 through 6.....just total on given day. Will study closer. Not all Czech`s bounce but I`d like to try with Barbar ;-) Make no mistake This IS the TEDDIES TEAM. |
carl.h Send message Joined: 28 Dec 05 Posts: 555 Credit: 183,449 RAC: 0 |
READ_1 , READ_7 Calculate WRITE-AVERAGE No ? Not all Czech`s bounce but I`d like to try with Barbar ;-) Make no mistake This IS the TEDDIES TEAM. |
Tern Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,695,450 RAC: 13 |
What fields are required, and how the math is done, is not the point. The problem with _any_ approach other than the current one, is simply that some program must run daily, and read and write EVERY RECORD. The current approach reads NO records that wouldn't already be read. If you don't read BACK_4, then how do you move it to BACK_5? It's not needed in the calculation, but it's needed because it has to be moved. Reading 1 or 10 fields in a record is irrelevant; reading another record is very relevant. |
Scribe Send message Joined: 2 Nov 05 Posts: 284 Credit: 157,359 RAC: 0 |
At BOINC RAC is BOINC RAC is software You are QUESTIONING.. You are REQUESTING a FEATURE .......Bleagh! |
Vester Send message Joined: 2 Nov 05 Posts: 258 Credit: 3,651,260 RAC: 521 |
How does RAC show when one of your remotes isn`t continuous ? By viewing the RAC of the individual computers, I can see that the RAC decreases when not running Rosetta full-time. |
River~~ Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
READ_1 , READ_7 No. On a weekly running average, if it is Tuesday today and your last update was Sunday, I need to subract from the total the credit you got on Sunday (9 days ago) and on Monday (8 days ago) before I add the new credit. READ_LAST_DATE, READ_SUN, READ_MON, READ_WKAVG WRITE_AVG, WRITE TIME_OF_AVG This means I must store each day's numbers separately, so they can be subracted when they 'expire' OR I must stor each day's credit separately, and add up seven numbers to get each average. This means I must store each day's numbers separately so they can be added together as needed. READ_6, READ_5, READ_4, READ_3, READ_2, READ_1, READ_0 ADD, WRITE-AVG Notice that whichever way you do it, you end up storing different day's credits in different slots. There is no way to calculate a running average without doing this. And notice that if you want the result creditted at 15:07 UTC on that sunday to stay in the weekly stats for exatly a week, ie not to drop out at 15:06 a week later but to drop out at 15:08 a week later, then it is not just a case of storing each day separately, but each actual result separately together with its time awarded. At present the dates & times of a credit award only stay in the database till the results are removed - then nobody knows when or how you got those scores, it just carries forward the two numbers. As I said the issue is not the sums, it is not even really the storage as hard drives are cheap enough - the issue is the retrieval from storage of many more numbers than are used for the RAC. Even if (say) we locked all results into the database for a week after credit was awarded, every time credit5 was awarded the database would need to re-read many or all the credit awards for the last week. Reading seven items of data from seven different parts of the disk takes a *lot* longer than seven times the time to use numbers in the current record. In all fairness Carl these complexities are not obvious. Humans have good memory and poor calculating speed just think how fast you can recall two 7-digit phone numbers you use regularly, and compare that to the time it takes you to add them together in your head! OK: and now multiply them ;-) With current computer tech, computers are the other way round. It turns out it is a good deal to do a dozen multiplies to save one fetch. That is strongly counter intuitive. But anyone who has worked with databases will confirm it aint silly, the surprising truth. It isn't BS baffles brains, but the skill of experience versus the natural assumptions of inexperience. River~~ |
Paul D. Buck Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
last point that is also somewhat counter intuitive. Database systems are almost always I/O bound, so, computations are "cheap", accesses are expensive. So, a complex formula is less of a burden than one read. Adding fields to a record makes the record longer and that increases the I/O, assuming we use denormalized methods and add daily totals for today, yesterday, etc. But, as was pointed out we also have to track the day that this was done, turning 7 fields into 14. If we create a new table, now the rows are narrow, like we want, ID, date, value; but now I have to do a scan to pull 7 records, do the math and any updates (assuming we delay updates to the point of next use) Truning a single read and update into the original read and update along with additional reads and updates to the added records. |
carl.h Send message Joined: 28 Dec 05 Posts: 555 Credit: 183,449 RAC: 0 |
Guy`s, guy`s ease back now, my lads asked me in a nice sort of way to pull out. I can come across somewhat abrasive when I`m trying to get at something. I have learnt somewhat that Rosetta does not keep what I thought it did (maybe), but am far from convinced. Let`s leave it at that. I`m sure if I get my old Tandy and my basic book out.... Not all Czech`s bounce but I`d like to try with Barbar ;-) Make no mistake This IS the TEDDIES TEAM. |
Message boards :
Number crunching :
Recent Average Credit
©2024 University of Washington
https://www.bakerlab.org