Tells us your thoughts on granting credit for large protein, long-running tasks

Author	Message
spRocket Send message Joined: 23 Mar 20 Posts: 22 Credit: 3,008,018 RAC: 0	Message 95135 - Posted: 22 Apr 2020, 15:05:03 UTC I think I've picked up a 4 GB work unit on one of my systems - it has 8 GB RAM, but at the moment, only a single task is running, and I haven't touched its settings. The "top" command shows a resident size of 2.874GB. ID: 95135 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 2151 Credit: 12,881,353 RAC: 4,337	Message 95138 - Posted: 22 Apr 2020, 15:59:22 UTC - in response to Message 95123. We only have a few top notch 32+ cores machines with beefy GPUs around the world, Try hundreds of thousands, at the least. Yeap, see here ID: 95138 · Rating: 0 · rate: / Reply Quote

bkil Send message Joined: 11 Jan 20 Posts: 97 Credit: 4,433,288 RAC: 0	Message 95140 - Posted: 22 Apr 2020, 16:27:28 UTC - in response to Message 95123. Last modified: 22 Apr 2020, 16:37:26 UTC Yes, I agree that we should not crunch on everything. I meant to say on every computer where it is worth it, as per my thread "The most efficient cruncher rig possible". Sorry this part of the sentence got lost - I had to retype this message because no drafts are saved on this forum. We should do exact computations on this, but my gut feeling is that crunching on normal, non-extreme, non-server hardware can be at least somewhat efficient if it is: - more recent than 5 years - more recent than 10 years underclocked - more recent than 10 years portable You could actually produce a histogram/median/average of our current fleet from this data: https://boinc.bakerlab.org/rosetta/cpu_list.php Although I think the machine distribution is quite skewed towards the higher end compared to the general population, so it shouldn't be considered representative. Also note that computers are usually recycled after about 15-20 years of age in general, so you shouldn't see a large number of them in operation anyway. By "only a few top notch computers" I meant that I expect them to be much less than 1% of the population according to my gut feeling. Also, I expect that most deployed high performance computers already serve a purpose and usually couldn't offer their unused capacity for volunteer computing, as a given company made a big investment to purchase and operate them. On the other hand, there exist a vast amount of computers just sitting there all day long in businesses, schools and homes. If we assume that we are only talking about the more efficient crunchers, the benefit of their computation should far outweigh their cost in electricity. And if you are not running 24/7 but are running BOINC in the background with low priority, it still has higher energy efficiency due to the components that are shared between a given project and the user. For example, if a user's machine idles at 30W, then the +60W CPU power cost would be less than operating a dedicated cruscher at 90W either in their own home, or in a separate lab. Thus reducing global warming, and also producing less electric waste (less servers to manufacture - less of them to dispose of). ID: 95140 · Rating: 0 · rate: / Reply Quote

allen Send message Joined: 14 Apr 20 Posts: 1 Credit: 61,472 RAC: 0	Message 95141 - Posted: 22 Apr 2020, 17:36:45 UTC Hello all: I'm new here and am wondering how Rosetta determines the amount of wu's to send each computer. The reason I ask is because I have had wu's cancelled before they are finished since they ran out of time. I have a system that is receiving 8 hour wu's that are continuously taking over 24 hours to run. Hopefully one of you will fill me in on what's happening here. Thanks a bunch, Allen ID: 95141 · Rating: 0 · rate: / Reply Quote

bkil Send message Joined: 11 Jan 20 Posts: 97 Credit: 4,433,288 RAC: 0	Message 95142 - Posted: 22 Apr 2020, 17:39:30 UTC - in response to Message 95140. Last modified: 22 Apr 2020, 17:39:45 UTC I've processed the CPU list table from the above post. Because the sum is much less than the one on the homepage, I think this may include any registered member on the project, not only the active members. Also note that HT CPU's are overrated at least 50% in the total stats (they simply multiply thread count by per-thread flops). As HT is much more prominent at the high end than the low end (envision Celerons/Pentiums), this skews the stats even more towards the right. 21428.9 TFlops;97.8928 GFlops/host mean;218902 host 20.34 GFlops/host median 64915 < 5 GFlops 10834 < 10 GFlops 19593 < 15 GFlops 12726 < 20 GFlops 11298 < 25 GFlops 10666 < 30 GFlops 8273 < 35 GFlops 5766 < 40 GFlops 1993 < 45 GFlops 2626 < 50 GFlops 1451 < 55 GFlops 3696 < 60 GFlops 2406 < 65 GFlops 1783 < 70 GFlops 1363 < 75 GFlops 1437 < 80 GFlops 2547 < 85 GFlops 1959 < 90 GFlops 4437 < 95 GFlops 198 < 100 GFlops 332 < 105 GFlops 133 < 110 GFlops 22 < 115 GFlops 298 < 120 GFlops 28904 < 125 GFlops 102 < 135 GFlops 452 < 140 GFlops 404 < 145 GFlops 228 < 150 GFlops 22 < 160 GFlops 355 < 165 GFlops 14 < 175 GFlops 15 < 180 GFlops 23 < 195 GFlops 21 < 200 GFlops 20 < 205 GFlops 20 < 210 GFlops 11 < 215 GFlops 19 < 220 GFlops 16 < 225 GFlops 174 < 245 GFlops 126 < 250 GFlops 12 < 275 GFlops 19 < 290 GFlops 30 < 315 GFlops 55 < 335 GFlops 135 < 380 GFlops 14 < 405 GFlops 11 < 630 GFlops 47 < 645 GFlops 16686 < 830 GFlops ID: 95142 · Rating: 0 · rate: / Reply Quote

bkil Send message Joined: 11 Jan 20 Posts: 97 Credit: 4,433,288 RAC: 0	Message 95143 - Posted: 22 Apr 2020, 17:46:12 UTC - in response to Message 95141. Last modified: 22 Apr 2020, 17:48:53 UTC I think your question is off-topic here, but let me give a TL;DR. I can see under your account that you have dozens of in progress WU's. Please visit computing preferences under your account and reduce your store at least ... and store up to additional ... values. They should probably sum to be less than 1 day, even down to 0.1+0.1days during debugging while BOINC is learning your processing rate. According to this task, it indeed took 24 hours of CPU to complete 195 decoys: https://boinc.bakerlab.org/rosetta/result.php?resultid=1153332354 Please double check the target CPU runtime in your Rosetta@home preferences under your account. It defaults to 8 hours, although 24 hours should be still doable. Deadlines are around 3 days I think. ID: 95143 · Rating: 0 · rate: / Reply Quote

Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1925 Credit: 18,534,891 RAC: 0	Message 95159 - Posted: 22 Apr 2020, 22:32:24 UTC - in response to Message 95141. Last modified: 22 Apr 2020, 22:51:22 UTC I'm new here and am wondering how Rosetta determines the amount of wu's to send each computer. The reason I ask is because I have had wu's cancelled before they are finished since they ran out of time. It's pretty much the same for all projects- they send a rough Estimate of how long it thinks it will take your system to return work. But since you're new to the project, it doesn't have any history for work done, and so that estimate can be way off. Since you are running more than 1 project, you would be much better off with no cache at all. At the very least, an extremely small one. On the top of this page, click on your name at the top right, then in your Account, under Preferences, When and how BOINC uses your computer, click on "Computing preferences." Down the bottom is a link to Edit. Computing Usage limits Use at most 100% of the CPUs Use at most 100% of CPU time When to suspend Suspend when computer is on battery (not selected) Suspend when computer is in use (not selected) Suspend GPU computing when computer is in use (not selected) 'In use' means mouse/keyboard input in last 3 minutes Suspend when no mouse/keyboard input in last --- minutes Suspend when non-BOINC CPU usage is above --- % Compute only between --- Other Store at least 0.1 days of work Store up to an additional 0.02 days of work Switch between tasks every 60 minutes Request tasks to checkpoint at most every 60 seconds Disk Use no more than 20 GB Leave at least 2 GB free Use no more than 60 % of total Memory When computer is in use, use at most 95 % When computer is not in use, use at most 95 % Leave non-GPU tasks in memory while suspended (not selected) Page/swap file: use at most 75 % Click on "Update changes." In the BOINC Manager, View, Advanced. Select Rosetta in the Project tab, then update. those changes will then take effect. See how those settings go, particularly the Other settings. I have a system that is receiving 8 hour wu's that are continuously taking over 24 hours to run. In your account, Preferences for this project click on "Rosetta@home preferences" Set the Target CPU run time to "not selected" and Update to save them. That way it will use the default which is presently 8 hours. Any currently running tasks will use the old value, any non-running Tasks will use the new value when they start (once the Manager has contacted the Scheduler, or you have pressed Update in the Manager). Some Tasks will run longer than their Target CPU Runtime. They are able to run for up to 10 more hours, after which time the Wacthdog timer will end the Task. Grant Darwin NT ID: 95159 · Rating: 0 · rate: / Reply Quote

Michael E. Send message Joined: 5 Apr 08 Posts: 16 Credit: 2,007,414 RAC: 65	Message 95330 - Posted: 24 Apr 2020, 22:46:38 UTC I use a lot of BOINC projects. PrimeGrid applies a bonus for long-running tasks because most people like short-running tasks. For example, looking at CPU-only tasks: Subprojects with a 10% long job credit bonus have recent average CPU time of 41:29:00 and 60:40:12 hours Subprojects with a 20% long job credit bonus have a recent average CPU time of 107[/list]:29:32 and 125:37:06 hours Other subprojects with longer run-times have long job and conjecture bonuses. To see details, create a PrimeGrid account and choose Your Account > PrimeGrid Preferences. Or send me a message and ask for a text/screen cap. The preferences also show completion times. I used to choose projects in part by measuring the points per CPU hour to find those with a high reward. Now I am concerned about medical science more than points. ID: 95330 · Rating: 0 · rate: / Reply Quote

RME Send message Joined: 4 Mar 20 Posts: 12 Credit: 1,211,010 RAC: 0	Message 95339 - Posted: 25 Apr 2020, 8:42:02 UTC - in response to Message 95330. I can't wait to get to 1,000,000 points so I can get my reward. ID: 95339 · Rating: 0 · rate: / Reply Quote

teacup_DPC Send message Joined: 3 Apr 20 Posts: 6 Credit: 2,744,282 RAC: 0	Message 95343 - Posted: 25 Apr 2020, 11:17:00 UTC - in response to Message 95123. but if we contributed every phone, tablet and low-mid end office machine, typically with 2-4 cores, our computing capacity could increase by orders of magnitude. (I.e., we have way less than a million hosts and there exist billions of personal computing devices in the world) For as many of of those devices there are, many are of such low capability they are of no use to many projects. And for those that are of use, their frequent use for what they were designed for by the users means they often can't contribute much during those periods, compared to more capable systems. Just to nuance this, I know people getting their old phones from below a layer of dust out of the chest of drawers and setting them to work. As I've understood they only can be functional with their display turned off, so I doubt if that very phone is available for normal use at all. And you need to keep in mind efficiency isn't actually about low peak or maximum power use- it is about energy used over time to complete a task. It's no good having a device use 1W if it takes 1 month to produce a result when something that uses 1kW can produce the same result in a matter of seconds. Yeah, it's instantaneous power consumption is a lot higher. But it uses less energy to do the same work. And the fact it can do so much more work over the same period of time as the slower device makes it even more useful to a project. I read your point, and it sounds logical, but that coin has two sides. Phone hardware is tailored as well to be super efficient, while continuously needs to be on battery use. Desktop hardware does not necessarily has this efficiency pedigree, though large steps have been made miniaturizing the processor circuits. This phone sideline is a bit off topic perhaps, I admit. But your remark made me a bit curious, I need to search somewhere an GFLOP/W ratio or so. Maybe you're completely right after all, I only caught myself on the thought I was not able to quantify your argumentation. I think an interesting topic in itself. But no need marginalize our beloved Behemoth machines. I am always impressed what their work throughput is in my team (Dutch Power Cows), saliva dripping from the corners of my mouth looking at those numbers. My older i5 and i7 processors stand their ground, but they are from another order. Independent from this 4GB discussion my next processor becomes a big Ryzen, that's for sure. Behemoths and more potent desktops will always remain a pillar in the capacity of distributed computing. Rosetta is stretching herself by trying to meet the phone clients and the potent desktop client with those 4GB jobs. If support of phones proves to be a long term investment time needs to learn, but there are a lot of (old) phones out there, and they represent a huge capacity. That is tried to harvest this I can fully understand. (sorry, a bit off topic i fear) ID: 95343 · Rating: 0 · rate: / Reply Quote

Tom M Send message Joined: 20 Jun 17 Posts: 178 Credit: 36,855,866 RAC: 3,779	Message 95345 - Posted: 25 Apr 2020, 12:22:08 UTC - in response to Message 94950. I expect you do not want to end up with a bias toward 1GB or 4GB jobs, while both are needed. For the clients that can handle the 4GB jobs the bias should be neutral. Unless you expect a tendency towards more 4GB jobs with respect to 1 GB jobs, or the other way around, then you want a bias. That's the thinking. Over all, the effect should be neutral. People shouldn't lose out for processing these larger RAM requirement Tasks, and they shouldn't get a boost either. All the work is important, so if a Tasks stops 2 or more others form being processed at that time, it needs to offset that loss in production. Credits can't buy you a toaster, but they can let you see how you are doing, and how much you have done to help Rosetta. +1 Proud member of the O.F.A. (Old Farts Association) ID: 95345 · Rating: 0 · rate: / Reply Quote

bkil Send message Joined: 11 Jan 20 Posts: 97 Credit: 4,433,288 RAC: 0	Message 95348 - Posted: 25 Apr 2020, 13:27:12 UTC - in response to Message 95343. Last modified: 25 Apr 2020, 13:32:25 UTC I've already answered some of your questions above regarding efficiency and whatnot: - https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13833&postid=95140 If battery use is an issue for you, see also: - Running on a 4GB Raspberry Pi 4 - How to? - Running Rosetta on Raspberry Pi 3B+ (how to guide) - How to Recycle Android Phones for BOINC or Folding Rig Without Using Batteries, also runs on Amlogic Smart TV Boxes You can actually compute an approximate performance/watt quite easily from the CPU list shared earlier in this thread and some Wikipedia or datasheet lookups for power consumption. BOINC can run on many Android phones in the background regardless of whether you are using it or not. An aging phone from many years ago can still crank out as much RAC as a Raspberry Pi 4. It is usually set up so it only computes when it is on charger and having finished the charging cycle during the night. At the same time, phones with iOS can only compute with DreamLab while the screen is on. As you've rightly noted that a PC is more universal and supports more projects. Although an SBC can be more power efficient credit/watt or credit/$ and could take up less space, but you will need to maintain more nodes. With the right tools and experience, this shouldn't be an issue, but you should keep this in mind. So although we may not be able to declare a clear winner, it's good to be aware of all the options. ID: 95348 · Rating: 0 · rate: / Reply Quote

Ged Send message Joined: 17 Apr 06 Posts: 2 Credit: 1,034,115 RAC: 0	Message 95350 - Posted: 25 Apr 2020, 15:03:23 UTC - in response to Message 95082. For me, personally, I'm not driven by the credits granted for running work units; It's about contributing to the science, either by running work units which model a particular behaviour or sheer crunching of data for further treatment or research candidate selection/rejection. I'd rather the see application development and testing effort be expended producing efficient and effective code. I'd also like to see more realistic operational criteria being assigned to work units so as not to 'waste' computing effort (and electricity) by having my machines swamped with, often, spuriously defined deadlines, maybe by including some operational acceptance testing rather than just functional tests. That's my 10c's worth ;-) Ged Ged, I just wanted to clarify, are you basically suggesting that you'd like to see some way to control the deadline of the work you receive? Or have a way to only be assigned WUs that have 8 day deadlines? Or are you referring to cases where the BOINC Manager gets tricked into requesting more R@h work than is required to fill your work cache, and to complete before the 3 day deadlines? Mod.Sense Not to control the deadline of received WUs nor only accepting 8-day deadline WUs, it's more the latter case but some means to ensure that a WU has a realistic deadline for a given WU's payload. ID: 95350 · Rating: 0 · rate: / Reply Quote

teacup_DPC Send message Joined: 3 Apr 20 Posts: 6 Credit: 2,744,282 RAC: 0	Message 95422 - Posted: 27 Apr 2020, 14:17:18 UTC - in response to Message 95348. Last modified: 27 Apr 2020, 14:18:23 UTC Hi sangaku I found your https://boinc.bakerlab.org/rosetta/forum_thread.php?id=13791#94266 thread, read some of its first posts. I liked the questioning approach of it , and will direct responses concerning what hardware to use in that topic. Your Raspberry Pi 4 remark did set me thinking. Without doing the math I got a vision of a stack of these things, each taking 2 or 3 threads. Being a Dutch my financial domain, as yours, is Euros, and a Pi4 can be fetched in Holland for around 50-60 Euro's. Storage and PSU for all those Pi's should be approached in some clever combined way. First will completely read that topic now, probably the math will not add up, making a Pi 4 a no go. But only fantasizing about that pile of Pi's made my morning a good one, though it probably was not the aim of your post :\|. Thanks! ID: 95422 · Rating: 0 · rate: / Reply Quote

Millenium Send message Joined: 20 Sep 05 Posts: 68 Credit: 184,283 RAC: 0	Message 95429 - Posted: 27 Apr 2020, 16:52:59 UTC I don't really care about credits, as long as they are consistent so we can use them to judge the performance of different computers it's fine. Instead the main problem for WUs whose models take too much time, is the checkpointing. Shutting down a PC and losing 6 hours of work isn't good. To solve this problem, if of course changing how checkpointing works, a good idea is to let us choose if we want to get these WUs where checkpointing is problematic. If someone keeps his pc running 24/24 then they can get these WUs without problems. If instead someone shut it down every day then it's better to avoid them. Sure, if checkpointing can be changed to save the progress no matter if a model is completed or not then no problem. ID: 95429 · Rating: 0 · rate: / Reply Quote

lazyacevw Send message Joined: 18 Mar 20 Posts: 12 Credit: 93,576,463 RAC: 0	Message 95449 - Posted: 27 Apr 2020, 22:36:03 UTC Last modified: 27 Apr 2020, 22:37:55 UTC My question about credits is, what is up with this guy? Within 3 days, he has the top three "fastest" computers by nearly a factor of 6. [/img] ID: 95449 · Rating: 0 · rate: / Reply Quote

Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1925 Credit: 18,534,891 RAC: 0	Message 95461 - Posted: 28 Apr 2020, 5:13:59 UTC - in response to Message 95449. My question about credits is, what is up with this guy? Within 3 days, he has the top three "fastest" computers by nearly a factor of 6. They are returning a lot of Tasks for such a small number of core/threads. 0.72 day turn around. 8 hour runtime. 4,600 Tasks in progress on one system, over 6000 Valid. 0.72 day turn around. 8 hour runtime. 1,300 Tasks in progress on the others, roughly 1,650 each Valid on the others. Number of times client has contacted the server, 3 for one system. 0 for the others? Some sort of CPU compute cluster feeding it's results through those host IDs? Grant Darwin NT ID: 95461 · Rating: 0 · rate: / Reply Quote

strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0	Message 95467 - Posted: 28 Apr 2020, 9:06:58 UTC Yes, pretty clearly using those hosts to somehow feed work to other cpus. Very clever but clearly not actually the top cpu. Does anyone know how he can do that out of interest? ID: 95467 · Rating: 0 · rate: / Reply Quote

Millenium Send message Joined: 20 Sep 05 Posts: 68 Credit: 184,283 RAC: 0	Message 95472 - Posted: 28 Apr 2020, 11:59:35 UTC Over 750.000 RAC with a single computer? Even a dual EPYC 7702 computer has no way to get such a high RAC. And his pc seems to have a single EPYC 7702P ID: 95472 · Rating: 0 · rate: / Reply Quote

[DPC]_Fatal_Error_Group~Bubbles Send message Joined: 17 Mar 06 Posts: 1 Credit: 382,602 RAC: 0	Message 95488 - Posted: 28 Apr 2020, 17:06:07 UTC Someone from DPC over here: we've notified the guy running the Nifhack account of this thread and asked if he wants, and is able to, clarify this. He's know for having access to huge amounts of computational power (at work, I believe) but can't deploy all of it all the time. He's also known to rarely part with specifics. My guess is as well those machines are indeed some sort of hosts to the computers behind. ID: 95488 · Rating: 0 · rate: / Reply Quote