Minimum energy function.

Author	Message
adrianxw Send message Joined: 18 Sep 05 Posts: 662 Credit: 12,140,580 RAC: 268	Message 48585 - Posted: 12 Nov 2007, 8:53:11 UTC How is the energy of a particular folded configuration calculated? I have seen several references which state that "given a configuration the energy can be quickly determined", but they don't go into any detail. Can I, for example, get a callable C/C++ function which when presented with a sequence of AA's and angles between them will return the energy of the configuration? Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. ID: 48585 · Rating: 0 · rate: / Reply Quote

hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0	Message 48605 - Posted: 13 Nov 2007, 7:01:19 UTC Last modified: 13 Nov 2007, 7:04:21 UTC This has never been comprehensively explained. What it is not: is the energy it takes to fold a protein. It has to do with the electrostatic potential, hydrophilic, hydrophobic and some Newtonian (molecular dynamic) type forces as I understand it. I don't think that it is as easy as using the Poisson-Boltzmann equation I could be wrong, I'm just guessing. From what I have read there are "force fields"* around each atom, so atoms may attract each other but squeezing them together causes them to repulse very strongly (increases energy) *Molecular Dynamics terminology. ID: 48605 · Rating: 0 · rate: / Reply Quote

Michael G.R. Send message Joined: 11 Nov 05 Posts: 264 Credit: 11,247,510 RAC: 0	Message 48631 - Posted: 13 Nov 2007, 21:37:48 UTC The general idea, as I understand it, is that proteins are most stable in shapes that require the least energy to maintain. That's what their calculating, but don't ask me about the details. ID: 48631 · Rating: 0 · rate: / Reply Quote

hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0	Message 48690 - Posted: 15 Nov 2007, 7:07:41 UTC If anybody would like to do some light reading :) It was 2001 written by Richard Bonneau and David Baker, and has to do with a lot of things in the protein prediction field. Energy function is in there, but not spelled out. here There is more around,here to name one ID: 48690 · Rating: 0 · rate: / Reply Quote

adrianxw Send message Joined: 18 Sep 05 Posts: 662 Credit: 12,140,580 RAC: 268	Message 48691 - Posted: 15 Nov 2007, 8:32:17 UTC Last modified: 15 Nov 2007, 8:33:44 UTC I have found various sites talking about energy functions. The common theme is that they tend to want to explain how they work rather then be a practical tool one can use. Thus, you need to be a biologist to get anything out of it. My feeling is, and if you search the boards here, you'll find it has been so since the start of the project, that an approach from a non-biologist may reveal ideas and techniques that someone already burdened with a wealth of scientific baggage may never consider. I am a software engineer, if you forget that amino acids are chemicals and that sequences become proteins, and simply think of the parameters as numbers, a host of computer science techniques data mining/manipulation/etc. could be applied to this mound of data. My own "folder" converges a sequence from a random start point to a structure within a few % of the known structure in a few minutes, BUT, and it really is a huge but, I can only work with protein sequences where the end structure is already known, which really is never going to help anyone. The reason for this is that I have to estimate the energy of a configuration based on the amount of difference there is between my specimen and the target. I can never do a de-novo/ab-initio on an unknown because there is no way I can tell which configuration has the lower free energy. Even a decent approximation would be useful as I could then use my program as an input filter to provide "promising" start points for more sophisticated folders. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. ID: 48691 · Rating: 1 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 48697 - Posted: 15 Nov 2007, 14:33:47 UTC I can only work with protein sequences where the end structure is already known, which really is never going to help anyone. I wasn't clear on whether you were explaining why you want to understand the energy function, or if you have a point of confusion about the project as a whole. So I'll add what I can. Because Rosetta does have an energy function, they can predict protein conformation. As predictions are generated, if the structure is known, then another figure (RMSD) is computed which compares the prediction to the known structure. So Rosetta models can fly just fine without a known structure. Indeed that is the whole point. If you think about it, the energy function is the one thing that is probably changed with every new release of Rosetta. That is the heart of what drives the search. And if they spelled out every detail of it, they would be giving away a life's work. So, in addition to being complex to describe, it is constantly changing. So any sufficiently detailed description is instantly obsolete. The good news is that the Rosetta game they are working on will give you a way to interact with the engergy function without having to deal with the minutia of all the numbers. Rosetta Moderator: Mod.Sense ID: 48697 · Rating: 0 · rate: / Reply Quote

svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0	Message 48789 - Posted: 18 Nov 2007, 21:09:24 UTC - in response to Message 48697. If you think about it, the energy function is the one thing that is probably changed with every new release of Rosetta. That is the heart of what drives the search. And if they spelled out every detail of it, they would be giving away a life's work. So, in addition to being complex to describe, it is constantly changing. So any sufficiently detailed description is instantly obsolete. This bothers me a bit. Why wouldn't the Rosetta team want to publish their methods (once they're stable )? It's been made clear that Rosetta is a scientific project and not a commercial one, a point emphasized in large bold type at the top of the home page. So there's no financial reason to keep the methods secret and it's normal practice in science to publish results in as much detail as is necessary to let others reproduce the work: in fact there's a lot of incentive to do this. A bit of clarification on this point would be appreciated. ID: 48789 · Rating: 0 · rate: / Reply Quote

Luuklag Send message Joined: 13 Sep 07 Posts: 262 Credit: 4,171 RAC: 0	Message 48837 - Posted: 19 Nov 2007, 20:37:43 UTC - in response to Message 48789. If you think about it, the energy function is the one thing that is probably changed with every new release of Rosetta. That is the heart of what drives the search. And if they spelled out every detail of it, they would be giving away a life's work. So, in addition to being complex to describe, it is constantly changing. So any sufficiently detailed description is instantly obsolete. This bothers me a bit. Why wouldn't the Rosetta team want to publish their methods (once they're stable )? It's been made clear that Rosetta is a scientific project and not a commercial one, a point emphasized in large bold type at the top of the home page. So there's no financial reason to keep the methods secret and it's normal practice in science to publish results in as much detail as is necessary to let others reproduce the work: in fact there's a lot of incentive to do this. A bit of clarification on this point would be appreciated. well they constatly find better/faster ways to predict, so there is not a really stable thing in the near future, and sending out stuff that changes every 2 months isn't really helping mutch. ID: 48837 · Rating: 0 · rate: / Reply Quote

proxima Send message Joined: 9 Dec 05 Posts: 44 Credit: 4,148,186 RAC: 0	Message 48858 - Posted: 20 Nov 2007, 11:05:30 UTC I am a software engineer, if you forget that amino acids are chemicals and that sequences become proteins, and simply think of the parameters as numbers, a host of computer science techniques data mining/manipulation/etc. could be applied to this mound of data. Me too, and that's a very interesting thought. For example, no doubt someone somewhere has tried applying Genetic Algorithms to protein folding? (GA's being a very efficient search algorithm when applied to certain kinds of problems - and needing a "fitness function" to measure the "error" of a particular solution - presumably the energy of a folded conformation would be a usable fitness function). As I said, I'm sure this has been tried many times before by people who know far more about biology and Genetic Algorithms than me. Still an interesting thought though... Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365. ID: 48858 · Rating: 0 · rate: / Reply Quote

proxima Send message Joined: 9 Dec 05 Posts: 44 Credit: 4,148,186 RAC: 0	Message 48859 - Posted: 20 Nov 2007, 11:07:31 UTC Last modified: 20 Nov 2007, 11:09:11 UTC [sorry, duplicate post deleted - either this forum, or my IP connection, is REALLY grinding today]. Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365. ID: 48859 · Rating: 0 · rate: / Reply Quote

adrianxw Send message Joined: 18 Sep 05 Posts: 662 Credit: 12,140,580 RAC: 268	Message 48910 - Posted: 21 Nov 2007, 17:49:44 UTC Last modified: 21 Nov 2007, 17:53:44 UTC My own "folder" mentioned earlier is a mutating genetic evolver. Getting a better energy function was my motivation for the thread. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. ID: 48910 · Rating: 0 · rate: / Reply Quote

hugothehermit Send message Joined: 26 Sep 05 Posts: 238 Credit: 314,893 RAC: 0	Message 49260 - Posted: 1 Dec 2007, 6:18:21 UTC Last modified: 1 Dec 2007, 6:23:20 UTC I guess no one is going to answer this here or at POEM@home :( I've had a bit to drink, but here goes. I believe Tinker has an energy function, I also believe that I read somewhere that it has been converted into C or C++. I am a software engineer, if you forget that amino acids are chemicals and that sequences become proteins, and simply think of the parameters as numbers, a host of computer science techniques data mining/manipulation/etc. could be applied to this mound of data. You should be able to do this with the protein structures in the PDB. I'm not sure how. I can never do a de-novo/ab-initio on an unknown because there is no way I can tell which configuration has the lower free energy. Even a decent approximation would be useful as I could then use my program as an input filter to provide "promising" start points for more sophisticated folders. The way (I think) they do this is to take homologue (use PSI-BLAST at PDB) proteins and cut them into pieces (somewhere between 3 and 12 Amino Acids, though of course this would be dependent on lots of things)fit those pieces back together, then check if enough water hating Amino Acids are going to be on the inside of the protein structure. I don't think that minimum energy has anything to do with this step. You should also disregard the known structure so as not to contaminate your data. Hopefully that made sense. ID: 49260 · Rating: 0 · rate: / Reply Quote

adrianxw Send message Joined: 18 Sep 05 Posts: 662 Credit: 12,140,580 RAC: 268	Message 49352 - Posted: 3 Dec 2007, 15:00:30 UTC Last modified: 3 Dec 2007, 15:04:54 UTC I guess no one is going to answer this here or at POEM@home :( I had a PM from POEM explaining why they could not give me their function at this time, but did not rule out making it available at a future date when the causative issue was resolved. In the mean time, I have been talking to an acquaintance here who was part of a team involved in one of the earlier CASP experiments. Obviously, the energy functions in use back then would be inferior, but it may give me a handle on how much CPU time I would need to use my algorithms with a real function rather then my somewhat crude estimator. I read somewhere that it has been converted into C or C++. Unless the original source is in a really esoteric language, I doubt there would be a problem adapting it. I know a lot of scientific code is still written in Fortran, but I coded in Fortran IV, 66, 77 and 9x from the late 1970's until about 1996. I also have a good Fortran compiler and have never had problems linking across languages after recognising a few representational issues which I can correct for. Wave upon wave of demented avengers march cheerfully out of obscurity into the dream. ID: 49352 · Rating: 0 · rate: / Reply Quote