Minimum energy function.

Message boards : Rosetta@home Science : Minimum energy function.

To post messages, you must log in.

AuthorMessage
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 62
Message 48585 - Posted: 12 Nov 2007, 8:53:11 UTC

How is the energy of a particular folded configuration calculated? I have seen several references which state that "given a configuration the energy can be quickly determined", but they don't go into any detail.

Can I, for example, get a callable C/C++ function which when presented with a sequence of AA's and angles between them will return the energy of the configuration?
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 48585 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 48605 - Posted: 13 Nov 2007, 7:01:19 UTC
Last modified: 13 Nov 2007, 7:04:21 UTC

This has never been comprehensively explained.

What it is not: is the energy it takes to fold a protein.

It has to do with the electrostatic potential, hydrophilic, hydrophobic and some Newtonian (molecular dynamic) type forces as I understand it.

I don't think that it is as easy as using the Poisson-Boltzmann equation

I could be wrong, I'm just guessing.

From what I have read there are "force fields"* around each atom, so atoms may attract each other but squeezing them together causes them to repulse very strongly (increases energy)

*Molecular Dynamics terminology.

ID: 48605 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael G.R.

Send message
Joined: 11 Nov 05
Posts: 264
Credit: 11,247,510
RAC: 0
Message 48631 - Posted: 13 Nov 2007, 21:37:48 UTC

The general idea, as I understand it, is that proteins are most stable in shapes that require the least energy to maintain. That's what their calculating, but don't ask me about the details.
ID: 48631 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 48690 - Posted: 15 Nov 2007, 7:07:41 UTC

If anybody would like to do some light reading :)

It was 2001 written by Richard Bonneau and David Baker, and has to do with a lot of things in the protein prediction field.
Energy function is in there, but not spelled out.
here

There is more around,here to name one
ID: 48690 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 62
Message 48691 - Posted: 15 Nov 2007, 8:32:17 UTC
Last modified: 15 Nov 2007, 8:33:44 UTC

I have found various sites talking about energy functions. The common theme is that they tend to want to explain how they work rather then be a practical tool one can use. Thus, you need to be a biologist to get anything out of it.

My feeling is, and if you search the boards here, you'll find it has been so since the start of the project, that an approach from a non-biologist may reveal ideas and techniques that someone already burdened with a wealth of scientific baggage may never consider.

I am a software engineer, if you forget that amino acids are chemicals and that sequences become proteins, and simply think of the parameters as numbers, a host of computer science techniques data mining/manipulation/etc. could be applied to this mound of data.

My own "folder" converges a sequence from a random start point to a structure within a few % of the known structure in a few minutes, BUT, and it really is a huge but, I can only work with protein sequences where the end structure is already known, which really is never going to help anyone. The reason for this is that I have to estimate the energy of a configuration based on the amount of difference there is between my specimen and the target.

I can never do a de-novo/ab-initio on an unknown because there is no way I can tell which configuration has the lower free energy. Even a decent approximation would be useful as I could then use my program as an input filter to provide "promising" start points for more sophisticated folders.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 48691 · Rating: 1 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 48697 - Posted: 15 Nov 2007, 14:33:47 UTC

I can only work with protein sequences where the end structure is already known, which really is never going to help anyone.


I wasn't clear on whether you were explaining why you want to understand the energy function, or if you have a point of confusion about the project as a whole. So I'll add what I can.

Because Rosetta *does* have an energy function, they *can* predict protein conformation. As predictions are generated, *if* the structure is known, then another figure (RMSD) is computed which compares the prediction to the known structure.

So Rosetta models can fly just fine without a known structure. Indeed that is the whole point.

If you think about it, the energy function is the one thing that is probably changed with every new release of Rosetta. That is the heart of what drives the search. And if they spelled out every detail of it, they would be giving away a life's work. So, in addition to being complex to describe, it is constantly changing. So any sufficiently detailed description is instantly obsolete.

The good news is that the Rosetta game they are working on will give you a way to interact with the engergy function without having to deal with the minutia of all the numbers.
Rosetta Moderator: Mod.Sense
ID: 48697 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 48789 - Posted: 18 Nov 2007, 21:09:24 UTC - in response to Message 48697.  

If you think about it, the energy function is the one thing that is probably changed with every new release of Rosetta. That is the heart of what drives the search. And if they spelled out every detail of it, they would be giving away a life's work. So, in addition to being complex to describe, it is constantly changing. So any sufficiently detailed description is instantly obsolete.


This bothers me a bit. Why wouldn't the Rosetta team want to publish their methods (once they're stable )? It's been made clear that Rosetta is a scientific project and not a commercial one, a point emphasized in large bold type at the top of the home page. So there's no financial reason to keep the methods secret and it's normal practice in science to publish results in as much detail as is necessary to let others reproduce the work: in fact there's a lot of incentive to do this. A bit of clarification on this point would be appreciated.

ID: 48789 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 48837 - Posted: 19 Nov 2007, 20:37:43 UTC - in response to Message 48789.  

If you think about it, the energy function is the one thing that is probably changed with every new release of Rosetta. That is the heart of what drives the search. And if they spelled out every detail of it, they would be giving away a life's work. So, in addition to being complex to describe, it is constantly changing. So any sufficiently detailed description is instantly obsolete.


This bothers me a bit. Why wouldn't the Rosetta team want to publish their methods (once they're stable )? It's been made clear that Rosetta is a scientific project and not a commercial one, a point emphasized in large bold type at the top of the home page. So there's no financial reason to keep the methods secret and it's normal practice in science to publish results in as much detail as is necessary to let others reproduce the work: in fact there's a lot of incentive to do this. A bit of clarification on this point would be appreciated.



well they constatly find better/faster ways to predict, so there is not a really stable thing in the near future, and sending out stuff that changes every 2 months isn't really helping mutch.

ID: 48837 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile proxima

Send message
Joined: 9 Dec 05
Posts: 44
Credit: 4,148,186
RAC: 0
Message 48858 - Posted: 20 Nov 2007, 11:05:30 UTC

I am a software engineer, if you forget that amino acids are chemicals and that sequences become proteins, and simply think of the parameters as numbers, a host of computer science techniques data mining/manipulation/etc. could be applied to this mound of data.


Me too, and that's a very interesting thought. For example, no doubt someone somewhere has tried applying Genetic Algorithms to protein folding? (GA's being a very efficient search algorithm when applied to certain kinds of problems - and needing a "fitness function" to measure the "error" of a particular solution - presumably the energy of a folded conformation would be a usable fitness function).

As I said, I'm sure this has been tried many times before by people who know far more about biology *and* Genetic Algorithms than me. Still an interesting thought though...
Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365.
ID: 48858 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile proxima

Send message
Joined: 9 Dec 05
Posts: 44
Credit: 4,148,186
RAC: 0
Message 48859 - Posted: 20 Nov 2007, 11:07:31 UTC
Last modified: 20 Nov 2007, 11:09:11 UTC

[sorry, duplicate post deleted - either this forum, or my IP connection, is REALLY grinding today].
Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365.
ID: 48859 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 62
Message 48910 - Posted: 21 Nov 2007, 17:49:44 UTC
Last modified: 21 Nov 2007, 17:53:44 UTC

My own "folder" mentioned earlier is a mutating genetic evolver. Getting a better energy function was my motivation for the thread.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 48910 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
hugothehermit

Send message
Joined: 26 Sep 05
Posts: 238
Credit: 314,893
RAC: 0
Message 49260 - Posted: 1 Dec 2007, 6:18:21 UTC
Last modified: 1 Dec 2007, 6:23:20 UTC

I guess no one is going to answer this here or at POEM@home :(

I've had a bit to drink, but here goes.

I believe Tinker has an energy function, I also believe that I read somewhere that it has been converted into C or C++.

I am a software engineer, if you forget that amino acids are chemicals and that sequences become proteins, and simply think of the parameters as numbers, a host of computer science techniques data mining/manipulation/etc. could be applied to this mound of data.


You should be able to do this with the protein structures in the PDB. I'm not sure how.


I can never do a de-novo/ab-initio on an unknown because there is no way I can tell which configuration has the lower free energy. Even a decent approximation would be useful as I could then use my program as an input filter to provide "promising" start points for more sophisticated folders.


The way (I think) they do this is to take homologue (use PSI-BLAST at PDB) proteins and cut them into pieces (somewhere between 3 and 12 Amino Acids, though of course this would be dependent on lots of things)fit those pieces back together, then check if enough water hating Amino Acids are going to be on the inside of the protein structure. I don't think that minimum energy has anything to do with this step. You should also disregard the known structure so as not to contaminate your data.

Hopefully that made sense.
ID: 49260 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 62
Message 49352 - Posted: 3 Dec 2007, 15:00:30 UTC
Last modified: 3 Dec 2007, 15:04:54 UTC

I guess no one is going to answer this here or at POEM@home :(

I had a PM from POEM explaining why they could not give me their function at this time, but did not rule out making it available at a future date when the causative issue was resolved.

In the mean time, I have been talking to an acquaintance here who was part of a team involved in one of the earlier CASP experiments. Obviously, the energy functions in use back then would be inferior, but it may give me a handle on how much CPU time I would need to use my algorithms with a real function rather then my somewhat crude estimator.
I read somewhere that it has been converted into C or C++.

Unless the original source is in a really esoteric language, I doubt there would be a problem adapting it. I know a lot of scientific code is still written in Fortran, but I coded in Fortran IV, 66, 77 and 9x from the late 1970's until about 1996. I also have a good Fortran compiler and have never had problems linking across languages after recognising a few representational issues which I can correct for.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 49352 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Rosetta@home Science : Minimum energy function.



©2024 University of Washington
https://www.bakerlab.org