protein-protein docking at Rosetta@Home

Author	Message
Chu Send message Joined: 23 Feb 06 Posts: 120 Credit: 112,439 RAC: 0	Message 29146 - Posted: 11 Oct 2006, 1:56:50 UTC Last modified: 13 Oct 2006, 1:26:39 UTC Protein-protein docking is a computational task which aims at predicting the structure of a protein complex given the structures of each individual protein partners are solved. In the update 5.32, we make Rosetta protein docking protocol compatible with Rosetta@Home so that we can take advantage of the computational power brought by the BOINC distributed computing technology and of course the generous contribution from users all over the world. Thank everyone for the help ! My name is Chu Wang and I am a graduate student in Dr. Baker's lab working on developing new methodology to better understand the protein docking problem. Below is some background information about this project. Docking (in biology) refers to the computational technique aiming at predicting the interaction between two or even more biological molecules. Such interactions could be between proteins, proteins and DNAs(or RNAs), or proteins and small chemical compounds (ligands). So docking can be classifed as protein-protein docking, protein-DNA docking and protein-ligand docking. Protein interactions are very important in biology because proteins do not act alone and they have to "talk" to each other in order to accomplish any biological process. Solving the strucure of a protein complex can provide a mechanisitc base for understanding how a biologcial signal is transducted or how a biological function is carried out. It is much more time-consuming and technically chanllenging to solve the structure of a protein complex experimentlly, and thus developing computational methods to approach this problem becomes of great interest for many groups. Besides predicting the structure of a protein from its sequence, Rosetta has also been developed to handle various type of docking problems as mentioned above, including protein-protein docking. In our standard protein-protein docking protocol, we start with two protein structures in space, firstly carry out a very fast but crude search to find a rough shape fit between these two proteins. During the first stage, the proteins are represented by only backbones (which defines the shape) and one pseudo atom for sidechains (that is why it is fast). Afterwards, sidechain atoms are added back and the docking protocol enters the full-atom refinement stage in which the relative orientation between the two proteins and the detailed sidechain interactions across the interface are optimized simultaneously. Each trajectory will end up with a model with certain docking oridentation and we also have an energy function to rank them. The complexity of a docking problem can vary a lot. Proteins are flexible and dynamic biological molecules which means that its 3-D structure may change under a different condition. Such flexibility can be observed in both backbone and sidechain level. So it is very possible that the protein structures in their isolated (unbound) form we start with may look different from those in the final complex (bound) form. 1. If no internal freedoms are considered for each protein, it is more like docking two "rocks" together and it is called "rigid-body" docking. Only six parameters are varibable and they are translation and rotation to decribe the relative orientation of the two proteins. 2. As mentioned above, the current standard Rosetta docking method takes the sidechain flexibility into consideration though the protein backbones are still being fixed. We may call this approach as "semi rigid-body" docking. 3. The next level of docking problem is "flexible-backbone" docking, which is to allow protein backbones to vary as well. This is a very challenging as in addition to sampling the rigid-body orientation, we will also have to take care of the "folding" problem of TWO proteins. Similar to the CASP experiment for structure prediction, there is a blind docking prediciton experiment -- CAPRI, in which two protein structures are provided and participants are asked to predict the sturcture of the complex. Using Rosetta, the Baker Lab team has submitted high-quality predictions for several targets for which backbones do not vary very much between the unbound form and bound form and some of these predictions are even accurate at atomic resolution( both backbone and sidechain are correct). This has shown the strength of our protocol in allowing sidechains to be optimized. However, an important lesson learned from the CAPRI experiments is that the current bottleneck for developing protein docking methods is how to treat backbone flexibility as there were almost unanimous failure for the targets with backbone movements upon forming the complex. Currently, I am working to develop new approaches to consider backbone flexibility in our Rosetta docking method and I believe the compuational power provided by the BOINC and millions of millions people who volunteer to donate their computer resource is a key factor to the success of this challenging project. Again, I would like thank all the Rosetta@Home users for their generous help and contribution. ID: 29146 · Rating: 6 · rate: / Reply Quote

ronalds8 Send message Joined: 14 Nov 06 Posts: 10 Credit: 744 RAC: 0	Message 31172 - Posted: 15 Nov 2006, 6:46:51 UTC Great explanation, thanks, this type of researche really has wide practical potential. ID: 31172 · Rating: 0 · rate: / Reply Quote

Hypermarkup Send message Joined: 3 Mar 06 Posts: 7 Credit: 112,275 RAC: 0	Message 31427 - Posted: 19 Nov 2006, 17:59:44 UTC Yeah, good work. Go on. Hypermarkup Fotowing ID: 31427 · Rating: 1 · rate: / Reply Quote

EdMulock Send message Joined: 14 Mar 06 Posts: 30 Credit: 2,347,485 RAC: 0	Message 31465 - Posted: 20 Nov 2006, 16:33:51 UTC Great. This kind of scientific effort is the reason I donate my crunch time to Rosetta. ID: 31465 · Rating: 0 · rate: / Reply Quote

AnRM Send message Joined: 18 Sep 05 Posts: 123 Credit: 1,355,486 RAC: 0	Message 31615 - Posted: 24 Nov 2006, 6:37:10 UTC Last modified: 24 Nov 2006, 6:40:30 UTC I really appreciate the time you took to explain to us 'crunchers' what you hope to accomplish and how we can help in your quest. Starting today, we are running 100% Rosetta.....Thanks, Rog. ID: 31615 · Rating: 0 · rate: / Reply Quote

288VKYUjwsXfAaTXn6SFJC4LVPRf Send message Joined: 16 Dec 05 Posts: 31 Credit: 153,110 RAC: 0	Message 31624 - Posted: 24 Nov 2006, 9:06:06 UTC Thank you for the daily updates and info on the project. It makes us feel we really can make a difference. Other projects are not so open about their goals. At Rosetta we really got the impression we are participating. ID: 31624 · Rating: 0 · rate: / Reply Quote

Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0	Message 46755 - Posted: 22 Sep 2007, 4:43:01 UTC Bump. Many of you probably have not seen this info. before. Rosetta Moderator: Mod.Sense ID: 46755 · Rating: 0 · rate: / Reply Quote

jtibble Send message Joined: 8 Oct 06 Posts: 4 Credit: 570,722 RAC: 0	Message 52214 - Posted: 3 Apr 2008, 4:24:22 UTC With all the CAPRI 15 WUs I've been getting lately, I think this thread needs a well-deserved BUMP. And one quick question... Even though the RMSD value isn't known for each target, are we able to see all of our results grouped together with our own contributions as the red dots on the "Results and Plots of Active Work Units" section of our profiles? Keep up the great work everyone! -John ID: 52214 · Rating: 0 · rate: / Reply Quote

Michael G.R. Send message Joined: 11 Nov 05 Posts: 264 Credit: 11,247,510 RAC: 0	Message 52226 - Posted: 3 Apr 2008, 23:07:25 UTC Thanks for bumping it up. I remember reading it back then, but most of it went over my head. Now that I know a bit more biology, I found Chu's explanation fascinating and I'm happy that our CPUs are helping. I would love an update from Chu on how much progress has been made with regards to flexible-backbone docking & rosetta@home. Thanks, ID: 52226 · Rating: 0 · rate: / Reply Quote

proxima Send message Joined: 9 Dec 05 Posts: 44 Credit: 4,148,186 RAC: 0	Message 52462 - Posted: 15 Apr 2008, 9:05:17 UTC What an excellent article - can't believe I missed it all that time ago, and have been wondering what CAPRI is ever since. (Although not curious enough to Google it, I guess). Thanks for the explanation in language I can just about manage, and thanks to the various people who've bumped it over the months. I have a couple of CAPRI WU's on the go at the momment, and it's great to know what they are. Any update would be great when there's any news. Alver Valley Software Ltd - Contributing ALL our spare computing power to BOINC, 24x365. ID: 52462 · Rating: 0 · rate: / Reply Quote

Chu Send message Joined: 23 Feb 06 Posts: 120 Credit: 112,439 RAC: 0	Message 56503 - Posted: 29 Oct 2008, 6:08:26 UTC Hello everyone, sorry that I have not been around in this forum for a while and here are some updates regarding my work on protein-protein docking which would not have been possible without you guys' contribution. Based on the data produced from Rosetta@Home, we published a research article last year and I am posting the abstract below. In summary, we found that introducing backbone flexibility into rosetta docking simulation helps to improve sampling around native conformation to some extent (that is, the new treatment allows us to get access to native conformational space which was inaccessible with rigid-backbone assumption) and energy discrimination between native-like and non-native-like models (which means more likely for us to pick out correct structures based on energy). However, the overall problem is still quite challenging because the sampling space is dramatically enlarged when backbone flexibility, especially when there is no any external information which can be used as hints to derive possible types of backbone movements. This is essentially the problem of folding two proteins independently and then dock them. This published paper concluded my Ph.D thesis and I am personally very grateful to all Rosetta@Home users for their generous help. While I went on to take some different projects(structure prediction of zinc-binding proteins is just one them!), there are other people in the lab who are continuing to develop new algorithms in Rosetta to improve our docking method. All the existing functionalities in Rosetta have been successfully ported over to the minirosetta application (and now with the long-waited docking graphic) and they are ready to test it out on Rosetta@Home. Actually, there is another round of blind docking prediction competition CAPRI (similar to CASP) which was just announced and I believe soon you will get new WUs crunching models to help us find the correct solution for a protein complex with unknown (at least currently to us) structure. Thanks again for everyone's contribution. J Mol Biol. 2007 Oct 19;373(2):503-19. Epub 2007 Aug 2. Protein-protein docking with backbone flexibility. Wang C, Bradley P, Baker D. Department of Biochemistry and Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA. Computational protein-protein docking methods currently can create models with atomic accuracy for protein complexes provided that the conformational changes upon association are restricted to the side chains. However, it remains very challenging to account for backbone conformational changes during docking, and most current methods inherently keep monomer backbones rigid for algorithmic simplicity and computational efficiency. Here we present a reformulation of the Rosetta docking method that incorporates explicit backbone flexibility in protein-protein docking. The new method is based on a "fold-tree" representation of the molecular system, which seamlessly integrates internal torsional degrees of freedom and rigid-body degrees of freedom. Problems with internal flexible regions ranging from one or more loops or hinge regions to all of one or both partners can be readily treated using appropriately constructed fold trees. The explicit treatment of backbone flexibility improves both sampling in the vicinity of the native docked conformation and the energetic discrimination between near-native and incorrect models. ID: 56503 · Rating: 0 · rate: / Reply Quote

robertmiles Send message Joined: 16 Jun 08 Posts: 1265 Credit: 14,424,358 RAC: 0	Message 56737 - Posted: 6 Nov 2008, 21:10:39 UTC Some of the people crunching here would like to know if those revisions require more RAM memory to run properly. ID: 56737 · Rating: 0 · rate: / Reply Quote

Sarel Send message Joined: 11 May 06 Posts: 51 Credit: 81,712 RAC: 0	Message 56747 - Posted: 7 Nov 2008, 0:34:29 UTC Hello, Round 16 of CAPRI, the experiment comparing computational docking methods has started recently, and we're excited to test new algorithms. Over the next few days you will see jobs going out where two protein chains are docked with respect to one another. In one of the chains you will see a ligand showing up as an object comprised of spheres attached to one of the protein partners. Our challenge is to predict the conformation of the two protein partners, and much of the biological data point to the possibility that the two partners should bind close to the ligand. Hence, it was important that we get the modeling of the ligand right for this to work, and we will now be able to test that! We are not excluding other binding possibilities, though, so many of the simulations will dock the two protein partners in alternative sites. An important issue in protein-protein docking is motion of the backbone of the protein chain. In this round of CAPRI we plan on experimenting with such backbone motion during the docking simulation. So, in some of the simulations that you get, you will see that after the two partners find a particular orientation the backbones wiggle. Such wiggling motion is often observed when protein structures are compared between their bound and unbound states, and it is widely accepted that such motion is a crucial component of protein recognition events. Finally, these changes should not introduce any new memory restrictions compared to previous docking simulations. Thank you all for participating! ID: 56747 · Rating: 0 · rate: / Reply Quote

Sarel Send message Joined: 11 May 06 Posts: 51 Credit: 81,712 RAC: 0	Message 58763 - Posted: 12 Jan 2009, 17:41:20 UTC Hello, At this point, the coordinates for the previous target in CAPRI have not been released so I can't yet report on whether our predictions thanks to ROSETTA@HOME have been successful. I'll let you know as soon as I hear from the organizers. In the meantime, another round of CAPRI (round 17) was announced today. The modeling task this time involves homology modeling and docking at the same time. So, we would be able to test both aspects of ROSETTA as well as the ability to dovetail the two tasks. Such modeling challenges are quite common in real-world scenarios, where we know the structure of one of the partners but only have a template for the other. So, this round of CAPRI offers us a chance to see how well we do at such a task. Over the next few days, I'll send out a very small number of simulations on ROSETTA @ HOME to test the protocols that I intend to use this time. These will involve both rigid-body docking of the partners and backbone motions (and you'll be able to see how those combine using the graphics application). Later in the week I will submit a large batch of trajectories to carry out the actual prediction task. I'm excited to see how well we do at this and will keep you posted on the results from previous round of CAPRI. ID: 58763 · Rating: 0 · rate: / Reply Quote

Sarel Send message Joined: 11 May 06 Posts: 51 Credit: 81,712 RAC: 0	Message 59194 - Posted: 30 Jan 2009, 23:48:43 UTC Hello, The second stage of this round of CAPRI has started. This targets the same two partners as the previous stage, but we have been given the coordinates of the structure which previously we had to predict based on homology modeling. From a preliminary analysis James (who had produced this homology model) and I found that the homology model that we used in the previous stage was largely correct except for one long loop which we modeled incorrectly. Nevertheless, it seems like the main binding modes that we predicted previously would be unaffected by the modeling of that loop. Now that we have the coordinates of that structure we should be much better placed to make high-accuracy predictions. One word of caution is that as in the previous stage the system that we're modeling is quite large (450 residues across both protein partners). We've run such simulations before using ROSETTA @ HOME with no memory issues but I expect result files to be quite large. Please let me know if you run into any trouble. ID: 59194 · Rating: 0 · rate: / Reply Quote

Michael G.R. Send message Joined: 11 Nov 05 Posts: 264 Credit: 11,247,510 RAC: 0	Message 59196 - Posted: 31 Jan 2009, 9:26:39 UTC Hi, Are the CAPRI work units named something specific? Is there CAPRI in there, or something else we can look for to know if we're crunching some. Thanks. ID: 59196 · Rating: 0 · rate: / Reply Quote

Saharak Send message Joined: 28 Apr 07 Posts: 7 Credit: 1,170,212 RAC: 0	Message 59216 - Posted: 1 Feb 2009, 10:32:09 UTC - in response to Message 59196. Hi, Are the CAPRI work units named something specific? Is there CAPRI in there, or something else we can look for to know if we're crunching some. Thanks. My computer is crunching this one: _CAPRI17_T39_1_.sjf_br_both_docking.protocol__6483_8004 ID: 59216 · Rating: 0 · rate: / Reply Quote

Sarel Send message Joined: 11 May 06 Posts: 51 Credit: 81,712 RAC: 0	Message 59223 - Posted: 1 Feb 2009, 17:29:20 UTC Yes, sorry for not mentioning. As posted by Saharak, the jobs have _CAPRI prominently in the name. They would also be visually recognizable because you will see a small protein 'orbiting' around a very large protein (that's the docking step). Then, side chains will come up (packing) and finally you will see some backbone motions. I'm running two prediction strategies here, one where only one of the partners undergoes backbone motions (named _br_one_docking) and the other with both partners (_br_both_docking). This is because in this stage of the experiment we were given the exact coordinates of one of the partners and I would like to see whether backbone motions are any help in such cases (my hypothesis is that backbone motions shouldn't help and this is a great way to find out!). Thanks for participating! ID: 59223 · Rating: 0 · rate: / Reply Quote

Michael G.R. Send message Joined: 11 Nov 05 Posts: 264 Credit: 11,247,510 RAC: 0	Message 59250 - Posted: 3 Feb 2009, 4:27:49 UTC Thank you for the precision. Very interesting. ID: 59250 · Rating: 0 · rate: / Reply Quote

shilei Volunteer moderator Project developer Project scientist Send message Joined: 25 Aug 11 Posts: 5 Credit: 1,014,314 RAC: 0	Message 72147 - Posted: 15 Jan 2012, 18:01:56 UTC Last modified: 15 Jan 2012, 18:02:21 UTC Hi Rosetta@home, my name is Lei Shi, a postdoc in Baker's lab. My work is protein-protein docking with sparse experimental data. Many proteins carry their functions by interacting with other protein. Predicting protein interactions are thus important to understand their function. Computational prediction of protein complex structures using docking is an important approach toward this problem. The success and challenges of Rosetta docking have been highlighted by Chu's previous posts. Successful prediction of complex structures requires correctly capturing the protein-protein interactions at the interface and structures for each protein in their bounded form. This is still an unsolved problem due to the enormous number of degrees of freedom. Due to this reason, scientists are most of time solving the problems using tedious and costly experimental approaches, such as Xray or NMR etc. In my project, I will work to incorporate sparse experimental data into Rosetta docking. These experimental data is usually easily available in the early stages of experiment. Although sparse and ambiguous in nature, these limited data is very powerful guiding computational modeling. Combined with Rosetta methodology, this approach has proven to be very successful in protein structure prediction as highlighted in the science paper from the Baker's group in 2010 (NMR Structure Determination for Larger Proteins Using Backbone-Only Data, link at: http://www.sciencemag.org/content/327/5968/1014.short). My goal is to use similar information from experiments to improve the accuracy of protein complex structure using Rosetta dock. This will be useful to speed up the process of determine high-resolution complex structures with limited efforts in experiment investigation. Many previous Baker lab members, including Prof. Jeffrey Gray and Dr. Chu Wang etc, have laid the ground work. As a relatively new member (08/2011) to the Baker's lab, I am very excited for this project. Of course, the work will not be done with contributions and donations of your computational resources. Thank you all for your participation. Lei ID: 72147 · Rating: 0 · rate: / Reply Quote