
Research
Abstracts  2007 
Modeling Ensembles of Transmembrane βbarrel ProteinsJérôme Waldispühl, Charles W. O'Donnell, Nathan Palmer, Srinivas Devadas, Peter Clote & Bonnie BergerThe goal of this project is to develop a whole new family of algorithms for exploring the folding landscape of transmembrane βbarrel proteins. Using the Boltzmann partition function and a bottomup energy model that does not rely on machine learning, we are able to compute, from sequence, meaningful physical properties and the weighted ensemble of supersecondary structures that a particular TMB might conform to. Current progress on this project can be found at http://protein.csail.mit.edu/ and http://theory.csail.mit.edu/~bab. IntroductionTransmembrane βbarrel (TMB) proteins are embedded in the outer membrane of Gramnegative bacteria, mitochondria and chloroplasts. The cellular location and functional diversity of βbarrel outer membrane proteins (omps) makes them an important protein class. At the present time, very few nonhomologous TMB structures have been determined by Xray diffraction because of the experimental difficulty encountered in crystallizing transmembrane proteins. Our recent algorithm, transFold, is able to compute the minimum energy supersecondary structure for omps, identifying βstrands and their interstrand Hbond contacts from sequence alone [1]. To do this we use a multitape Sattribute grammar to model a TMB structure, and use dynamic programming to find the minimum energy structure according to statistical potentials of interstrand residue contacts. This approach was the first to predict such longrange interactions for TMBs, and differs significantly from others by not requiring a training phase on known TMB structures. An online webserver has been set up at http://theory.csail.mit.edu/transFold/ where the algorithm can be run on any given sequence [2]. Unfortunately, transFold can only determine minimum folding energy structures, and says nothing about the folding landscape that is being investigated. For the first time, we now move beyond classical single structure prediction methods and introduce a family of algorithms for investigating the folding landscape of TMBs using the Boltzmann partition function. Inspired by previous work on RNA secondary structure, we have developed a recursive algorithm for computing the partition function value, and from this, we are able to compute the individual residueresidue contact probabilities of interβstrand contacts. Further, we have devised an algorithm for rigorously sampling structural conformations from the Boltzmann low energy ensemble. By creating stochastic contact maps from the residueresidue contact probabilities, sampling structural conformations according to their probability in the Boltzmann distribution, and computing physical properties from the Boltzmann partition function value, we can gain a much more clear understanding of the entire folding landscape of an outer membrane protein, rather than just the minimum energy conformation. MethodsTransmembrane βbarrel proteins can be modeled using a multitape Sattribute grammar by linearly representing the constituent βstrands by their secondary structure assignment, duplicating this linear assignment to form two tapes, and aligning the two tapes against each other to indicate the alignment of βstrands in a βsheet, and the location of Hbonds. Using this form, the barrel structure of a TMB (requiring a barrelshaped βstrand alignment) can be defined by a 2tape grammar (as shown by G_{barrel} and G_{couple} rules below). Using this grammar, a parse tree can then be derived, and Sattributes within the tree that correspond to energy potentials can be used to find the optimal supersecondary structure. More details on how to use Sattribute grammars to represent protein structure can be found in [1] and [3]. Inspired by McCaskill's work on RNA structure prediction [4], the Boltzmann partition function can be found for a TMB by projecting the βbarrel structure onto a 2D lattice and solving a dynamic programming recursion. Although this problem is NPHard for 3D model, Istrail has shown that a 2D lattice can be solved in polynomial time [5]. In a similar fashion, the individual residueresidue contact probability can be solved according to each contacts weight in the Boltzmann distribution. Finally, independent, random, structural conformations can be sampled from the ensemble of possible conformations, again according to their weight in the Boltzmann distribution. These techniques are inspired by work done by Ding and Lawrence [6]. ResultsUsing our methods [7], a stochastic contact map can be created for any given TMB residue sequence, identifying likely structural conformations in the Boltzmann ensemble. In the figure below, both axis delineate the amino acid sequence, and darker regions in the map represent highprobability interactions according to our derived Boltzmann distribution. The minimum energy structure, as would be found be transFold is also shown in red. Initial results suggest that our residueresidue contact predictions are the best for transmembrane βbarrel proteins when compared to contacts found in xray crystal structures. Since residueresidue contact probabilities can indicate regions of a TMB protein that are more flexible, we can compare the perresidue contact probability with the DebyeWaller factor (Bvalue) found in xray crystal structures. As seen in the figure below, sequential residue regions of high flexibility should maintain a low perresidue contact probability (in green). These lowprobability regions sequence across the xaxis correspond very well with high experimental Bvalue regions which also indicate flexible and disordered structure. References:[1] J. Waldispühl, B. Berger, P. Clote, and JM. Steyaert. Predicting Transmembrane Betabarrels and Interstrand Residue Interactions from Sequence. In PROTEINS: Structure, Function and Bioinformatics, 65(1) pp. 6174 (2006) [2] J. Waldispühl, B. Berger, P. Clote, and JM. Steyaert. transFold: a web Server for Predicting the Structure and Residue Contacts of Transmembrane βbarrels. In Nucleic Acids Research, 34 (Web Server issue) pp. W189W193 (2006) [3] J. Waldispühl and JM. Steyaert. Modeling and Predicting allAlpha Transmembrane Proteins Including HelixHelix Pairing. In Theoretical Computer Science, special issue on Pattern Discovery in the Post Genome, pp. 6792 (2005) [4] J. S. McCaskill. The equilibrium partition function and base pair binding probabilities for RNA secondary structure. In Biopolymers, 29 pp. 11051119 (1990) [5] I. Istrail. Statistical mechanics, threedimensionality and NPcompleteness: I. Universality of intractability of the partition function of the ising model across nonplanar lattices. In Proceedings of the 32^{nd} ACM Symposium on the Theory of Computing, pp. 8796 (2000) [6] Y. Ding, C. E. Lawrence. A statistical sampling algorithm for RNA secondary structure prediction. In Nucleic Acids Research, 31(24) pp. 72807301 (2003) [7] J. Waldipühl, C. W. O'Donnell, N. Palmer, S. Devadas, P. Clote, B. Bergeer. Modeling Ensembles of Transmembrane βbarrel proteins. Under review. 

