Biophysics An Introduction From the hydrophobic effect to protein–ligand binding, statistical physics is relevant in almost all areas of molecular biophysics and biochemistry, making it essential for modern students of molecular behavior But traditional presentations of this material are often difficult to penetrate Statistical Physics of Biomolecules: An Introduction brings “down to earth” some of the most intimidating but important theories of molecular biophysics With an accessible writing style, the book unifies statistical, dynamic, and thermodynamic descriptions of molecular behavior using probability ideas as a common basis Numerous examples illustrate how the twin perspectives of dynamics and equilibrium deepen our understanding of essential ideas such as entropy, free energy, and the meaning of rate constants The author builds on the general principles with specific discussions of water, binding phenomena, and protein conformational changes/folding The same probabilistic framework used in the introductory chapters is also applied to non-equilibrium phenomena and to computations in later chapters The book emphasizes basic concepts rather than cataloguing a broad range of phenomena Students build a foundational understanding by initially focusing on probability theory, low-dimensional models, and the simplest molecular systems The basics are then directly developed for biophysical phenomena, such as water behavior, protein binding, and conformational changes The book’s accessible development of equilibrium and dynamical statistical physics makes this a valuable text for students with limited physics and chemistry backgrounds 73788 ISBN: 978-1-4200-7378-2 90000 ww w c rcp ress c o m 781420 073782 w w w c rc p r e s s c o m Statistical Physics of Biomolecules Daniel M Zuckerman Zuckerman Statistical Physics of Biomolecules Daniel M Zuckerman Statistical Physics of Biomolecules An Introduction Statistical Physics of Biomolecules AN INTRODUCTION Statistical Physics of Biomolecules AN INTRODUCTION Daniel M Zuckerman CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2010 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S Government works Version Date: 20150707 International Standard Book Number-13: 978-1-4200-7379-9 (eBook - PDF) This book contains information obtained from authentic and highly regarded sources Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint Except as permitted under U.S Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers For permission to photocopy or use material electronically from this work, please access www.copyright com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400 CCC is a not-for-profit organization that provides licenses and registration for a variety of users For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com For my parents, who let me think for myself Contents Preface xix Acknowledgments xxi Chapter Proteins Don’t Know Biology 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Prologue: Statistical Physics of Candy, Dirt, and Biology 1.1.1 Candy 1.1.2 Clean Your House, Statistically 1.1.3 More Seriously Guiding Principles 1.2.1 Proteins Don’t Know Biology 1.2.2 Nature Has Never Heard of Equilibrium 1.2.3 Entropy Is Easy 1.2.4 Three Is the Magic Number for Visualizing Data 1.2.5 Experiments Cannot Be Separated from “Theory” About This Book 1.3.1 What Is Biomolecular Statistical Physics? 1.3.2 What’s in This Book, and What’s Not 1.3.3 Background Expected of the Student Molecular Prologue: A Day in the Life of Butane 1.4.1 Exemplary by Its Stupidity What Does Equilibrium Mean to a Protein? 1.5.1 Equilibrium among Molecules 1.5.2 Internal Equilibrium 10 1.5.3 Time and Population Averages 11 A Word on Experiments 11 Making Movies: Basic Molecular Dynamics Simulation 12 Basic Protein Geometry 14 1.8.1 Proteins Fold 14 1.8.2 There Is a Hierarchy within Protein Structure 14 1.8.3 The Protein Geometry We Need to Know, for Now 15 1.8.4 The Amino Acid 16 1.8.5 The Peptide Plane 17 1.8.6 The Two Main Dihedral Angles Are Not Independent 17 1.8.7 Correlations Reduce Configuration Space, but Not Enough to Make Calculations Easy 18 1.8.8 Another Exemplary Molecule: Alanine Dipeptide 18 vii Contents viii 1.9 A Note on the Chapters 18 Further Reading 19 Chapter The Heart of It All: Probability Theory 21 2.1 Introduction 21 2.1.1 The Monty Hall Problem 21 2.2 Basics of One-Dimensional Distributions 22 2.2.1 What Is a Distribution? 22 2.2.2 Make Sure It’s a Density! 25 2.2.3 There May Be More than One Peak: Multimodality 25 2.2.4 Cumulative Distribution Functions 26 2.2.5 Averages 28 2.2.6 Sampling and Samples 29 2.2.7 The Distribution of a Sum of Increments: Convolutions 31 2.2.8 Physical and Mathematical Origins of Some Common Distributions 34 2.2.9 Change of Variables 36 2.3 Fluctuations and Error 36 2.3.1 Variance and Higher “Moments” 37 2.3.2 The Standard Deviation Gives the Scale of a Unimodal Distribution 38 2.3.3 The Variance of a Sum (Convolution) 39 2.3.4 A Note on Diffusion 40 2.3.5 Beyond Variance: Skewed Distributions and Higher Moments 41 2.3.6 Error (Not Variance) 41 2.3.7 Confidence Intervals 43 2.4 Two+ Dimensions: Projection and Correlation 43 2.4.1 Projection/Marginalization 44 2.4.2 Correlations, in a Sentence 45 2.4.3 Statistical Independence 46 2.4.4 Linear Correlation 46 2.4.5 More Complex Correlation 48 2.4.6 Physical Origins of Correlations 50 2.4.7 Joint Probability and Conditional Probability 51 2.4.8 Correlations in Time 52 2.5 Simple Statistics Help Reveal a Motor Protein’s Mechanism 54 2.6 Additional Problems: Trajectory Analysis 54 Further Reading 55 A Statistical Perspective on Biomolecular Simulation 0.35 Probability 0.3 311 Targeted distribution, w(x) 0.25 0.2 0.15 0.1 0.05 –3 Sampling distribution, w΄(x) –2 –1 x FIGURE 12.7 Reweighting from one distribution to another The targeted distribution ρ(x) ∝ w(x) may differ significantly from the distribution used to sample it, w (x) If few configurations from w are important in w, based on a necessarily finite sample, the reweighted configurations may not be valid—that is, there may not be sufficient “overlap” between the distributions Ideally, one designs w to be as similar to w as possible or MC simulation at temperature Tj is launched from each configuration and run for a short time The modified ensemble consists of the final relaxed configurations— one from each short simulation—along with the weights In this way, the ensemble becomes more adapted to each successively lower temperature, ameliorating the overlap problem Lyman and Zuckerman showed this procedure can be applied to biomolecular systems 12.5.2 POLYMER-GROWTH IDEAS The reweighting idea can be used to generate ensembles of molecular configurations without any kind of dynamics at all—not even MC The ideas come from algorithms for growing polymers See, for instance, the paper by Garel and Orland In our case, imagine that the full set of molecular coordinates rN is divided into a set of n molecular fragments, so that rN = (s1 , s2 , , sn ), with si being the coordinates for fragment i Also assume an ensemble of each fragment has been prepared in advance, with configurations distributed according to the Boltzmann factor of Ui for each fragment i The potential Ui will be assumed to correspond to all interactions among atoms internal to fragment i If we consider the configuration space of our full system and imagine selecting one configuration at random from each fragment ensemble, we will have configurations for the whole molecule distributed according to w (rN ) = w(s1 ) w(s2 ) w(sn ), (12.9) where w(si ) = exp[−Ui (si )/kB T] is the Boltzmann factor for fragment i On the other hand, we really want configurations distributed according to the Boltzmann factor of the full potential U(rN ), which is given by the fragment terms, plus all interactions terms Uij among fragments i and j: Statistical Physics of Biomolecules: An Introduction 312 Fragment Fragment Fragment FIGURE 12.8 Polymers constructed by fragments A library of each molecular fragment is generated in advance For example, fragments can correspond to amino acids The fragment configurations can be assembled into full molecular configurations, which must be reweighted to account for interactions among fragments U(rN ) = U(s1 , , sn ) = Ui (si ) + i Uij (si , sj ) (12.10) i