Contribution from protein and water to the barrier and reaction energy in kcal/mol associated with the potential of mean force for the proton transfer from the zinc-bound waterinCAID.. c
Trang 1METHOD VALIDATION AND APPLICATION
by
Demian Riccardi
A dissertation submitted in partial fulfillment of
the requirements for the degree of
Trang 2INFORMATION TO USERS
The quality of this reproduction is dependent upon the quality of the copy submitted Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction
In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted Also, if unauthorized
copyright material had to be removed, a note will indicate the deletion
®
UMI
UMI Microform 3245613 Copyright 2007 by ProQuest Information and Learning Company
All rights reserved This microform edition is protected against unauthorized copying under Title 17, United States Code
ProQuest Information and Learning Company
300 North Zeeb Road P.O Box 1346
Ann Arbor, Ml 48106-1346
Trang 3
COMPUTATIONAL INVESTIGATIONS OF LONG-RANGE PROTON TRANSFER:
METHOD VALIDATION AND APPLICATION
submitted to the Graduate School of the University of Wisconsin-Madison
in partial fulfillment of the requirements for the Degree of Doctor of Philosophy
by
DEMIAN RICCARDI
Date of Final Oral Examination: November 16, 2006
Month & Year Degree to be awarded: December 2006 May August
eke de kee kee dee ete eke de RR RRR RRR EKER RRR KERR ERE EKER KERR EERE RR RERERER
Approval Signatures of Dissertation Committee
Signature, Dean of Graduate School
Trang 5
The validated method is then applied to simulate long-range proton transfers in solution and car- bonic anhydrase II These investigations emphasize that the microscopic, mechanistic details of long-range proton transfer are most sensitive to the free energy of protonation or deprotonation for all groups involved in the transfer pathway This naturally leads to two mechanisms for a long-range proton transfer between two groups: the most common Grotthus and the less recog- nized “proton hole” mechanisms The latter involves the protonation of the acceptor prior to the deprotonation of the donor, thereby creating a “proton hole” in the mediating pathway Overall, fair agreement with experiment is attained, and the “proton hole” mechanism is supported as the dominant path in carbonic anhydrase II and is found to be independent of the distance between the zinc-bound water and the His 64.
Trang 6Published Work and Work in Preparation
[1] D Riccardi, G Li, and Q Cui, “Importance of van der waals interactions in QM/MM simula- tions.,” J Phys Chem B, vol 108, p 6467-6478, 2004
[2] P Schaefer, D Riccardi, and Q Cui, “Reliable treatment of electrostatics in combined QM/MM simulation of macromolecules,” J Chem Phys., vol 123, p 014905, 2005
[3] D Riccardi, P Schaefer, and Q Cui, “pK, calculations in solution and proteins with qm/mm free energy perturbation simulations,” J Phys Chem B, vol 109, pp 17715-17733, 2005 [4] D Riccardi, P Schaefer, Y Yang, H Yu, N Ghosh, X Prat-Resina, P Konig, G Li, D Xu,
H Guo, M Elstner, and Q Cui, “Development of effective quantum mechanical/molecular mechanical (QM/MM) methods for complex biological processes (feature article),” J Phys Chem B, vol 110, pp 6458-6469, 2006
[5] D Riccardi, P Koenig, X Prat-Resina, H Yu, M Elstner, T Frauenheim, and Q Cui, “proton holes” in long-range proton transfer reactions in solution and enzymes: a theoretical analysis,”
J Am Chem Soc (In Press)
[6] D Riccardi and Q Cui, “Insights for carbonic anhydrase Il from pk, computations for the zinc bound water” (In preparation)
[7] D Riccardi, P Koenig, and Q Cui, “Characterizing the molecular details of the rate limiting long-range transfer in carbonic anhydrase II” (In preparation)
Trang 7TABLE OF CONTENTS
Page
Published Work and Work in Preparation 0 2.0.00 ee eae 1H
3.3 Generalized Solvent Boundary Potential(GSBP) 4I
Trang 84.3.1 Small Moleculesin solution . 02-4 71
QM/MM Simulations of Human Carbonic AnhydraselII 103 3.1 Introduclion c c Q Q c Q ng vn ng vn v k à v kg v k kg ấ 103 5.2 Computational Setup c Q HQ HQ ng ng v V KV IV 107 5.2.1 Spherical BoundaryCondilons Q QS 108 5.2.2 Periodic Boundary Condilons Ặ 111 3.2.3 Linearresponse p#„€Calculalons 113 5.3 Resultsanddiscisslion Q Q Q Q HQ Và ky kia 114 5.3.1 Root Mean Square Deviations for backbone atoms 114 5.3.2 Flexibility of H64 2 0.022.000.0222 0G 114 5.3.3 Behavior of water in the active site 2 2 2 ee eee 116 5.3.4 Weakness of GSBP: dynamic properties of the system 125
5.4 Concluding Remarks 2 0 2 2, ee ee ee 129 Insights for CAII from pk, computations for the zinc-bound water 131
6.2 Computational Methods 2 0 2 00.0.0 02 ee eee eee 133 6.2.1 GSBP setup for 20 and 25 Ẳ innerregions 133 6.2.2 pK, calculations with FEP and charge perturbations 134
Trang 9Page
6.3.2 Contrasting errors within free energy derivatives and between free energies
determined from independent simulalons 138
6.3.3 Comparison withexperiment 0.052000 004 141 6.3.4 Dissecting the pK, in terms of water and protein electrostatic contributions 141 6.4 Concluding Remarks 2.20.0 eee ee et ee es 147 7 ‘Proton holes” in long-range proton transfer reactions in solution and enzymes: A theoretical analysis 2 2 eee 148 7.1 Introduction 2 ee 148 7.2 Systems and Simulaton Methods 151
72.1 Solulonsystems Q Q Q Q.2 151 7.2.2 Carbonic AnhydraselIT(CAI) 152
7.2.3 QM/MM parttioningandswichng 154
7.2.4 Potential of mean force (PMF) calculations and analysis 155
7.2.5 Electrostatic potential calculations for relevant protonation states 156
7.3 Results and Discusslons Ặ Q Q Q Q HQ Q HQ va 157 7.3.1 Soluloncases es 158 7.3.2 Carbonic Anhydrasell Q.2 162 74 ConcludngRemarks es 168 8 Characterizing the molecular details of the rate limiting long-range transfer in Carbonic Anhydrase ll 2 0.0.0.0 0 ee ee ee 170 8.1 Introduction 2 ee ee 170 8.2 Methods: “TS-reorganized” simulations and energetic decomposition 172
8.2.1 “TS-reorganized” simulations 0.0.2 0 0528 ee 172 8.2.2 Harvesting water wires and determining the MEP 175
8.2.3 Electrostatic Perturbation analysis of the PMF 176
8.3 Results and Discussion 2 2 2 eee 177 8.3.1 Ensemble of mimnimum energy paths 177
8.3.2 Potential of mean force for the long-range proton transferinCAII 183
9 Conclusions .0 0 00 ee la a aalaaa CD 193
Trang 11LIST OF TABLES
Table
2.1
2.2
2.3
2.4
2.5
2.6
4.1
4.2
4.3
4.4
4.5
4.6
4.7
5.1
Page
Summary of interaction energies and bond lengths: optimization set 10
Summary of interaction energles and bond lengths: test set 12
LJ parameter sets for SCC-DFTB atoms in SCC-DFTB/MM calculatons 18
RMS errors for interaction energy and hydrogen bond length 20
Gas and condensed phase comparisons for the reductionof FAD 22
Gas and condensed phase barriers for the intramolecular proton transfer in enediolate 31 Gas phase proton affinities (in kcal/mol) calculated at the SCC-DFTB, B3LYP, and CCSD levels 2 1 ee ee 73 The AG" (in kcal/mol) component (Figure 4.1) for the five small molecules studied 75
Various bulk solvation contributions (in kcal/mol) to the deprotonation free energy considered for the small molecules insolution 20005 77 The effect of the van der Waals parameters for the acidic proton on AG” and AG@2”) for acetic acid and imidazole (n kcal/mol) 79
Bonded, Zero-Point-Energy (ZPE) contributions to the free energy of deprotonation and the QM/MM free energy correction as well as the QM proton affinity correction (mkcal/mol) ee 8l The pK, shifts (in pK, units), relative to the COOH group in glycine, and the RMS difference from experimental values for the five small molecules studied 85
The pk, for His 31 and Lys 102 in the M102K mutant of the T4-Lysozyme 87 Root mean square deviations (RMSD) of backbone atoms and a summary of the “IN” and “OUT” sampling for H64 2 2.20.00 000 2c eee ee es 118
Trang 12Table Page 5.2 Percentages of water brldge tyDp€S Q Q Q LH HQ nu n k và kg 124 5.3 pK,s calculated for H64 and the Zinc-bound water with the Linear Response Approx- imation relative to 4 methyl imidazole In soluion 128 6.1 Example of statistical analysis used to determine the free energy derivatives for the deprotonation of the zinc-bound water in the E106Q mutant of CAII with a 20 A
8.3 Contribution from protein and water to the barrier and reaction energy (in kcal/mol) associated with the potential of mean force for the proton transfer from the zinc-bound waterinCAID 2 Q Q Q HQ HH vn k v kg kg V KV kg 190 Appendix
Table
A.1_ Errors in proton affinities for SCC-DFTB relative to G3B3 calculations 225 A.2 Error analysis for the proton exchange reactions between a solute and water 226 B.! Proton affinities for models of proton donor and acceptor groupsinCAII 229 B.2 Energetics for proton transfer in a gas phase active site model forCAII 231
Trang 13Figure Page 2.1 Training set molecules for optimization of SCC-DFTB/MM LJ parameters 9 2.2 Additional molecules not included In the traningset 11
2.4 Schematic of enediolate inramolecular proton transffr 16 2.5 Comparison of interaction energies between various SCC-DFTB/MM LJ parameter
2.6 _ Radial distribution functions of TIP3P waters about the model RAD molecule 24 2.7 _ Radial distribution functions of TIP3P water about an enediolate 25 2.8 Examples of the free energy derivative convergence and integration for the reduction
Of FAD 1 Q Q Q Q Q Q Q Q Quy kg ng kh v k k k k kg 27 2.9 Typical thermodynamic cycle for determining the free energy required to convert A to
2.10 Plot of the temperature dependence of the free energy used to determine entropic and enthalpic contributions to the reductionof FAD 0 0 00000, 29 2.11 The potential of Mean Force for the Intramolecular Proton Transfer in an Enediolate 32 3.1 Schematic representation of the GSBP partition of a solvated biomolecule in the QM/MM framework 1 0 Q Q Q Q nu gà và kia 42 3.2 Convergence of the GSBP results versus the size of the basis set for a solvated imidazole 51 3.3 Flow chart representation of a GSBP set-up for a biological molecule 53 4.1 Thermodynamtc cycle forDTSC pK„calculalons .- 59
Trang 14Figure Page 4.2 _ The five small molecules studied here for p#<„ values in solulon 67 4.3 GSBP setups forT4Lysozyme -2 2 2002 eee eee ee 70 4.4 Representative linear fits and convergence of the free energy derivatives for solution
4.5 Effect of van der Waals parameters on the computed pK, of imidazole and acetic acid 80 4.6 Schematic plot of the RMSD for pK, shifts relative to experiment as different proto-
4.7 The structure of water and stability of the protein in the pK, calculations for His 31
in the M102K mutant of T4 Lysozyme 2 2 ee ee ee 89 4.8 The average structure for the H31 protonated simulation of the M102K mutant of T4 Lysozyme overlayed with the crystal structure 2 ee ee ee 90 4.9 Convergence of the free energy derivatives for the H31 pK, simulation in the M102K T4 Lysozyme mutant with the 0.1 M GSBP simulation 92 4.10 The structure of water and stability of the protein in the protonated state of K102 for the M102K mutant of T4¢ Lysozyme .2 2.204.002 5 0s 94 4.11 The average structure for the K102 protonated simulation of the M102K mutant of T4 Lysozyme overlayed with the crystal structure 95
4.12 The structure of water and stability of the protein in the deprotonated state of K102
for the M102K mutant of T4Lysozyme ỐC 96 4.13 The average structure for the K102 deprotonated simulation of the M102K mutant of T4 Lysozyme overlayed with the crystal structure 20 0.00000 00 97 4.14 Convergence of the free energy derivatives for the K102 pK, simulation in the M102K T4 Lysozyme mutant with the 0.1 MGSBPsimulaion 99 5.1 Active site of CAII rendered from the crystal structire 105
Trang 15Figure
5.4 Representative plots of the RMSD and the H64 “IN” and “OUT” configurations plot-
5.5 Superposition of a few key residues from two stochastic boundary SCC-DFTB/MM
simulations with the x-ray stfuCfUIf€ ee
5.6 Comparison of radial distributions of water oxygens for PBC and GSBP 5.7 Comparison of radial distribution of water oxygens about the Zinc atom for different
5.8 Statistics for productive water-bridges (only from two and four shown here) between the zinc bound water and His 64 with different electrostatics protocols 5.9 Diffusion coefficients for TIP3 water molecules as a function of the distance from
5.10 Root mean square fluctuations of backbone, C, atoms .00-
6.1 Detailed rendering of CAll active site 2 eee
6.2 C, RMSF comparisons: 20 and 25 ÄGSBP ‘ 6.3 Linear fit of the of the free energy derivatives with respect to A for the WT and E106Q mutants of CAII computed with 20 and 25 A GSBP innerregions 6.4 Example of QM/MM partitioning with a single link atom for a histidine residue 6.5 Linear dependence of water and protein electrostatic contributions to the free energy derivatives 2
6.6 The contribution to the free energy of deprotonation from water and protein MM charges In the WT and E106Q mutantCAII 7.1 Proton transfer: the Grotthus and “proton hole” mechanisms 7.2 |The “excess coordination” for the donor atom, acceptor atom and oxygen atoms in the mediating water molecules during the proton transfer reactions 7.3 Calculated potential of mean force for the various proton transfer reactions in solution and CAI based on SCC-DFTB/MM smulaions
Trang 16Figure
7.4 Snapshot and electrostatic potentials of the carbonic anhydrase II active site 7.5 Radial distribution functions for the solvent oxygen around the “proton hole” for the proton transfer reaCtOn c Q Q Q Q HQ HH kg kia 8.1 Snapshot from CHOH containing a two water bridge with synchronous proton transfer
Figure
A.1 Calculated potential of mean force for the various “half” proton transfer reactions in solution and the His64A mutant of CAII based on SCC-DFTB/MM simulations B.1 2-dimensional adiabatic mapping of proton transfer and O-O vibration in (H302)*
B.2 2-dimensional adiabatic mapping of double proton transfers with different sets of O-O distances in (H7O3)t 2 Q Q Q Q Q Q Q Q HQ ng ng v g v v va va
BA nu nu cu ca củ ch N na ch ch ok k k kh Ko Ko N KT Ko Ko ko KT ko K ko Ha
ĐỘ HH uc cà ca củ kh Nà ch N k R Ko kg TK k ki Ko NV ko Kia
B.6 eee ee ee eee ee ee ee
Trang 17B.8 2-dimensional adiabatic mapping of proton transfer and N-O vibration in a complex between water and protonated imidazole 0.2.00 eee eee 239 B.9 2-dimensional adiabatic mapping of proton transfer and O-O vibration in a complex
Trang 18ACKNOWLEDGMENTS
During my many childhood visits to my grandparents’ house in New Jersey, my grandfather and
I would watch TV and talk about all sorts of things during the muted commercials and sometimes,
to my annoyance, during the shows; he introduced me to the concept of “infinity”, and as a result,
I think of him whenever I’m stopped in traffic At the time, I remember occasionally complaining
to him that he talked too much, but I'd give anything to talk to him again There were many things
he needed to talk about, and I was too young to understand
I would like to express my deepest gratitude to my advisor, Qiang Cui His door has always been open to me He seems to have infinite patience and insight, and as a result, I don’t think I could have learned more from anyone else
I would like to thank my family and friends for all their support My parents, Celia and Shawn Upton (“Big Shawn’), sacrificed a lot to send me to college, and I am grateful From them, I’ve gained an appreciation for hard work; my days shoveling snow during Nor’easters with my step- brother Shawn while having Big Shawn barking something on the order of “hurry up!” from the Scrambler Jeep were some of the most pleasant, profitable, and exhausting days I’ve experienced I’m proud of many of the figures I created for this thesis, and I have my mother’s creative genes
to thank; I hope she enjoys them The friends I’ve gained in graduate school have all been truly remarkable individuals Eric Fulmer, Jocelyn Cox, Michael Birkeland, Govardhan Reddy, and Sai Ramesh joined the program with me; they helped make this process enjoyable, and I’ve learned a lot from them I have also enjoyed my time spent with all the people on the 8th floor, especially Amber Krummel, Dave Strasfeld, and of course, Eric Fulmer Dave will always be “Dr D” to me;
Trang 19on the other hand, Eric will not be Dr Fulmer, to me, until he sings a certain song at karaoke he knows which
I would be nowhere without the members of the Cui group Mark Formaneck, Adam van Wynsberghe, and Gouhui Li helped me extensively in the beginning of my graduate student career They’re great friends and resources Sharing an office with Mark was very enjoyable, and I am
grateful for his willingness to discuss research and a multitude of other things I’ve learned a lot
from him It is too bad Adam never migrated up to the eighth floor because I’ve always enjoyed discussing various things with him as well Gouhui implemented the dual topology single coordi- nate, free energy perturbation approach, which has provided the tools that enabled me to push the quantitative details used in validating the SCC-DFTB/MM simulations I would also like to thank Patti Schaefer for her contributions to the implementation of the generalized solvent boundary condition (GSBP) for use with SCC-DFTB; without an accurate treatment of long-range electro- statics, there may not have been any quantitative details I would also like to thank Nilanjan Ghosh for collaborations on some of the gas phase validations of SCC-DFTB and for our many colorful discussions regarding his work and my own I would like to thank Haibo Yu for implementing the fix to the hydrogen bond corrected SCC-DFTB; the day the minimum energy path barriers disappeared was an interesting one! Sharing an office with Haibo is Yang Yang; I have had many fruitful discussions in this office with the both of them! Yang has been the source of many interest- ing conversations, and I admire his pride and ability in teaching and presenting research I would like to thank Xavier Prat-Resina for his work implementing the QM/MM switching for QM and
MM waters I first thought QM/MM switching would be necessary back in my second year while fooling around with water in nanotubes, I forgot about it, and then it proved essential to studying these proton transfer reactions in solution and in CAII Xavier had the expertise to implement this quickly which helped accelerate the PT project immensely
Peter Koenig has been incredibly helpful I’ve learned a lot from Peter; all of the work on the long-range proton transfers was the product of a wonderful collaboration I would probably still
be pushing protons through wires if it were not for his development and implementation of the general reaction coordinate for proton transfer Further, he was responsible for the implementation
Trang 20of the force-based perturbative analysis I used to understand the electrostatic contributions from the protein and water molecules to the energetics of the process The picture I present for the long- range proton transfer in carbonic anhydrase II is largely the result of our lasting collaboration with Peter
I would also like to thank my undergraduate advisor, Maria Gomez, for helping get started on this path I still regret not going to Los Alamos to work with her during the summer of 2000; that would have been a great experience She has always been, and continues to be, very supportive Last but not least, I would like to thank my life partner, Laurel Pegram, for all her support and kindness We met at Vassar and have had wonderful times these past years growing up together I’m pleased she joined the same program and chose to work for Tom Record; we may have many future collaborations as a result! If she didn’t have such contempt for the cold winter, we may have tried to stay here in Madison forever We also have four great, mostly gray cats that I should thank Thank You Piggins, Bones, Sardines, and Jolene for being such fine beasts!
Trang 21General Introduction
As Joe Hirschfelder agreed[1] with what Joel Hildebrand said, “Chemistry is fun!”
1.1 Long range proton transfer
Acid-base chemistry plays a key role in many enzyme-catalyzed reactions While the majority
of the associated proton transfers are direct, from donor to acceptor, there are many important en-
zymes (such as the ATP synthase and cytochrome c oxidase)[2] wherein the proton transfer occurs
| over distances on the order of tens of Angstroms The large number of chemical groups that po- tentially participate in the transfer process makes it challenging to pinpoint the transfer mechanism and critical factors that limit the corresponding energetics and kinetics Often, the Grotthus[4] mechanism, where a series of sequential “proton hops” transfers the positively charged proton from the donor to the acceptor, is invoked to explain these processes A common prerequisite for such mechanisms has been the identification of hydrogen bonded chains (HBC)[3] that consist of groups that are able to accept and donate one hydrogen bond unidirectionally Indeed, long “water- wires”, intermittently mixed with titratable amino acid sidechains, are commonly seen in the X-ray structure of biomolecules that transport protons[5, 6] Typically, X-ray crystal structures of biolog- ical molecules do not have high enough resolution to resolve hydrogen atoms, which can lead to ambiguities in defining the HBC In addition, there is some controversy over when the HBC model can be applied For example, aquaporins are transmembrane pores that allow the passage of water, but not protons or other ions As reviewed recently by Warshel[7], initial efforts were made to explain the proton blockage, structurally, in terms of a bifurcated HBC that prevents the Grotthus
Trang 22Carbonic anydrase II (CAI) provides another example of the controversy between the electro- Static and the HBC, “proton wire” views CAII is a Zn(I])-containing metalloenzyme that moder- ates respiration by catalyzing the conversion between C'O, and HC’O; An important step in the functional cycle of CAII involves a PT between the zinc-bound water and a histidine (H64) near the protein surface; the transferred proton is released into solution as the now doubly-protonated H64 sidechain flips from a buried (‘IN’) conformer to a solvent-exposed (“OUT”) conformer Since the distance between the zinc-bound water and His 64 in the X-ray structure [9] was observed to
be too long (7.5 A) for a direct transfer, water molecules in the active site are assumed to pro- vide the HBC needed to relay the proton[10, 11, 9] However, the molecular nature of the proton transfer, bottleneck has been under heated debate due to the large number of waters in the active site.[11, 12, 13, 14]
In this thesis, it will be shown that in order to more fully understand the long-range proton transfer process in CAII and other biological systems, all accessible protonation states of the me- diating groups must be considered This is exemplified by proton transfer through water, in which the individual water molecules can exist in three protonation states (water, hydronium and hydrox- ide) An alternative to the Grotthuss mechanism for proton transfer through water occurs when a hydroxide is generated first by a water that protonates the acceptor and then transferred towards the donor through the mediating water molecules The latter mechanism can most generally be described as the transfer of a “proton hole” from the acceptor to the donor where the “hole” char- acterizes the deprotonated state of any mediating molecule This pathway is distinct and is rarely considered in the discussion of proton transfer processes
1.2 Overview
When a proton is transferred over a long distance, several bonds are formed and broken, and a positive charge is shifted In order to study these processes computationally, methods are required that 1) accurately reproduce the potential energy surface and 2) allow sufficient sampling for the
Trang 23tions in solution and biological systems; the approximate density functional method, SCC-DFTB (self-consistent-charge density functional tight-binding approach)[17] , is applied as implemented
in CHARMM{[18, 19] throughout this work due to its balance of accuracy and efficiency
In order to simulate long-range proton transfer with SCC-DFTB/CHARMM the accuracy of coupled potential energy function must be carefully benchmarked This thesis consists of two major parts The first part contains the validation of the SCC-DFTB/CHARMM method and the second part includes the application of the method to study long-range proton transfer in CAII In Chapter 2, the van der Waals interactions between SCC-DFTB atoms and TIP3 waters are system- atically studied The SCC-DFTB atom parameters for these interactions are first optimized, and then the overall sensitivity of QM/MM molecular dynamics free energy simulations to these pa- rameters is evaluated In Chapter 3, the generalized solvent boundary potential (GSBP) and Ewald
sums methods for the treatment of long-range electrostatics in SCC-DFTB/MM simulations are
reviewed In Chapter 4, the treatments of long-range electrostatics are applied to pK, calculations
where direct comparisons to experimental results can be made First, GSBP and Ewald are used to
calculate the relative pK,s for a series of small molecules in solution Then, the pK, shift of two amino acids in T4-Lysozyme are calculated relative to their solution values In Chapter 5, mul- tiple simulations of the protonation states relevant to the PT are carried out with both GSBP and Ewald treatments of long range electrostatics In Chapter 6, the pA,s of CAII and a mutant form
of CAII are calculated to explore how the enzyme regulates the pK, of zinc-bound water Elec- trostatic contributions from the protein and waters are evaluated In Chapter 7, the mechanisms
of long-range proton transport is investigated for transfers in solution and in CAII In Chapter 8, the long-range proton transfer in CAII is explored in detail Electrostatic contributions from the protein and waters are evaluated
Trang 24van der Waals Interactions in QM/MM Simulations
2.1 Introduction
The analysis of chemical events in complex systems requires a potential function that can describe electronic changes in the region of interest Current quantum mechanical approaches provide such descriptions, but there are severe limitations in the size of the systems that can be treated[20, 21] In order to investigate the effects of the environment on chemical events, approx- imations must be made with either an implicit or explicit approach For biological systems, the inclusion of the protein and solvent environment is paramount[22], and combined quantum me- chanical and molecular mechanical (QM/MM) methods[23, 24, 25] are a popular choice for such investigations QM/MM methods enable theoretical computations of complex chemical events in large systems by partitioning the system into a quantum region and a molecular mechanics region The application of the QM/MM method can provide atomistic details of catalytic mechanisms cor- responding to experimental observables, which is valuable for both fundamental understanding of enzyme Catalysis and realistic applications such as protein engineering
The total Hamiltonian for the molecular system under consideration in the QM/MM framework is:
Trang 25the molecular mechanical bonding term is retained between the boundary QM and MM atoms while the valency of the QM region is satisfied with the addition of link atoms[23] or frontier bonds[26, 27] The purpose of the van der Waals term is to estimate dispersion attractions that fall off as r~® and to prevent molecular collapse being strongly repulsive at short interaction distances Typical QM/MM potentials approximate the van der Waals interactions with a Lennard-Jones po- tential (LJ) containing predetermined parameters for QM and MM atoms; as will be described below, the same approach is used here and “LJ” will be used to refer so these interactions In terms
of the magnitudes at typical interactomic distances, electrostatic interactions usually overwhelm
the LJ contributions, especially in polar systems such as enzymes Due to the rapid variation at short interatomic distances, however, the LJ interaction (therefore LJ parameters) are important to the equilibrium geometries of molecules treated by QM/MM in the gas phase [23, 16, 28, 29, 30]
In the condensed phase, on the other hand, due to averaging over a large number of configura- tions, the question of how important the precision of QM/MM van der Waals interactions is to the molecular properties of common interest is an interesting one, which has not been systematically explored in the past Moreover, the LJ interaction between QM and MM atoms in the condensed phase may not only affect the enthalpy of direct interactions but also have a substantial entropic component, and the possibility of enthalpy/entropy compensation has not been analyzed in details
in previous work
This chapter consists of two major components: first, the LJ parameter optimization/testing for the interaction of SCC-DFTB[17] (Self Consistent Charge Density Functional Tight Binding method) atoms (C, O, N, and H) with MM atoms, and second, the systematic analysis regarding the sensitivity of computed condensed phase observables on these parameters The parameters are op- timized using a set of hydrogen bonding complexes chosen to represent the interaction of selected amino acid side chains with water molecules; the transferability of the parameters is verified for interactions of water molecules with a different set of molecules with similar atomic environments Then a comparison between the optimized parameters (set Opt) and two extreme sets (set A and set
Trang 26(isoalloxazine) and the potential of mean force for an intramolecular proton transfer in enediolate
is computed with SCC-DFTB/MM For the redox potential of isoalloxazine, the interplay between enthalpic and entropic components is investigated
While the comparison of computational with experimental results is paramount for realistic applications, the goal of this analysis is not to compare with experiment but rather to determine the variation of observables with QM-LJ parameters In the following, we will describe the LJ parameter optimization and computational details in Section 2.2 and the results and discussion in Section 2.3 This work will be summarized with conclusions in Section 2.4
2.2 Computational Methods
2.2.1 QM/MM Energy Evaluation
According to Eqn 2.1, the energy of QM/MM simulations is determined by combining the Hamiltonians of the quantum mechanical and molecular mechanical regions with a QM/MM cou- pling term composed of electrostatic, bonded, and van der Waals contributions:
Uror = (UHM + AGM |W) + UE + Un + UMM elec [2.3]
The QM approach used throughout this thesis is SCC-DFTB,[17] which is very efficient due mainly
to approximations to the two-electron integrals This method introduces the charge self- consis- tency at the level of Mulliken population and, accordingly, the QM atoms interact with the MM sites electrostatically through Mulliken partial charges[19],
elec TT _ VATE [2.4]
AeMM BeQM |Ra — Ral where @,4 and Agp are the MM partial charges and Mulliken partial charges, respectively We note that although other definition of charges in SCC-DFTB and SCC-DFBT/MM calculations can
in principle be used instead of the simple Mulliken charges, important parameters in SCC-DFTB (e.g., repulsive potentials) were optimized within the Mulliken framework[17, 31] Therefore,
Trang 27QM M™ 1s that the anisotropy of interactions tends to be underesti-
using Mulliken charges in U,
mated
The van der Waals interactions are approximated from a Lennard-Jones potential (LJ) term consisting of pre-determined parameters:
where A and B are the indices for the QM and MM nuclei, respectively, and Rx is the distance between QM and MM nuclei The LJ parameters are defined by the standard combination rules: EAB = (eaep)? and đZap = s(Øa + op), where € and o describe the well depth and atomic radius, respectively Different QM methods require, in principle, different LJ parameters for optimal results[28, 16], and the optimization of these parameters for SCC-DFTB is the first goal in this investigation This will be followed by a systematic study on the quantitative importance of the precision of QM/MM van der Waals interactions to results
2.2.2 Optimization of van der Waals Parameters
A training set of hydrogen bonded complexes (see Figure 2.1) are used to optimize the LJ parameters for SCC-DFTB atoms in equation (5) In addition to the training set, a set of hydrogen bonded complexes not used in the parameter opimization (see Figure 2.2) are used to verify the transferability of the parameters The complexes are chosen to capture amino acid interactions with water; a subset of the complexes were used previously by M Freindorf and J Gao in the optimization of LJ parameters for the HF/3-21G/MM potential [28]
For all the gas phase complexes (Figures 2.1 and 2.2), the water molecule is treated with MM using a modified TIP3P model, as implemented in CHARMM [33], while the organic molecule
is treated with SCC-DFTB To ensure the largest transferrability, the LJ parameters are chosen to depend only on the element type rather than atom type The exception is hydrogen, for which only the LJ parameters for polar hydrogen atoms are optimized; for non-polar hydrogen, the standard
Trang 28defined as the inverse of a weighted sum of differences between values that are determined from the SCC-DFTB/MM and the reference calculations:
v= Vi-1 wi[Vi(reference) — Y;(SCC — DFTB/MM)|?
The reference values (Tables 2.1 and 2.2) are determined from B3LYP[35, 36, 37] calcula- tions using B3LYP/6-31+G** for geometry optimizations and B3LYP/6-311++G**[38, 39] for single point energy evaluations for each complex The reliability of density functional theories for describing hydrogen bonded complexes has been discussed rather extensively in the litera- ture [40, 41] Although it is well known that local DFT methods tend to overestimate intermolec- ular interactions, calculations using non-local functionals (especially hybrid functionals) were found to be reliable compared to high level ab initio methods For example, an early study by Salahub and co-workers [40] on prototypical hydrogen-bonding complexes such as water dimer and formamide-water cluster found that DFT calculations using several popular non-local func- tionals agreed well with MP2 and experimental results More recently, Jorgensen and co-workers [41] have compared the performance of the popular B3LYP method, MP2 and various high-level extrapolation-based methods (CBS) [42] for more than fifty hydrogen bonding complexes involv- ing organic molecules and water It was also found that the performance of B3LYP was quite impressive compared to more sophisticated calculations, provided that basis sets larger than 6- 31+G** are used Therefore, although it is crucial to remember that dispersion interaction is miss- ing from popular implementations of DFT and could be important in many occasions of biological
Trang 29
Figure 2.1 Training Set Molecules These complexes were chosen to represent typical amino acid side chain interactions with water This set was used fot the optimization of the van der Waals parameters for SCC-DFTB atom as required for SCC-DFTB/MM calculations, treating the organic molecule with SCC-DFTB and the water with TIP3P The reference stationary point geometries presented here were optimized at the B3LYP/6-31+G** level Distances are in Angstroms and angles are in degrees
Trang 31
Figure 2.2 Complexes Not Included in the Optimization of the van der Waals Parameters for SCC-DFTB Atoms The stationary point geometries presented here were optimized at the B3LYP/6-31+G** level These molecules were chosen as to represent similar hydrogen bonding interactions in complexes that were different from the training set to test the transferability of the van der Waals parameters treating the organic molecule with SCC-DFTB and the water with TIP3P Distances are in Angstroms and angles are in degrees
Trang 32Table 2.2 Summary of interaction energies and bond lengths: test set
Trang 33significance (e.g., nucleic acid base stacking [43]), one should note that B3LYP with a decent basis set is a fairly robust approach for treating typical hydrogen-bonding interactions (~10 kcal/mol)
In the genetic algorithm, weighting (w; in Equation 2.6) is used to adjust the sensitivity of se- lected properties to the optimization; for this study, weights of 3.0, 5.0, and 1.0 are used for the interaction energies, hydrogen bond angles and distances, respectively The different weights are chosen so that the interaction energy evaluation was the most important followed by the gradient calculations (where the gradient about the angle needs a higher weight to compensate its smaller value in units of kcal/mol-degree) Similar results are obtained for minimizations with different weights During the GA optimization, consistent LJ values are found with various GA settings; the values reported here are obtained with a micro-GA technique with a population of 5 chromo- somes that is allowed to operate for 400 generations with uniform crossovers, see [44] for detailed descriptions and recommendations for the GA options
2.2.3 Gas Phase Comparisons
Hydrogen bond distances and energies are compared between the full B3LYP, full SCC-DFTB, and SCC-DFTB/MM with opimized LJ parameters In addition, SCC-DFTB/MM calculations are also carried out with two “extreme” sets, one with “maximal” LJ interaction parameters (set A) and one with “minimal” LJ interaction parameters (set B) The values for sets A and B are taken directly from the CHARMM22[33] forcefield selected in such a way that set A contains the smallest 04, and largest €48 to promote closer packing, while set B contains the largest 042 and smallest €4 thereby pushing out the MM solvent providing a molecular surface that tends to be more “slippery” The gas phase comparisons are made for both the training set (Figure 2.1) and the set of hydrogen bonded complexes not used in the parameter optimization (Figure 2.2)
2.2.4 Condensed Phase Observables
In order to determine the effect of LJ parameters on condensed phase observables, a series
of free energy and potential of mean force (PMF) calculations are carried out using the SCC- DFTB/MM approach described above with each of the three sets of LJ parameters The free energy
Trang 34and PMF calculations are chosen because they are the typical type of simulations that characterize energetics of chemical events in the condensed phase The free energy is determined for the process
of reducing the isoalloxazine ring of flavin adenine dinucleotide (FAD, see Figure 2.3) in solution using a thermodynamic integration scheme; this is compared to the gas phase potential energy difference to determine the energetic effect of solvation The PMF is determined for the proton transfer process in a model enediolate in solution (see Figure 2.4)
2.2.4.1 Reduction Potential of a Model FAD
Figure 2.3 Schematic of FAD reduction
All free energy calculations for the model FAD reduction are carried out with CPT[45] molec- ular dynamics under periodic boundary conditions An FAD molecule is solvated in a 38 A water box with 1769 water molecules The nonbond interactions are determined on an atom-atom pair basis with pair interactions terminated at 12 A, and reduced to zero from 8-12 A using a shift func- tion with a dielectric constant of one Bonds involving hydrogen are constrained using the SHAKE algorithm[46]; the system is heated to 285, 300, or 325 K (see below) and equilibrated for 50 ps The free energy derivatives are calculated with a thermodynamic integration method[47] until con- vergence (usually about 200 ps) is achieved for three windows (A =0,0.5, and 1.0) converting the system from FAD to FAD~ Three windows are expected to be sufficient for the free energy calcu- lation due to the expected linear response of the solvent[48, 47] The free energy derivative is then integrated to determine the free energy change
Trang 35Using a modified QM/MM potential, thermodynamic integration calculations for the reduction
of FAD are carried out using the dual topology single coordinate (DTSC) approach; the DTSC method has been recently described in detail [47] and will only be briefly summarized here This approach combines the ideas of the dual-topology and single topology schemes into an approach that describes the system with two electronic Hamiltonians and one set of coordinates at each time step A series of simulations convert the potential energy of the system from 4 = 0 (A=FAD) to
A = 1 (B=FAD~ ) where the QM/MM potential of the system is defined as:
U(X4,Xp,Xc) = (1—A)(O4lHaa(Ka) + Hac(Ka, Xc)|O4) + [2.7]
\(®g|Hga(Xe) + Hac(XsB, Xc)|Sz) + Uco(Xc)
where A, B, and C correspond to FAD, FAD™, and TIP3P water respectively The QM/MM in- teraction energies do not depend on and the free energy derivative is determined from Equation 2.7 If the same set of LJ parameters are used for the interaction between the MM and the two QM states, only the electrostatic term contributes explicitly to the free energy derivative:
OG(X; 2) Ø8U(0)
where ui Al MM ((yEAD"/ M™) is the total electronic energy of the model FAD(FAD~) including
the electrostatic interaction with the MM solvent The value of A defines the solute-solvent in- teraction for which the difference between the potential energies for both end states is averaged The free energy of reduction is then calculated from the free energy derivative by determining the functional dependence on 4 and integrating from \ = 0.0 to A = 1.0 As emphasized in previous work [47, 49], although a single set of coordinates is used for both electronic states at each time- step, the method is formally exact due to the state character of free energy The error in practical simulations arises from the fact that bonds involving hydrogen are usually constrained to be the same length in both end states, which is usually of a negligible magnitude
Trang 36In addition to determining the reduction potential, entropic and enthalpic contributions to the reduction potential are calculated based on a finite difference approach assuming a linear depen- dence of the free energy change on temperature (285, 300, and 325 K) The entropic term is deter-
mined from the negative slope of the linear fit, as represented by the thermodynamic relationship,
OAG
and the enthalpic term is determined from the intercept of the linear fit of the temperature de-
pendence of the free energy change While being computationaly expensive, the finite difference
approach provides relatively accurate estimates of thermodynamic quantities although it is clearly limited by the accuracy of the method used to determine the free energy; methods for determining the enthalpy and entropy changes have been surveyed in a recent article[50] Once again, al- though the accuracy of the SCC-DFTB/MM approach is an important issue when comparing with experiment[49, 47], this study aims to determine the relative variation of observables between LJ parameter sets, and changes in entropic and enthalpic contributions allow for a more quantitative look at the free energy change
2.2.4.2 Potential of Mean Force Calculations
H
Figure 2.4 Schematic of enediolate intramolecular proton transfer
To illustrate the effect of the LJ parameters on QM/MM calculations of chemical reactions, the potential of mean force (PMF) for the proton transfer in a model enediolate molecule in water is studied As discussed in previous work, this model is motivated by a possible step in the catalytic cycle of triosephosphate isomerase [51] As a first approximation, adiabatic maps and PMF pro- files are compared for the three sets of LJ parameters in the presence of water in a 30 A box with
Trang 37periodic boundary conditions The adiabatic map consists of energy minimizations along the reac- tion pathway, as defined by the difference between O,-H and H-O, distances as the proton shifts from one oxygen to the other (Figure 2.4) Then for each set of LJ parameters, PMF calculations are carried out with a set of SCC-DFTB/MM CPT molecular dynamics simulations with umbrella sampling [52] along the same proton transfer reaction coordinate (6 = ro,-H — TH-0,)
where W (6) depends on temperature and the probability distribution (9(6)) of observing the system
along the reaction coordinate To prepare the system, the unconstrained enediolate system is heated and equilibrated for 50 ps, and then a series of 50 ps simulations with harmonic potentials applied
to restrain the transfer proton at different points along the reaction coordinate are carried out The actual distances between enediolate oxygens and the transfer proton are recorded at each step for all windows and the weighted histogram analysis method (WHAM)[53] is used to determine the PMF along the approximate reaction coordinate, 6
2.3 Results and Discussion
2.3.1 Gas Phase Comparisons
2.3.1.1 Hydrogen Bonding Complexes
Overall, the optimized parameters for SCC-DFTB atoms are found to increase the accuracy
of gas phase hydrogen bond lengths and interaction energies when they are compared to the two extreme parameter sets (A and B) taken from the CHARMM27? forcefield, see Table 2.3 for param- eters The optimized parameters are found to be transferrable when applied to the set of molecules not included in the training set (Figure 2.2)
More specifically, the optimized parameters recognize more of the reference stationary points than either of the other two parameter sets (Figure 2.5) For the training set of complexes, the RMS error in the hydrogen bond distance (interaction energy) is 0.08 (1.2), 0.10 (1.2), and 0.40 (3.4) A (kcal/mol) for set Opt, set A, and set B, respectively (Table 2.4) Similar results are found when the parameters are applied to the set of molecules not included in the optimization set All RMS
Trang 38Table 2.3 LÍ parameter sets for SCC-DFTB atoms in SCC-DFTTB/MM calculations
Atom Set Opt* Set Opt? Set A® SetA° SetB® Set B?
calculations only include complexes with SCC-DFTB/MM stationary points Parameter set Opt fails to locate stationary points corresponding to molecules 6 and 23 (Figure 2.1 and Figure 2.2); parameter set A fails to locate stationary points for molecules 6, 21, 23, and 33; parameter set B fails to locate stationary points for molecules 6, 11, 23, and 33 For the three sets, molecule 6 shifts the in-plane water to a configuration similar to molecule 5, but with the water at a further distance and rotated into the same plane as the carbocation containing the lysine analog (Figure 2.1) For all three sets the starting geometry of molecule 23 minimizes to molecule 24 (Figure 2.2) For set
A, molecule 21 minimizes to 19; for set B molecule 11 minimizes to molecule 10
The SCC-DFTB/MM with optimized vdW parameters is found to yield more accurate hydro- gen bond interaction energies but less accurate geometries than the full SCC-DFTB in comparison with the reference results from B3LYP calculations(Tables 2.1 and 2.2) Full SCC-DFTB has difficulty describing the hydrogen bond interaction with N as the acceptor; the bond lengths in molecules 2, 11, 12 and 24 are all about 0.1 A too long The full SCC-DFTB hydrogen bond in- teraction energies are underestimated by about 2-3 kcal/mol yielding an RMS of 3.0 kcal/mol for the training set, while the geometries are the most accurate compared to the reference geometries yielding an RMS of 0.07 A for the hydrogen bond lengths (Table 2.4)
While interaction energies and bond lengths improve for the optimized set of LJ parameters, the angles do not (which most likely contributes to the inability to recognize some stationary
Trang 39B3LYP/6-311++G** (kcal/mol) B3LYP/6-311++G** (kcal/mol)
Figure 2.5 Interaction Energies For the Full SCC-DFTB and SCC-DFTB/MM with different van der Waals parameters (Set Opt, A, and B) Plotted Against the B3LYP results The direct com- parison between B3LYP/6-311++G** interaction energies plotted on the x-axis and the (a) Full SCC-DFTB, (b) SCC-DFTB/MM set Opt, (c) SCC-DFTB/MM set A, and (d) SCC-DFTB/MM set B Those values for the SCC-DFTB/MM interaction energy equal to zero correspond to the stationary points not observed (see text)
Trang 40Table 2.4 RMS errors for interaction energy and hydrogen bond length of SCC-DFTB and SC- C-DFTB/MM compared to B3LYPa reference values determined for the set of complexes used for optimizing (testing) the van der Waals parameters
points), and still remain a limitation for the SCC-DFTB/MM approach due to the coupling be- tween the quantum and molecular mechanics regions As mentioned above (Equation 2.4), SCC- DFTB/MM uses the Mulliken partial charges to determine the electrostatic interaction between the
QM and MM atoms[19], and thus ignores the shape of the QM molecular orbitals As a result, the anisotropy of the QM/MM interactions is underestimated with the current implementation An improved SCC-DFTB/MM approach with an orbital description (e.g., using the analytical one- electron operator for the interaction between the QM electrons and MM partial charges as in most
ab initio or DFT based QM/MM implementations [24, 54]) may improve bond angles for the gas phase, but this may prove to be of limited practical value when applied to condensed phase systems Future work may investigate this issue further
2.3.1.2 FAD Water Cluster
The results of the potential energy difference calculations for a small water FAD cluster with full SCC-DFTB and SCC-DFTB/MM (Set Opt, Set A, and Set B) are carried out to compare the interaction of water molecules with isoalloxazine is treated at the full SCC-DFTB level to the SCC-DFTB/MM treatment where the isoalloxazine is treated with SCC-DFTB and the waters are