1. Trang chủ
  2. » Luận Văn - Báo Cáo

Luận án tiến sĩ: Computational studies to understand molecular regulation of the TRPC6 calcium channel, the mechanism of purine biosynthesis, and the folding of azobenzene oligomers

500 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Computational studies to understand molecular regulation of the TRPC6 calcium channel, the mechanism of purine biosynthesis, and the folding of azobenzene oligomers
Tác giả Peng Tao
Người hướng dẫn Christopher M. Hadad, Advisor, Russell M. Pitzer, James V. Coe
Trường học The Ohio State University
Chuyên ngành Chemistry
Thể loại Dissertation
Năm xuất bản 2007
Thành phố Ann Arbor
Định dạng
Số trang 500
Dung lượng 14,96 MB

Cấu trúc

  • 1.1 TRPC ion channels (35)
  • 1.2 N 5 -CAIR mutase mechanism in purine biosynthesis pathway (0)
  • 1.3 Foldamers (41)
  • 1.4 Regioselective ring-opening of epoxides in 2,3-anhydrosugars (45)
  • 1.5 References for Chapter 1 (47)
  • 2.1 Density functional theory (54)
  • 2.2 Monte Carlo simulations (57)
  • 2.3 Molecular dynamics simulations (59)
  • 2.4 Quantum molecular dynamics (61)
  • 2.5 References for Chapter 2 (64)
  • 3.1 Introduction (66)
  • 3.2 Computational methods (70)
    • 3.2.1 TRPC6 peptide docking (70)
    • 3.2.2 Molecular dynamics simulation of FKBP12-TRPC6 peptide (72)
    • 3.2.3 MM/GB-SA free energy of binding calculations (73)
  • 3.3 Results and discussion (74)
    • 3.3.1 TRPC6 peptide docking (74)
    • 3.3.2 RMSD fluctuation of complexes (78)
    • 3.3.3 Residue fluctuation in each simulation (80)
    • 3.3.4 Binding free energy calculations by MM-GBSA based on Single-Trajectory (87)
    • 3.3.5 Free energy of binding calculations of FKBP12 with (92)
    • 3.3.6 Free energy of binding calculations of Ser768Asp and (93)
    • 3.3.7 Entropic contribution of each complex based on Single- (94)
    • 3.3.8 Free energy of binding calculations of WT complex based (96)
    • 3.3.9 Free energy of binding calculations of mutant complexes (97)
    • 3.3.10 Decomposition analysis of the free energy of binding (98)
    • 3.3.11 Decomposition analysis of the free energy of binding (106)
    • 3.3.12 Spatial distribution of residues with major contributions to (106)
    • 3.3.13 Interactions among Lys44, Lys47 and Residue768 75 (108)
    • 3.3.14 Comparison of RMSD fluctuation between bound and (114)
    • 3.3.15 Comparison of RMSD fluctuation between bound and (118)
  • 3.4 Conclusions (120)
  • 3.5 References for Chapter 3 (124)
  • 4.1 Introduction (130)
  • 4.2 Computational Methods (134)
  • 4.3 Results and Discussions (136)
    • 4.3.1 Simplified Models (136)
    • 4.3.2 Thermochemistry for transformation of AMI to AMIC (136)
    • 4.3.3 From MICA to AMIC through cationic or anionic pathways: a brief glance (139)
    • 4.3.4 Exploring the PES of carboxyl unit migration (144)
    • 4.3.5 Complete mechanism involved with enzyme active site (155)
    • 4.3.6 Verification of the Simplified Model (161)
  • 4.4 Conclusions (164)
  • 4.5 References for Chapter 4 (167)
  • 5.1 Introduction (171)
  • 5.2 Computational Methods (172)
  • 5.3 Results (173)
    • 5.3.1 Model structures (173)
    • 5.3.2 Optimization in the gas phase (175)
    • 5.3.3 Aqueous phase calculation with the PCM method (181)
    • 5.3.4 PCM calculations in chlorobenzene (183)
    • 5.3.5 Transition states search related to previous calculations (186)
  • 5.4 Discussion (194)
    • 5.4.1 Thermochemistry of N 5 -CAIR and CAIR (194)
    • 5.4.2 Transition States of Carboxyl Group Migrations (197)
  • 5.5 Conclusions (200)
  • 5.6 References for Chapter 5 (201)
  • 6.1 Introduction (203)
  • 6.2 Computational Methods (204)
  • 6.3 Results (205)
    • 6.3.1 Analogs with either R 1 or R 2 as a methyl group (207)
    • 6.3.2 Analogs with either R 1 or R 2 as a trifluoromethyl group (213)
    • 6.3.3 Analogs with either R 1 or R 2 as a cyano group (219)
    • 6.3.4 Analogs with either R 1 or R 2 as a carboxylic ester group (225)
    • 6.3.5 Analogs with either R 1 or R 2 as a nitro group (232)
  • 6.4 Conclusions (239)
  • 6.5 References for Chapter 6 (241)
  • 7.1 Introduction (243)
  • 7.2 Computational methods (246)
  • 7.3 Results and Discussion (249)
    • 7.3.1 Monte Carlo Conformational Search of 7.1 (249)
    • 7.3.2 Monte Carlo Conformational Search of 7.2 (259)
    • 7.3.3. Molecular Dynamics of 7.1 (265)
    • 7.3.4 Molecular Dynamics of 7.2 (270)
    • 7.3.5 Replica Exchange Molecular Dynamics Simulation (REMD) of 7.1 (274)
    • 7.3.6 Replica Exchange Molecular Dynamics Simulation (REMD) of 7.2 (278)
  • 7.4 Conclusions (280)
  • 7.5 References for Chapter 7 (283)
  • 8.1 Introduction (286)
  • 8.2 Methods (287)
  • 8.3 Results and Discussion (290)
    • 8.3.1 Replica Exchange Diagnostics (290)
    • 8.3.2 REMD simulations of 7.1 (296)
    • 8.3.3 REMD simulations of 7.2 (306)
    • 8.3.4 REMD simulations of 8.1 (316)
    • 8.3.5 REMD simulations of 8.2 (321)
  • 8.4 Conclusions (331)
  • 8.5 References for Chapter 8 (332)
  • 9.1 Introduction (334)
  • 9.2 Methods (338)
    • 9.2.1 Epoxide ring-opening reactions (338)
    • 9.2.2 Molecular Dynamics simulation of sparteine, sugar ring (339)
    • 9.2.3 CPMD simulation method (340)
  • 9.3 Results and Discussion (341)
    • 9.3.1 Gas-phase and PCM potential energy surfaces (341)
    • 9.3.2 Molecular Dynamics simulations (345)
      • 9.3.2.1 Molecular Dynamics Simulation for MD9.6 ( C5=O - ) (346)
      • 9.3.2.2 Molecular Dynamics Simulation for MD9.7 (C5=OH) (350)
      • 9.3.2.3 Molecular Dynamics Simulation for MD9.8 (C5=NH 2 ) (352)
      • 9.3.2.4 Molecular Dynamics Simulation for the System (354)
      • 9.3.2.5 Molecular Dynamics Simulation for the System (358)
    • 9.3.3 ab initio Molecular Dynamics Simulation Results (359)
  • 9.4 Conclusions (365)
  • 9.5 References for Chapter 9 (368)
  • 3.2 Binding free energy components of FKBP12 and wild peptides (0)
  • 3.3 Calculated entropic change (Delta) by NMODE (0)
  • 4.1 Thermochemistry (kcal/mol) of model reactions for AMI to MICA and (0)
  • 4.2 Energy comparison of different products generated via methylation, (0)
  • 4.3 Transition-state calculation results for different protonation states of (0)
  • 5.1 DFT Computational results in the gas phase (0)
  • 5.2 DFT single-point energetic results using the PCM model for water (ε=78.39) at the B3LYP/6-31+G(d) level of theory (298 K, kcal/mol) (0)
  • 5.3 DFT single-point energetic results using the PCM model for (0)
  • 5.4 Transition states for carboxylate group migration using the full model (0)
  • 6.2 Substituent calculation with R 1 =H, R 2 =CH 3 (0)
  • 6.3 Substituent calculation with R 1 =CF 3 , R 2 =H (0)
  • 6.4 Substituent calculation with R 1 =H, R 2 =CF 3 (0)
  • 6.5 Substituent calculation with R 1 =CN, R 2 =H (0)
  • 6.6 Substituent calculation with R 1 = H, R 2 = CN (0)
  • 6.7 Substituent calculation with R 1 = CO 2 CH 3 , R 2 = H (0)
  • 6.8 Substituent calculation with R 1 = H, R 2 = CO 2 CH 3 (0)
  • 6.9 Substituent calculation with R 1 = NO 2 , R 2 = H (0)
  • 6.10 Substituent calculation with R 1 = H, R 2 = NO 2 (0)
  • 7.1 First 20 Conformers in Monte Carlo Simulations of (0)
  • 9.1 Transition-state Preferences for C3 vs. C2 ring-opening of the epoxide (0)
  • 9.2 Transition-state Preferences for C3 vs. C2 ring-opening of the epoxide (0)
  • 9.3 MD simulation details (0)
  • A.1 MD simulation details (0)
  • B.1 Energies for model reactions B3LYP/6-311+G(3df,2p)//B3LYP/6- 31+G(d) (0)
  • B.3 Energies for carboxylate group a ddition products on different sites of AMI at B3LYP/6-31+G(d) level of theory (0)
  • B.4 Energies for protonation products on different sites of AMI at B3LYP/6- 31+G(d) level of theory (0)
  • B.5 Energies for deprotonation products on different sites of AMI at B3LYP/6- 31+G(d) level of theory (0)
  • B.6 Energies for histidine model structures at B3LYP/6-31+G(d) level of (0)
  • B.7 Energies for structures along carboxylate group migration through (0)
  • B.8 Energies for structures along carboxylate group migration through (0)
  • B.9 Energies for structures along carboxylate group migration through (0)
  • B.10 Energies for structures along carboxylate group migration through (0)
  • B.11 Energies for structures along carboxylate group migration through (0)
  • B.12 Energies for structures along carboxylate group migration through (0)
  • B.13 Energies for structures along carboxylate group migration through (0)

Nội dung

273 8.14 Contour plots of the free energy at 300 K projected onto the dihedral angle 1 and dihedral angle 2 reaction coordinates for REMD simulation of 7.2.. 276 8.16 Contour plots of th

TRPC ion channels

Transient receptor potential-canonical (TRPC) channels are part of the mammalian transient receptor potential (TRP) channel superfamily, which consists of cation channels These channels are homologous to the Drosophila TRPC and TRPL channels Based on sequence homology, the 28 members of the mammalian TRP family can be categorized into seven subfamilies, with the TRPC subfamily containing seven members (TRPC1-C7) being the most closely related to the original channels.

Drosophila TRP and TRPL channels Based on sequence similarities, the seven members of the TRPC subfamily can be further subdivided into four groups: (a) TRPC1; (b)

The TRPC subfamilies, including TRPC1, TRPC2, TRPC3, TRPC4, TRPC5, TRPC6, and TRPC7, exhibit similar structural characteristics, with each member being a peptide that spans the membrane six times, positioning the N- and C-terminals in the cytoplasm A notable proline-rich motif, found downstream of the last transmembrane domain, plays a crucial role in the regulation of these ion channels Recent studies have revealed that the immunophilins FKBP52 and FKBP12 interact with TRPCs via this proline-rich domain, with TRPC1, TRPC4, and TRPC5 specifically binding to FKBP52, while TRPC3, TRPC6, and TRPC7 associate with FKBP12.

The regulation of TRPC6 channels involves a complex interplay of phosphorylation, where protein kinase C (PKC) negatively impacts the channels, while CaM kinase II and the nonreceptor tyrosine kinase Fyn exert positive effects Research by Shi et al revealed that calcium entry induced by the DAG analogue OAG is diminished by a calmodulin antagonist, which inhibits CaM kinase II, as well as by the non-hydrolyzable analogue AMP-PNP Additionally, Hisatsune et al demonstrated that TRPC6 can be phosphorylated by Fyn in response to epidermal growth factor (EGF).

Immunophilins are specialized peptidyl prolyl cis-trans isomerases that target specific XP dipeptides in proteins, functioning as immunosuppressants They have a strong affinity for immunosuppressant drugs such as cyclosporin A, FK506, and rapamycin Cyclosporin A specifically binds to cyclophilin family members, while FK506 and rapamycin interact with FK506 binding proteins (FKBPs) FKBPs are associated with key intracellular calcium channels, including the ryanodine receptor and the inositol 1,4,5-trisphosphate receptor (IP3R), where FKBP12 plays a crucial role The dissociation of FKBP12 from these channels disrupts calcium flux Additionally, FKBP59 in Drosophila TRP has been shown to bind to a conserved proline-rich dipeptide sequence at the C-terminal region of TRPCs, highlighting a conserved mechanism across all TRPCs.

Recent findings by Schilling's group indicate that TRPC1, TRPC4, and TRPC5 interact with the immunophilin FKBP52, while TRPC3, TRPC6, and TRPC7 associate with FKBP12 Disruption of FKBP12 binding delays cation entry through TRPC channels, highlighting the importance of immunophilin binding in TRPC channel activation Kim and Saffen demonstrated that FKBP12 is part of a TRPC6-centered protein complex, which forms quickly after the activation of endogenous M1 mAChR in neuronal PC12D cells The binding sites for TRPC6 and FKBP12 are depicted in Figure 1.1, emphasizing FKBP12's crucial role in regulating the TRPC6 channel Additionally, computational modeling has been employed to elucidate the binding dynamics of TRPC6 and FKBP12, as well as the influence of phosphorylation on this interaction, with further details to be discussed in Chapter 3.

Figure 1.1 Structure of TRPC6 Channel and FKBP12 Binding Site

1.2 N 5 -CAIR Mutase Mechanism in the Purine Biosynthesis Pathway

Purine is a heterocyclic aromatic organic compound characterized by a structure that combines a pyrimidine ring with an imidazole ring This essential molecule, along with its derivatives known as purines, plays a crucial role in biological systems, particularly as adenine and guanine, which are two of the key bases found in nucleic acids.

Figure 1.2 Structures of purine, adenine and guanine

Biological systems can synthesize purines both de novo and from nucleic acid degradation products During de novo synthesis, various components of the purine ring are sourced from different precursors: the N1 atom comes from the amine group of aspartate, C2 and C8 are derived from formate, N3 and N9 originate from glutamine's amine group, while C4, C5, and N7 are obtained from glycine, and C6 is sourced from bicarbonate or carbon dioxide.

Glutamine amine Figure 1.3 Biosynthetic sources of purine ring atoms 18

The de novo biosynthesis of purine nucleotides involves approximately fifteen enzymes, many of which are multifunctional proteins in eukaryotes Imbalances in purine regulation can lead to various disorders, including Down’s syndrome, Lesch-Nyhan syndrome, gout, and urinary stones Additionally, elevated levels of de novo purine enzymes are found in several types of cancer cells, making them potential targets for chemotherapeutic drug discovery.

In the 1950s and 1960s, Buchanan and colleagues extensively characterized the enzymatic reactions and intermediates of the de novo purine biosynthetic pathway Recently, new enzyme activities have been identified, including the mechanism of CAIR formation, which is catalyzed by PurK.

Enzymes that facilitate the carboxylation of 5-aminoimidazole have been studied across various prokaryotic and eukaryotic organisms One such enzyme, known as AIR carboxylase, is believed to convert AIR and bicarbonate or CO2 into CAIR However, in E coli, two distinct gene products, PurE and PurK, are essential for this reaction Research by Stubbe and colleagues revealed that PurE and PurK exhibit independent catalytic functions Specifically, PurK catalyzes the transformation of AIR, in the presence of HCO3- and ATP, into N5-carboxyaminoimidazole ribonucleotide (N5-CAIR), generating ADP as a byproduct.

Pi PurE facilitates the reversible transformation of N 5 -CAIR and CAIR, and both chicken and human PurEs exist as part of a bifunctional protein that also synthesizes phosphoribosylaminoimidazole-succinocarboxamide (SAICAR) Notably, genetic homologues of PurK are absent in these systems, indicating that PurK is not a gene product in these organisms and is unnecessary for their purine biosynthetic pathways.

The differences in purine biosynthesis between prokaryotes and eukaryotes present a unique opportunity for drug design, allowing for the development of inhibitors that specifically target PurK or PurE enzymes in prokaryotes To advance rational structure-based drug design, it is essential to clarify the reaction mechanisms of these enzymes The crystal structure of E coli PurE is accessible via the Protein Data Bank (PDB), providing a valuable resource for further study Although the complete reaction mechanism of PurE remains to be fully understood through experimental methods, its three-dimensional structure and associated enzymatic functions serve as promising subjects for computational chemistry research Chapters 4 to 6 explore various potential mechanisms of action for PurE utilizing computational techniques.

Biological macromolecules have great structural and functional varieties

Proteins, alongside ribonucleic acids and polysaccharides, are one of the three primary macromolecules in biological systems Composed of unique sequences of amino acids, proteins can achieve diverse tertiary structures, which are formed from regular secondary structural elements such as α-helices, β-turns, and β-strands These secondary structures serve as fundamental building blocks that assemble into specific and well-defined tertiary forms through dynamic folding processes During protein folding, proteins transition through various transient intermediates and molten globule states, ultimately reaching their most stable and energetically favorable conformation.

Proteins, as products of biological evolution, serve a diverse array of functions within living systems Despite the vast chemical diversity found in both organic and inorganic compounds, proteins are often the primary focus for well-defined folded structures Significant efforts are underway to discover or design unnatural folded polymers that can rival proteins Current research emphasizes the development of unnatural backbones that exhibit stable, well-defined secondary structures The term "foldamer" refers to polymers that exhibit a strong propensity to adopt specific compact conformations Among these, foldamers capable of forming helical structures have garnered particular attention due to their stability and relevance to protein architecture.

All natural amino acids with chirality centers adopt the L-configuration, resulting in all naturally occurring α-helices in proteins being right-handed Additionally, proteins can form various secondary structures, such as β-sheets In contrast, synthetically designed foldamers lacking chiral centers can fold into either right- or left-handed helices without a strong preference Even when a fully extended foldamer possesses chirality, the resulting folded helices are chiral structures The right- and left-handed helices are mirror images of each other but are not identical.

Foldamers

Biological macromolecules have great structural and functional varieties

Proteins are essential macromolecules in biological systems, alongside ribonucleic acids and polysaccharides Composed of unique amino acid sequences, proteins can achieve various tertiary structures, primarily formed from elements of secondary structure such as α-helices, β-turns, and β-strands These secondary structures serve as fundamental building blocks, allowing proteins to fold into specific and well-defined tertiary configurations The protein folding process is dynamic, involving numerous transient intermediates and molten globule states as proteins transition from high-energy unfolded states to their most thermodynamically stable forms.

Proteins, as products of biological evolution, serve a diverse range of functions within biological systems Despite the vast chemical diversity present in organic and inorganic chemistries, proteins remain the primary choice for polymers with defined folded structures However, significant efforts are underway to discover or design unnatural folded polymers that can rival proteins Current research is focused on creating unnatural backbones that exhibit well-defined secondary structures, particularly foldamers—polymers that strongly tend to adopt specific compact conformations Among these, foldamers that can form helical structures have garnered particular interest due to their stability and relevance to protein structures.

All natural amino acids with chirality centers adopt the L-configuration around their chiral carbon, resulting in all naturally occurring α-helices in proteins being right-handed Additionally, proteins can form other secondary structures, such as β-sheets In contrast, synthetically designed foldamers lacking chiral centers can fold into either right- or left-handed helices without a strong preference Even when a fully extended foldamer possesses chirality, the resulting folded helices remain chiral structures The right- and left-handed helices are mirror images of each other but are not identical.

In the absence of chirality in helices or their surrounding medium, the two helices are expected to exist in equal proportions However, when interacting with chiral groups, these helices can become diastereomeric, leading to a preference for either the right- or left-handed conformation Recent research on foldamers has demonstrated that various factors, including protonation, metal coordination, changes in solvent polarity, photoisomerization, and binding of anions or cations, can induce significant conformational changes in these oligomers Despite these findings, there remains limited understanding of the mechanisms behind these conformational interchanges.

Pyridine-2,6-dicarboxamide-based oligomers and dendrons featuring an azobenzene moiety have been synthesized and studied as helical foldamers The azobenzene bond's conformation can be altered between E and Z forms through photochemical activation In the E configuration, the azobenzene oligomers can adopt either right- or left-handed helices, with both conformations being equally favored in the absence of chirality Crystal structures of a tetraazobenzene oligomer reveal two-turn helical conformations, showcasing both right- and left-handed helical structures in the solid state.

Inducing helical chirality in foldamers often involves the introduction of a chiral center, though the specific type of chiral center can be arbitrary The mechanisms behind the interconversion of left- and right-handed diastereomeric helices, as well as the favored isomer proportions, remain poorly understood A deeper insight into these interconversion mechanisms could facilitate the rational design of diastereomeric foldamers Chapters 7 and 8 delve into computer simulations that explore helical folding and interconversion, proposing a mechanism based on these simulations and mapping the free energy landscapes of oligomers.

Four helices in one unit cell of crystal structure

Figure 1.4 Structure of tetraazobenzene oligomer and unit cell of crystal structure.

Regioselective ring-opening of epoxides in 2,3-anhydrosugars

Complex carbohydrates play crucial roles in biology, making their synthesis a key focus in synthetic organic chemistry Synthetic oligosaccharides are vital for biomedical research on carbohydrate-recognizing enzymes, such as glycosyltransferases and glycosidases, which are implicated in diseases like cancer and infections Despite the development of various methods for assembling glycosidic bonds, synthesizing many oligosaccharides and glycosidic linkages continues to pose significant challenges.

The synthesis of complex oligosaccharides has posed significant challenges for organic chemists over the past two decades Various synthetic methods have been developed, but these processes tend to be cumbersome and highly dependent on the substrate Notably, the stereocontrolled synthesis of oligosaccharides with furanose residues remains largely uncharted and has only recently started to receive attention in research.

Recent advancements have been made in synthesizing oligosaccharides derived from arabinofuranose-containing polysaccharides, which are crucial components of the mycobacterial cell wall These polysaccharides feature both 1,2–trans (α-arabinofuranosyl) and 1,2–cis (β-arabinofuranosyl) linkages The epoxy thioglycoside 1.1 and glycosyl sulfoxide 1.2, characterized by their 2,3-anhydro-D-lyxo stereochemistry, exhibit a high degree of stereoselectivity when glycosylating alcohols The major product formed in this reaction has the glycosidic bond cis to the epoxide, allowing the epoxide ring in 1.3 to be opened by nucleophiles, resulting in β-arabinofuranosides (1.4) Previous studies indicated modest regioselectivity in the ring-opening reaction, favoring C–3 attack over C–2 attack at a ratio of approximately 3:1 However, a new method has emerged that enhances regioselectivity through a (-)-sparteine-mediated nucleophilic opening of the epoxide in the glycosides produced from 1.1 and 1.2 The mechanism of (-)-sparteine's catalysis in this epoxide ring-opening reaction has not been extensively studied, but understanding it could significantly improve oligosaccharide synthesis methods Various computational techniques have been employed to simulate the epoxide ring opening, with findings discussed in Chapter 9.

Figure 1.5 Synthesis of β-arabinofuranosides (1.4) from 2,3-anhydrosugar glycosylating agents 1.1 and 1.2

Density functional theory

The wave function of a molecule encapsulates all its information, allowing for the extraction of molecular properties through the application of corresponding operators However, calculating wave functions for large molecular systems is often challenging and time-consuming To address this issue, Density Functional Theory (DFT) has emerged as an effective alternative for calculating molecular properties efficiently.

Pierre Hohenberg and Walter Kohn demonstrated that for molecules with a nondegenerate ground state, the ground-state molecular energy is uniquely determined by the electron probability density This principle allows for the calculation of all ground-state molecular properties directly from the electron density, eliminating the need to determine the molecular wave function Furthermore, the Hohenberg-Kohn variational theorem states that the true ground-state electron density minimizes the energy functional Levy has also proven these theorems for cases involving a degenerate ground state.

The Kohn-Sham method offers a practical approach for performing Density Functional Theory (DFT) calculations It involves a hypothetical reference system that mirrors the real molecular system in terms of electron count and electron density but excludes electron-electron interactions This simplification aids in the DFT calculation process, allowing for more efficient energy computation.

The kinetic energy of non-interacting electrons is represented by T ni [ρ(r)], while V ne [ρ(r)] denotes the nuclear-electron interaction Additionally, V ee [ρ(r)] accounts for classical electron-electron repulsion energy, including self-interaction The term ΔT[ρ(r)] refers to the correction in kinetic energy arising from electron interactions, and ΔV ee [ρ(r)] encompasses all non-classical corrections to electron-electron repulsion energy.

The electron density ρ can be expressed by Kohn-Sham orbitals θ KS by

Therefore, the equation (2.1) can be rewritten as equation (2.3):

To determine the ground-state energy, we can minimize the functional E[ρ(r)] by varying the electron density, N Alternatively, we can achieve the same result by adjusting the Kohn-Sham orbitals to minimize this functional.

The equation (2.3) outlines key components of the energy functional E[ρ(r)], starting with the kinetic energy of non-interacting electrons, denoted as T ni [ρ(r)] It includes the nuclear-electron attraction, V ne [ρ(r)], which mirrors the nuclear-electron interaction in actual systems Additionally, the equation accounts for classical electron-electron repulsion, represented by V ee [ρ(r)], and the exchange-correlation energy, E xc [ρ(r)], which encompasses corrections to kinetic energy ΔT[ρ(r)] and electron-electron repulsion ΔV ee [ρ(r)] While the first and third terms closely approximate the kinetic energy and electron-electron interactions of real systems, the corrections ΔT[ρ(r)] and ΔV ee [ρ(r)] are significant enough to impact the overall energy functional, even if they are not the primary contributors.

The exchange-correlation energy cannot be calculated exactly, necessitating the use of approximations While Hartree-Fock (HF) theory accurately addresses electron exchange, it neglects electron correlation In Density Functional Theory (DFT), the hybrid method is commonly employed to compute the exchange-correlation energy by incorporating a portion of the exact HF exchange into DFT Among these hybrid methods, B3LYP stands out as one of the most widely used, featuring three parameters that yield satisfactory theoretical results aligned with experimental data in numerous instances.

Monte Carlo simulations

Monte Carlo methods are essential computational algorithms utilized for simulating the behavior of physical and chemical systems Unlike molecular dynamics simulations, these methods operate stochastically in a nondeterministic manner In computational chemistry, the Monte Carlo approach focuses on simulating a system with a fixed number of identical particles (N) within a specified volume (V) at a defined temperature (T) The classical expression for the partition function Q plays a crucial role in these simulations.

In a system of N particles, the coordinates are represented by r N and their corresponding momenta by p N Each particle's momentum is denoted as P i, with m representing the mass of the particles and k B symbolizing the Boltzmann constant By utilizing the partition function, any observable A of the system can be formulated as a function of the particles' coordinates and momenta.

In statistical mechanics, the parameter β is defined as 1/k_B T, allowing for the analytical integration of momentum-dependent functions due to their quadratic nature Consequently, calculating averages for these functions is typically straightforward However, challenges arise when computing averages for coordinate-dependent functions, as multidimensional integrals over particle coordinates are predominantly solvable only through numerical methods To address these complexities, Monte Carlo methods were developed as a solution.

With integration over momenta evaluated analytically, the averages of the system can be expressed as function of coordinates

The Monte Carlo simulation efficiently samples from a system's configuration space to estimate averages, circumventing the impracticality of evaluating all possible configurations The Metropolis protocol is the most widely utilized scheme within Monte Carlo methods for effective sampling.

The probability density to find the system in a configuration around r N is denoted as

In a given system's configuration space, a randomly generated state \( o \) can transition to an arbitrary neighboring state \( n \) using the Metropolis scheme This process begins with a random trial move from state \( o \) to state \( n \) The acceptance probability of this trial move is then calculated as a function of the system's characteristics.

The Monte Carlo simulation fundamentally relies on the selection of π(o→n), with the Metropolis protocol demonstrating superior efficiency in sampling configuration space compared to alternative strategies.

This dissertation utilizes Monte Carlo simulation to address multiple-minimum problems in the conformational searches of organic molecules By employing internal coordinates, the simulation efficiently captures molecular configurations compared to Cartesian coordinates While the intricacies of Monte Carlo simulation can be complex, the foundational principles are clearly outlined in this section.

Molecular dynamics simulations

Molecular dynamics is a computational technique that employs classical mechanics to investigate the structure and dynamics of molecular assemblies This simulation involves a numerical, step-by-step approach to solving classical equations of motion In this context, particles are characterized by their positions (R) and momenta (P=mv), where m represents mass and v denotes velocity The collective set of positions or momenta is referred to as R^N (P^N) Additionally, the Hamiltonian (H) of the system plays a crucial role in describing its energy dynamics.

Here, the potential U is assumed to be a function of position only The forces on each particle can be derived from the potential

Then the dynamics of the system can be described by Hamilton’s equations of motion:

The states of a system at any given time can be determined from initial conditions using equations of motion By establishing the initial position and velocity of each atom, a crucial step is to define the potential function U I (R N ) for each atom.

Molecular mechanics approaches utilize potential energy functions that encompass bond stretching, valence angle bending, dihedral angle torsions, van der Waals interactions, and electrostatic interactions To enhance accuracy, cross terms reflecting the coupling among these factors may also be included These methods are characterized by their rapid computation of energy based on numerous atomic coordinates.

Direct dynamics methods, which calculate energy using ab initio or DFT techniques at each time step of molecular dynamics simulations, are feasible but significantly more resource-intensive These direct dynamics approaches will be explored in Chapter 9.

Molecular mechanics simulations significantly lower computational demands by treating atoms as classical particles, allowing for the analysis of very large systems However, these simulations do not account for electronic structure, which limits their ability to model bond-breaking or bond-forming events.

Quantum molecular dynamics

The advancements in modern computing power have made it feasible to apply quantum theory to molecular dynamics by solving the time-dependent Schrödinger equation This approach is extensively used in atomic and molecular spectroscopy, where photon absorption or emission occurs over very short timeframes, allowing molecules to be modeled as two- or three-level systems However, for studying molecular dynamics over longer time scales, this simplistic model becomes inadequate, prompting the development of alternative quantum molecular dynamics methods to accurately simulate molecular behavior in these extended time frames.

Born–Oppenheimer molecular dynamics is a quantum extension of classical molecular dynamics that propagates the coordinates and momenta of nuclei based on initial configurations and current electronic structures This method optimizes the electronic structure after each nuclear movement by solving the time-independent Schrödinger equations, ensuring accurate simulation of molecular behavior.

Born–Oppenheimer molecular dynamics read:

MI represents the mass of each nuclear, RI(t) are the nuclei coordinates at time t, Ψ0 is the electronic wave function at the time t, and He is the electronic Hamiltonian

The Car–Parrinello method is a widely used ab initio molecular dynamics technique that employs density functional theory to simulate molecular time evolution In this approach, the Kohn-Sham energy functional, E KS, is influenced by single electronic wave functions {Φi} and nuclear coordinates R N By deriving forces on the nuclei from the Lagrangian function's derivatives concerning nuclear positions, the method effectively calculates the forces acting on the orbitals The specific Lagrangian utilized in the Car-Parrinello method is essential for this process.

L & Φ Φ& =∑ & +∑ μ Φ& Φ& −Ε Φ (2.14) where à is the “fictitious mass” or inertia parameter assigned to the orbital degrees of freedom The corresponding Newtonian equations of motion are obtained from the associated Euler–Lagrange equations:

Therefore the Car-Parrinello equations of motion have the form: j i ij ij I I

Car-Parrinello simulations maintain the electronic subsystem near its instantaneous minimum energy, ensuring that an optimized ground-state wave function for the initial nuclear configuration remains close to its ground state throughout the simulation Practical applications of the Car-Parrinello molecular dynamics method have yielded several intriguing results.

Introduction

The seven subtypes of transient receptor potential-canonical (TRPC) channels (TRPC1-7) are prevalent in various cells and tissues, facilitating the influx of extracellular calcium (Ca2+) and sodium (Na+) upon activation of cell surface receptors This influx plays a critical role in essential cellular processes such as smooth muscle contraction, immune cell activation, neuronal growth-cone mobility, and cell proliferation and migration Given the significance of these functions in relation to human diseases, there is a growing interest in developing therapeutic agents that can selectively activate or inhibit TRPC channels.

Native TRPC channels consist of four protein subunits symmetrically arranged around a central pore, with each subunit featuring six transmembrane domains and a membrane-loop domain that contributes to the channel's structure The amino- and carboxyl-termini of these subunits are positioned on the intracellular side of the membrane TRPC3, TRPC6, and TRPC7 are closely related in structure and function, forming a subfamily of TRPC channels that can create homotetrameric channels or combine with other subfamily members to form heterotetrameric channels.

TRPC3, TRPC6, and TRPC7 channels are activated by Gq/11-linked G protein-coupled receptors (GPCRs) and growth factor receptors that stimulate phospholipase C-beta (PLC-β) and phospholipase C-gamma (PLC-γ) This activation leads to the hydrolysis of membrane phosphatidylinositol (4,5)-bisphosphate [PtdIns(4,5)P2], resulting in the production of key second messengers, inositol (1,4,5) triphosphate [Ins(1,4,5)P3] and diacylglycerol (DAG).

TRPC3, TRPC6, and TRPC7 channels are believed to be directly activated by diacylglycerol (DAG) within seconds following receptor-mediated activation of phospholipase C (PLC)-β/γ Additionally, on a slightly slower timescale of minutes, the DAG released by PLC-β/γ also stimulates protein kinase activity.

C (PKC), which phosphorylates TRPC3/67 channels and inhibits their activity 11, 14 15 16 17 , , ,

TRPC6 subunit-containing channels are found in various cell types, including vascular endothelial and smooth muscle cells, pulmonary smooth muscle, neutrophils, platelets, kidney podocytes, and central nervous system neurons These channels are activated by endothelin-1, angiotensin II, and adrenergic receptors, playing a significant role in blood pressure regulation Recent studies have linked gain-of-function variants of TRPC6 channels with high basal activity to familial focal segmental glomerulosclerosis, a leading cause of kidney failure Additionally, TRPC6 channels may serve as downstream effectors of M1 muscarinic acetylcholine receptors in hippocampal neurons, suggesting their involvement in cognitive functions such as attention, learning, and memory Two splice variants of TRPC6 have been identified: TRPC6A, which consists of 933 amino acids, and TRPC6B, which lacks 54 amino acids at the amino terminus.

Understanding the molecular mechanisms regulating TRPC channels is crucial for identifying new drug targets related to TRPC channel processes This exploration has led to the investigation of posttranslational modifications and interacting proteins that affect TRP channel function TRPC3/6/7 channels are regulated by PKC, which phosphorylates specific serine residues in their carboxyl-terminal region, while Src family tyrosine kinases are essential for maximal TRPC3 and TRPC6 activation Additionally, these channels directly interact with several proteins, including calmodulin, the IP3 receptor, and the adapter protein Homer Recent research has identified a binding site for the immunophilin FKBP12 in the carboxyl-terminal domain of TRPC3/6/7 channels, with site-specific mutagenesis revealing that FKBP12 binds to a specific consensus sequence, where the serine residue is a target for PKC phosphorylation in different TRPC6 splice variants.

Recent research by Kim and Saffen has identified FKBP12 as a key component of a TRPC6-centered protein complex that forms rapidly after M1 mAChR activation in PC12D neuronal cells Upon activation with carbachol, a series of events occur: first, a protein complex comprising M1 mAChRs, TRPC6 channels, and PKC assembles in non-lipid raft domains of the cell membrane Next, PKC phosphorylates TRPC6 channels at Ser768 and Ser714, creating a binding site for FKBP12, which is located in lipid raft domains The binding of FKBP12 retains the M1 mAChR/TRPC6/PKC complex within the lipid rafts and facilitates the recruitment of calcineurin/calmodulin Subsequently, calcineurin dephosphorylates the channels, leading to the release of M1 mAChR from the complex Finally, M1 mAChRs and TRPC6 channels are internalized through distinct pathways: M1 mAChRs via clathrin-coated pits and TRPC6 channels through a non-clathrin-dependent mechanism.

Recent findings reveal that the formation of TRPC6 channels by PKC is essential for FKBP12 binding This conclusion is supported by evidence showing that the coimmunoprecipitation of TRPC6 channels and FKBP12 is inhibited when channel phosphorylation is disrupted, either through the PKC inhibitor GF109203X or by mutating Ser768 and Ser714 to alanine or glycine.

Recent studies highlight the crucial role of PKC-mediated phosphorylation of TRPC6 channels in regulating their interaction with FKBP12 The current research aims to utilize computational modeling to gain deeper insights into the binding mechanism of TRPC6 channels with FKBP12 Findings suggest that specific amino acid residues are involved in this interaction, indicating that phosphorylation at Ser768 and Ser714 is essential for binding.

Computational methods

TRPC6 peptide docking

To effectively apply molecular dynamics in the study of protein-peptide interactions, it is essential to have a structural template that details the location and characteristics of the peptide binding site on the receptor protein Typically, this template is obtained from experimental techniques, with X-ray crystallography being a primary source of structural information.

Limited structural information is available regarding the interaction of FKBP12 with other proteins The notable exception is the crystal structure (PDB ID: 1B6C) determined by Huse and colleagues, which reveals FKBP12 bound to an unphosphorylated fragment of the TGF-β receptor Type I (TGFβTRI) that includes both the GS region and the catalytic domain In this protein complex, FKBP12 interacts specifically with an α-helix.

193LPLLVQRTIAR 203 sequence Sinkins and co-workers demonstrated that the analogous leucyl-prolyl initiated peptide 759 LPVPFNLVPSP 769 of TRPC6 mediates its interaction with FKBP12

So, to generate the initial geometries of the TRPC6 peptide, the

193LPLLVQRTIAR 203 helix was excised from the crystal structure of the FKBP12-TGF-β receptor Type I fragment complex and the amino acids corresponding to those found in

The study introduced TRPC6 and prepared three types of peptides: unphosphorylated wild-type and two mutants, Ser768Asp and Ser768Glu Each peptide was modified with methyl and acetyl groups at the N- and C-terminal ends, respectively, and fully optimized using the AMBER ff94 force field The relative orientations of the peptides bound to FKBP12 were predicted with the DOCK 5.2.0 suite of programs, providing initial structures for subsequent molecular dynamics simulations.

In addition to the unphosphorylated wild-type peptide, three peptide mutants—phosphoserine-768, Ser768Asp, and Ser768Glu—were synthesized Each peptide was modified with methyl and acetyl groups at the N- and C-terminal ends, respectively These peptides underwent full optimization using the AMBER ff94 force field and default atomic charges The relative orientations of the peptides bound to FKBP12 were predicted with the DOCK 5.2.0 suite, providing initial structures for molecular dynamics simulations.

The FKBP12 receptor was obtained from the PDB structure 1B6C, with protons added to align with physiological pH and charges applied using the ff94 force field A Connolly solvent-accessible surface was created for FKBP12 using a probe radius of 1.4 Å, leading to the generation of 57 overlapping spheres that define the binding pocket DOCK scoring grids measuring 42×34×24 Å were developed with the GRID program, utilizing electrostatic potential charges from the ff94 force field and van der Waals parameters from the ff99 force field A total of 48 peptides were docked as a single anchor structure, exploring up to 2×10^6 orientations, while minimizing torsion angles in the peptide for each binding mode.

Molecular dynamics simulation of FKBP12-TRPC6 peptide

Separated FKBP12 and TRPC6 Peptides

In addition to the four FKBP12-TRPC6 peptide complexes, molecular dynamics (MD) simulations were performed on isolated FKBP12 and TRPC6 peptides using the AMBER 8 software suite The FKBP12 structure was derived from the crystal structure (PDB ID: 1B6C), while the initial configurations of the peptide complexes were prepared accordingly The simulations utilized the all-atom force field ff03 and incorporated explicit water solvent via the TIP3P model, with proteins and complexes placed in a water box maintaining a minimum distance of 10 Å from the surface, totaling approximately 20,000 atoms Periodic boundary conditions were applied, and long-range electrostatics were treated using the particle mesh Ewald (PME) method The SHAKE algorithm constrained hydrogen bond lengths, and a non-bonded interaction cutoff of 8 Å was established Following initial optimization, the systems were equilibrated over 4000 steps and heated from 0 to 300 K in the NVT ensemble, followed by 20 ns production runs under NPT conditions at 300 K and 1 bar The simulations employed a time step of 1 fs, with coordinates saved every 100 steps, and additional details are available in the Supporting Information.

MM/GB-SA free energy of binding calculations

For each peptide, the free energy of binding to FKBP12 was computed using the

MM-GBSA method, 60 available in the AMBER program suite This method uses a thermodynamic cycle to calculate the free energy of binding for each ligand, in this case

TRPC6 peptides, to the FKBP12 receptor 61,62 The free energies of binding are computed using the equation:

(1) sol ligand sol receptor sol complex sol binding G G G

G =Δ −Δ −Δ Δ where is the total free energy of binding in solution, and , and are free energies in solution of the complex, receptor and ligand, respectively

The free energy in solution of each entity (ΔG sol binding ΔG ΔG complex sol ΔG receptor sol sol ligand ΔG sol ) is calculated by the following equations:

G gas = internal + vdw + electrosta tic − Δ Δ nonpolar GB solvation G G

The equation G = Δ + Δ Δ (4) describes the relationship between free energy in the gas phase (ΔG gas) and solvation energy (ΔG solvation) ΔG gas encompasses the total internal energy (E internal), van der Waals energy (E vdw), and Coulombic interactions, highlighting the key components that influence the free energy in gaseous states.

Electrostatic interactions and entropic contributions (ΔS) play a crucial role in determining molecular mechanics (MM) energies The internal energy encompasses various factors, including bond stretching, bond angles, and torsional contributions Additionally, the solvation energy (ΔG solvation) consists of both polar (ΔG GB) and nonpolar (ΔG nonpolar) components, highlighting the complexity of molecular interactions in different environments.

The MM-GBSA method utilizes snapshots from molecular dynamics (MD) trajectories of complexes involving FKBP12 and the TRPC6 peptide, as well as FKBP12 with the peptide alone The initial 2 ns of the MD simulation were discarded for free energy of binding calculations, focusing on 1,000 evenly extracted snapshots from the subsequent 18 ns The total free energy of binding comprises several contributions, including Coulombic interactions (Eelectrostatic), van der Waals interactions (Evdw), internal energies (Einternal), hydrophobic effects (ΔGnonpolar), solvation effects (ΔGGB), and entropic effects (TΔStotal) Analyzing each component's contribution offers valuable insights into the interactions between FKBP12 and the peptide.

The entropies for the free energy calculations were calculated by the normal mode (NMODE) module available in AMBER package 63,64

Results and discussion

TRPC6 peptide docking

To validate our docking protocol and the computational definition of the FKBP12 receptor along with its peptide ligands, we conducted docking simulations of the 193 LPLLVQRTIAR 203 peptide from TGFβTRI using a computational model of the FKBP12 receptor Our computational method successfully replicated the experimentally derived binding mode of the natural peptide with FKBP12, demonstrating a low root-mean square deviation, thereby confirming the accuracy of our docking procedure.

(RMSD) of 1.7 Å, as displayed in Figure 3.1

Figure 3.1 Comparison of Docked TFGβ receptor peptide (green) and crystal structure

(purple) RMSD value between the two structures is 1.72 Å

We confidently evaluated the docking of four TRPC6 peptide models with FKBP12, including the unphosphorylated wild-type TRPC6 peptide (LPVPFNLVPSP) and three mutant variants: phosphoserine-768, Ser768Asp, and Ser768Glu The top-scoring binding modes for both the TGFβTRI and TRPC6 models, with and without phosphorylation, are illustrated in Figure 3.2, while their corresponding DOCK energy scores are detailed in Table 3.1.

Table 3.1 Energy Scores (kcal/mol) from DOCK for the preferred modes of TGF-β receptor Type I and TRPC6 peptides binding to the FKBP12 receptor

The DOCK energy score analysis reveals that the TRPC6 peptides exhibit energetically favorable binding modes that differ from the experimental binding mode of the TGFβTRI peptide The computed energy scores for the five peptides are closely aligned, ranging from -37.6 to -43.6 kcal/mol Notably, a negatively charged amino acid at position 768 contributes significantly to the electrostatic interactions, as its anionic side chain binds between two lysine residues (Lys44 and Lys47) on the FKBP12 surface Importantly, these lysine residues do not play a role in the binding of FK506 or the TGFβTRI peptide, as indicated by their respective crystal structures.

Molecular dynamics (MD) simulations of FKBP12-TRPC6 peptide complexes were conducted to investigate the molecular recognition elements on the FKBP12 surface crucial for TRPC6 binding These simulations assessed the stability of the protein-peptide complexes in an aqueous environment and identified key residues involved in the molecular recognition of TRPC6 peptides by FKBP12 Additionally, by integrating simulations of isolated species, this research aids in calculating the free energy of binding for both the wild-type peptide and its mutants.

RMSD fluctuation of complexes

We analyzed the RMSD deviation of FKBP12-peptide complexes throughout a 20 ns MD simulation, referencing the initial coordinates from the production MD simulation The RMSD values rose from zero to approximately 3 Å in the first 2 ns After this period, each of the four complexes exhibited distinct behaviors, yet all maintained relative stability during the entire simulation These variations in RMSD fluctuations highlight notable differences in the flexibility of the complexes.

The RMSD fluctuation of the phosphoserine-768 peptide complex is approximately 3 Å, the lowest among the four FKBP12-peptide complexes, indicating that phosphorylation of Ser768 limits the conformational dynamics of the entire protein-peptide complex This restriction may enhance the formation of the multiprotein complex observed in experiments In contrast, the unphosphorylated wild-type peptide-FKBP12 complex exhibits a stable RMSD fluctuation around 3.5 Å Despite potentially weak interactions with FKBP12, the consistent stability of the wild-type peptide complex throughout the simulation suggests that additional nonbonded interactions may play a role in peptide binding beyond phosphorylation.

The RMSD analysis of four complexes over 20 ns of molecular dynamics simulations reveals distinct behaviors among them The wild type TRPC6 peptide complex is represented in black, while the phosphorylated wild type TRPC6 peptide is shown in red Additionally, the mutant Ser768Asp complex is illustrated in green, and the mutant Ser768Glu complex is highlighted This comparative analysis underscores the structural stability and variations induced by phosphorylation and mutations in the TRPC6 peptide complexes.

We hypothesized that introducing a negatively charged amino acid at position 768 would mimic phosphoserine and enhance binding However, the mutant peptides Ser768Asp and Ser768Glu exhibited greater RMSD fluctuations in their MD trajectories compared to the native and phosphorylated Ser768 peptides Despite the presence of a negatively charged side chain, the simulations of Ser768Asp and Ser768Glu showed RMSD fluctuations reaching up to 4 Å This suggests that factors beyond the electrostatic contributions of the negatively charged side chain, possibly intrinsic properties of the phosphate functionality, are essential for TRPC6 binding.

Residue fluctuation in each simulation

The average fluctuation of individual residues in the four complexes and isolated FKBP12 was analyzed over 20 ns of molecular dynamics simulations Each region of the complexes exhibits distinct fluxional behavior, with residues from various parts of FKBP12 showing significantly different fluctuation patterns in both bound and unbound states.

The average residue fluctuation of FKBP12 during molecular dynamics simulations reveals distinct patterns based on various peptide complexes The analysis, conducted over a 20ns simulation, shows fluctuations in the complex with the wild type TRPC6 peptide (black), the phosphorylated wild type TRPC6 peptide (red), the mutant Ser768Asp (green), and the mutant Ser769Glu (blue), as well as in the unbound state.

Five specific amino acid regions (10’s, 30’s, 40’s, 50’s, and 80-90’s) exhibited increased flexibility compared to FKBP12 Molecular dynamics simulations, including those of unbound FKBP12, revealed consistent patterns; however, the FKBP12 complexes with the TRPC6 peptide mutants Ser768Asp and Ser768Glu displayed significantly greater average fluxional behavior than the other complexes and the unbound form.

FKBP12 The loop structures of FKBP12 from each complex as well as unbound

FKBP12 are color coded in Figure 3.5 to reflect the different levels of flexibility

The four segments adjacent to the FKBP12 binding site on the TRPC6 peptide exhibit a general fluctuation of 1 to 2 Å, with the amino acids in the 10's loop being the exception.

10's Loop pSer768 b FKBP12 with Wild Type c FKBP12 with Phosphorylated Wild Type

10's Loop d FKBP12 with Mutant Ser768Asp e FKBP12 with Mutant Ser768Glu

The fluctuation of amino acid residues in FKBP12-TRPC6 peptide complexes is illustrated in Figure 3.5, with color coding to indicate varying degrees of fluctuation: amino acids with fluctuations under 1 Å are shown in gray, those between 1 and 2 Å in green, between 2 and 3 Å in red, and between 3 and 4 Å in yellow.

The complex of FKBP12 with the wild-type (WT) TRPC6 peptide shows similar fluctuation pattern to unbound FKBP12, but with slightly higher flexibility (Figure 3.5b)

The amino acid loop residues of FKBP12 from the 80s to 90s (84 ATGHPGIIPH 94) exhibit fluctuations between 1 and 2 Å, indicating increased flexibility in the loop region compared to the unbound state These hydrophobic residues maintain close contact with the hydrophobic region of the TRPC6 peptide during molecular dynamics simulations In contrast, the Arg42 residue from the 40s loop (41 DRNK 44) shows a significant decrease in fluctuation relative to unbound FKBP12 Additionally, the hydrophobic residues Val55 and Ile56 from the 50s loop contribute to the overall structural dynamics of FKBP12.

The study of the 52 KQEVIR 57 complex demonstrated enhanced flexibility compared to unbound FKBP12, highlighting the impact of binding on FKBP12's structure Conversely, the amino acid region in the 10's complex exhibited a consistent pattern with unbound FKBP12, as it is geometrically distant from the binding pocket and does not directly interact with the TRPC6 peptide during binding.

The FKBP12 complex with the phosphorylated wild-type TRPC6 peptide (pWT) exhibits a similar structure to both the wild-type (WT) complex and unbound FKBP12 Notably, five regions within FKBP12—specifically the 10’s, 30’s, 40’s, 50’s, and 80~90’s—demonstrate increased flexibility compared to other areas Phosphorylation introduces significant changes, particularly in the 40’s and 80~90’s regions, where the Arg42 residue in the 40’s region shows reduced fluctuation, while the 80~90’s region experiences enhanced fluctuation Overall, the phosphorylated FKBP12 complex (pWT) closely resembles unbound FKBP12, exhibiting lower fluctuations than the WT complex.

The carboxylate mutants significantly disrupted the structure of FKBP12, particularly in the complex with mutant Ser768Asp Notably, the 80–90's region exhibited dramatic flexibility, with a substantial portion of the loop showing fluctuations exceeding 2 Å (red), and several amino acid residues fluctuating more than 3 Å (yellow).

The Ser768Asp mutant complex exhibited significant fluctuations in the 30's region compared to both the unbound FKBP12 and the complexes with the wild-type peptide Similarly, the FKBP12-Ser768Glu peptide complex demonstrated substantial reorganization, highlighting the dynamic nature of these interactions.

The 80-90's and 30's amino acid regions of two peptide mutants, which share a similar electrostatic character at position 768 compared to the pWT peptide, did not exhibit the expected dynamical coupling with FKBP12 In contrast to the phosphorylated TRPC6 peptide, the WT and the Ser768Asp and Ser768Glu mutant peptides demonstrated comparable flexibility However, the two mutants showed significant differences from the phosphorylated pWT peptide, contrary to our initial expectations.

The fluctuations of the four TRPC6 peptides were analyzed over the 20 ns MD simulations, with results plotted in Figure 3.6 Notably, Phe760 in the phosphorylated TRPC6 peptide exhibited a significantly lower fluctuation of around 1 Å, whereas Phe769 showed a higher fluctuation of approximately 4.5 Å The phosphorylated TRPC6 peptide's 759 end displayed notably lower fluctuation compared to the other end, highlighting the distinct characteristics of this peptide fragment.

Gohlke and Case 66 introduced two methods for calculating binding free energies: the "Single-Trajectory" approach and the "Three-Trajectory" approach The Single-Trajectory method relies exclusively on the molecular dynamics (MD) trajectory of the binding complex, extracting the trajectories of each binding partner for MM-GBSA calculations In contrast, the Three-Trajectory approach necessitates separate trajectories for each binding partner, enhancing the accuracy of the binding free energy assessment.

Molecular Dynamics (MD) simulations of a complex and its two binding partners were performed, utilizing trajectories from separate simulations for MM-GBSA calculations This three-trajectory method demands greater computational resources compared to a single-trajectory approach, as it necessitates two additional MD simulations Accurate calculations of binding free energies from MD trajectories are contingent on the binding partners maintaining stable conformations and dynamics during complex formation.

Binding free energy calculations by MM-GBSA based on Single-Trajectory

The calculated binding free energies, including energetic and entropic contributions for four complexes, indicate that the MM-GB/SA binding free energy for the unphosphorylated wild-type peptide to FKBP12 is -5.4 kcal/mol, while the Ser768 phosphorylated peptide shows a slightly more favorable binding energy of -5.7 kcal/mol This suggests that FKBP12 may have a greater binding affinity for the phosphorylated TRPC6 peptide However, the difference of only 0.3 kcal/mol between the pWT and WT complexes is relatively small and does not align significantly with experimental observations.

Figure 3.6 Average residue fluctuation of each TRPC6 Peptide in MD simulations of four complexes (Analysis based on 20ns): wild type (black), phosphorylated wild type

(red), mutant Ser768Asp (green), mutant Ser768Glu (blue).

Free energy of binding calculations of FKBP12 with

The calculated free energy of binding between FKBP12 and the wild-type TRPC6 peptide is -5.4 kcal/mol, indicating a notable interaction Notably, the binding affinity increases with the phosphorylated peptide, yielding a free energy of binding of -5.7 kcal/mol This suggests that FKBP12 exhibits enhanced binding efficacy towards the phosphorylated form of the TRPC6 peptide, implying a significant role of phosphorylation in modulating the FKBP12-TRPC6 interaction.

The TRPC6 peptide exhibits a calculated free energy of binding for the pWT complex that is lower than that of the WT complex by just 0.3 kcal/mol However, this minor difference is considerably less than what is observed experimentally.

The binding of FKBP12 to the phosphorylated TRPC6 peptide (pWT) is primarily driven by electrostatic interactions due to the negatively charged phosphate group, which significantly contributes to the total free energy of binding However, the desolvation of the phosphoserine group during this process introduces unfavorable solvation effects Despite this, the van der Waals interactions (–49.4 kcal/mol) and hydrophobic effects (–8.2 kcal/mol) are thermodynamically favorable, suggesting the binding sites are hydrophilic Ultimately, the combination of favorable Coulombic, van der Waals, and hydrophobic interactions compensates for the substantial desolvation energy encountered during FKBP12 binding.

FKBP12 interacts with pWT, but in the absence of a phosphate group, WT lacks significant Coulombic binding contributors and desolvation effects Instead, van der Waals interactions play a crucial role, significantly influencing the total free energy of binding and potentially serving as the primary driving force behind this interaction.

Free energy of binding calculations of Ser768Asp and

The Ser768Asp mutant exhibits stronger Coulombic interactions with FKBP12 compared to the wild-type (WT) peptide, yet weaker than the phosphoserine (pWT) analog Despite these interactions, the desolvation effect for the Ser768Asp mutant remains thermodynamically unfavorable, resulting in a combined energy contribution of 19.6 kcal/mol, surpassing the 14.6 kcal/mol of pWT The van der Waals contributions for this complex are less favorable than those of the WT or pWT complexes, making its free energy of binding the least thermodynamically favorable among the four peptides studied This is primarily due to the carboxylate group from aspartic acid not providing sufficient Coulombic interactions to counteract the unfavorable desolvation effects, and the significant disturbances introduced in the 80-90 amino acid region of FKBP12, leading to less effective conformational adjustments between the receptor and ligand Further analysis of these binding interactions will be provided below.

The Ser768Glu mutant exhibits a binding free energy similar to that of the wild-type (WT) peptide The carboxylate group in Glu768 contributes favorable Coulombic interactions, although its desolvation effects are less significant compared to the Ser768Asp mutant This difference in energetic contributions may stem from the structural variation between Asp and Glu, which differ by just one methylene (CH2) unit.

Ser768Asp nor Ser768Glu mutant shows an improved binding affinity over the WT peptide.

Entropic contribution of each complex based on Single-

The binding process of each complex faces an entropic penalty, making it thermodynamically unfavorable This entropic shift arises from translational, rotational, and vibrational factors The entropic contributions to binding were determined using molecular dynamics (MD) trajectories of the complexes, as detailed in Table 3.3.

The translational entropy changes (TΔStranslational) for all four complexes are approximately –14.1 kcal/mol, while the rotational entropy changes (TΔSrotational) are around –12.6 kcal/mol, reflecting the consistent physical properties of the protein-ligand binding process However, the entropic changes vary among the complexes due to their distinct binding modes The WT complex exhibits the least entropic changes, indicating minimal dynamical conformational changes during binding In contrast, the pWT complex shows a 5.1 kcal/mol higher entropic penalty, likely due to the strong interaction between the phosphate group and the FKBP12 binding site Additionally, the entropic contributions to the free energy of binding are –7.2 kcal/mol for the Ser768Asp mutant and –11.0 kcal/mol for the Ser768Glu mutant.

Free energy of binding calculations of WT complex based

Gohlke and Case (67) emphasized that accurate calculations of free binding energies from molecular dynamics trajectories are effective only when the binding partners maintain stable conformations and dynamics during complex formation.

Molecular dynamics (MD) simulations were conducted on isolated FKBP12 and model peptides, from which evenly spaced snapshots were extracted These snapshots, along with those from FKBP12-peptide complexes, were used to calculate the free energy of binding, resulting in a value of –4.8 kcal/mol for the FKBP12 and pWT complex Notably, the difference in free energy of binding between the two methods—using single versus separated MD trajectories—was only 0.9 kcal/mol, indicating a consistent binding interaction.

FKPB12 and pWT must not involve significant conformational changes

The binding free energy of FKBP12 with the WT peptide significantly changes from –5.4 kcal/mol in Single-Trajectory calculations to 4.5 kcal/mol in Three-Trajectories calculations, highlighting the substantial conformational changes in the WT complex compared to the pWT complex This discrepancy arises because, in Single-Trajectory calculations, the intramolecular energy is fully canceled out due to identical conformations of binding partners in each snapshot However, when large conformational changes occur during binding, the ligand and receptor adopt notably different conformations in their bound and unbound states, leading to significant differences in intramolecular energies that affect the calculated free energy of binding Notably, there is a 9.8 kcal/mol difference in free energies between single and separated MD trajectories for the WT complex, translating to a factor of 10^7 in the binding equilibrium constant This finding aligns with experimental evidence indicating that FKBP12 does not bind to unphosphorylated TRPC6.

Free energy of binding calculations of mutant complexes

The mutant Ser768Asp complex demonstrated an unfavorable binding affinity to FKBP12, with a calculated free energy of binding of 1.3 kcal/mol from a single complex trajectory In contrast, separated trajectory calculations revealed an even less favorable binding free energy of 28.0 kcal/mol, highlighting significant RMSD fluctuations This disparity between the binding free energies from single and separated trajectories underscores the complex's thermodynamic instability.

This mutant appears unable to form a stable complex with FKBP12 Our hypothesis suggests that the carboxylate group of the aspartic acid residue is expected to mimic the phosphate group of pSer768, potentially forming a similar complex with FKBP12, albeit with a reduced binding affinity The Coulombic interaction plays a crucial role in this process.

In this complex, the electrostatic contribution of the carboxylate group in the mutation is significant, at –9.4 kcal/mol This group exhibits weaker Coulombic interactions with FKBP12 compared to the phosphate group from pSer768 in the pWT peptide, yet it interacts more strongly than the hydroxyl group from the Ser768 residue in the WT peptide Similar to the pWT complex, solvation effects (ΔG GB) negatively impact the total free energy of binding Although the favorable Coulombic contribution exists, it cannot compensate for the desolvation energy of the aspartate residue, resulting in a thermodynamically unfavorable total binding.

The binding free energy of the Ser768Glu mutant to FKBP12 is 14.1 kcal/mol Despite both mutations involving similar residues (Asp and Glu), the Coulombic interaction in the Ser768Glu mutant complex (–45.1 kcal/mol) is significantly weaker than that of the Ser768Asp mutant complex (–79.4 kcal/mol) Additionally, the desolvation energy of the Ser768Glu complex (64.4 kcal/mol) is notably lower than that of Ser768Asp (103.8 kcal/mol) Understanding this discrepancy necessitates further structural analysis of these complexes, which will be addressed later.

Decomposition analysis of the free energy of binding

To analyze the binding interactions, we decomposed the total free energy of binding for each complex into contributions from individual amino acid residues As intramolecular energies cancel out in the free energy calculations derived from the complex's molecular dynamics trajectory, we focus solely on the contributions from each residue This approach allows us to easily identify the residues that significantly impact the binding energy.

The wild-type TRPC6 peptide's binding site is characterized by hydrophobic amino acids, particularly Leu759, Pro760, Phe763, and Pro767, which significantly enhance binding affinity In contrast, the hydrophilic residues Asn764 and Ser768 negatively impact the total free energy of binding, indicating that the FKBP12 binding site is predominantly hydrophobic Additionally, the phosphorylated wild-type (pWT) peptide also benefits from the presence of hydrophobic residues, while the two hydrophilic residues exhibit opposing contributions to binding dynamics.

Asp764 plays a crucial role in enhancing stability, while the phosphorylated residue pSer768 contributes negatively to the overall free energy of binding This paradox highlights that the phosphorylation of Ser768 does not necessarily reduce the stability of the complex.

Free energy of binding calculations reveal that the phosphate group contributes a significant and thermodynamically unfavorable desolvation energy, alongside favorable Coulombic interactions during the binding process Notably, the unfavorable desolvation energy is attributed solely to pSer768, while the favorable Coulombic interactions are shared among various residues of the model peptide and the FKPB12 receptor Overall, the pSer768 residue imposes a considerable unfavorable impact on the total free energy of binding Comparatively, the residue distributions in the Ser768Asp and Ser768Glu mutants closely resemble those of the wild-type peptide, indicating no substantial improvement associated with the amino acid change at position 768.

Ac Leu759 Pro760 Val761 Pro762 Phe763 Asn764 Leu765 Val766 Pro767 768* Pro769 Nme

Wild Type TRPC6 Peptide Wild Type TRPC6 Peptide With Posphoserine768 Mutant Ser768Asp

Figure 3.7 Decomposition of binding free energies into each single residue from TRPC6 peptides: wild type (blue), phosphorylated wild type (red), mutant Ser768Asp (yellow), mutant Ser768Glu (cyan)

The energy contributions from amino acid residues in FKBP12 are unevenly distributed, primarily between residues 24 to 60 and 79 to 107 Notably, residues Asp37, Lys44, and Lys47 play a crucial role in differentiating FKBP12 binding with pWT from other peptide complexes Positioned at the binding pocket, Lys44 and Lys47, both positively charged under physiological conditions, likely engage in strong Coulombic interactions with the phosphate group of pSer768 In the pWT complex, the energy contributions from Lys44 and Lys47 are similar, differing by only 0.5 kcal/mol However, in the WT complex, their contributions are inverted, with Lys47 providing a slight favorable effect and Lys44 an unfavorable one In the Ser768Asp mutant, Lys44 significantly enhances binding energy, while Lys47 contributes favorably, albeit to a lesser extent (~ –0.5 kcal/mol).

Figure 3.8 illustrates the decomposition of binding free energies into individual residue contributions from FKBP12, specifically from residues 24 to 60 The analysis compares the binding interactions in four different contexts: the wild type (blue), phosphorylated wild type (red), and two mutants, Ser768Asp (yellow) and Ser768Glu (cyan).

The decomposition of binding free energies from FKBP12, specifically residues 61 to 107, reveals distinct contributions based on different complexes: the wild type (blue), phosphorylated wild type (red), mutant Ser768Asp (yellow), and mutant Ser768Glu (cyan).

Figure 3.10 illustrates the residues that significantly contribute to the total binding free energy at the FKBP12 binding site The hydrophobic residues, highlighted in green, include Tyr26, Phe36, Phe39, Phe46, Phe48, Met49, Trp54, Val55, Ile56, Tyr82, His87, Ile90, and Ile91 In contrast, the hydrophilic residues, marked in blue, are Asp37, Arg42, Lys44, Lys47, Glu54, and Gln53 The model peptide depicted in the figures represents the phosphorylated wild type in a docked orientation.

The binding interactions in the Ser768Asp and Ser768Glu mutants reveal that the two lysine residues, despite their proximity, are not equally significant for binding to the carboxylate group In the Ser768Glu mutant, Lys44 and Lys47 each contribute approximately –0.5 kcal/mol to the total free energy of binding, which is minimal compared to the contributions observed in the pWT peptide Notably, Asp37 shows a substantial favorable contribution only in the pWT complex, while its impact is unfavorable in the mutant complexes The binding distinctions among these complexes primarily stem from the contributions of Lys44, Lys47, and Asp37 Additionally, the two carboxylate mutants do not exhibit a binding pattern akin to the pWT peptide The hydrophobic residue Phe46, positioned between Lys44 and Lys47, along with other hydrophobic residues like Val55, Ile56, and Trp59, are crucial for binding across these complexes.

The FKBP12 79~107 sequence contains five key residues (Tyr82, His87, Ile90, Ile91, and Phe99) that significantly influence the binding free energies across all four complexes Notably, most of these residues are hydrophobic, with His87, Ile90, and Ile91 situated in the 80-90 region (84 ATGHPGIIPPHA 95) This region predominantly comprises hydrophobic residues, apart from His87 and two glycine residues, which are only weakly hydrophilic The model peptides exhibit a higher number of hydrophobic residues compared to hydrophilic ones, indicating that hydrophobic interactions between FKBP12 and ligand peptides are crucial for the binding process Additionally, the contribution of the hydrophilic residue Glu107 is thermodynamically unfavorable in all four complexes due to the hydrophobic nature of the binding site.

Decomposition analysis of the free energy of binding

The free energies of binding for the four complexes were analyzed by decomposing them to each residue based on separated MD trajectories Notably, the structures of FKBP12 and the phosphorylated TRPC6 peptide exhibit relative rigidity before and after binding, resulting in similar distributions of significant contributions to total binding energies when comparing single and separated MD simulations.

In contrast, the other three complexes exhibit distinct decomposition patterns compared to the earlier analysis, primarily characterized by an increased number of residues that contribute thermodynamically unfavorable effects to the free energies.

Spatial distribution of residues with major contributions to

3.3.11 Decomposition analysis of the free energy of binding based on Three-Trajectory

The binding free energies for the four complexes were analyzed by decomposing the data from separated molecular dynamics (MD) trajectories for each residue Notably, the structures of FKBP12 and the phosphorylated TRPC6 peptide exhibit minimal flexibility before and after binding, resulting in a comparable distribution of significant contributions to the total binding energies from both single and separated MD simulations.

Notably, the three complexes exhibit distinct decomposition patterns compared to the previous analysis, primarily due to a higher number of residues with thermodynamically unfavorable contributions to the free energies of binding These unfavorable contributions are attributed to significant conformational changes between the bound and unbound FKBP12 and TRPC6 peptides, as evidenced by larger RMSD fluctuations in these complexes Despite this, the residues that previously contributed favorably to the binding energies remain favorable in the separated trajectory analysis.

3.3.12 Spatial distribution of residues with major contributions to the free energy of binding

Our analysis of the free energy of binding reveals that both hydrophobic and hydrophilic residues play crucial roles in binding affinity The spatial distribution of these residues at the FKBP12 binding site shows that hydrophobic residues are strategically positioned at the bottom, creating a hydrophobic pocket for the model peptide Key residues such as Phe36, Val55, Ile56, Trp59, and Phe99 are situated at the pocket's base, while His87, Ile90, and Ile91 from the 80-90's loop contribute to one side of a hydrophobic wall Conversely, Phe46, Phe48, and Met49 form a wall on the opposite side Collectively, these residues establish a large "U" shaped hydrophobic pocket that accommodates the binding peptide from head to tail.

The pocket's two open sides facilitate access for hydrophilic residues, with Asp37, Arg42, Lys44, and Lys47 forming an arc that enhances interaction with the model peptide's side chains Free energy decomposition analysis indicates that these four residues significantly contribute to the binding affinity in the pWT complex and its mutants, contrasting with the wild-type TRPC6 peptide This distinction highlights the critical role of hydrophilic interactions in molecular recognition While Glu54's side chain points toward the TRPC6 peptide, it negatively impacts binding affinity across all complexes Additionally, Gln53 and Glu107, positioned away from the binding pocket, exhibit similar unfavorable energy contributions in all scenarios.

The visualization of surfaces formed by hydrophobic and hydrophilic residues reveals a specific binding pocket for the molecular recognition domain of TRPC6, applicable both before and after TRPC6 phosphorylation However, this pocket alone does not allow FKBP12 to establish a thermodynamically favorable complex with TRPC6; instead, phosphorylation at the Ser768 residue is essential for facilitating this binding.

Interactions among Lys44, Lys47 and Residue768 75

The binding modes of unphosphorylated and phosphorylated TRPC6 peptides differ significantly, particularly in the interactions involving Ser768 and the Lys44 and Lys47 residues of FKBP12 Distance measurements between these residues were analyzed in various complexes In the wild type (WT) complex, initial distances between Ser768-OG and Lys44-NZ, Ser768-OG and Lys47-NZ, and Lys44-NZ and Lys47-NZ were approximately 8 Å, forming an equilateral triangle However, after 4 ns of molecular dynamics (MD) simulation, Ser768 shifted closer to Lys47, increasing the distance between Ser768 and Lys47 and the two lysine residues to about 12 Å, disrupting the triangular formation After 8 ns, the distances gradually returned to around 8 Å, reestablishing the equilateral triangle, which persisted for approximately 6 ns before being disrupted again.

Figure 3.11 Atomic pair distance analysis for Lys44, Lys47 and residue 768 from model peptides: distance between Lys44 and residue 768 (black), Lys47 and residue 768 (red) and Lys44 and Lys47 (blue)

Although both Lys44 and Lys47 can be at similar distances from Ser768 simultaneously, Ser768 showed an obvious interaction with Lys47 during most of the

Molecular dynamics (MD) simulations revealed that Lys47 significantly enhances binding affinity with the wild-type peptide, while Lys44 exhibits slightly unfavorable contributions This distance-dependent behavior suggests that Lys44 could potentially play a role in binding if it were to switch functions with Lys47 However, the timeframe required to observe such a switch in these lysine residues during simulations remains uncertain.

The phosphorylated Ser68 (pWT) complex with FKBP12 exhibited distinct behavior during the molecular dynamics (MD) simulation Initially, the phosphoserine residue was tightly coupled to Lys47, with a distance of approximately 5 Å between pSer768-OG and Lys47-NZ, while the distances to Lys44-NZ were around 12 Å After 6 ns, pSer768 began to approach Lys44 and move away from Lys47, as the two lysine residues also came closer together By 7 ns, the distances among the three residues converged to about 8 Å, forming an approximate equilateral triangle Subsequently, pSer768 continued to advance towards Lys44, while distancing itself from Lys47, leading to a positional switch between the two lysine residues relative to pSer768.

During the 20 ns molecular dynamics simulation, switching occurred four times, with either Lys44 or Lys47 maintaining close contact with the phosphoserine residue This finding highlights the equal importance of both lysine residues in their interaction with the phosphate group Notably, Lys47 spent a longer duration in proximity to pSer768 compared to Lys44, which may account for Lys47's greater contribution to the overall free energy of binding.

Switching in lysine residues relies on the proximity of their ε-NH3 + groups, which must come within approximately 5 Å during each switching event In phosphorylated peptides, the negative charge from phosphate groups mitigates repulsion between lysines, facilitating their close approach and enabling salt bridge transfer Conversely, in mutant peptides, the lysine residues fail to come close enough for successful transfer, as they can approach each other but do not reach the necessary threshold distance.

In the FKBP12 complex with the Ser768Asp mutant, the Asp768 residue closely interacts with Lys44 from FKBP12, supporting the hypothesis that strong interactions between lysine residues and the phosphate group from phosphorylated serine are favorable However, this interaction is insufficient for both lysine residues to engage during binding Notably, in the 5th nanosecond of the MD simulation, Lys47 moved closer to Asp768, while the distance between Lys44 and Asp768 increased Throughout the 5 to 7 ns timeframe, the side chains of the three residues remained within 5 to 8 Å of each other, yet the contact switching between Lys44 and Lys47 with Asp768 did not occur during this extended simulation period.

In a study of molecular interactions, Lys44 maintained close proximity to Asp768 over 7 nanoseconds, while the distance between Lys47 and Asp768 gradually increased The distance fluctuations between Lys44-NZ and Asp768-CG exhibited a sinusoidal pattern with an approximate period of 8 nanoseconds Additionally, the distance between Lys44 and Lys47 demonstrated a similar fluctuation in both pattern and magnitude When the distances from Lys47 to Asp768 or Lys44 reached their minimum, the residues formed a configuration resembling an equilateral triangle, akin to the pWT complex, with Lys44 and Lys47 interchanging roles concerning pSer768 However, during two specific periods of equilateral triangle formation, the expected switching between Lys44 and Lys47 did not occur This suggests that the aspartic acid residue closely mimicked the phosphorylated serine, yet the interaction between the carboxylate group of aspartic acid and the ammonium group of lysine residues was insufficient to promote the switching between the two lysine residues.

The Ser768Glu mutant (Figure 3.11d) behaved quite differently from the

Ser768Asp mutant First, Glu768 could not maintain close contact with Lys44

Notably, the distance between the ammonium nitrogen of Lys44 and the carboxylate carbon of Glu768 fluctuated sinusoidally between 5 and 15 Å throughout the simulation, despite reaching as close as 3 Å at 5 ns In contrast to expectations, Lys44 and Glu768 did not exhibit strong coupling, mirroring the behavior of Lys47 and Asp768 in the FKBP12 complex, where the distance between the ammonium group of Lys47 and the carboxylate carbon of Glu768 ranged from 12 to 20 Å The coupling between Lys47 and Glu768 was weaker due to the longer distances involved, whereas both Lys44 and Lys47 carried partial positive charges under physiological conditions.

The Coulombic interaction between the two side chains requires an external driving force to bring them closer together However, Glu768 does not effectively contribute to altering the relative orientation of the two lysine residues.

Comparison of RMSD fluctuation between bound and

MD simulations were conducted on four single TRPC6 peptides, enabling a comparison of their flexibilities before and after FKBP12 binding The root mean square deviation (RMSD) fluctuations of each peptide, including Wild Type, Phosphorylated Wild Type, Mutant Ser768Asp, and Mutant Ser768Glu, were calculated to assess the impact of FKBP12 binding (Figure 3.13).

Figure 3.13 Unbound and bound wild type peptide RMSD in each complex: bound mode

The wild type (WT) peptide exhibits higher RMSD values in its unbound state compared to its bound state, indicating significant conformational changes upon binding Specifically, when bound to FKBP12, the average RMSD ranges from 2 to 3 Å, while the unbound peptide maintains an average RMSD of approximately 5 to 6 Å throughout the simulation This binding event constrains the flexibility of the wild-type TRPC6 peptide, resulting in steady conformational changes Initially, during the MD simulation, the RMSD quickly stabilizes around 2 Å, with minor fluctuations thereafter.

After 6 ns, the RMSD value of the wild-type TRPC6 peptide rose sharply to 3 Å and stabilized around this level for more than 11 ns, indicating a steady increase throughout the simulation.

FKPB12 effectively stabilizes the peptide, preventing significant conformational changes compared to its unbound state This stabilization aligns with the observed differences in calculated free energies of binding, which vary between a single complex molecular dynamics (MD) trajectory and separate MD trajectories.

The RMSD analysis of unbound and bound phosphorylated TRPC6 (pWT) peptides revealed similar fluctuation patterns, particularly between 6 to 10 ns, where significant RMSD fluctuations occurred for both states During this period, Lys44 and Lys47 transitioned to form salt bridges with pSer768, reflecting conformational changes in the TRPC6 peptide Notably, the phosphorylated TRPC6 peptide maintained consistent conformational and dynamical properties regardless of FKBP12 binding, indicating a well-defined binding interface Although the unbound peptide exhibited larger RMSD fluctuations due to FKBP12's constraints, both forms displayed comparable fluctuations throughout the simulation The unique switching between Lys44 and Lys47 in the FKBP12 complex could stem from the phosphoserine residue's conformational influence or strong Coulombic interactions with the lysine residues Ultimately, phosphorylation emerges as a crucial factor for the unique and efficient binding characteristics of the TRPC6 structure.

The Ser768Asp mutant exhibits minimal fluctuations in both unbound and bound states, with RMSD values ranging from 3 to 4 Å While the bound mode shows slightly tighter RMSD fluctuations, the mutant remains largely unconstrained by FKBP12, indicating no significant conformational changes between the two states This finding is consistent with separate molecular dynamics (MD) trajectories, which demonstrate that this mutant does not favorably bind to FKBP12.

The unbound Ser768Glu mutant exhibited significant RMSD fluctuations, while the bound peptide maintained steady RMSD levels Initially, both states displayed similar flexibility for the first 9 ns; however, the unbound peptide underwent a conformational change after this period, resulting in an RMSD fluctuation of approximately 5 Å Another substantial conformational shift occurred after 14 ns in the unbound simulation, a phenomenon not observed in the bound peptide Interaction with FKBP12 was noted in the mutant, which limited these conformational changes Despite these differences, the overall fluctuations between the two simulations were comparable, aligning with free energy calculations indicating that FKBP12 cannot bind to the Ser768Glu mutant.

Comparison of RMSD fluctuation between bound and

The RMSD values of FKBP12 were analyzed in both its unbound state and in complexes with four peptides The unbound FKBP12 exhibited the least RMSD fluctuation, demonstrating structural stability with average values below 2 Å In contrast, FKBP12 bound to the WT TRPC6 peptide showed increased fluctuation during simulations, indicating that the interaction with the TRPC6 peptide significantly affected its conformation.

Figure 3.14 RMSD of FKBP12 in four complexes and unbound mode: unbound (black), in complex with wild type (red), with phosphorylated wild type (green), with mutant

Ser768Asp (green), with mutant Ser768Glu (yellow)

The average RMSD of FKBP12 when bound to the phosphorylated (pWT) peptide is higher compared to the wild-type (WT) peptide Despite this increase, FKBP12 does not undergo a noticeable conformational change when interacting with the pWT peptide Additionally, once FKBP12 is fitted to the pWT peptide, it maintains its conformation without significant alterations.

Binding of FKBP12 to the Ser768Asp mutant resulted in significant conformational changes, with RMSD values exceeding 3 Å, distinguishing it from other complexes This lack of favorable binding affinity with the Ser768Asp mutant may lead to unfavorable intramolecular interactions In contrast, the Ser768Glu peptide exhibited minimal RMSD fluctuations, similar to the pWT complex The primary distinction between the Asp and Glu mutants lies in the length of their carboxylate side chains, highlighting the sensitivity of FKBP12-TRPC6 interactions to structural changes at the binding site.

Conclusions

This study utilizes molecular dynamics simulations to investigate the protein-protein interactions between FKBP12 and TRPC6 channels Experimental findings indicate that FKBP12 can only bind to TRPC6 following the phosphorylation of Ser768 within the intercellular domain To explore the binding domain of TRPC6 with FKBP12, four model peptides were constructed: wild-type TRPC6, phosphorylated TRPC6, and two proposed mutants, Ser768Asp and Ser768Glu.

The FKBP12 complex with the phosphorylated wild-type TRPC6 peptide exhibits the smallest RMSD values, indicating it is the most stable and least flexible among the studied complexes In contrast, the complexes containing Asp and Glu mutants at position 768 show larger RMSD values than the wild-type peptide, which is unexpected given that these mutations were intended to mimic the negatively charged residue critical for binding Free energy calculations reveal that only the phosphorylated wild-type TRPC6 peptide demonstrates a thermodynamically favorable binding affinity with FKBP12 Additionally, variations in the calculated binding energies arise from significant conformational changes between the bound and unbound states of the wild-type peptide Neither the Asp nor Glu mutants exhibit favorable binding affinities with FKBP12 based on free energy assessments.

The binding interaction between FKBP12 and TRPC6 involves approximately twenty key residues, including both hydrophobic and hydrophilic types, which are crucial for complex stability Hydrophobic residues create a “U” shaped binding pocket, while hydrophilic residues, particularly Lys44 and Lys47, significantly contribute to the overall free energy of binding and are strategically positioned at the ends of this pocket Notably, the phosphate group from the phosphoserine residue in the pWT peptide forms a strong salt bridge with either Lys44 or Lys47 during simulations, with these residues capable of switching roles over a timespan of 5 to 6 ns This unique switching behavior between Lys44 and Lys47, in relation to the hydrophilic residue at position 768, was not seen in other peptide complexes, highlighting the distinct nature of this interaction.

Ser768 in the wild-type TRPC6 peptide did not show coupling with either Lys44 or

The carboxylate group from Asp768 in the Ser768Asp mutant formed a salt bridge with Lys44 during simulation, but Lys47 did not replace Lys44 in this interaction While the aspartic acid residue in the FKPB12 complex could somewhat mimic phosphoserine through interactions with lysine residues, it failed to effectively couple with both Lys44 and Lys47 simultaneously This limitation resulted in a reduced binding affinity of the Ser768Asp mutant with FKBP12 In contrast, the Ser768Glu mutant exhibited different behavior despite having only one additional methylene unit.

MD simulations have identified a specific binding pocket for FKBP12's interaction with the TRPC6 intracellular domain, featuring both hydrophobic and hydrophilic residues Critical to the stabilization of the FKBP12-phosphorylated TRPC6 peptide complex are the Lys44 and Lys47 residues, which form strong salt bridges with the phosphate group Simulation results indicate that two proposed carboxylated mutants of the TRPC6 peptide exhibit unfavorable binding affinity for FKBP12 Ongoing biological experiments aim to assess the expression of these mutants and compare them with simulation findings Further simulations and analyses may be necessary to enhance understanding of the molecular recognition between FKBP12 and TRPC6's intracellular domain.

Introduction

Currently, new potential drug leads are designed based on crystal structures and/or reaction mechanisms of the chosen targets, which are mostly proteins 1, , 2 3

The potential of a protein to be targeted for de novo drug design is largely determined by the understanding of its reaction mechanisms Purines, essential components of DNA and RNA, play a crucial role in biochemistry For over four decades, inhibitors of purine biosynthesis have proven effective in treating cancer, inflammatory disorders, and various infections Research into the purine biosynthetic pathways and their mechanisms has been ongoing since the early 1960s.

Recent studies have highlighted the differences in purine biosynthetic pathways between eukaryotes and prokaryotes Specifically, the conversion of 5-aminoimidazole ribonucleotide (AIR) to 4-carboxyaminoimidazole ribonucleotide (CAIR) illustrates this divergence In most prokaryotes, as well as in plants and yeast, the synthesis of N5-carboxyaminoimidazole ribonucleotide (N5-CAIR) from AIR is facilitated by the enzyme N5-CAIR synthetase (PurK) This reaction requires one molecule of ATP and one molecule of HCO3-, resulting in the production of one molecule each of ADP and inorganic phosphate (Pi).

Figure 4.1 Reaction catalyzed by PurK and PurE Class I in prokaryotes

N 5 -CAIR mutase, classified as PurE Class I, plays a crucial role in converting N 5 -CAIR to CAIR These proteins have garnered significant attention due to their cooperative function, despite the lack of evidence for direct binding affinity between PurK and PurE Class I Their interaction is essential for the effective transfer of the unstable intermediate N 5 -CAIR from the active site.

PurK to PurE Class I The mechanisms of the carboxylation catalyzed by PurK and the

CO2 migration catalyzed by PurE are also of chemical interest 8

In eukaryotes, the vertebrate AIR carboxylase, known as PurE Class II, efficiently converts AIR to CAIR in the presence of CO2 Sequence alignment of PurE Class I and II reveals significant similarities in their amino acid sequences and conserved residues, indicating potential structural and functional parallels Notably, the differences between prokaryotic and eukaryotic systems present an opportunity to develop selective inhibitors targeting prokaryotic purine biosynthetase while minimizing interactions with eukaryotic enzyme counterparts.

Figure 4.2 Reaction catalyzed by PurE Class II in eukaryotes

The X-ray crystal structure of PurE Class I is now accessible from the Protein Data Bank (PDB), providing valuable insights into the enzyme's functionality Recent experiments have further illuminated the protein's mechanism, notably demonstrated by Meyer et al., who found that the CO2 unit of N5-CAIR carbamate directly migrates to the C4 position within the same molecule, as indicated by the lack of isotopic exchange between N5-CAIR and CO2 from the reaction medium Given this wealth of information, the reaction mechanism of PurE Class I presents an excellent opportunity for computational studies and detailed atomistic structural analysis.

Figure 4.3 AMI, MICA and AMIC models represent AIR, N 5 -CAIR and CAIR, respectively, where the CH3 group replaces the ribose-5-phosphate unit

Over the last ten years, advancements in computational power have made computational chemistry and biology essential for investigating enzymatic reaction mechanisms Density Functional Theory (DFT) has emerged as a leading quantum mechanical method, effectively clarifying the reaction mechanisms of numerous enzymatic processes This study focuses on applying DFT to explore the fundamental thermodynamic properties of the PurE enzyme.

The article discusses the PurK reactions in eukaryotes and prokaryotes, highlighting the computational investigation of the reaction mechanisms of PurE Class I It synthesizes data from experimental results, including protein sequence analyses and crystal structures, to offer a novel atomistic description of the catalytic mechanism of the PurE Class I enzyme.

Computational Methods

In computational chemistry, selecting an appropriate functional is crucial for effective DFT calculations The Becke3-Lee-Yang-Parr (B3LYP) hybrid functional has gained popularity due to its successful application in various studies, particularly in enzymatic systems, where it has demonstrated results that align well with experimental data Consequently, B3LYP was selected for the DFT calculations in this research.

This study's calculations encompass various species and processes, highlighting the importance of careful basis set selection Specifically, anions, which possess loosely bonded electrons, require a basis set that incorporates diffuse functions to ensure accurate results.

To accurately characterize hydrogen bonds, which are crucial in our system, a polarized basis set is essential for calculations For this purpose, the split-valence basis set 6-31+G(d), which incorporates both polarization and diffuse functions, was selected for the DFT calculations.

In this study, all structures were fully optimized using the B3LYP/6-31+G(d) method To confirm minimum energy structures and provide zero-point vibrational energy corrections, analytical vibrational frequencies were calculated at the same theoretical level, applying a scaling factor of 0.9806 Thermal and entropic contributions to free energies were derived from the unscaled vibrational frequencies obtained from the B3LYP/6-31+G(d) calculations Additionally, single-point energy calculations at the B3LYP/6-311+G(3df,2p) level were performed to refine the energies of many structures based on the geometries obtained from the B3LYP/6-31+G(d) method.

Transition states (TS) were determined for the conversion of N5-CAIR to CAIR by identifying geometries that exhibit a single imaginary vibrational frequency The intrinsic reaction coordinate (IRC) paths were analyzed in both forward and reverse directions for each reaction.

The thermodynamic stability (TS) of reactants and products was assessed and optimized for each reaction, facilitating the exploration of the potential energy surface (PES) This approach will help identify the most energetically favorable pathway for the overall reaction.

Tomasi and colleagues developed the polarizable continuum model (PCM) to effectively address solvation effects in organic molecules This model allows for the adjustment of the solute's electron density from the gas phase to the condensed phase The PCM was utilized for chlorobenzene (ε=5.62) and water (ε=78.39) through single-point energy calculations based on gas-phase optimized structures, enabling the evaluation of solvation effects for both non-polar and polar solvent models.

All geometry optimizations, transition state searches, frequency calculations and PCM single-point energy calculations were performed using Gaussian 03 30 at the Ohio Supercomputer Center.

Results and Discussions

Simplified Models

The three substrates CAIR, N5-CAIR, and AIR each feature a ribose-5-phosphate (R5P) group attached to their imidazole ring, which remains unchanged during their transformations To simplify the system and reduce computational demands, the R5P group was substituted with a methyl group (-CH3) across all three substrates In this simplified model, 5-amino-1-methyl-imidazole (AMI) corresponds to AIR, (1-methyl-1H-imidazol-5-yl)-carbamic acid (MICA) represents N5-CAIR, and 5-amino-1-methyl-1H-imidazole-4-carboxylic acid (AMIC) denotes CAIR.

Thermochemistry for transformation of AMI to AMIC

In Figure 4.1, the transformation reaction from AIR to N 5 -CAIR, which is catalyzed by PurK, needs one unit of ATP However, neither the transformation from

The transformation from N 5-CAIR to CAIR and the conversion from AIR to CAIR occur without ATP consumption, indicating that these reactions do not require energy input This suggests that the initial reaction is endothermic To confirm this hypothesis, the thermodynamic profiles of the model reactions were calculated and are detailed in Table 4.1.

Gas Phase a PCM b (ε=5.62) PCM b (εx.39) Model Reactions Δ 298 H Δ 298 G Δ 298 H Δ 298 G Δ 298 H Δ 298 G

HO 2 C 21.1 31.7 2.9 12.6 -1.9 8.6 a For the gas-phase calculations, the theoretical level is B3LYP/6-311+G(3df,2p)//B3LYP/6-

31+G(d) b Single-point energy calculations were carried out with the PCM methods at the B3LYP/6-

311+G(3df,2p) level using gas-phase optimized structure at the B3LYP/6-31+G(d) level

Table 4.1 Thermochemistry (kcal/mol) of Model Reactions for AMI to MICA and AMIC

The transformation of AMI to AMIC is characterized as endothermic in the gas phase, with an enthalpy change of +11.7 kcal/mol, while it becomes exothermic in chlorobenzene (ε=5.62) and aqueous (ε=39) solvents, with enthalpy changes of -3.9 and -6.0 kcal/mol, respectively This indicates that solvation effects significantly influence the reaction, shifting it from an endothermic process in the gas phase to an exothermic one in solution The reaction, which combines two reactants to form one product, experiences an unfavorable entropy change, resulting in Gibbs free energy changes of +7.8 kcal/mol in chlorobenzene and +5.7 kcal/mol in water.

The transformation of MICA to AMIC is an exothermic reaction with enthalpy changes of -9.4 kcal/mol in gas-phase, -6.1 kcal/mol in chlorobenzene, and -4.0 kcal/mol in water, indicating the instability of N5-CAIR In experiments without PurE Class I, N5-CAIR rapidly decomposes into AIR in aqueous solutions However, in the presence of PurE Class I, N5-CAIR converts to CAIR at a much faster rate compared to its aqueous decomposition Notably, PCM calculations reveal that the MICA to AMIC transformation is more exothermic and exoergic in a hydrophobic environment, which better represents the enzyme's active site Consequently, this reaction can thermodynamically drive itself without the need for ATP coupling.

The transformation of AMI to MICA is thermodynamically unfavorable due to the instability of N 5 -CAIR, with calculations indicating that the enthalpy change is endothermic in both the gas phase and chlorobenzene, while being exothermic in water at -1.9 kcal/mol This aligns with experimental findings that indicate one unit of ATP is required to facilitate this reaction step, considering the hydrophobic nature of the enzyme's active site.

Thus, using simplified models, the thermodynamics of the three reactions of the purine biosynthetic pathway were investigated The calculated results qualitatively agree with the experimental results

A further goal of this study is to understand the catalytic mechanisms of PurE (Class I and II) and PurK, which are shown in Figures 4.1 and 4.2 Since the PurE Class

The PurE Class I enzyme exclusively utilizes N 5 -CAIR as its substrate, and its crystal structure is well-documented, making it more accessible for study compared to other systems Consequently, this research primarily concentrates on exploring the enzymatic mechanism of PurE Class I.

From MICA to AMIC through cationic or anionic pathways: a brief glance

The conversion of MICA to AMIC involves the relocation of the CO2 unit from N6 to C4, which can occur via cationic or anionic pathways In the cationic pathway, the reactant is protonated to form a positively charged pre-reactant, followed by the migration of the carboxyl group, resulting in cationic intermediates Additionally, various side reactions may take place, prompting the investigation of three alternative structures.

Figure 4.4 Model structures related to the cationic pathways for carboxyl group migration

The study analyzed the migrations of methyl and hydrogen alongside the carboxyl group to generalize the findings To facilitate a comparison of the thermodynamics in various cationic reactions, the energies of the structures depicted in Figure 4.4a were normalized to zero, allowing for an evaluation of the corresponding pathways and computational results presented in Table 4.2.

Migrations of the carboxyl, methyl group, and hydrogen from N6 to N1 are endothermic, while migrations to N3 or C4 are exothermic in gas phase, chlorobenzene, and water The most exothermic migration occurs to N3, although migration to C4 yields the desired product Notably, the side product from N3 migration has not been experimentally detected, suggesting that this side reaction is not enzymatically competitive, assuming the reaction mechanism involves cationic intermediates The N3 position is three bonds away from N6, and the migration requires the carboxyl unit to traverse the imidazole ring over a distance of approximately 5 Å, making this reaction kinetically unfavorable compared to migration to C4.

Different Sites of AMI Gas Phase

The energy values are presented in kcal/mol, with specific references to Figure 4.4 for labeled structures Notably, methylation at the N1 site yields a single isomer, with structure e being equivalent to structure d Additionally, protonation at the C4 site also results in a single isomer, where structure c is equivalent to structure b.

Table 4.2 Energy comparison of different products generated via methylation, carboxylation and protonation reactions of AMI (B3LYP/6-31+G(d)) a

PCM calculations indicate that the enthalpic and Gibbs free energy changes for cationic reactions are significantly more positive in chlorobenzene compared to the gas phase, with an average increase of approximately 7 kcal/mol, which rises to 11 kcal/mol in aqueous solutions This suggests that solvation effects can stabilize cationic pre-reactants and their analogs, with greater stabilization observed in more polar solvents Experimental data reveal that N5-CAIR is an unstable species, exhibiting a half-life of only 0.9 minutes at pH 7.8 and 30 °C in aqueous solution These findings imply that solvation effects may enhance the stability of N5-CAIR, enabling it to effectively reach the active site of the target enzyme, PurE Class I.

The carboxyl, methyl, and hydrogen groups exhibit similar energetic profiles when migrating to various sites on the imidazole ring, suggesting that the C-C and C-H single bonds possess comparable chemical characteristics Notably, carboxyl group migrations are typically more exothermic than those involving hydrogen or methyl groups, attributed to the carboxyl group's high polarity and its capacity to form intramolecular hydrogen bonds.

The migration of the carboxyl group via an anionic pathway is less complex than the cationic route Initially, the C4 site undergoes deprotonation to form an anionic intermediate, enabling the carboxyl group to migrate from N6 to C4 The most challenging aspect of this process is the deprotonation of C4, as the C-H bond is typically strong Comparisons of the deprotonation products of C4 and N6 indicate that deprotonation of C4 is significantly less favorable, by over 30 kcal/mol, in a hydrophobic environment Further mechanistic studies on this pathway will be explored in the following section.

The relative energies (kcal/mol) of the N6 and C4 deprotonation products were analyzed through structures optimized in the gas phase using the B3LYP/6-31+G(d) level of theory Additionally, single-point energy calculations were performed in chlorobenzene and water, employing the optimized geometries derived from the gas phase calculations at the same theoretical level.

Exploring the PES of carboxyl unit migration

Eight distinct pathways, labeled MICAp1 to MICAp8, were identified for the transition state searches of MICA conversion in the PurE Class I protein, without presuming the pH value at its active site These pathways feature various protonation and deprotonation sites for MICA, as illustrated in Figure 4.6.

Figure 4.6 Possible protonation states of MICA for carboxyl group migration

The study investigated the transition states for carboxyl unit migration using B3LYP/6-31+G(d) theory, identifying transition states for MICA structures p1 to p8, with the exception of p3 These findings, alongside the corresponding reactants and products, enhanced the understanding of the potential energy surface (PES) of the model system Due to varying protonation states affecting hydrogen atom counts, the calculated energies of these structures were not directly comparable The crystal structure of PurE Class I reveals two histidine residues at the active site, known for their role as hydrogen donors/acceptors in enzymatic reactions This research employed separately optimized histidine side chains in different protonation states as reference points, allowing for the comparison of MICA structures p1 to p8 The optimized structure of MICAp1 and three neutral histidine side chains served as the reference state, with energies of other structures compared accordingly The computational results are detailed in Table 4.3.

Side Chain of Histidine Protonated Side Chain of Histidine a b

Figure 4.7 Different protonation states of histidine side chain

The migration of the carboxyl group starting from MICAp1 is an exothermic process, making it thermodynamically favorable, with an enthalpy change of -31.7 kcal/mol in the gas phase, -23.9 kcal/mol in chlorobenzene, and -21.1 kcal/mol in water The activation barrier is 31.0 kcal/mol in the gas phase, increasing to 36.7 kcal/mol in chlorobenzene and 39.0 kcal/mol in water In MICAp1, the protonated carboxyl group likely prevents the release of CO2 from the active site, aligning with experimental findings that show no isotopic exchange of the carboxyl group in aqueous solutions Following the migration, C4 on the imidazole ring must be deprotonated to form CAIR, which will also be discussed in this study.

In MICAp2, the protonation of the amide nitrogen and deprotonation of C4 result in a zwitterionic structure The enthalpy of carboxyl group migration from MICAp2 is highly exothermic, starting at -109.7 kcal/mol in the gas phase and decreasing to -92.9 kcal/mol in chlorobenzene and -86.2 kcal/mol in water, indicating thermodynamic favorability The reaction barrier is low, under 3 kcal/mol, suggesting kinetic favorability as well However, MICAp2 has a higher energy enthalpy of 62.8 kcal/mol in the gas phase compared to the reference state, while solvation stabilizes it to 48.3 kcal/mol in chlorobenzene and 43.4 kcal/mol in water The significant energy difference between MICAp1 and MICAp2 arises from the deprotonation of the C4 site, where the C4-hydrogen bond is stronger than the N-H bond of the amide group, making the deprotonation endothermic and increasing the imidazole ring's energy This phenomenon ultimately renders the carboxyl group migration exothermic with a low reaction barrier.

MICAp1(3;0) 0 36.7 (36.7) -23.9 (-23.9) 0 37.1 (37.1) -23.0 (-23.0) MICAp2(2;1) 48.3 50.9 (2.6) -44.6 (-92.9) 48.0 52.1 (4.1) -43.6 (-91.6) MICAp4(1;2) 38.0 40.3 (2.3) -21.9 (-59.9) 37.1 38.5 (1.4) -21.0 (-58.1) MICAp5(2;1) -44.5 30.0 (74.5) -27.3 (17.2) -44.0 31.1 (75.1) -27.0 (17.0) MICAp6(1;2) 31.8 61.3 (29.5) -4.1 (-35.9) 32.0 62.6 (30.6)) -3.6 (-35.6) MICAp7(1;2) -24.1 10.2 (34.3) -9.4 (14.7) -23.9 8.6 (32.5) -9.2 (14.7)

MICAp1(3;0) 0 39.0 (39.0) -21.1 (-21.1) 0 39.4 (39.4) -20.2 (-20.2) MICAp2(2;1) 43.4 45.8 (2.4) -42.8 (-86.2) 43.1 46.9 (3.8) -41.9 (-85.0) MICAp4(1;2) 21.9 23.3 (1.4) -38.6 (-60.5) 21.0 21.5 (0.5) -37.8 (-58.8) MICAp5(2;1) -41.9 31.8 (73.7) -25.0 (16.9) -41.3 32.9 (74.2) -24.7 (16.6) MICAp6(1;2) 14.2 45.1 (30.9) -20.4 (-34.6) 14.4 46.4 (32.0) -19.8 (-34.2) MICAp7(1;2) -39.1 -3.1 (36.0) -25.5 (13.6) -38.9 -4.7 (34.2) -25.3 (13.6) PCM f (Water , ε = 78.39)

The article presents calculations for MICAp8(0;3) with values of 19.0, 73.1, and -9.6, alongside reference states for neutral and protonated histidine side chains The numbers in parentheses indicate the quantities of neutral and protonated histidines The gas-phase calculations utilize the B3LYP/6-31+G(d) level of theory, with MICAp1 set as the reference structure for comparison In this context, "R" denotes reactant, "P" represents product, and "TS" indicates transition state, with values in parentheses reflecting activation enthalpy, Gibbs free energy, reaction enthalpies, and Gibbs free energy changes Additionally, PCM single point energy calculations were conducted on gas-phase optimized structures at the B3LYP/6-31+G(d) level of theory.

The transition state energies in chlorobenzene and water are 50.9 and 45.8 kcal/mol, respectively, which are higher than the reaction barriers for MICAp1 carbon migration This suggests that the pathway involving MICAp1 is more feasible.

Calculations indicate that no transition state exists for the migration of CO2 from the deprotonated carboxyl group in MICAp3 Instead, MICAp3 decomposes into carbon dioxide and 1-methyl-5-aminoimidazole Experimental evidence of no CO2 exchange further supports the conclusion that MICAp3 cannot serve as a reaction intermediate for CO2 migration.

Deprotonation of the carboxyl group in MICAp2 results in the formation of MICAp4, which exhibits a highly exothermic migration pathway with an enthalpy change of -56.1 kcal/mol in the gas phase, and -59.9 and -60.5 kcal/mol in chlorobenzene and water, respectively The reaction barrier remains low, under 3 kcal/mol across different solvation models MICAp4 has a relative enthalpy of 119.5 kcal/mol compared to the reference state and displays significant polarizability due to its dianionic characteristics Solvation effects notably reduce the relative energy of MICAp4 to 38.0 kcal/mol in chlorobenzene and 21.9 kcal/mol in water, suggesting that the reaction pathway involving MICAp4 is more favorable than that of MICAp2 However, the transition state for this pathway, influenced by the deprotonated carboxyl group, exhibits characteristics of free CO2.

MICAp3, this pathway could be excluded by the experimental results

The carboxyl migration beginning with MICAp5 is an endothermic process, exhibiting enthalpy changes of approximately 17 kcal/mol in both gaseous and PCM models The reaction barrier for this migration is 76.4 kcal/mol in the gas phase and around 74 kcal/mol in solvation models, indicating that the reaction is neither thermodynamically nor kinetically favorable Relative to the reference structure (MICA p1), the enthalpy of MICAp5 is -47.9 kcal/mol in the gas phase, decreasing to -44.5 and -41.9 kcal/mol in chlorobenzene and water, respectively, making it the most stable pre-reactant structure on the potential energy surface (PES) The transition state's relative enthalpy is 28.5 kcal/mol in the gas phase and slightly increases to 30.0 and 31.8 kcal/mol in chlorobenzene and water Compared to the transition state of carboxyl group migration starting with MICAp1, which is currently the most favorable pathway, the overall barrier remains significantly high Additionally, since MICAp5 is neutral, the solvation process does not provide substantial stabilization for the reactant, product, or transition state, rendering this pathway unfeasible for carbon unit migration.

The migration pathway of the carboxyl group in MICAp6 is exclusively anionic, with C4 being deprotonated prior to migration In the gas phase, the relative energy of MICAp6 is 117.4 kcal/mol However, solvation effects in chlorobenzene and water significantly stabilize this structure, reducing its relative energy to 31.8 kcal/mol and 14.2 kcal/mol, respectively.

The reaction barrier for the transition states along this pathway is 26.0, 29.5, and 30.9 kcal/mol, with a reference state of 143.4 kcal/mol in the gas phase Solvation stabilization significantly lowers the activation barriers to 61.3 kcal/mol in chlorobenzene and 45.1 kcal/mol in water This pathway is exothermic, exhibiting enthalpy changes of -40.7, -35.9, and -34.6 kcal/mol in the gas phase, chlorobenzene, and aqueous solution, respectively Given that the active site of the target enzyme is hydrophobic, chlorobenzene may better mimic the active site environment than water In chlorobenzene, the transition state energy for the MICAp6 pathway is 24.6 kcal/mol higher than that of the MICAp1 pathway, with a reaction barrier of 29.5 kcal/mol relative to its starting structure Both pathways are exothermic, but the MICAp6 pathway has an enthalpy change that is 12.0 kcal/mol lower than that of MICAp1, indicating that MICAp6 is competitive with MICAp1 In the transition state of the MICAp6 pathway, a four-membered ring structure is formed by four atoms, which prevents the release of a carbon dioxide unit into solution, aligning with experimental findings Overall, while the MICAp6 pathway is more thermodynamically favorable than MICAp1, it is less favorable kinetically.

The CO2 migration pathway through MICAp7 is characterized as an anionic pathway, where the deprotonated carboxyl group carries a negative charge This process is endothermic and thermodynamically unfavorable, with enthalpy changes of 17.0, 14.7, and 13.6 kcal/mol observed in the gas phase, chlorobenzene, and aqueous solution, respectively Additionally, the reaction barrier is 26.3 kcal/mol in the gas phase, escalating to 34.3 kcal/mol in chlorobenzene and 36.0 kcal/mol in aqueous solution.

The reaction barrier for MICAp7 is relatively low compared to the MICAp1 and MICAp6 pathways The relative energy of MICAp7 decreases significantly from 56.1 kcal/mol in the gas phase to -24.1 kcal/mol in chlorobenzene, and further to -39.1 kcal/mol in aqueous solution Similarly, the transition state's relative energy drops from 82.4 kcal/mol in the gas phase to 10.2 kcal/mol in chlorobenzene and -3.1 kcal/mol in aqueous solution, indicating that this pathway is highly competitive with MICAp1 and MICAp6 However, despite its endothermic nature, the transition state exhibits characteristics of free carbon dioxide, which contradicts experimental observations.

The MICAp8 structure is a dianion, and its CO2 migration pathway exhibits high relative energies for the reactant, transition state, and product in the gas phase due to strong repulsive intermolecular electrostatic interactions While solvation effects can stabilize these dianions, the transition state's energy remains significantly higher compared to other pathways in both chlorobenzene and aqueous solutions The reaction barrier is recorded at 50.4 kcal/mol in the gas phase, 55.8 kcal/mol in chlorobenzene, and 54.1 kcal/mol in aqueous solution, marking it as the second-highest barrier among the pathways examined This pathway is exothermic in condensed phases, with enthalpy changes of -27.6 kcal/mol in chlorobenzene and -28.6 kcal/mol in aqueous solution Notably, the transition state demonstrates more characteristics of free carbon dioxide than those of the migrated carboxyl group, leading to the conclusion that this pathway is not viable for carboxyl group migration.

In this section, the PES for carboxyl unit migration of the model molecule MICA

The neutral state MICAp5 pathway is endothermic and has a high reaction barrier, making it thermodynamically unfavorable Experimental evidence rules out pathways starting from structures with deprotonated carboxyl groups, such as MICAp4, MICAp7, and MICAp8, due to the presence of free CO2 in their transition states MICAp2 and MICAp6 involve deprotonation at position C4 on the imidazole ring, which requires breaking a strong carbon-hydrogen bond The transition states for MICAp2 and MICAp6 exhibit significantly higher relative enthalpies compared to other condensed phase transition states, except for MICAp8.

MICAp2 or MICAp6 as potential pathways, but it does make them less favorable than the pathway starting with MICAp1; the geometric information is shown in Figure 4.9

Complete mechanism involved with enzyme active site

The transformation of N 5 -CAIR to CAIR through the MICAp1 structure involves three key steps Initially, N 5 -CAIR undergoes protonation at the amide nitrogen (N6), forming a pre-reactant for CO2 migration Next, the carboxyl group migrates from N6 to C4 on the imidazole ring, resulting in a pre-product intermediate Finally, this intermediate is deprotonated at C4, yielding the final product, CAIR, as illustrated in Figure 4.10.

In the catalytic reaction involving PurE Class I, the substrate undergoes protonation and deprotonation, necessitating an external hydrogen donor and acceptor PurE Class I serves this role, with its complex with the substrate AIR represented by PDB ID 1D7A Although AIR is not the direct substrate, its structural and sequence similarities to N 5 -CAIR aid in identifying the enzyme's active site Notably, the imidazole ring of AIR aligns with histidine 45 (His45), which is conserved across species, highlighting His45's function as a crucial hydrogen donor/acceptor in enzyme catalysis.

Protonation of N 5 -CAIR by His45

Carboxyl group migration Deprotonation of CAIR by His45

Figure 4.10 Proposed mechanism of PurE Class I (N 5 -CAIR Mutase)

The study investigates histidine's role as a hydrogen acceptor in a reaction pathway by modeling the deprotonation process with the side chain of His45 and MICAp1 Using B3LYP/6-31+G(d) for analysis, a transition state for hydrogen transfer was identified, and the reactant and product complexes were optimized, as shown in Figure 4.12a The reaction is highly exothermic, exhibiting enthalpy changes of -29.7 kcal/mol in chlorobenzene and -26.8 kcal/mol in an aqueous solution Additionally, the activation barriers are low, at 8.7 kcal/mol in chlorobenzene and 9.9 kcal/mol in aqueous solution, reinforcing the viability of the proposed pathway.

Close-up View of Active Site

Figure 4.11 Active Site of PurE Class I (PDB ID: 1D7A)

The initial step in the catalytic reaction involves the protonation of MICA, resulting in the formation of MICAp1 In this mechanism, His45 becomes protonated and prepares for the subsequent catalytic cycle by deprotonating MICAp1 (P) To investigate the protonation process of MICA facilitated by His45, a model complex was created by combining the protonated His45 side chain with neutral MICA for transition state analysis The B3LYP/6-31+G(d) method was employed to explore the reaction mechanism, revealing one transition state for the deprotonation process, with optimized reactant and product complexes identified.

The deprotonation step in chlorobenzene and water is endothermic, with enthalpy changes of 28.5 kcal/mol and 27.7 kcal/mol, respectively The activation enthalpy is 28.0 kcal/mol in chlorobenzene and 30.4 kcal/mol in aqueous solution, indicating a close relationship with the enthalpy change This suggests that the reaction is neither thermodynamically nor kinetically favorable, highlighting the complexity of the actual enzymatic reaction mechanism beyond the model utilized in this study Further research will be conducted to explore this issue.

The investigation of the carbon unit migration pathway through MICAp1 reveals that the first step, protonation of neutral MICA, is endothermic with an enthalpy change of 28.5 kcal/mol in chlorobenzene In contrast, the second step, carboxyl group migration, is exothermic, exhibiting an enthalpy change of -23.9 kcal/mol The final step, deprotonation of the pre-product intermediate MICAp1 (P), is also exothermic, with an enthalpy change of -29.7 kcal/mol Overall, the combined reaction of these three steps is exothermic, resulting in an enthalpy change of -25.1 kcal/mol Further investigation is required to explore the complexities associated with the deprotonation of MICAp1 and the protonation of neutral MICA to generate MICAp1.

Figure 4.12 Relative energies of His45 as proton acceptor/donor

The structures were optimized in the gas phase using the B3LYP/6-31+G(d) method, with energy values reported in kcal/mol Additionally, the values in brackets and parentheses for each structure correspond to PCM single-point energy calculations conducted in chlorobenzene (ε = 5.62) and aqueous solution (ε = 78.39), respectively.

Verification of the Simplified Model

To validate the PES calculations derived from the simplified model, the complete structure of N5-CAIR, including ribose-5-phosphate (R5P), was employed for transition state searches and structural optimizations, utilizing the same theoretical framework as the simplified model Given the significant computational requirements for a comprehensive analysis of the N5-CAIR system, the investigation focused solely on the pathway through MICAp1 A transition state for the migration of the carboxyl group in N5-CAIR was successfully identified.

Figure 4.13 Transition state for carboxyl group migration with full N 5 -CAIR structure

The initial structure of N 5 -CAIR in this pathway is protonated on the amine nitrogen, similar to the MICAp1 scenario discussed earlier Both the reactant and product associated with this transition state were optimized, and their calculated energies were utilized to determine the thermodynamics of the reaction (Figure 4.14) Notably, the structures of the reactant and product in this pathway closely resemble those in the simplified MICAp1 model, and the geometric data for the transition states of this simplified model were also analyzed.

The reactions for (R=CH3) and (R=R5P) exhibit significant similarities, with their thermodynamic properties aligning closely The reaction barriers for the complete N5-CAIR pathway are measured at 36.7, 38.0, and 39.4 kcal/mol in the gas phase, chlorobenzene, and aqueous solution, respectively In contrast, the simplified (R=CH3) model pathway through MICAp1 shows barrier values of 31.0, 36.7, and 39.0 kcal/mol for the same solvents Additionally, the enthalpy changes for the complete N5-CAIR pathway are -20.0, -18.6, and -17.9 kcal/mol across the gas phase, chlorobenzene, and aqueous solution, while the model system displays values of -31.7, -23.9, and -21.1 kcal/mol in these solvents These comparisons indicate that the simplified model provides a reasonable approximation of the full N5-CAIR computations, particularly in condensed phases, reinforcing the validity of prior studies using this model to represent the potential energy surface of the complete N5-CAIR substrate.

To validate the thermodynamic properties computed using the B3LYP/6-31+G(d) method, the CBS-QB3 method was employed for gas-phase calculations of carboxyl group migration, starting from the MICAp1 structure The relative enthalpies (Gibbs free energy) for the transition state and product in relation to MICAp1 were found to be 32.3 (32.6) and -30.5 (-29.7) kcal/mol, respectively In comparison, the B3LYP/6-31+G(d) calculations yielded values of 31.0 (31.5) and -31.7 (-30.9) kcal/mol, demonstrating discrepancies of less than 1.5 kcal/mol from the CBS-QB3 results These findings confirm the accuracy and reliability of the B3LYP/6-31+G(d) method utilized in this research.

Conclusions

Recent computational studies using DFT methods have investigated three enzymatic reactions in the purine biosynthetic pathway, focusing on the conversion of AIR to CAIR A simplified model was developed to represent the substrates involved: AMI for AIR, MICA for N5-CAIR, and AMIC for CAIR The thermodynamic calculations reveal that producing one unit of N5-CAIR from AIR requires the consumption of one ATP, while the biosynthesis of CAIR from either AIR or N5-CAIR does not require ATP Additionally, the reactivity analysis of the model molecule AMI indicates that AMIC is more stable than MICA, with the transformation from MICA to AMIC facing a potential competitive reaction at the N3 site on the imidazole ring; however, structural analysis suggests this side reaction is kinetically unfavorable.

The transformation from N5-CAIR to CAIR does not consume ATP, prompting extensive research into the enzymatic mechanism due to the availability of the enzyme's crystal structure Without assuming the initial protonation state of N5-CAIR prior to CO2 migration from N6 to C4, eight MICA model structures were created to explore various reaction pathways along the potential energy surface (PES) Out of seven transition states identified for CO2 migration, four were ruled out due to the lack of isotopically enriched CO2 exchange in experiments Ultimately, three pathways—MICAp1, MICAp2, and MICAp6—aligned with experimental findings, with MICAp1 emerging as the most favorable due to its lower overall reaction barrier compared to MICAp2 and MICAp6.

In the MICAp1 pathway, the reaction requires the protonation of N6 in MICA prior to CO2 migration, followed by the deprotonation of C4 A conserved histidine residue, His45, located at the active site, plays a crucial role as both the proton donor and acceptor in this process.

To validate the simplified model used in this study, full structure transition state searches of the N5-CAIR substrate (R=R5P) were conducted The optimized structures along the CO2 migration pathway for the complete substrate, including reactants, products, and transition states, closely resembled each other Additionally, the thermodynamic properties, such as reaction barriers and enthalpy changes, were consistent across both pathways, particularly when accounting for solvation effects.

The enthalpies and free energies of the MICAp1 carboxyl group migration pathway, calculated using CBS-QB3 and B3LYP/6-31+G(d) in the gas phase, show a difference ranging from 1.1 to 1.3 kcal/mol This indicates that the B3LYP/6-31+G(d) method is effective for providing both quantitative and qualitative insights into this system.

This computational study proposes a stepwise enzymatic reaction mechanism for PurE Class I, detailing how protonated His45 at the active site first protonates N5-CAIR Subsequently, the carboxyl unit (-CO2H) migrates to carbon 4 (C4), leading to C-C bond formation and the creation of a protonated CAIR intermediate In the final step, His45 deprotonates this intermediate, yielding the final product, CAIR, and restoring His45 to its initial state Notably, this represents the first atomistic description of the catalytic mechanism of the PurE Class I enzyme The insights gained may aid in the development of transition state analogs and mechanism-based inhibitors, potentially serving as new drug leads for optimization in synthetic and medicinal chemistry.

Introduction

In the previous chapter, the thermodynamics of reactions catalyzed by PurE Class I and II, as well as PurK, in the purine biosynthesis pathway were examined through computational methods A simplified model of the reaction catalyzed by PurE Class I, known as N5-CAIR mutase, was developed and assessed using DFT calculations The key simplification involved substituting the ribose-5-phosphate group with a methyl group This simplified model allowed for an analysis of the potential energy surface (PES) of the reaction, leading to the proposal of a molecular reaction mechanism.

The model lacked any intramolecular interactions associated with the ribose-5-phosphate (R5P) group, suggesting that the potential energy surface (PES) could vary significantly due to the presence of the R5P group, particularly when considering the various charge states of the phosphate unit.

This chapter presents an advanced model of the PurE Class I reaction, incorporating the protonation states of the substrate The computational findings from this model are compared with those obtained from a simplified version The reaction model under investigation is illustrated in Figure 5.1.

Figure 5.1 Reaction catalyzed by PurE Class I (pKa of R5P = 6.22) 1

Computational Methods

In this study, DFT methods were utilized to optimize all structures of interest at the B3LYP/6-31+G(d) level of theory Vibrational frequencies were calculated for each optimized structure to confirm a minimum energy configuration and to provide zero-point vibrational energy corrections, applying a scaling factor of 0.9806.

Transition states (TS) were analyzed for the migration pathways of the carboxylic group The intrinsic reaction coordinate (IRC) paths in both forward and reverse directions were calculated and optimized for each TS This approach allowed for the exploration of the potential energy surface (PES) of the reaction model, aiding in the identification of the most favorable reaction pathway.

The polarizable continuum model (PCM) was utilized to analyze solvation effects, employing chlorobenzene (ε=5.62) and water (ε=78.39) as solvent models Single-point energy calculations were conducted at the B3LYP/6-31+G(d) level of theory, using gas-phase optimized structures to evaluate the impact of non-polar and polar solvents on solvation.

All geometry optimizations, TS searches, frequency calculations and PCM single- point energy calculations were performed by Gaussian 03 13 at the Ohio Supercomputer Center.

Results

Model structures

Both N5-CAIR and CAIR exhibit various protonation states, prompting a detailed examination of the reaction's potential energy surface A total of 48 distinct structures were generated, each representing different protonation configurations (see Figure 5.2) These structures were systematically labeled based on the protonation states of five specific sites on the imidazole ring and the phosphate group, designated as R1 through R5.

Eight diff erent protonation states Protonation Code

Combine with three state of R5P Protonation Code

Eight diff erent protonation states Protonation Code

R 1 = No H (0), R 2 = No H (0), R 3 = H (1) 001 Combine with three state of R5P Protonation Code

Figure 5.2 N 5 -CAIR and CAIR Structures in Different Protonation States

(Reference structure is set as zero for energy comparison.)

The protonation states of R1, R2, and R3 are represented by digits 0 (indicating an anion) and 1 (indicating a hydrogen atom), while the phosphate group sites R4 and R5 utilize three digits (0, 1, and 2) to denote specific protonation states: 0 for both hydroxyl groups deprotonated (Pi 2-), 1 for one hydroxyl group deprotonated (Pi 1-), and 2 for both hydroxyl groups protonated (Pi) This nomenclature allows for a four-digit code to represent the protonation state of N5-CAIR or CAIR, such as N5-CAIR110-2, where R1 and R2 are hydrogen atoms, R3 is deprotonated, and the phosphate group remains neutral In total, there are 24 structures of N5-CAIR, coded from 000-0 to 111-2, which can be paired with CAIR structures, as the migration of the CO2 unit transforms each N5-CAIR into a specific CAIR structure.

To ensure comparability between the structures of N 5 -CAIR and CAIR in various protonation states, histidine residues were utilized as reference structures This approach standardized the stoichiometry of each species, aligning them to a consistent overall formula across different protonation states.

Optimization in the gas phase

The optimized neutral structure N5-CAIR101-2 serves as the reference point for all N5-CAIR and CAIR structures A total of 48 structures were optimized using the B3LYP/6-31+G(d) level of theory in the gas phase, as detailed in Table 5.1 It was observed that during the optimization process, certain structures were unable to sustain their protonation state.

These structures therefore were not listed in the table Most of the unstable structures contain a completely deprotonated phosphate group (with two negative charges and

The structures R4 and R5 exhibit multiple negative charges, resulting in fragmentation during structural optimization The stable N5-CAIR and CAIR structural pair, coded 000-0, features a Pi 2-group, a negatively charged C4 position on the imidazole ring, and a negatively charged carboxylate group Overall, each structure possesses four negative charges, contributing to their high relative energies compared to the reference structure N5-CAIR101-2 of the N5-CAIR and CAIR pair.

The transformation from N 5 -CAIR000-0 to CAIR000-0 is exothermic with ΔH -43.2 kcal/mol With the phosphate group containing one deprotonated hydroxyl group,

After geometric optimization, N 5 -CAIR000-1 and CAIR000-1 exhibited stability, with CAIR000-1 having an enthalpy that is 28.9 kcal/mol lower than N 5 -CAIR000-1 This enthalpic difference is comparable to that observed between N 5 -CAIR000-0 and CAIR000-0 Notably, N 5 -CAIR000-0 has an enthalpy approximately 300 kcal/mol higher than N 5 -CAIR000-1, while N 5 -CAIR000-2 shows an enthalpy about 200 kcal/mol lower than N 5 -CAIR000-1 This stepwise trend highlights the significant impact of varying protonation states of the phosphate unit on the potential energy surface (PES) of the N 5 -CAIR~CAIR system Both CAIR000-0 and CAIR000-1 demonstrate similar enthalpic trends, although CAIR000-2 was unstable during structural optimization and could not be compared with the other structures in this series.

During the optimization process, both N 5 -CAIR001-0 and CAIR001-0 exhibited instability In contrast, the pair N 5 -CAIR001-1 and CAIR001-1 displayed elevated relative enthalpies, likely attributed to the presence of a negatively charged imidazole ring and a phosphate group The enthalpies associated with CAIR001-1 are notably significant.

1 are 50.6 kcal/mol lower than N 5 -CAIR001-1, which makes pathway 001-1 exothermic Having a neutral phosphate group, N 5 -CAIR001-2 showed enthalpic values about 140 kcal/mol lower than that of N 5 -CAIR001-1

The structural optimization of N 5 -CAIR010-0 and CAIR010-0 was unstable due to a deprotonated C4 site and a carboxylic group, resulting in the deprotonation of a hydroxyl group on the ribose ring by the phosphate unit In contrast, the optimization of N 5 -CAIR010-1(2) and CAIR010-1(2) was successful, with the 010-1 pair exhibiting two negative charges and high relative enthalpies comparable to the 001-1 pair The enthalpy change between N 5 -CAIR010-1 and CAIR010-1 was -63.3 kcal/mol, similar to that of the 001-1 pair The structural pair 010-2, which has one negative charge, demonstrated lower enthalpies than 010-1, with enthalpic changes for CO2 migration in 010-2 pathways recorded at -82.3 kcal/mol, indicating that both pathways are exothermic.

The N5-CAIR011-0(1) structure exhibited instability during optimization due to a deprotonated C4 position alongside protonated carboxylate and amine nitrogen groups In contrast, the optimized CAIR structures demonstrated high relative enthalpies, with a notable decrease from structure 011-0 to 011-1 The optimized N5-CAIR011-2 and CAIR011-2 structures revealed an additional exothermic pathway, showing enthalpy changes of -90.0 kcal/mol Furthermore, CAIR011-2 followed a stepwise trend, indicating that its relative enthalpy is lower than that of CAIR011-1.

The four series (000, 001, 010, and 100) illustrate potential pathways for carbon dioxide unit migration, with the protonation state of the phosphate group influencing the relative energies of various structures However, the reaction enthalpies remain largely unchanged across different protonation states of the phosphate group, indicating that it may not significantly impact the relative potential energy surfaces (PES) in this reaction Notably, all four series share a common feature: a deprotonated C4 position Regardless of the protonation states of the carboxylic group and amide nitrogen, all calculated pathways exhibit exothermic characteristics when the C4 site is deprotonated.

All remaining structures have a neutral C4 site on imidazole ring N 5 -CAIR100-

The compound 0 exhibits high relative enthalpy due to its three negative charges, leading to instability in CAIR100-0, which may render the 100-0 pathway inaccessible Both N5-CAIR100-1 and CAIR100-1 demonstrate elevated relative enthalpies, attributed to strong and unfavorable intramolecular Coulombic interactions Notably, the enthalpy of CAIR100-1 is 10.6 kcal/mol greater than that of N5-CAIR100-1, indicating that this pathway is endothermic and thermodynamically unfavorable In contrast, the 100-2 pair of structures shows low relative enthalpies, consistent with the trend of decreasing relative enthalpy as the phosphate group carries less negative charge Additionally, the enthalpy change for the 100-2 pathway remains positive, and deprotonation of the carboxylic group does not contribute to making the pathway exothermic across a series of pathways.

The N 5 -CAIR101-0 and CAIR101-0 structures exhibit instability during optimization In contrast, the optimized structures of the 101-1(2) pairs reveal an exothermic reaction of kcal/mol for the 101-2 route Notably, the N 5 -CAIR structure features a neutral imidazole ring, representing its most stable configuration.

Structures featuring deprotonated carboxylic groups and protonated amide nitrogens exhibit instability due to the potential release of CO2 from the carboxylate unit in this protonation state Experimental results indicate that isotopic exchange occurs with labeled compounds.

CO2 did not occur between N 5 -CAIR and CO2 in the reaction medium 14 Therefore, this series of structures is not a feasible pathway for CO2 migration

The latest series of optimized structures feature a neutral C4 site on the imidazole ring and a carboxylic group, with the amine nitrogen being protonated and carrying a positive charge Among the N5-CAIR structures, 111-0 and 111-1 were found to be unstable and could not be optimized The only successfully optimized paired structures in this series are N5-CAIR111-2 and CAIR111-2 Notably, the enthalpy change from N5-CAIR to CAIR is significantly exothermic at -18.1 kcal/mol, marking it as the sole exothermic pathway with a neutral C4 site on the imidazole ring in these gas-phase calculations.

The gas-phase calculations above show that deprotonation of the C4 site of the imidazole ring could lead to many exothermic reaction pathways from N 5 -CAIR to CAIR

The protonation of amine nitrogen can initiate an exothermic reaction pathway, significantly impacting the protonation state of the ribose-5-phosphate (R5P) group and its structural energies relative to the reference structure However, the enthalpic changes across different pathways remain relatively stable despite varying protonation states of the phosphate group Notably, the crystal structure of enzyme PurE Class I and its substrate reveals that the phosphate unit of R5P can establish a salt bridge with the guanidine group of conserved residue Arg46, while the ribose ring’s hydroxyl group forms a hydrogen bond with the carboxylate group of conserved residue Asp19 These interactions promote a "rigid" orientation of the substrate within the enzyme's active site, and the anionic R5P group may also provide stabilization for the positively charged protonated amine group.

The R5P group may significantly influence the enzymatic mechanism of the CO2 migration process, despite our gas-phase calculations indicating no direct energetic involvement of this group.

Aqueous phase calculation with the PCM method

To investigate the impact of aqueous solvation on reaction thermodynamics, PCM single-point energy calculations were conducted on gas-phase optimized structures Notably, water significantly stabilizes negatively charged structures, with all calculated structures showing relative enthalpies below 90 kcal/mol compared to over 788 kcal/mol for gas-phase results The results of the PCM calculations in the aqueous phase are detailed in Table 5.2 It is important to note that unstable gas-phase structures transformed into stable configurations with varying protonation states and were excluded from PCM calculations.

For the pathways which include a deprotonated (anionic) C4 site, most are still exothermic in the aqueous phase, except for the one through N 5 -CAIR000-0 and

The enthalpy change for CAIR000-0 is 21.3 kcal/mol, while the enthalpic changes for other pathways are similar to their gas phase counterparts The reaction enthalpy differences between gas phase and aqueous solution for pathways with a deprotonated C4 unit are all within 10 kcal/mol This indicates that solvation effects significantly stabilize the involved structures but have minimal impact on the thermodynamics of each reaction pathway Consequently, while the potential energy surface (PES) of this system may be quantitatively shifted, it remains qualitatively similar between gas and aqueous phases.

PCM calculations for pathways with a neutral C4 site indicate that while solvation stabilizes all structures, the reaction enthalpies remain largely unchanged Each pathway's reaction enthalpy retains the same sign as in the gas phase, with a deviation of approximately 10 kcal/mol Notably, the only exothermic pathway occurs through the N 5 CAIR111-2 and CAIR111-2 pair, exhibiting an enthalpy change of -17.2 kcal/mol, which is just 0.9 kcal/mol different from gas-phase results Overall, aqueous solvation does not significantly alter the potential energy surface (PES) of these reactions for most calculated structures.

PCM calculations in chlorobenzene

Chlorobenzene (ε=5.62) was utilized alongside water as a solvent for PCM calculations to explore the potential energy surface (PES) in a hydrophobic environment, reflecting the nonpolar conditions of the enzyme's active site The findings from the PCM calculations conducted in chlorobenzene are detailed in Table 5.3.

Pathways featuring two negative charges on the imidazole ring (000-x, where x=1, 2, 3) exhibit varying thermochemical behaviors in chlorobenzene compared to gas phase and aqueous conditions The 000-0 pathway is exothermic in chlorobenzene, with an enthalpy change of -37.5 kcal/mol, closely aligning with gas-phase results of -43.2 kcal/mol In contrast, the 000-1 pathway shows a positive enthalpy change of 2.8 kcal/mol, diverging from gas-phase observations Aqueous PCM calculations reveal that the 000-0 pathway becomes endothermic with an enthalpy change of 21.3 kcal/mol, while the 000-1 pathway is exothermic at -29.4 kcal/mol The thermodynamic properties of these pathways shift significantly across different media: the 000-0 pathway is exoergic in gas and chlorobenzene but endoergic in water, whereas the 000-1 pathway is exoergic in gas, endoergic in chlorobenzene with near-zero enthalpy change, and endoergic in water The negatively charged structures in these pathways suggest that solvent stabilization effects are crucial to understanding these phenomena.

The investigated pathways exhibited thermochemistry consistent with gas-phase results, revealing that most pathways featuring a deprotonated C4 on the imidazole ring are exothermic Additionally, when the amine nitrogen at the R2 site is protonated, the resulting carboxylate group migration products do not possess a negative charge on the amine group.

In the R2 site, the CO2 migration products possess a negative charge on the amine group, resulting in elevated relative energies for these species Consequently, the pathways involving the 010 and 011 series structures, characterized by protonated amine nitrogen, exhibit greater exothermicity compared to those of the 000 and 001 series structures, where the amine nitrogen remains unprotonated.

Most pathways involving a neutral C4 site are endothermic, consistent across gas-phase and aqueous results, with the exception of the exothermic pathway through N5-CAIR111-2 and CAIR111-2, which exhibits enthalpy changes of -18.1, -17.2, and -16.7 kcal/mol in gas, aqueous, and chlorobenzene phases, respectively These values demonstrate minimal variation, indicating that solvation has little impact on this pathway Despite the unknown microenvironment of the enzyme's active site, the thermochemistry of this pathway is expected to remain stable Additionally, stabilization effects in chlorobenzene are weaker compared to those in aqueous solutions Notably, aside from N5-CAIR000-0 and CAIR000-0, all other structures exhibit relative energies below 100 kcal/mol.

Transition states search related to previous calculations

All structures have been optimized to the lowest energy states, allowing for the determination of thermodynamic properties of reaction pathways However, understanding the kinetic properties requires consideration of activation barriers, necessitating the identification of transition states (TS) for these pathways Significant efforts were dedicated to locating TS associated with the most probable reaction pathways Four TS were identified connecting N5-CAIR and CAIR, with the previously optimized structure N5-CAIR101-2 serving as a reference This ensures that all structures, including transition states, share a common reference point, making their relative energies comparable Additionally, model molecules based on the histidine side chain were utilized to standardize the molecular formula across different protonation states.

All transition states were identified using the B3LYP/6-31+G(d) level of theory in the gas phase Energetic results for chlorobenzene and water were calculated using the PCM model at the same theoretical level to estimate solvation effects based on optimized gas-phase geometries Thermal and entropic corrections for enthalpies and free energies were derived from gas-phase calculations, with all values referenced to a standard structure, N 5 -CAIR101-2, and reported at 298 K in kcal/mol.

Table 5.4 Transition states for carboxylate group migration using the full model

The reaction barrier for the carboxylate group’s migration starting from structure

The reaction barrier for N 5 -CAIR011-1 is 2.6 kcal/mol in the gas phase, while in solvents like chlorobenzene and water, the barriers increase to 4.8 kcal/mol and 6.2 kcal/mol, respectively These low barriers indicate that the migration process occurs rapidly from a kinetic perspective.

In the N 5 -CAIR011-1 structure, the deprotonation of the imidazole ring at the C4 position involves the heterolytic breaking of a strong C-H bond, resulting in a high relative energy for the reactant that is close to the transition state (TS), thereby creating a low reaction barrier The resulting product, CAIR011-1, exhibits a lower relative enthalpy than the reactant, indicating that the process is exothermic with an enthalpy change of -77.1 kcal/mol in the gas phase According to Hammond’s postulate, the TS structure in this exothermic reaction closely resembles the reactant, which is confirmed by the optimized TS structure for CO2 migration showing significant similarity to the reactant.

In the reactant structure of N5-CAIR011-1, the bond distance between the carboxylate carbon and C4 measures 2.8 Å, which decreases to 2.3 Å in the transition state (TS) The TS structure closely resembles the reactant structure, aligning with Hammond’s postulate Notably, the phosphate of the R5P group establishes three hydrogen bonds with the carboxylate group, amine group, and a hydroxyl group from the ribose ring Interestingly, the hydrogen bonds with the -CO2H unit and the amine group remain intact throughout the reaction, including in the transition state These hydrogen bonds likely play a crucial role in stabilizing the protonated amine nitrogen in both the reactant structure and the TS, as well as the migrating carboxylate group.

The migration of CO2H through N 5 -CAIR011-1 was analyzed using B3LYP/6-31+G(d) theory in the gas phase Energetic results, presented in square brackets and parentheses, were obtained from the optimized geometry in the gas phase and the PCM model for chlorobenzene (ε = 5.62) and water (ε = 78.39), respectively N 5 -CAIR101-2 serves as the reference structure for comparison The bond distance between the carboxylate carbon and C4 is indicated for N 5 -CAIR011-1, CAIR011-1, and TS, while red dashed lines illustrate hydrogen bonds based on short contact distances.

Chlorobenzene and water provide significant solvation effects that stabilize the structures involved, yet these effects do not substantially alter the potential energy surface (PES) across different media The reaction pathway is exothermic with a low activation barrier, maintaining consistent enthalpy changes of approximately 8 kcal/mol between the gas phase and both solvents In the reactant, the amine nitrogen is protonated and carries a positive charge, while the carboxylic acid group remains neutral, resulting in the imidazole ring acting as a zwitterion.

The N5-CAIR011-2 structure features an imidazole ring identical to that of N5-CAIR011-1, differing only in the protonation state of the phosphate group The transition state identified in this pathway exhibits a geometry comparable to that of the previous pathway, as illustrated in Figure 5.4.

Figure 5.4 illustrates the migration of CO2H through N5-CAIR011-2, calculated at the B3LYP/6-31+G(d) level of theory in the gas phase The data, presented in square brackets and parentheses, includes single-point energy values derived from the optimized gas phase geometry and the PCM model for chlorobenzene (ε = 5.62) and water (ε = x.39) N5-CAIR101-2 serves as the reference structure for comparison Additionally, the bond distance between the carboxylate carbon and C4 is indicated for N5-CAIR011-2, CAIR011-2, and the transition state (TS) For clarity, hydrogen bonds are represented as red dashed lines, highlighting short contact distances.

The reaction barrier is measured at -0.4, 2.6, and 4.0 kcal/mol in the gas phase, chlorobenzene, and water, respectively, with the gas phase exhibiting a negative reaction barrier due to thermal corrections This reaction pathway is highly exothermic, with enthalpic changes of -99.3, -88.4, and -83.6 kcal/mol in the respective environments The transition state (TS) geometry closely resembles that of the reactant, supporting Hammond’s postulate due to the exothermic nature of the reaction Notably, the varying reaction barriers across different dielectric constants suggest that solvation does not significantly alter the potential energy surface (PES) of the reaction Overall, calculations through N 5 -CAIR011-1(2) indicate that neither the protonation state of the phosphate group nor the solvation effect influences the PES for the migration of the CO2 unit.

The orientation of the imidazole ring in this pathway is opposite to that in the N 5 -CAIR011-1 pathway In this configuration, the protonated amine nitrogen forms a hydrogen bond with a hydroxyl group from the ribose ring rather than with the phosphate moiety This hydrogen bond appears to play a crucial role in stabilizing the amine group in both scenarios.

The N 5 -CAIR101-1 pathway is characterized by an endothermic nature, exhibiting enthalpy changes of 16.7, 14.3, and 13.4 kcal/mol in gas phase, chlorobenzene, and water, respectively, indicating its thermodynamic unfeasibility The reaction barrier exceeds 60 kcal/mol across all evaluated media, rendering the migration of the carboxyl group implausible Additionally, the imidazole ring remains neutral, with no deprotonation of the carboxyl group or the C4 site, and the amine nitrogen is not protonated Consequently, the migration pathway lacks both thermodynamic and kinetic favorability.

Figure 5.5 illustrates the migration of CO2H through N5-CAIR101-1, calculated using the B3LYP/6-31+G(d) method in the gas phase The data presented in square brackets and parentheses represent single-point energy values derived from the optimized geometry in the gas phase and the PCM model for chlorobenzene (ε = 5.62) and water (ε = 78.39), respectively N5-CAIR101-2 serves as the reference structure for comparison The bond distance between the carboxylate carbon and C4 is indicated for N5-CAIR101-1, CAIR101-1, and the transition state For clarity, hydrogen bonds are depicted as red dashed lines, based on short contact distances.

The CO2 unit migration through N 5 -CAIR111-2 represents the only exothermic pathway lacking a deprotonated C4 position, as shown in previous thermochemical calculations This pathway exhibits enthalpy changes of -20.0, -18.5, and -17.9 kcal/mol in the gas phase, chlorobenzene, and water, respectively, with an overall reaction barrier nearing 40 kcal/mol Despite this barrier being higher than those associated with the N 5 -CAIR011-1(2) structures, the transition state (TS) energies related to N 5 -CAIR111-2 are comparatively lower than those of its counterpart TS connected to N 5.

CAIR011-1(2) Moreover, the relative energy of N 5 -CAIR111-2 is much lower than the relative energies of N 5 -CAIR011-1(2)

Figure 5.6 illustrates the migration of CO2H through N5-CAIR111-2, calculated using the B3LYP/6-31+G(d) level of theory in the gas phase The data presented in square brackets and parentheses represent single-point energy values derived from the optimized geometry in the gas phase and the PCM model for chlorobenzene (ε = 5.62) and water (ε = x.39), respectively N5-CAIR101-2 serves as the reference structure for comparison The bond distance between the carboxylate carbon and C4 is highlighted for N5-CAIR111-2, CAIR111-2, and the transition state For clarity, hydrogen bonds are depicted as red dashed lines, indicating short contact distances.

Discussion

Thermochemistry of N 5 -CAIR and CAIR

Using 48 structures of N 5 -CAIR and CAIR in different protonation states, the PES of rearrangement reactions of N 5 -CAIR to CAIR were studied About one third of these hypothetical structures did not remain stable during geometry optimization Most of the unstable structures had a completely deprotonated phosphate group, bearing two negative charges The highly charged phosphate group provided a large disturbance to the imidazole moiety during the optimization, and therefore, those structures were not stable during the gas-phase optimization, and changed to different protonation states Since the ribose-5-phosphate group is generally not completely deprotonated under normal physiological pH, no more efforts were taken to focus on those unstable structures Some other structures with a deprotonated carboxyl group released free CO2 upon optimization According to the simplified model, one of these pathways is endothermic, and the other two require the deprotonation of the C4 site on the imidazole ring Therefore, these pathways are overall thermodynamically unfavorable The experimental observation also excludes these pathways Since there is no isotopic exchange between CO2 from the environment and the carboxyl group in the imidazole structure experimentally, these structures are also not of interest with respect to carboxyl group migration

PCM calculations indicate that solvation significantly stabilizes various structures, leading to a notable decrease in their relative energies compared to gas-phase calculations, particularly for multiply charged structures and zwitterions The enthalpy changes observed during the migration of the CO2 unit from each N5-CAIR structure to its corresponding CAIR structure remain consistent across gas phase and two condensed phases, chlorobenzene and water These enthalpy changes reflect the thermodynamic trends of the N5-CAIR rearrangement reaction based on specific protonation states Additionally, calculations with varying dielectric constants show that the potential energy surface (PES) of this reaction maintains a relatively consistent shape.

The energy trends of N 5 -CAIR and CAIR structures, differing only in the protonation state of ribose-5-phosphate, vary across different media In the gas phase, energies decrease with a reduction in the phosphate group's charge Conversely, in aqueous environments, N 5 -CAIR and CAIR structures with a singly negatively charged phosphate (Pi -) exhibit lower energies compared to those with a doubly charged (Pi 2-) or neutral phosphate (Pi) In nonpolar solvents like chlorobenzene, the trends are less clear However, when the imidazole reaction centers maintain the same protonation state, the enthalpy changes for N 5 -CAIR and CAIR with differing phosphate groups are comparable This indicates that the protonation states of the ribose-5-phosphate group do not significantly impact the thermodynamics of carboxylic group migration Even with a third of the total structures missing, the fundamental thermodynamic properties related to carboxylic group migration in various protonation states of imidazole reaction centers can still be effectively assessed.

All migration pathways through structures with a deprotonated C4 position of the imidazole ring are exothermic, primarily due to the heterolytic breaking of a strong C-H bond during deprotonation Additionally, the protonation state of the amine nitrogen significantly influences the reaction's thermodynamics, with protonated amine nitrogen lowering the enthalpy of carboxyl group migration by at least 20 kcal/mol compared to neutral amine nitrogen The presence of both deprotonated C4 and protonated amine nitrogen creates a zwitterionic structure, with these two sites in close geometric proximity, suggesting a potentially complex mechanism for maintaining this structure.

All pathways involving neutral C4 and amine nitrogen are endothermic, indicating that these structures are more stable than others due to their proximity to N5-CAIR's natural state This suggests that the migration of the carboxyl group within N5-CAIR to form CAIR is not a spontaneous reaction To facilitate spontaneous migration, the protonation state of N5-CAIR must be altered.

The N 5 -CAIR structures, characterized by protonated amine nitrogen and deprotonated carboxyl groups, exhibit instability due to their tendency to decompose into AIR and free CO2 Consequently, these structures are not viable participants in the mechanistic pathway.

The only exothermic pair of N 5-CAIR and CAIR featuring a neutral C4 position involves a protonated amine nitrogen This structure is just one step away from the neutral state of N 5-CAIR, and its energy levels are relatively low compared to those with a deprotonated C4 position The stability and exothermic nature of this pair position them as strong contenders for carboxyl group migration, especially when compared to pairs with a deprotonated C4 position.

Transition States of Carboxyl Group Migrations

The transition states (TS) for a reaction indicate activation barriers, with deprotonated carboxyl groups leading to TS structures that allow the CO2 unit to remain relatively free, resulting in no observed isotopic exchange Consequently, pairs featuring a deprotonated carboxyl group were excluded from TS calculations TSs associated with N 5 -CAIR011-1(2) exhibit similar structures, confirming that the protonation states of the R5P unit do not influence the potential energy surface (PES) of the reaction A TS identified with an arbitrary R5P protonation state can effectively represent the other two TSs Notably, two TSs with a deprotonated C4 unit show activation barriers approximately 10 kcal/mol lower, making the migration processes through these pathways both thermodynamically and kinetically favorable The critical step in these pathways is the deprotonation of C4, necessitating heterolytic cleavage of the C-H bond.

The reaction through N 5 -CAIR101-1 has an activation barrier which is larger than

The overall pathway of the reaction has an endothermic energy of 60 kcal/mol, indicating that it is not favorable either kinetically or thermodynamically The N5-CAIR101-1 structure features a neutral imidazole in its natural state, which is more stable than CAIR Additionally, achieving equilibrium between N5-CAIR and CAIR is challenging, suggesting that this pathway is unlikely to be the actual mechanism for carboxyl group migration.

The pathway through N 5 -CAIR111-2 is exothermic The magnitude of the enthalpic change is smaller than for the N 5 -CAIR011-1(2) pathways The reaction barrier

The relative energy of the transition state (TS) in the pathway associated with N5-CAIR011-1(2) is comparable to that of N5-CAIR111-2, which exhibits significantly lower relative energy This suggests that the pathway involving N5-CAIR011-1(2) is more accessible than those requiring the cleavage of a C-H bond However, the high reaction barrier for this step complicates the determination of a definitive preferred pathway Two favorable reaction mechanisms, along with their calculated energy properties, are illustrated in Figure 5.7, indicating that further analysis of the amino acid residues within the enzyme's active site may be necessary.

Figure 5.7 illustrates the favorable reaction mechanisms of PurE Class I, highlighting single-point energetic results obtained from optimized geometries in both gas phase and PCM model environments for chlorobenzene (ε = 5.62) and water (ε = 78.39) The structure of N5-CAIR101-2 serves as the reference for comparative analysis.

Conclusions

This chapter explores the potential energy surfaces (PES) of molecular rearrangement of N5-CAIR for CO2 migration to CAIR, utilizing DFT methods to analyze the thermodynamics and kinetics of the reaction It was found that solvation does not significantly influence the PES shape, nor does the protonation state of the ribose-5-phosphate group Consequently, typical transition states (TS) were optimized for further study, while structures with a deprotonated carboxyl group were excluded due to their tendency to yield TS with free CO2, which experimental isotopic exchange results ruled out The computed thermochemistry indicates two viable rearrangement pathways for N5-CAIR leading to CAIR: one involving a deprotonated C4 position of the imidazole ring and the other involving a protonated amine nitrogen Calculations reveal that these pathways have TS with comparable relative energies, yet the energy of N5-CAIR with a deprotonated C4 position is significantly higher than that with a protonated amine nitrogen, with the reaction barrier for the latter being notably greater than that for the deprotonated C4.

No further conclusion could be made based on these computational results Explicit consideration of the enzyme’s active site will be required to distinguish between these two possible pathways.

Introduction

In Chapter 4, we computationally investigated the mechanisms of N5-CAIR mutase (PurE) using a simplified model that substituted ribose-5-phosphate (R5P) with a methyl group Our findings identified seven potential reaction pathways, with the most favorable pathway involving the protonation of the reactant's amine group, facilitated by a nearby histidine residue in the protein's active site CBS-QB3 calculations confirmed the energetics of this pathway, aligning well with DFT results at the B3LYP/6-31+G(d) level In Chapter 5, DFT calculations on a complete model (R=R5P) further supported our simplified model; however, previous studies did not account for the entire enzyme's scaffolding and amino acid residues in the catalytic mechanism.

Our calculations provided detailed information about the N 5 -CAIR mutase mechanism, and the reaction mechanism may be useful for structure-based drug design

This chapter explores various analogs of the simplified MICA substrate model, focusing on how different substituent groups influence the potential energy surface (PES) We will evaluate two specific sites on the MICA model compound, as illustrated in Scheme 6.1, using five distinct substituent groups at either the R1 or R2 positions: CH3, CF3, and CN.

The study focused on the compounds CO2CH3 and NO2, examining the effects of single substitutions of the R1 or R2 groups At this stage, simultaneous substitutions of both R1 and R2 have not been investigated.

R 1 , R 2 = -CH 3 , -CF 3 , -CN, -CO 2 CH 3 , -NO 2

Computational Methods

In this study, DFT 4,5 methods were utilized, with all structures optimized at the B3LYP/6-31+G(d) level of theory The vibrational frequencies were also calculated at this level, and the zero-point vibrational energy was determined using a scaling factor of 0.9806 Additionally, thermal and entropic corrections were derived from the vibrational frequency calculations based on unscaled frequencies.

Transition states (TS) for carboxylic group migrations were obtained for structures with different substituent groups In order to evaluate the effect of solvation, the

In this study, PCM models 10, 11, 12, and 13 were utilized for chlorobenzene (ε=5.62) and water (ε=39) in single-point energy calculations, conducted at the B3LYP/6-31+G(d) level of theory based on gas-phase optimized structures All geometry optimizations, transition state searches, vibrational frequency calculations, and PCM single-point energy evaluations were executed using Gaussian 03 at the Ohio Supercomputer Center.

Results

Analogs with either R 1 or R 2 as a methyl group

The calculations for the MICA analog with substituents R1 = CH3 and R2 = H reveal significant effects on the carboxylic group migration pathway, as detailed in Table 6.1 The transition state (TS) derived from this structure favors the migration of the carboxyl group to C5 rather than C4, despite multiple attempts to locate a TS for C4 consistently resulting in C5 For alternative pathways, TS for migration to C4 can be identified, and the potential energy surfaces (PES) for these pathways closely resemble those of the model structure MICA Pathways involving MICA analogs p2, p4, and p8 remain exoergic, whereas those through MICAp5 and p7 are endoergic The relative enthalpies and free energies across structures differ by less than 5 kcal/mol compared to their model counterparts, with PCM calculations in chlorobenzene and water showing similar results The maximum discrepancies in enthalpies or free energies between the analogs and their model counterparts are around 5 kcal/mol Although the reactant structure for the MICA p6 pathway is unavailable, its TS and product exhibit thermodynamic properties akin to those of the basic systems Overall, the presence of the methyl group at R1 does not significantly alter the PES shape for carboxylic group migration in either the gas or condensed phases.

Table 6.1 Continued Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

The data in this article includes various protonation states of histidine, represented numerically in Figure 6.1, with parentheses indicating the counts of neutral and protonated histidine Gas phase calculations were performed using the B3LYP/6-31+G(d) level of theory, with MICA p1 serving as the reference structure for comparison In the terminology used, "R" denotes reactant, "P" signifies product, and "TS" refers to transition state, with values in parentheses indicating activation enthalpy and Gibbs free energy for transition states, and reaction enthalpies and Gibbs free energy changes for products Additionally, PCM single point energy calculations were conducted on gas-phase optimized structures at the same theoretical level The notation "N/A" indicates incomplete optimization of structures, while "A/S" signifies the discovery of alternate structures rather than the intended reactant, product, or transition state.

Table 6.1 Substituent calculation with R1=CH3, R2=H

The computational results for the MICA analogs with substituents R1 = H and R2 = CH3 indicate a notable issue in the MICA p1 pathway, where the transition state results in the carboxylic group migrating to C5 instead of C4 Additionally, the potential energy surface (PES) for this series of analogs exhibits a pattern akin to that of the methyl group at the R1 site and the basic MICA model Notably, the enthalpy and free energy differences between these analogs and the basic model structures remain relatively small, not exceeding 5 kcal/mol for most available structures.

The presence of a methyl substituent group at either site does not significantly alter the potential energy surface (PES) for carboxyl group migration All previously identified pathways retain comparable transition states (TS) and thermodynamic properties across both series when compared to the basic model system However, an exception arises in the pathways involving MICA p1 analogs, where distinct TSs leading to various migration products were observed The thermodynamic data shows Δ 298 H and Δ 298 G values in kcal/mol for structures a and b.

Table 6.2 (continued) Δ298H (kcal/mol) Δ298G (kcal/mol) Structure a,b

The article discusses the computational analysis of protonation states using the B3LYP/6-31+G(d) level of theory, with a focus on comparing neutral and protonated histidine side chains as reference states It highlights that the starting structure of MICA p1 serves as the reference for all calculated results The terms "R," "P," and "TS" denote reactants, products, and transition states, respectively, with parentheses indicating activation enthalpy, Gibbs free energy, reaction enthalpies, and Gibbs free energy changes Additionally, PCM single point energy calculations were performed on gas-phase optimized structures, while "N/A" signifies incomplete optimization and "A/S" denotes the discovery of alternate structures instead of the intended reactants, products, or transition states.

Table 6.2 Substituent calculation with R1=H, R2=CH3.

Analogs with either R 1 or R 2 as a trifluoromethyl group

The calculations for the MICA analog with substituents R1 = CF3 and R2 = H reveal significant findings, as detailed in Table 6.3 The MICA p1 pathway was accurately identified, but it is noted that this pathway becomes endoergic when trifluoromethyl is present at the R1 position, with a free energy change of 28.4 kcal/mol—approximately 59 kcal/mol higher than the corresponding change in the parent model system The reaction barrier for this pathway is 60.6 kcal/mol, which is about 30 kcal/mol greater than that of the model system Other pathways, including MICA p2 and p4, exhibit exoergic migrations, while the MICA p5 analog shows endoergic behavior The reaction barriers and enthalpy changes for these pathways are comparable to those of the basic model system, yet the relative energies for these pathways are higher—ranging from 7 to 20 kcal/mol—compared to their counterparts in the basic model system.

Table 6.3 (continued) Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

The article discusses the comparison of different protonation states by using neutral and protonated histidine side chains as reference states, with the number in parentheses indicating the count of each type The gas phase calculations were conducted using the B3LYP/6-31+G(d) level of theory, with the starting structure of MICA p1 designated as the reference for comparison The abbreviations "R," "P," and "TS" denote reactant, product, and transition state, respectively, with accompanying values in parentheses representing activation enthalpy, Gibbs free energy, reaction enthalpies, and Gibbs free energy changes Additionally, PCM single point energy calculations were performed on gas-phase optimized structures, while "N/A" signifies incomplete optimization of certain structures.

Table 6.3 Substituent calculation with R1small>3, R2=H

Current calculations for the pathways involving MICA p6 and p7 analogs remain incomplete, with observed increases in relative energies compared to the global reference structure In contrast, the transition state (TS) for the CF3-MICA p8 analog exhibits a relative enthalpy approximately 9 kcal/mol lower than that of the TS in the MICA p8 pathway Notably, the MICA p8 structure possesses two negative charges, and the presence of the electron-withdrawing trifluoromethyl group helps stabilize this electron-rich configuration.

The presence of a trifluoromethyl substituent at the R2 site significantly influences the reaction pathway compared to the R1 site, as detailed in Table 6.4 The MICA p1 analog pathway exhibits an exoergic nature with an enthalpy change of -32.0 kcal/mol, while the activation barrier is lowered to 22.6 kcal/mol, representing an 8.4 kcal/mol reduction compared to the parent model system Consequently, the trifluoromethyl group at the R2 site enhances both the thermodynamic and kinetic favorability of the MICA p1 pathway relative to the model system.

Table 6.4 (continued) Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

The numbered structures referenced are depicted in Figure 6.1 To facilitate comparisons between different protonation states, neutral and protonated histidine side chains are used as reference states, with the numbers in parentheses indicating the count of neutral and protonated histidine Gas phase calculations utilize the B3LYP/6-31+G(d) level of theory, with the starting structure of MICA p1 designated as the reference for all other calculated results In this context, “R” denotes reactant, “P” signifies product, and “TS” represents transition state, with values in parentheses indicating activation enthalpy and Gibbs free energy for TS, and reaction enthalpies and Gibbs free energy changes for P PCM single point energy calculations are performed on gas-phase optimized structures at the B3LYP/6-31+G(d) level The notation “N/A” indicates incomplete optimization of structures, while “A/S” denotes the discovery of alternate structures instead of the intended reactant, product, or transition state.

Table 6.4 Substituent calculation with R1=H, R2small>3

The pathways through MICA p5 and p7 for the R2small>3 analog are endoergic, with relative energies significantly lower than those of model systems This destabilization may arise from the trifluoromethyl group's influence in the MICA p1 analog, where the positively charged amine group interacts with the electron-withdrawing trifluoromethyl group, leading to notable structural instability.

Current calculations for the MICA p2, p4, p6, and p8 analog pathways are incomplete Available data indicate that the relative energies of these analogs are generally lower than those in the basic model system, with the exception of the transition state in the MICA p2 pathway Similar to MICA p1, the p2 pathway exhibits a positive charge on its amine group, which is further destabilized by the presence of the trifluoromethyl group.

Analogs with either R 1 or R 2 as a cyano group

The results of calculations with a cyano group as the substituent at the R1 position are listed in Table 6.5 Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

Table 6.5 (continued) Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

The data presented in Figure 6.1 illustrates various protonation states, comparing neutral and protonated histidine side chains as reference states The numbers in parentheses indicate the count of neutral and protonated histidine Gas phase calculations utilize the B3LYP/6-31+G(d) level of theory, with the starting structure of MICA p1 (reactant) serving as the reference for all calculated results In this context, "R" denotes reactant, "P" signifies product, and "TS" represents transition state, with values in parentheses indicating activation enthalpy, Gibbs free energy, reaction enthalpies, and Gibbs free energy changes Additionally, PCM single point energy calculations are performed on gas-phase optimized structures at the same theoretical level, while "N/A" denotes incomplete optimization of structures, resulting in unavailable data.

Table 6.5 Substituent calculation with R1=CN, R2=H

The electron-withdrawing nature of the CN substituent in the R1=CN analog does not significantly impact the potential energy surface (PES) As indicated in Table 6.5, the pathways for the MICA analogs p1, p2, and p8 are exoergic, while those for analogs p5 and p7 are endoergic Generally, the relative enthalpies of the structures in this analog series are lower compared to the basic model system, likely due to the destabilizing effect of the electron-withdrawing substituent when compared to the reference MICA p1 structure.

The dianionic structures in the MICA p8 analog pathway benefit from the strong stabilization effect of the electron-withdrawing cyano group Both in gaseous and condensed phases, the relative enthalpies of these structures are significantly lower than those in the basic model system, resulting in a notably reduced reaction barrier compared to pathways lacking the cyano substituent The MICA p8 structure features two negative charges from the carboxylic group and deprotonated C4, with the transition state (TS) involving the migration of the carboxylic group to C4, both of which are negatively charged The presence of the electron-withdrawing CN group further stabilizes the TS by enhancing the stability of these electron-rich reaction centers.

The cyano substituent group at the R2 site destabilizes many structures, and causes difficulties in calculations for this series The results are listed in Table 6.6 Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

Table 6.6 (continued) Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

The article discusses the comparison of different protonation states of histidine side chains, using neutral and protonated forms as reference states The data is structured numerically, as illustrated in Figure 6.1, with values indicating the number of neutral and protonated histidine The calculations for the gas phase are conducted using the B3LYP/6-31+G(d) level of theory, with MICA p1 (reactant) serving as the reference structure for all results The terms "R," "P," and "TS" denote reactant, product, and transition state, respectively, with parentheses indicating activation enthalpy and Gibbs free energy for transition states, and reaction enthalpies and Gibbs free energy changes for products Additionally, PCM single point energy calculations are performed on gas-phase optimized structures at the same theoretical level The notation "N/A" signifies incomplete optimization of structures, resulting in unavailable data.

Table 6.6 Substituent calculation with R1= H, R2 = CN

The pathways through the MICA p1 and p5 analogs yield complete results, with the MICA p1 pathway exhibiting a highly exoergic reaction and a low reaction barrier In contrast, the MICA p5 pathway is slightly endoergic when compared to the basic model system Similar trends are observed with the electron-withdrawing trifluoromethyl substituent at the R2 site, which significantly destabilizes the reactants This destabilization results in low reaction barriers and high exothermicities.

Analogs with either R 1 or R 2 as a carboxylic ester group

In this study, the carboxylic ester group was utilized as a substituent at the R1 site, with results detailed in Table 6.7 The reaction pathway involving the MICA p1 analog is exoergic, exhibiting a reaction enthalpy closely aligned with that of the MICA p1 pathway, differing by only 1.4 kcal/mol in the gas phase The reaction barrier for this pathway is 44.6 kcal/mol, which is 13.6 kcal/mol higher than the parent MICA model; however, this difference becomes negligible when considering PCM models In condensed phases, the enthalpy changes for these reactions vary from the basic model by approximately 4 kcal/mol Consequently, the potential energy surface (PES) of the reaction pathways through MICA p1 remains largely unaffected by the presence of the ester functionality as a substituent at the R1 site.

Table 6.7 (continued) Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

The article discusses the numbered structures illustrated in Figure 6.1, focusing on the comparison of different protonation states of histidine side chains, denoted by the number of neutral and protonated histidine The gas phase calculations utilize the B3LYP/6-31+G(d) level of theory, with the starting structure of MICA p1 serving as the reference for all calculated results In this context, "R" indicates reactant, "P" indicates product, and "TS" indicates transition state, with values in parentheses representing activation enthalpy, Gibbs free energy, reaction enthalpies, and Gibbs free energy changes PCM single point energy calculations are performed on gas-phase optimized structures at the same level of theory, while "N/A" denotes incomplete structure optimization and unavailable results.

Table 6.7 Substituent calculation with R1 = CO2CH3, R2 = H

The reaction pathways involving the MICA p2 and p5 analogs exhibit minimal differences from the basic model In the gas phase, the MICA p2 analogs have relative enthalpies that are 8 kcal/mol higher than those in the parent MICA model However, in chlorobenzene or water, the relative enthalpies of analogs containing the ester group are lower than those of the basic model structures Despite this, the reaction barriers and enthalpy changes for the p2 pathway are similar to those of the basic model, with differences of less than 3 kcal/mol This indicates that the substituent group does not significantly alter the potential energy surface (PES) for this pathway Additionally, solvation effects provide greater stabilization for the analogs along the p2 pathway with the ester substituent compared to those along the p1 pathway.

In the gas phase, the ester analogs in MICA p5 exhibit relative enthalpies that are over 10 kcal/mol greater than those in the parent MICA model However, these differences diminish significantly in the condensed phase, indicating that solvation effects counterbalance the substituent effects in this pathway Regardless of the phase, the potential energy surfaces (PES) for this pathway closely resemble those of the parent MICA model.

The relative enthalpies of analogs from pathways p4, p6, p7, and p8 closely resemble those in the basic gas phase model, yet they significantly decrease with higher dielectric constants While the substituent group does not alter the potential energy surface (PES), the stabilization effects vary between the parent MICA model and the analog series for each pathway.

The calculations for the carboxylic ester substituent at the R2 site, detailed in Table 6.8, indicate that the pathway through the MICA p5 analog shows relative enthalpies of the transition state (TS) and product approximately 10 kcal/mol lower than those in the parent MICA model This reaction is endoergic for both the parent model and the R2=CO2CH3 analog, with reaction barriers and enthalpy changes also about 10 kcal/mol lower in energy compared to the parent model in the gas phase Additionally, PCM calculations using dielectric constants of chlorobenzene and water exhibit similar trends, revealing that the relative enthalpies for each structure are significantly lower than those of the parent MICA model, with reaction barriers and enthalpy changes for these analogs remaining around 10 kcal/mol lower at the PCM level.

Table 6.8 (continued) Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

The article discusses the methodologies used in computational chemistry, specifically highlighting the structures illustrated in Figure 6.1 It emphasizes the comparison of different protonation states by incorporating neutral and protonated histidine side chains as reference states The gas phase calculations are performed using the B3LYP/6-31+G(d) level of theory, with the starting structure of MICA p1 designated as the reference point for all other results The abbreviations "R," "P," and "TS" represent reactant, product, and transition state, respectively, with corresponding values indicating activation enthalpy and Gibbs free energy for transition states, and reaction enthalpies and Gibbs free energy changes for products Additionally, PCM single point energy calculations are conducted on gas-phase optimized structures Instances marked as "N/A" denote incomplete optimizations, while "A/S" signifies the discovery of alternate structures instead of the intended reactants, products, or transition states.

Table 6.8 Substituent calculation with R1 = H, R2 = CO2CH3

Computational results indicate that structures derived from alternative reaction pathways exhibit lower relative enthalpies compared to the parent MICA model structures The influence of solvation, observed with both dielectric constants, significantly stabilizes these structures Specifically, for pathways involving MICA p4, p6, p7, and p8, the relative enthalpies are at least 15 kcal/mol lower than those of their counterparts in the gas phase of the parent model Additionally, when accounting for solvation, the relative enthalpies of these structures are considerably reduced compared to the basic model.

Analogs with either R 1 or R 2 as a nitro group

The nitro group serves as an electron-withdrawing substituent in this study, particularly at the R1 site, where it minimally affects the reaction pathway through the MICA p1 analog compared to the parent model As detailed in Table 6.9, the reaction barriers for this pathway are within 0.5 kcal/mol of those of the parent model in both gaseous and condensed phases The reaction is exoergic, exhibiting enthalpy changes of -27.2 kcal/mol in the gas phase, and -16.5 and -12.8 kcal/mol in chlorobenzene and water, respectively However, this pathway is less exoergic than the parent model, with the enthalpy change being 4.5 kcal/mol higher in the gas phase, and increasing further by 7.4 and 8.3 kcal/mol in chlorobenzene and water, respectively.

The MICA p2 analog exhibits barrier and enthalpy changes that are only 2 kcal/mol higher than those of its parent model Additionally, the relative enthalpies of the reactant, product, and transition state in this pathway are 15 kcal/mol lower compared to their basic model structures Notably, the pathway's shape remains unchanged with the presence of a nitro group at the R1 site, while the potential energy surface (PES) shifts 15 kcal/mol lower in relation to the pathway through p1 Consistent trends are observed in PCM calculations for both chlorobenzene and water.

The MICA p4 analog exhibits instability, resulting in the release of carbon dioxide from the carboxylate group Consequently, the product and transition state associated with this pathway will not be further examined, as this behavior does not align with experimental observations.

The MICA p5 analog pathway is characterized by an endoergic nature, exhibiting an enthalpy change of 22.5 kcal/mol, which is 5.4 kcal/mol greater than that of the parent model Additionally, the reaction barrier for this pathway is 2.5 kcal/mol higher than the parent model In the gas phase, the relative enthalpy of the MICA p5 analog is -59.9 kcal/mol, indicating a decrease of 12 kcal/mol compared to the parent model structure The transition state and product relative enthalpies in this pathway are 18.9 kcal/mol and -37.5 kcal/mol, respectively, both lower than their parent model counterparts by 9.4 and 6.7 kcal/mol Similar to the p2 pathway, the energy profile shifts towards lower energy, resembling the potential energy surface of the basic model pathway Furthermore, PCM calculations align with the trends observed in the gas-phase results.

TS structures derived from the MICA p6, p7, and p8 analogs exhibit lower relative enthalpies compared to their parent models in both gaseous and condensed phases These analogs are characterized by negative charges and demonstrate significant stabilization due to the presence of the nitro group.

Table 6.9 (continued) Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

The article discusses the comparison of different protonation states of histidine side chains by using neutral and protonated forms as reference states The calculations are based on the B3LYP/6-31+G(d) level of theory for gas phase analyses, with MICA p1 set as the reference structure for all comparisons Key designations include "R" for reactants, "P" for products, and "TS" for transition states, with accompanying values indicating activation enthalpy and Gibbs free energy for transition states, and reaction enthalpies and Gibbs free energy changes for products Additionally, PCM single point energy calculations were performed on gas-phase optimized structures, while "N/A" denotes incomplete optimizations and "A/S" indicates the discovery of alternate structures.

Table 6.9 Substituent calculation with R1 = NO2, R2 = H Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

Table 6.10 (continued) Δ 298 H (kcal/mol) Δ 298 G (kcal/mol) Structure a,b

The article presents a detailed analysis of protonation states using neutral and protonated histidine side chains as reference points for comparison It employs B3LYP/6-31+G(d) level theory for gas phase calculations, with the MICA p1 structure serving as the reference for all other results Key notations include "R" for reactant, "P" for product, and "TS" for transition state, with associated values indicating activation enthalpy and Gibbs free energy Additionally, PCM single point energy calculations are conducted on optimized gas-phase structures The term "N/A" signifies incomplete optimization, while "A/S" indicates the discovery of alternate structures rather than the intended reactants, products, or transition states.

Table 6.10 Substituent calculation with R1 = H, R2 = NO2

Table 6.10 presents the results for the nitro group at the R2 site, revealing that the transition state (TS) identified along the reaction pathway via the MICA p1 analog resulted in the migration of the carboxylic group to the C5 site Additionally, comparable transition states were observed with various other substituent groups throughout this study.

The MICA p2 analog exhibits a gas-phase barrier of 29.8 kcal/mol, significantly higher than the 2.0 kcal/mol barrier of the parent model In chlorobenzene and water, the barriers are 29.2 and 28.0 kcal/mol, respectively, indicating that solvation has a minimal impact on these heights This pathway is highly exoergic in both the basic model and the analog series, with enthalpy changes closely resembling those of the parent model when solvation is considered Notably, the presence of the electron-withdrawing nitro group at the R2 site destabilizes the transition state by approximately 26 kcal/mol compared to the reactant and product in this analog's pathway.

The MICA p5 analog exhibits a less endoergic pathway compared to its parent model, with reaction enthalpy changes under 5 kcal/mol in both gas phase and solution Specifically, the reaction enthalpies are approximately 14 kcal/mol lower than those of the basic model In the gas phase, the MICA p5 analog is destabilized relative to the parent model, although higher dielectric constants bring its relative enthalpy closer to that of the parent structure The transition state (TS) is stabilized by the substituent group, resulting in a lower reaction barrier—over 15 kcal/mol lower in both gas phase and solution when compared to the parent model The product formed in this pathway has a similar relative enthalpy to the parent model in the gas phase but is stabilized by 12 kcal/mol in solution, contributing to a less endoergic pathway overall.

All of the structures from pathways p4, p6, p7, and p8 carry negative charges The electron-withdrawing nitro group showed stabilization effects of these structures comparing to their parent MICA model structures.

Conclusions

The substituent effects of –CH3, –CF3, –CN, –CO2CH3, and –NO2 groups at two sites of the parent substrate structure have been studied

The presence of a methyl group at both the R1 and R2 sites, along with a carboxylic ester group and a nitro group at the R2 site, significantly altered the potential energy surface (PES) of the reaction via the MICA p1 pathway In these analogs, the transition states (TS) for carboxylic group migration resulted in the movement of the CO2 unit to the C5 site instead of the intended C4 site The specific factors driving these changes remain unclear, but it is evident that the MICA p1 pathway is highly sensitive to the substituents at the R1 and R2 positions.

The methyl group exhibited minimal impact on the potential energy surface (PES) of various pathways, with relative enthalpy differences remaining under 5 kcal/mol compared to parent structures In contrast, the trifluoromethyl group at the C2 position of the imidazole ring rendered the MICA p1 pathway endoergic, significantly increasing the reaction barrier, while other pathways remained largely unaffected The trifluoromethyl group at the R2 site also showed negligible changes across the series The cyano group, another electron-withdrawing substituent, stabilized negatively charged structures at both R1 and R2 sites without altering the PES shape significantly The carboxylic ester group enhanced stabilization through solvation effects, shifting the PES of charged pathways towards lower energy, yet it did not modify the overall PES shapes Lastly, the nitro group at the R2 site facilitated the transition state for carboxylic group migration in the MICA p1 pathway, while other pathways maintained a similar PES to their parent models, further stabilizing negatively charged structures in this series.

Electron-withdrawing substituent groups significantly stabilize structures with negative charges, while structures with net positive charges do not experience the same effects The substituent groups at the R1 and R2 sites show no systematic differences Notably, the pathways through MICA p1 and its analogs are highly sensitive to these substituent groups, making them crucial for understanding the reaction mechanism proposed for this enzyme These studies on substituents offer valuable insights into the enzymatic reaction, though further research is necessary for a deeper understanding of the mechanism.

Introduction

Proteins exhibit stable tertiary structures due to well-defined secondary structures like α-helix and β-sheet, which are crucial for their biological functions Understanding the folding mechanisms of these secondary structures into their three-dimensional forms is of significant chemical and biological interest In addition to utilizing natural amino acids in protein-based polymers, researchers have focused on designing and synthesizing unnatural oligomers that can mimic these natural secondary structures These synthetic polymers, known as foldamers, possess compact three-dimensional folding capabilities The design of foldamers leverages the universal driving forces of natural secondary structural folding, including hydrogen bonding and hydrophobic interactions Foldamers are categorized into bioinspired and abiotic types, with many abiotic foldamers incorporating aromatic rings to utilize hydrophobic stacking for helical folding.

Understanding the folding and conformational changes of foldamers is crucial in their design process Research has focused on polymers and oligomers with helical structures, investigating M-P helical and helical-nonhelical conformational transitions caused by structural perturbations Various chemical methods have been established to induce conformational changes in foldamers Notably, a recent study by the Parquette group explored the synthesis and photoisomerization of alternating pyridinedicarboxamide/m-(phenylazo)azobenzene oligomers, highlighting the significance of these conformational dynamics.

Figure 7.1 Alternating Pyridinedicarboxamide/m-(Phenylazo)azobenzene

The crystal structure of 7.1 features a distinct two-turn helical conformation with a helical pitch of around 3.4 Å Additionally, 1 H NMR spectroscopy identified two well-separated doublets corresponding to the (benzylic) methylene hydrogens located at each end of the oligomer.

At low temperatures, the NMR sample exhibited distinct doublet peaks, indicating a lack of equilibrium between the M and P helical structures of the oligomers As the temperature increased, these peaks gradually merged into a singlet, suggesting that the energy barrier for interconversion between the M and P helical forms was overcome.

Strong experimental evidence has demonstrated the helical structures of M-P foldamers and their interconversion; however, studying the mechanism of this conversion experimentally is challenging due to low barriers and transient transition structures Understanding the interconversion mechanism is crucial for comprehending the folding process of helical foldamers Gaining insights into the molecular-level details of interconversion, including important structures and barriers, can enhance the design of next-generation foldamers Computational chemistry offers alternative methods to investigate the conformations and folding of foldamers This study evaluates various conformational sampling tools to explore the folding and interconversion between the M and P helical structures of 7.1 and 7.2.

Monte Carlo (MC) conformational search methods are essential

Molecular Dynamics (MD) is a widely used computational tool for investigating the conformational changes of polymers and proteins Unlike Monte Carlo (MC) methods, MD simulations offer a comprehensive trajectory that captures each step of a molecule transitioning between conformers Recently, MD techniques have been utilized to explore the dynamic interchange processes of foldamers However, a limitation of single trajectory MD simulations is their reliance on a single temperature, which can lead to molecules becoming trapped in local minima for extended periods To address these sampling challenges, advanced MD simulation methods have been developed that enable the application of multiple temperatures to the same molecule simultaneously.

In this study, we utilized three MD simulation methods, specifically parallel tempering MD and replica exchange MD, to investigate the folding and interconversion of the M and P helical structures of 7.1 and 7.2 The outcomes of each simulation were thoroughly analyzed and compared to enhance our understanding of these molecular processes.

Computational methods

The systematic Torsional Sampling Monte Carlo (SPMC) 28 search protocol in Macromodel version 8.5 was utilized to explore the conformational space of 7.1 This process involved conducting 84,000 steps in the Monte Carlo search, with the conformers produced at each step subsequently optimized using the AMBER* force field.

The Macromodel version 8.5 was utilized to save all optimized conformers within 1,000 kJ/mol of the global minimum, yielding a total of 70,207 unique conformations for compound 7.1 However, due to the complex structure of compound 7.2, a systematic torsional protocol is not suitable for a Monte Carlo (MC) conformational search, necessitating the use of the MC Multiple Minimum approach.

The MCMM 29,30 search protocol was employed to conduct a conformational search for compound 7.2, executing a total of 10,000 steps This optimization process successfully identified and saved all conformers within 4,000 kJ/mol of the global minimum, yielding 9,855 unique structures for 7.2.

MD simulations were conducted using the AMBER8 program package, applying the general AMBER force field (GAFF) to systems 7.1 and 7.2 The generalized Born (GB) solvation model was utilized to simulate water as a solvent Each simulation began with the solvation of the system, followed by minimization using SANDER, AMBER8's primary molecular dynamics program, and 10^4 equilibration steps under constant volume conditions Subsequently, 100 ns production MD runs were carried out for each system, maintaining the same protocol, with SHAKE bond length constraints applied to all hydrogen-containing bonds.

Replica-Exchange Molecular Dynamics (REMD) simulations involve N non-interacting replicas of a system, each maintained at distinct temperatures During the simulation, these replicas operate independently at their assigned temperatures After a predetermined number of simulation steps, neighboring temperature replicas can exchange states based on the Metropolis probability criterion, which utilizes their potential energies This "exchange" allows each replica to adopt the temperature of its neighbor without altering their positions or momenta For instance, the probability of exchange between two replicas, i and j, at temperatures Tm and Tn, respectively, can be calculated using a specific equation.

The exchange of replicas in a thermodynamic system is influenced by their potential energy distributions, specifically between neighboring temperatures This exchange, which is driven solely by potential energy, occurs with a probability that increases when there is significant overlap between the potential energy levels of the replicas In practice, the calculation of exchange probabilities is limited to pairs of neighboring temperatures, as they exhibit a higher degree of potential energy overlap, facilitating more effective exchanges.

REMD simulations were conducted on molecules 7.1 and 7.2 using the AMBER8 software package The number of replicas for each molecule was determined based on their size to ensure adequate temperature coverage and effective exchange among replicas Specifically, 12 replicas were utilized for molecule 7.1, with temperatures ranging from 230.0 K to 459.8 K, including increments of 15 K Additionally, 14 replicas were implemented for molecule 7.2 to further enhance the simulation's accuracy and efficiency.

The study utilized temperatures of 361.6 K, 380.2 K, 399.8 K, 420.4 K, and 442.1 K, which were exponentially spaced to facilitate the overlap of potential energy between adjacent temperature replicas Exchange probabilities were computed every 10,000 molecular dynamics (MD) steps, totaling 2200 attempts for each replica pair This resulted in 22 ns of MD simulations for each replica In total, section 7.1 comprised 264 ns of MD simulations with 12 replicas, while section 7.2 included 308 ns of MD simulations with 14 replicas.

Results and Discussion

Monte Carlo Conformational Search of 7.1

The MC conformational search for molecule 7.1 produced 70,207 conformers within a 1,000 kJ/mol (238.8 kcal/mol) energy window from the global minimum, chosen to ensure the inclusion of the left-handed (P) helical structure Given the symmetry of molecule 7.1, conformers with lower potential energies are more chemically relevant, making it unnecessary to analyze higher-energy conformers The potential energies of the top 10,000 structures were plotted, revealing coverage of over 110 kJ/mol (26 kcal/mol) relative to the global minimum Notably, the three structures highlighted in Figure 7.2—numbers 1, 5000, and 10000—illustrate that more extended conformers exhibit higher potential energies, while the fully compacted helical conformer number 1 represents the global minimum among all MC-generated structures.

Figure 7.2 Potential Energies of Monte Carlo Searched Conformers for 7.1

The global minimum potential conformer features a compact left-handed helical structure characterized by seven aromatic rings This unique configuration includes two benzyl rings at each end that fold back towards one another, enhancing the helical formation.

Figure 7.3 Left-handed helical structure with global minimum potential energy surface for 7.1

The helix's top view shows that the two six-member rings in each pair are not perfectly aligned, resembling the crystal structures of 7.1 (Figure 7.4), but with terminal benzyl rings oriented outward The discrepancy between the experimental crystal structure and the computed global minimum from the Monte Carlo (MC) search can be attributed to several factors Firstly, the implicit solvation model used in the MC simulation lacks explicit solvent molecules, leading to more compact conformers exhibiting lower potential energies Secondly, the significant stacking of molecules in crystal structures highlights the importance of intermolecular interactions for stabilizing solid-state structures, which cannot be fully captured in a single MC simulation.

Table 7.1 presents the root mean square deviation (RMSD) and relative potential energies of the first 20 conformers, highlighting that 13 of these conformers exhibit left-handed structures Figure 7.5 illustrates the superimposition of these 13 structures, omitting hydrogen atoms for clarity Notably, among these

Number of Terminal Benzyl Ring Folded back Towards Helix

Number of Terminal Benzyl Ring Folded away from Helix

Table 7.1 First 20 Conformers in Monte Carlo Simulations of 7.1

Figure 7.5 Overlap of 13 left-handed helices conformers of 7.1 among top 20 MC structures

The other nine structures with left-handed helical structures have at least one end benzyl ring folded away from the helical structure, similar to the crystal structure

Among the top twenty conformers, seven right-handed helices were identified, displaying comparable flexibility to their left-handed counterparts The benzyl rings at both ends of the helices can orient either upward or downward relative to the helix Notably, these seven right-handed helices are evenly balanced with the left-handed ones, indicating that both helices are similarly favored as global minima structures.

Figure 7.6 Overlap of seven right-handed helices conformers of 7.1 among top 20 MC structures

Conformers exhibit increased extension with rising potential energies, yet the specific folding pathways for right- and left-handed helical structures remain unclear from the MC search analysis Visual assessments indicate that higher energy structures are partially extended, suggesting that interconversion between right- and left-handed helices could occur through a partially unwound helical configuration By partially unwinding one end of the helix, the molecule can facilitate this interconversion without fully extending its helical structure Notably, conformers 5808 and 6289 demonstrate this unwinding behavior at one end of the right-handed helix.

Figure 7.7 Conformer 5808 (left) and 6289 (right) of 7.1 from MC results The potential energy with respect to the global minimum is 61.4 kJ/mol (14.7 kcal/mol) for conformer

5808, and 62.2 kJ/mol (14.9 kcal/mol) for conformer 6289

Both conformers exhibit right-handed helical structures, but one end of each conformer partially transitions into a left-handed helix If the remainder of the molecule could dynamically adopt this left-handed folding, the entire structure could potentially shift from a right-handed to a left-handed helix.

The relative potential energies of conformers 5808 and 6289 are 14.7 and 14.9 kcal/mol, respectively However, the complete pathway for interconversion between these conformers remains unidentified due to two main challenges: the absence of other conformers along the proposed pathways in the Monte Carlo (MC) simulations and the lack of information regarding the pathways connecting any pair of conformers Consequently, identifying the interconversion pathway between the two helices proves to be a complex task.

The ends of the helix are crucial to the proposed mechanism, leading to an analysis of conformational distributions for 10,000 Monte Carlo conformers The distances between the two carbon atoms at each end of the compound were measured and plotted for each conformer Interestingly, despite the symmetry of the compound, the distributions of these distances show significant variation.

Most conformers exhibit an extended structure at one end, where the benzyl ring is typically folded away from the helix For the first 6,000 conformers, the distribution of this end is concentrated at two specific lengths, predominantly around 7 Å and 8.5 Å In contrast, conformers numbered 6,000 to 10,000 show a wider range of distances between 6 and 9 Å The lower potential energies of the earlier conformers suggest that there are two energetically favorable geometries for the extended end As the potential energy increases, the barrier between these preferred geometries diminishes, leading to a more varied distribution of distances.

Conformers with numbers below 1,000 exhibit preferred carbon-carbon distances around 8 Å, while those above 1,000 show a variable distance between 4 to 9 Å, with a notable preference for approximately 4.5 Å Additionally, conformers with potential energies lower than 12 kcal/mol tend to favor extended geometries, contrasting with the other 9,000 conformers that possess higher potential energies.

12 kcal/mol, folded geometries are more favored a b

Figure 7.8 The distributions of distance between two carbons at either end of 7.1

The analysis reveals that the two ends of the helical structure in 1 are not equivalent, each exhibiting distinct preferred geometries Based on the proposed mechanism, interconversion between helical structures is likely to occur preferentially from one end, which favors extended geometry distributions.

Monte Carlo Conformational Search of 7.2

The MC conformational search for 7.2 yielded 9,855 distinct conformers, with their relative potential energies varying significantly, by approximately 350 kJ/mol (or 84 kcal/mol) from the global minimum, as illustrated in Figure 7.9 Notably, three specific conformers were identified.

5000 and 9855 showed that conformers with higher potential energies have more extended geometries

The global minimum conformation has a helical structure (Figure 7.10)

The helical structure exhibits a unique configuration, featuring both right-handed and left-handed helices; the upper half is right-handed while the lower half is left-handed Each half maintains a well-defined helical form, aligning with a straight line that serves as the central axis Notably, the first 20 conformers share a similar helical structure with the global minimum, consistently displaying a combination of right-handed and left-handed helices.

Figure 7.9 Potential Energies of Monte Carlo Searched Conformers for 7.2 a Side View b Top View

Figure 7.10 Structure (half left-handed and half right-handed helical structure) with global minimum potential energy surface for 7.2

In this molecular conformer study, no structures exhibiting a uniformly folded right- or left-handed helix were identified Instead, the analysis of high potential energy conformers revealed that they consistently feature partially extended structures Four specific conformers—185, 561, 612, and 808—illustrate this phenomenon, as shown in Figure 7.11 Conformers 185 and 808 display a left-handed helical structure at one end, coupled with an extended structure at the other, while conformers 561 and 612 exhibit a right-handed helical structure at one end and an extended structure at the opposite end This evidence suggests that oligomer 7.2 can exhibit end inversion of the helical structure, rather than maintaining a uniform helical configuration.

The analysis of the first 1,000 conformers identified several compact structures lacking well-defined helical forms, highlighting the effectiveness of the MC conformational search This finding suggests that the complexity of the compound 7.2 may have prevented any interconversion between helical structures during the MC simulation.

Therefore, the mechanism of interconversion between the two helical structures remains uncertain for 7.2

The measured distance between the two carbon atoms at each end of the conformers shows a similar distribution, predominantly ranging from 4 to 10 Å, with a slight preference for 4 Å and 8 Å This indicates that both ends can adopt either a folded or extended geometry across various potential energy states This finding aligns with observations that a helical structure at one end can unwind while the other end maintains its helical configuration.

The two ends exhibited noticeable differences in their distance ranges The end with a right-handed helix at low potential energy had a narrower distribution range of 4 to 9 Å, as illustrated in Figure 7.11a, compared to the other end.

The structural distinction between the left-handed and right-handed helices is evident, with the right-handed helix exhibiting a narrower and more compact distribution compared to its left-handed counterpart This compactness may influence the interconversion between the two helices; however, such interconversion remains uncertain due to its absence Notably, conformer 185 has an energy of 22.4 kcal/mol, while conformers 561, 612, and 808 have energies of 34.4, 35.2, and 39.1 kcal/mol, respectively.

The four conformers of 7.2, depicted in Figure 7.11, showcase partially extended geometries, with each conformer's potential energy indicated in parentheses relative to the global minimum from this Monte Carlo search Notably, one conformer ends with a right-handed helix, while another concludes with a left-handed helix.

End with right-handed helix End with left-handed helix c

Figure 7.12 The distributions of distance between two carbons at either end of 7.2

Molecular Dynamics of 7.1

Molecular dynamics (MD) simulations of 7.1 at 300K were conducted for 100 ns, revealing that the root mean square deviation (RMSD) values of molecule 1 fluctuated by up to 5 Å, indicating structural flexibility Despite this, all trajectories maintained a right-handed helical structure, as confirmed by Visual Molecular Dynamics (VMD) analysis A dihedral angle, defined by two carbon atoms from the central pyridine ring and the centroid of two benzyl rings, was measured and plotted for each frame along the trajectory, showing a predominance of positive angles that correspond to a right-handed helix Although a few frames exhibited negative dihedral angles due to the internal twist of one benzyl ring, the overall right-handed helical structure remained intact A similar measurement at the opposite end of the helix confirmed that this helical structure was consistently maintained throughout the simulation.

Figure 7.13 RMSD of 100 ns MD simulation of 7.1 a Dihedral angle along MD trajectory

Figure 7.14 Dihedral angle on one end of 7.1

The MD simulation maintained a consistent right-handed helical structure, prompting an investigation into the conformational changes responsible for the observed RMSD variations The MC simulation indicated that both ends of the helix exhibit flexibility, leading to a carbon-carbon distance analysis for the MD trajectory that mirrored the findings from the MC simulation, with distances ranging from 4 to 10 Å This analysis confirmed that the ends of the helix demonstrated similar conformational flexibilities as noted in the MC simulations Additionally, the end benzyl group displayed the ability to either fold back toward the helix, resulting in shorter carbon-carbon distances, or extend away, corresponding to longer distances.

Figure 7.15 The distribution of distance between two carbons at one end of 7.1

Molecular Dynamics of 7.2

MD simulations of 7.2 at 300K were run for 100 ns RMSD values for molecule

7.2 fluctuated by up to 8 Å (Figure 7.16) There are four significant changes in the

RMSD values during the simulations, which indicate sudden conformational changes of

7.2 This first sudden RMSD change occurs around 37 ns and varies from 3 to 6 Å

In the initial 37 ns of simulation, molecule 7.2 exhibits a dual conformation, featuring one half as a right-handed helix and the other as a left-handed helix Following this period, the molecule transitions to a folded structure while maintaining the helical types of each half This folding occurs on the same side of the central pyridine ring, resulting in a significant RMSD change at 37 ns A second notable RMSD change is observed around 40 ns, as the two halves of the molecule begin to fold onto opposite sides of the central pyridine ring, with the helices remaining unchanged A third significant RMSD change occurs near 60 ns, leading to the unfolding of the right-handed helix, resulting in a compact structure that lacks distinct helical characteristics This intermediate conformation contains two aromatic ring pairs stacked together, indicating a potential transition state between the helices, although no actual conversion takes place during the molecular dynamics simulations Ultimately, the unfolded segment of molecule 7.2 reverts to a right-handed helix following another substantial RMSD change at approximately 60 ns.

Figure 7.16 RMSD of 100ns MD simulation of 7.2 a b c

Figure 7.17 Three typical conformers of 7.2 from 100 ns MD Simulations

The molecular dynamics (MD) trajectory of 7.2 reveals a detailed mechanism for the unfolding of a right-handed helix, illustrated by three key conformers in Figure 7.18 The unfolding process, depicted from a side view, involves the rotation of a benzyl ring, initially positioned upward, which contributes to the helical structure As the ring rotates, the helix undergoes partial unwinding, and upon reaching a final position parallel to the original helix's central axis, the helix is fully unwound Notably, the final structure does not indicate whether it originated from a right- or left-handed helix, suggesting a potential interconversion between the two forms, although no such interconversion was observed in this simulation If the benzyl ring continues to rotate to the opposite orientation, the right-handed helix could transform into a left-handed helix.

In the same molecular dynamics (MD) simulation timeframe, the structure 7.2 exhibited significantly greater flexibility compared to 7.1, which is approximately half its size Notably, each half of 7.2 demonstrated relative conformational independence, with one half unfolding into a right-handed helix This observation suggests a potential interconversion mechanism between the two helices However, the simulation did not capture a complete transition from a right-handed to a left-handed helix, nor did it reveal the unfolding of the left-handed helix, indicating the need for improved simulation methods to fully explore the interconversion of helices.

Replica Exchange Molecular Dynamics Simulation (REMD) of 7.1

REMD simulations were applied to simulate 7.1 with 12 replicas at different temperatures Such conformers shows more variety than those in a single temperature

In the MD simulation, the dihedral angle between two carbons from the central pyridine ring and the centroids of two benzyl rings in Replica 4 varies between -50° and 50° This variation was not detected in the 100 ns MD simulations.

The two ends of structure 7.2 exhibit a flexibility comparable to that of structure 7.1, with carbon-carbon distances at the ends consistently measured and plotted (see Supporting Information Figure S6) Throughout the simulation, these distances range from 4 to 10 Å, indicating that the flexibilities and orientations of the two end benzyl rings remain unaffected by the helix type at each end Additionally, the dihedral angle along the REMD trajectory provides further insights into this structural behavior.

Figure 7.19 Distribution of dihedral angles in replica 4 of 7.1 (T)5.9K)

Helices are common structures observed in low-temperature replica trajectories The clustering method from the Multiscale Modeling Tools for Structural Biology (MMTSB) Tool Set 38 was utilized to examine the trajectory of replica 0 at 230.0K, leading to the identification of four distinct conformers representing four clusters Among these, two conformers exhibit a right-handed helix, while the other two display a left-handed helix (Figure S7).

The interconversion between the two helices did not take place in a single replica Following the exchange within each replica pair, the conformer from the lower temperature replica was simulated at a higher temperature prior to the exchange This process allowed either helix the opportunity to transition to a higher temperature replica, overcoming the interconversion barrier before reverting to the lower temperature replica.

The interconversion mechanism between helices was mapped through careful trajectory tracing and is illustrated in Figure 7.20 Molecule 7.1 does not transition through an extended conformer; instead, a compact intermediate structure connects the helices (Figure 7.20 b and e) The conversion primarily involves the orientation of two aromatic rings, which initially align with the rest of the molecule in a right-handed helix (Figure 7.20 a) During the conversion, these rings rotate counterclockwise, forming an intermediate structure while maintaining compactness, though the central helical structure unwinds Continued counterclockwise rotation of the rings ultimately leads to the formation of a left-handed helix, completing the interconversion (Figure 7.20 c).

The top view of the process reveals that the two aromatic rings rotate away from the molecule's central part, causing the helix to unwind This rotation allows the rings to fold back, resulting in a compact left-handed helix structure The motion of the end benzyl rings is coupled with the conformational changes of the central section, leading to a simultaneous shift in orientation from right- to left-handed during the helix interconversion.

The mechanism described resembles the conformational change seen in the MD simulation of 7.2, although there are key differences In the simulation, the rotational motion of a single benzyl ring resulted in an unwound helix, demonstrating its significant impact on helical structure It is suggested that the rotation of this one ring could convert the right-handed helix in one half of 7.2 to a left-handed form Additionally, the complete interconversion of the helix in 7.1 may provide supporting evidence for this proposed mechanism.

Replica Exchange Molecular Dynamics Simulation (REMD) of 7.2

REMD simulations were applied to simulate 7.2 with 14 replicas

Conformational analysis of REMD simulations indicates that compound 7.2 exhibits flexibilities similar to 7.1, with its helical structures capable of unwinding or transitioning to different helix types The initial geometry of the REMD simulation featured an equal mix of left- and right-handed helices Notably, a conformer consisting entirely of a right-handed helix was identified within the REMD trajectories, suggesting a conversion from left to right-handed helices Additionally, some trajectories displayed conformers with unfolded helices, where the unfolding involved the rotation of two aromatic rings, mirroring the intermediate structures seen during helix conversion in previous REMD studies.

7.1 Although the real conversion mechanism was not identified due to the difficulty in tracking the exchange between replicas, it is reasonable to believe that a similar mechanism was experienced for the interconversion between the two helices of 7.2 a Conformer with both halves in right-handed helix b Conformer with unfolded helix (upper part with arrows)

Figure 7.21 Sampled conformers of 7.2 from REMD simulation

The REMD simulation of 7.2 reveals that the half size of 7.2 or the full size of 7.1 serves as the conformational unit responsible for the helical structure In the helical structures of 7.2, two such units can be identical or distinct Additionally, the interconversion mechanism identified in the REMD simulation of 7.1 applies to both segments of the structure.

7.2 The conversion of one helix to the other could occur independently in either half of 7.2 When browsing the REMD trajectories, many structures which are not pure helices have two aromatic rings with their link rotated away from the helices (structures not shown) This double-ring unit is highly conserved when involved in the conformational change of helices.

Conclusions

This study investigates the interconversion between right- and left-handed helices in alternating pyridinedicarboxamide/m-(phenylazo)azobenzene oligomers Utilizing a systematic torsional sampling method, a robust Monte Carlo (MC) algorithm was employed to sample both helical forms of the oligomer However, the MC results did not provide insights into the interconversion mechanism, as most high-energy conformers exhibited extended structures Consequently, the lack of connection information among different conformers in the MC simulations hindered the identification of the potential interconversion pathway between the helices.

The MC simulation of compound 7.1 revealed limitations in applying a systemic search protocol to the larger compound 7.2 The MCMM search algorithm was utilized for the conformational analysis of 7.2, but no changes in helices were observed in the simulation results Similar to 7.1, the conformers of 7.2 with elevated potential energies exhibited extended structures Consequently, the mechanism of helix interconversion remained undetermined in the MC simulation of 7.2 as well.

Molecular dynamics (MD) simulations lasting 100 nanoseconds were conducted for compounds 7.1 and 7.2, revealing that the helices remained unchanged throughout the simulations In the case of 7.1, the primary flexibility observed was the swinging motion of the two terminal benzyl rings, while the core right-handed helix maintained stability Conversely, the flexibility of 7.2 was characterized by the movement of its two distinct helical halves, with the right-handed helix experiencing unwinding due to the rotation of an aromatic ring Notably, this unwinding did not alter the overall helical structure.

Replica-exchange molecular dynamics (REMD) was applied to simulate 7.1 and

7.2 Fortunately, conversion between the two helices was observed in REMD trajectories

The conversion of the right-handed helix of compound 7.1 to a left-handed helix occurs through the rotation of the central pyridine ring and an adjacent benzyl ring, without passing through a fully extended structure Instead, the intermediate structures along this conversion pathway remain compact Similar helical conversions were noted in the simulations of compound 7.2; however, due to the complexity of the Replica Exchange Molecular Dynamics (REMD) simulation, the conversion path was not tracked across different replicas Throughout the REMD trajectories, many conformers exhibited partially unwound helices, with the unwinding process involving the rotation of two aromatic rings, mirroring the conversion mechanism observed in 7.1.

The helical structures of alternating pyridinedicarboxamide/m-(phenylazo)azobenzene oligomers can be analyzed based on a REMD value of 7.1 The conversion of these helices does not require an extensive structural change; instead, a rotational motion of the two central aromatic rings facilitates the unwinding and transformation of the helix This localized conversion allows any oligomer unit with a size of 7.1 to change its helical form without disrupting the overall integrity of the surrounding molecular structures.

This article presents a novel conversion mechanism for the helices of abiotic oligomers, established through molecular mechanics simulations The findings offer valuable insights into the structural and dynamic characteristics of these oligomers, potentially aiding in the design of innovative helical oligomers.

Introduction

In the previous chapter, we examined two azobenzene oligomers using various computational methods to investigate their conformational changes Our findings revealed that only replica-exchange molecular dynamics (REMD) effectively sampled both right- and left-handed helices of these oligomers Notably, the transition from a right-handed helix of 7.1 to a left-handed helix occurred without passing through a fully extended structure, with interconversion intermediates involving rotations of the central pyridine ring and an adjacent benzyl ring A similar mechanism was observed for 7.2 These insights were primarily obtained through visual inspection of the simulation trajectories, but the extensive data generated by REMD posed challenges for analysis compared to single temperature molecular dynamics simulations Therefore, we employed the weighted histogram analysis method (WHAM) in conjunction with principal component analysis (PCA) to effectively analyze the REMD results and extract meaningful statistical information.

Oligomers 7.1 and 7.2 both have an odd number of pyridine rings in the linkage

The synthesis of oligomers with an even number of pyridine rings presents greater challenges compared to those with an odd number To facilitate REMD simulations, two additional oligomers (8.1 and 8.2) were constructed, and their simulation results were analyzed and compared with those of oligomers 7.1 and 7.2.

Figure 8.1 Azobenzene oligomers with even numbers of pyridine rings

Methods

The REMD simulations for oligomers 8.1 and 8.2 were conducted using the AMBER8 program package, as detailed in sections 7.1 and 7.2 Oligomer 8.1 utilized 12 replicas with temperatures ranging from 230.0 K to 431.6 K, while oligomer 8.2 involved 14 replicas with temperatures spanning from 230.0 K to 409.8 K.

In this study, temperatures were systematically adjusted to 430.0 K to enhance the overlap of potential energy among adjacent temperature replicas Exchange probabilities were computed every 10,000 molecular dynamics steps, equivalent to 10 picoseconds, resulting in 2,200 exchange attempts for each pair of replicas.

WHAM 1 is a comprehensive technique used to integrate data from various simulations to determine the thermodynamic properties of a system It is particularly effective in calculating the potential of mean force (PMF) from histograms of the reaction coordinate obtained from multiple simulations When simulations are conducted at different temperatures, the analysis is known as the temperature weighted-histogram analysis method (T-WHAM) In this context, simulations at temperatures higher or lower than the target temperature are considered temperature-biased The WHAM method utilized in this research was developed by Wang and colleagues A concise overview of the WHAM formalism is provided for clarity.

For a given system, the Hamiltonian H 0 (r) could be rewritten as a modified

In the given system, the initial parameter λ0 is set to 1, and the biasing potential V0(r) is defined as H0(r) The coordinates are denoted by r, while Vi(r) represents the biasing potentials and λi indicates the coupling parameters For R total simulations, each simulation is conducted at a specific temperature Ti, calculated as 1/(kBβi) The resulting probability histogram P{λ},β({V},ξ) encapsulates the statistical behavior of the system under these conditions.

The {λ}i are the coupling parameters and the n m are the number of snapshots taken The braces {V} denote the set of restraining potentials V 0 , V 1 , …, V L In the quantity f j =β j A j ,

The Helmholtz free energy for the jth simulation, denoted as A_j, is derived by iterating equations 8.2 and 8.3 In this context, ξ represents the reaction coordinate, while N_i({V}, ξ) indicates the histogram value at {V} and ξ for the ith simulation To compute the properties discussed in this chapter, the Weighted Histogram Analysis Method (WHAM) was employed to integrate data from all replicas across various temperatures.

To effectively apply WHAM for analyzing multiple simulations produced by REMD, it is essential to identify suitable reaction coordinates that represent the conformational changes of oligomers The variations in helical structures can be effectively captured using a single angle or distance To achieve this, principal component analysis (PCA), also known as quasi-harmonic analysis or the essential dynamics method, was utilized to derive “principal” coordinates that encapsulate the overall conformational changes of oligomers This method offers a robust and efficient approach to represent the conformational distribution of multi-dimensional systems through a limited number of principal coordinates The fundamental principle of PCA lies in identifying correlated variables to simplify complex data.

( i i ) ( j j ) ij = q − q q − q σ (8.4) where q1, …q3N are the mass-weighted Cartesian coordinates of the solute molecule and

By averaging over all sampled conformations, we diagonalize the covariance matrix σ to obtain 3N eigenvectors υ n and descending eigenvalues λ n, with λ1 being the largest These eigenvectors and eigenvalues reveal the modes of collective motion and their corresponding amplitudes Research indicates that a significant portion of the system's fluctuations can be effectively captured using only a few principal components In addition to standard Principal Component Analysis (PCA), various atomic pair distances and dihedral angles from each structure were incorporated for Weighted Histogram Analysis Method (WHAM) analysis, with definitions of these variables for each oligomer provided later in the chapter.

Results and Discussion

Replica Exchange Diagnostics

Multiple analyses support the validity of the REMD simulation, as illustrated by the temperature evolution of replica 6 of oligomer 7.1, which showcases a broad range of sampled temperatures This wide temperature range results in a diverse distribution of potential energy across the system's MD frames, allowing the system to effectively navigate potential energy barriers between local minima Consequently, this enables a more extensive exploration of conformational space compared to single temperature MD simulations Additionally, the energy histograms for 12 replicas of 7.1 at various target temperatures reveal significant overlap in potential energy distributions, contributing to a relatively high exchange rate among the replicas.

The Boltzmann distribution of energy among replicas demonstrates that equilibrium is attained in this simulation The overlapping potential energy distributions facilitate nearly constant exchanges between adjacent replicas throughout the simulations.

Figure 8.5 illustrates the exchange rate of all replicas in the REMD simulation of section 7.1, demonstrating that the exchange among replicas occurred nearly continuously throughout the simulation, with only minor gaps observed on each line for the corresponding replicas.

The diagnostic results indicate that the replicas in the REMD simulations are effectively scaled, with balanced exchanges and a comprehensive temperature range for each replica Consequently, the REMD simulations for oligomer 7.1 are well-defined and poised to yield valuable thermodynamic insights upon proper analysis Similarly, the setups for the REMD simulations of the other three oligomers—7.2, 8.1, and 8.2—exhibit comparable diagnostics, ensuring consistent and reliable results across these simulations.

Figure 8.2 The temperature evolution of replica 6 in the REMD simulation of Oligomer

Figure 8.3 Potential energy distribution of replica 6 in the REMD simulation of 7.1

Figure 8.4 Potential energy histograms of 12 replicas in the REMD simulation of 7.1

In the REMD simulation of section 7.1, Figure 8.5 illustrates the exchange rate among 12 replicas, with each point representing an exchange between adjacent replicas over molecular dynamics time The frequent exchanges result in nearly straight lines for each replica, indicating a consistent exchange rate throughout the REMD simulations.

REMD simulations of 7.1

In helical structures, the interrelation between the central pyridine ring and the azobenzene rings is crucial, as illustrated by the two defined dihedral angles in Figure 8.6 These angles reveal the relative positioning of the rings, with distinct combinations of values for right- or left-handed helical structures concerning the M configuration Understanding these dihedral angles is essential for analyzing the structural properties of helical compounds.

The P configuration and the distribution of dihedral angles provide insights into the conformational space of compound 7.1 Previous analyses of simulation results highlighted the flexibility of the terminal regions of 7.1, where the end benzyl rings can either extend outward or fold inward toward the molecule's center For the T-WHAM analysis, two specific distances, referred to as Distance 1 and Distance 2, were selected as depicted in Figure 8.6.

In WHAM analysis, dihedral angles are defined by the carbon atoms of the central pyridine ring and the ring centers of azobenzenes Additionally, distances are measured between the carbon atoms of the central azobenzenes and the centers of the two terminal benzyl groups.

The contour plots of Gibbs free energy (G) at 300K, illustrated in Figure 8.7, reveal two basins of attraction along the diagonal line, corresponding to dihedral angles 1 and 2 One basin, centered around -40°, represents an M configuration of 7.1, indicative of a left-handed helix, while the other, centered at approximately 40°, corresponds to a P configuration, signifying a right-handed helix These findings indicate that the two helical structures are the sole minima on the potential of mean force (PMF) surface defined by specific dihedral angles, with both helices being global minima and energetically equivalent in the simulation The symmetric attraction basins reflect the mirror image relationship between the M and P configurations of 7.1 Although the representative structures of the M and P configurations, shown in Figure 8.8, are not exact reflections, their central parts exhibit symmetry.

Figure 8.7 The contour plots of the Gibbs free energy (G) at 300 K projected onto the dihedral angle 1 and dihedral angle 2 reaction coordinates in 7.1

M (Left-handed helix) P (Right-handed helix)

Figure 8.8 Representative conformers with the M and P configurations of 7.1

The dihedral angles in question range approximately ±100°, and their strong coupling restricts them from reaching arbitrary values, hinting at the interconversion between two helical structures As illustrated in the PMF plots, the most efficient conversion pathway between the M and P regions occurs along a straight line through the original point, where both dihedral angles are zero Initially, this may seem physically impossible; however, the PMF plot indicates that this region is not only accessible but also features the second lowest free energy profile The dihedral angles, defined by the centroids of specific rings, suggest that the benzyl rings can rotate out of the plane even at zero or near-zero angles to mitigate steric hindrance This efficient pathway facilitates the interconversion of the two helical structures, aligning with the interconversion mechanism of 7.1 discussed previously It is important to note that this proposed pathway is based on the progression along the reaction coordinates—specifically the two dihedral angles—rather than a temporal progression.

The contour plots of free energy at 300 K, illustrated in Figure 8.9, reveal two preferred positions for each end: an extended structure and a folded structure These positions create four attraction basins on the potential of mean force (PMF) plots, forming a square shape at the corners, indicating that the two distance factors are not coupled like the dihedral angles The distance range for each end is approximately 4 to 9 Å, with a high density of contour lines outside this range, signifying a steep potential energy surface Conversely, within this range, the contour line density decreases significantly, reflecting a flatter free energy surface.

The molecule can transition between two favored conformations, with preferred distances of 4.5 Å for the folded state and 8.5 Å for the extended state Notably, the center of the potential of mean force (PMF) plot exhibits a higher free energy profile compared to the four corners of the attraction basins This indicates that the two ends of the molecule, at a distance of 7.1 Å, seldom fold or unfold at the same time and are not correlated in their conformational changes.

Figure 8.9 The contour plots of the free energy at 300 K projected onto the distance 1 and distance 2 reaction coordinates for 7.1

The bottom left corner attraction basin features two folded ends, with distances around 4.5 Å, and is significantly larger than the top right corner, which represents the conformer with both ends extended Transitioning from a folded to an extended structure at either end shifts the configuration from the bottom left attraction basin to either the top left or bottom right basins, which are intermediate in size This indicates that folded structures at both ends are slightly more favorable.

A principal component analysis (PCA) was conducted on the 7.1 REMD simulations, resulting in 369 components that reflect a total freedom number of 369 The components were organized in descending order based on their eigenvalues, with the eigenvalue accumulation plot indicating that the initial components account for the majority of the total eigenvalues.

The states characterized by large eigenvalues capture the majority of information regarding the molecule's overall conformational changes Visualization using Visual Molecular Dynamics (VMD) and the Interactive Essential Dynamics (IED) plug-in revealed that states 1 and 3 are closely linked to the interconversion between right- and left-handed helices Specifically, a transition from right- to left-handed helices occurs as the PCA1 coefficient shifts from -40 to 10, and similarly for PCA3 from -30 to 10 These components effectively illustrate the relationship between the structural conformational changes and helix interconversion A free energy map plotted against principal components 1 and 3 shows a single attraction basin at the center, despite the existence of two helical structures with distinct minimum free energy values This basin encompasses coefficient values ranging from -40 to 10 for both components, indicating that it represents two helical structures in the visualization.

Figure 8.10 Eigenvalues from the accumulated principal component analysis (PCA) for

7.1 The PCA was performed on the REMD simulation of 7.1 The principal components were sorted according to their eigenvalues The accumulation of their eigenvalues was plotted with respect to the principal component numbers The first 35 components from the total of 369 components were used to show the convergence of the accumulations The first four components contribute more than 50% to the sum of all eigenvalues

Figure 8.11 Contour plots of the free energy at 300 K projected onto PCA state 1 and 3 for 7.1

Figure 8.12 Major movements represented by PCA state 1 and 3 of 7.1 Picture generated by Visual Molecular Dynamics (VMD) with plug-in Interactive Essential

Hence, two helical structures are both covered by this large attraction basin The modes of motion of PCA state 1 and 3 as illustrated in Figure 8.12 are similar to each other

The modes discussed represent the motion patterns fitting a quasiharmonic free energy surface rather than actual molecular trajectories In PCA State 1, the majority of motion involves two azobenzene rings, where the right part of the structure exchanges positions with the left, resulting in a transformation from a right-handed to a left-handed helix Similarly, PCA State 3 exhibits a comparable molecular movement pattern, but the primary movements are oriented perpendicularly to those in State 1, allowing the molecule to switch between two helical conformations Together, these states illustrate the interconversion between the two helices, emphasizing that the depicted motions are not the actual paths taken by the molecule.

REMD simulations of 7.2

Six dihedral angles and two distances are defined in Figure 8.13, and applied for WHAM analysis There are three pyridine rings linkages between azobenzene units in

7.2 Since each pyridine ring combined with azobenzene units on both sides could form a helical structure, there are two dihedral angles directly related to each pyridine It was shown in the analysis of 7.1 that the two molecular ends were not correlated to each other Neither did they correlate to helical structures folding and interconversion A similar analysis of the correlation between the two ends, therefore, was not performed for 7.2 Two distances between pyridine rings were defined, and subject to WHAM analysis

Dihedral Angle 1 Dihedral Angle 2 Dihedral Angle 6 Dihedral Angle 5 Dihedral Angle 4 Dihedral Angle 3

In WHAM analysis, six dihedral angles are defined by the carbon atoms of the central pyridine ring and the ring centers of azobenzenes Additionally, two distances are established based on the centers of three pyridine molecules.

Figure 8.14 illustrates the contour plots of free energy at 300K, highlighting three significant attraction basins The central basin and the adjacent basin on the lower left are positioned along the diagonal line.

Figure 8.14 The contour plots of the free energy at 300 K projected onto the dihedral angle 1 and dihedral angle 2 reaction coordinates for REMD simulation of 7.2

The PMF contour of dihedral angles in the REMD simulation of 7.1 closely resembles that of 7.2, indicating that both structures share the same structural unit This similarity suggests that the segment of the molecule in 7.2 may adopt either a right- or left-handed helix In addition to the diagonal attraction basins representing these helical forms, there are two other basins—one major basin on the right side of the central basin and a minor basin below it—highlighting the complexity and flexibility of the 7.2 structure These additional basins correspond to compact structures that are not helically folded, potentially representing intermediate forms in the interconversion pathway between the two helical structures Notably, the off-diagonal attraction basin on the right contains a partially unfolded structure, with its upper half linked to dihedral angles 1 and 2, suggesting its role in the interconversion between helices.

The conformer of 7.2 is situated in the off-diagonal attraction basin, as illustrated in Figure 8.14 This structure features a dihedral angle of 106° for Dihedral Angle 1 and 38° for Dihedral Angle 2, with the upper section of the structure defining these angles.

The off-diagonal attraction basin located below the central one is notably smaller, indicating a less thermodynamically favorable configuration compared to the others In this area, dihedral angle 2 is approximately -100°, while dihedral angle 1 measures around 50°, demonstrating a symmetrical relationship between the two angles Although the shapes and positions of the off-diagonal basins exhibit symmetry relative to the second diagonal line of the plot, one basin is significantly more thermodynamically favorable than the other The distinction between dihedral angles 1 and 2 lies in their connections; dihedral angle 1 is linked to another azobenzene unit, whereas dihedral angle 2 connects to the molecule's free end This analysis suggests that the unit defined by these dihedral angles is more prone to unfolding at the side connected to the azobenzene unit rather than the free end, which appears contradictory since the free end is expected to be more flexible Further investigation into various molecular parts and different molecules is required to confirm this finding.

The contour plots of the free energy at 300K projected onto dihedral angles 3 and

Figure 8.16 illustrates two primary diagonal attraction basins, indicating that this half of the molecule can adopt either a right- or left-handed helical structure The significant off-diagonal attraction basin is situated beneath the central basin, where dihedral angle 4 ranges from -100° to -150° and dihedral angle 3 hovers around -50° Conformers with dihedral angle 3 at 100° are considerably less favorable than those with dihedral angle 4 at -100° As shown in Figure 8.13, dihedral angle 3 correlates with the molecular free end, while dihedral angle 4 is associated with the azobenzene moiety This aligns with previous findings that structural units linked to molecular moieties are more prone to unfolding than those connected to molecule ends, suggesting that the interconversion process of the helical structure may initiate within the helical formations rather than at the ends.

Figure 8.16 The contour plots of the free energy at 300 K projected onto the dihedral angle 3 and dihedral angle 4 reaction coordinates for REMD simulation of 7.2

Dihedral angles 5 and 6 are not linked to the molecular end, and the structures they define exhibit complete symmetry The potential of mean force (PMF) concerning these angles is illustrated in Figure 8.17, revealing a distinct pattern compared to previous plots This contour plot features two significant attraction basins that are nearly symmetric relative to a mirror plane parallel to the axis of dihedral angle 6.

Figure 8.17 The contour plots of the free energy at 300K projected onto the dihedral angle 5 and dihedral angle 6 reaction coordinates for REMD simulation of 7.2

The absence of a diagonal basin in the plot indicates that the molecular segment associated with dihedral angles 5 and 6 does not support helical structures during the simulation Additionally, the presence of two attraction basins highlights the distinct behaviors of these dihedral angles, with angle 6 fluctuating between ±150° and a global minimum near 0° In contrast, dihedral angle 5 exhibits two minima around ±100°, which preliminary analysis suggests are linked to the two primary configurations of 7.2, as illustrated in Figure 8.18.

Configuration A features a linear structure, while Configuration B exhibits a disrupted linearity, with its two halves folding towards each other in parallel positions The dihedral angles for these configurations are -120º for Configuration A and 150º for Configuration B.

The analysis reveals that the two pyridine rings at either end serve as the centers for two independent helical structure units These units can simultaneously adopt different helical configurations, either forming a linear structure or folding into distinct halves, as illustrated in Configuration B of Figure 8.18 The central structural unit acts as a linkage, connecting the two helical units.

The PMF plots in Figure 8.19 utilize two defined distances from Figure 8.13 as reaction coordinates, revealing a significant rectangular attraction basin in the lower right corner This pattern suggests that the two coordinates are independent of one another The central basin ranges from 5 Å to 10 Å for distance 1, while distance 2 has a narrower range of 6 Å to 9 Å Notably, the left half of the molecule, corresponding to distance 1, exhibits greater flexibility compared to the right half, which is associated with distance 2 This finding aligns with the WHAM analysis of dihedral angles 5 and 6, indicating that dihedral angle 6 has a broader range than dihedral angle 5.

Figure 8.19 The contour plots of the free energy at 300 K projected onto the distance 1 and distance 2 reaction coordinates for REMD simulation of 7.2

Analyses of the structure revealed that the two halves of 7.2 exhibited distinct behaviors, with the left half demonstrating greater flexibility compared to the right half In our simulation, the right half preferred a right-handed helical configuration, while the left half favored a left-handed helical structure Despite both halves sampling two helical forms according to the PMF plots, the left half consistently showed a preference for right-handed helices and greater flexibility The underlying cause of this asymmetry remains unexplained based on the current findings.

REMD simulations of 8.1

Dihedral angles serve as reaction coordinates for molecule 8.1, as illustrated in Figure 8.20 This molecule features an even number of pyridine rings, distinguishing it from structures 7.1 and 7.2 While structure 7.1 acts as a fundamental unit for helical configurations, molecule 8.1 comprises approximately one and a half of these helical units, enhancing its significance for study.

The contour plots of the free energy at 300 K projected onto dihedral angles 1 and

Figure 8.21 illustrates two oval-shaped attraction basins that are not aligned diagonally, indicating that the corresponding dihedral angles behave differently in the simulation This discrepancy suggests that the molecular portion defining these angles may struggle to adopt proper helical structures Considering the right half of the molecule as a helical structural unit, the left half serves as the linkage to another unit This perspective posits that the left half of the molecule in section 8.1 corresponds to the structure defining dihedral angles 5 and 6 in section 7.2, as depicted in Figure 8.13.

Figure 8.20 Definitions of dihedral angles of 8.1 for WHAM analysis Four dihedral angles are defined by carbon atoms from central pyridine rings and ring centers of azobenzenes

The PMF plots of dihedral angles 1 and 2 in section 8.1 exhibit notable similarities to those of dihedral angles 5 and 6 in section 7.2 Both plots reveal two attraction basins on the left and right sides, aligning with the two minima for dihedral angle 1, which are approximately ±100° In contrast, the values near 0° represent the least thermodynamically favorable positions for this angle Additionally, the flexibility range for dihedral angle 2 has significantly decreased from ±150° to around ±50°, indicating reduced flexibility compared to dihedral angle 6 in section 7.2.

Figure 8.21 The contour plots of the free energy at 300 K projected onto the dihedral angle 1 and dihedral angle 2 reaction coordinates for REMD simulation of 8.1

Analysis of the PMF plots in Figures 8.14 and 8.16 indicates that the azobenzene unit linking the two molecular moieties is more prone to unfolding helical structures compared to the unit connecting the molecular ends This observation aligns with the finding that the structure defining dihedral angle 2, which is connected to the molecular end of 7.1, exhibits less flexibility in the molecular ends than in the moieties of azobenzene oligomers.

The hypothesis that the right half of structure 8.1 acts as a helical structural unit while the left half serves as a linkage is supported by PMF plots related to dihedral angles 3 and 4 These plots reveal four significant attraction basins, with two located along the diagonal, indicating that the segment defined by these dihedral angles can fold into either right- or left-handed helices Notably, the upper diagonal basin, representing the right-handed helix, is considerably larger than the lower basin for the left-handed helix, indicating a preference for right-handed helices Additionally, the off-diagonal basins suggest that the portion associated with dihedral angle 4 is more flexible compared to that of dihedral angle 3, which connects to the molecular end This observation reinforces the differences in azobenzenes when linked to the molecular end Furthermore, the attraction basin pattern in Figure 8.22 resembles that of Figure 7.16, which illustrates the PMF surface concerning dihedral angles 3 and 4 in structure 7.2.

Figure 8.22 The contour plots of the free energy at 300 K projected onto the dihedral angle 3 and dihedral angle 4 reaction coordinates for REMD simulation of 8.1

The structural units measuring 7.1 are fundamental components capable of forming helical structures in oligomers In contrast, the molecule 8.1, which comprises one and a half of these units, exhibits helical behavior on only one side, while the other side displays a different structural characteristic.

REMD simulations of 8.2

In a molecule with four pyridine rings, eight dihedral angles are established, while the distance between the geometric centers of adjacent rings serves as the reaction coordinate, resulting in three defined distances These parameters are illustrated in Figure 8.23.

If 7.1 is considered as a helical structural unit, and 8.2 contains two and half such structural units

Dihedral Angle 3 Dihedral Angle 4 Dihedral Angle 6 Dihedral Angle 5

In the WHAM analysis, eight dihedral angles are established by the carbon atoms of pyridine rings and the ring centers of azobenzenes, while three specific distances are defined by the centers of three pyridine molecules.

The PMF plots illustrated in Figure 8.24 reveal four attraction basins on the dihedral angles 1 and 2 surface, with two located on the diagonal and two off-diagonal The diagonal basins indicate that the structure can fold into right- or left-handed forms, with a notable preference for the left-handed helix due to the significantly larger lower diagonal basin compared to the upper one The sizes of the off-diagonal basins are nearly equal, although the left-side basin is slightly more thermodynamically favorable than the bottom one, indicating a distinct pattern from previous plots of similar structural units Furthermore, dihedral angle 2, which connects the molecular end, exhibits comparable flexibility to dihedral angle 1, which connects the molecular moieties.

The structural components defined by dihedral angles 8 and 7 exhibit symmetry with those defined by dihedral angles 1 and 2 Consequently, the PMF plots for dihedral angles 7 and 8 (Figure 8.25) are specifically designed for direct comparison with the PMF plots of dihedral angles 1 and 2 presented in Figure 7.24.

Figure 8.25 illustrates two diagonal attraction basins, with the lower basin, representing a left-handed helix, being significantly more thermodynamically favorable than the upper basin, which depicts a right-handed helix The patterns of diagonal basins in Figure 8.24 are similar, while the off-diagonal basins in both plots exhibit few similarities Notably, the end effect observed in simulations 6.2 and 7.1 is absent in these plots Dihedral angles 7 and 2 demonstrate comparable flexibility to angles 8 and 1 The comparison of structure 8.2 with others suggests that the increase in structural size may diminish this effect.

Figure 8.24 The contour plots of the free energy at 300 K projected onto the dihedral angle 1 and dihedral angle 2 reaction coordinates for REMD simulation of 8.2

Figure 8.25 The contour plots of the free energy at 300 K projected onto the dihedral angle 7 and dihedral angle 8 reaction coordinates for REMD simulation of 8.2

Figure 8.26 illustrates the PMF plots projected onto dihedral angles 3 and 4, revealing two diagonal attraction basins that indicate helical structures are the most favorable configurations Notably, the right-handed helix is significantly more favorable than its left-handed counterpart The plot also features three off-diagonal attraction basins, with two positioned at the far left and right, and a third at the top, along with a minor basin at the bottom These four off-diagonal basins exhibit symmetry concerning the diagonal line, reflecting the overall symmetry of this structural segment Additionally, the proximity of the structural portion defining dihedral angle 4 to the molecular end introduces a bias in the PMF surface towards certain regions.

Figure 8.26 The contour plots of the free energy at 300 K projected onto the dihedral angle 3 and dihedral angle 4 reaction coordinates for REMD simulation of 8.2

The PMF plots for dihedral angles 5 and 6, displayed in Figure 8.27, allow for direct comparison with Figure 8.26, as these angles are symmetric to dihedral angles 3 and 4 The contour in Figure 8.27 reveals a similar pattern to that in Figure 8.26, featuring four primary attraction basins: two diagonal and two off-diagonal The presence of the diagonal basins indicates that the structure defined by dihedral angles 5 and 6 tends to adopt a helical configuration during simulations Notably, the larger upper diagonal basin suggests a preference for the right-handed helix over the left-handed helix, indicating that the structural region associated with dihedral angles 5 and 6 samples the right-handed helix more frequently than its left-handed counterpart, consistent with the behavior of dihedral angles 3 and 4.

Adjacent to the diagonal attraction basins, the two off-diagonal basins are aligned vertically or parallel When a molecule transitions from a diagonal basin to an adjacent off-diagonal basin, either dihedral angle 5 or 6 varies, while the other remains constant These configurations within the attraction basins may act as intermediates for the conversion between two helical structures However, current analytical methods lack direct evidence to support this hypothesis.

At least, these attraction basins in the PMF plot revealed the flexibility of the folded helical structures in this foldamer

Figure 8.27 The contour plots of the free energy at 300 K projected onto the dihedral angle 5 and dihedral angle 6 reaction coordinates for REMD simulation of 8.2

Figures 8.26 and 8.27 illustrate that the segments defined by dihedral angles 3 and 4, as well as 5 and 6, function as helical structure units in this simulation These adjacent units exhibit distinct patterns in their PMF plots during simulation 8.1, while in simulation 8.2, they show similar patterns that reflect the structure's symmetry The primary distinction between simulations 8.1 and 8.2 lies in the molecular size This analysis suggests that each unit, comprising one pyridine ring and adjacent azobenzenes, mimics the behavior of helical structural units found in larger oligomers.

The PMF plots at 300 K for distances 1 and 2, illustrated in Figure 8.28, reveal minimal coupling in this simulation, characterized by a prominent attraction basin located at the lower left corner, with ranges between 5 Å and 9 Å for both distances and upper limits around 15 Å This pattern resembles the PMF plots of distances 1 and 2 presented in section 7.2 (Figure 8.19) Additionally, similar trends are observed in the PMF plots for distances 1 and 3 or 2 and 3 (not shown), indicating that the distances between adjacent pyridine rings exhibit little coupling, thereby offering limited insights into the folding and unfolding of helical structures.

Figure 8.28 The contour plots of the free energy at 300 K projected onto the distance 1 and distance 2 reaction coordinates for REMD simulation of 8.2

Conclusions

This chapter analyzes the REMD simulations of four azobenzene oligomers (7.1, 7.2, 8.1, and 8.2) using the T-WHAM method, revealing that all structures can form right- or left-handed helices, both of which are minima on the PMF surfaces The 7.1 structural unit serves as the fundamental element capable of folding into helical forms, while the end units display distinct behavior compared to the rest of the molecule, with this end effect diminishing as oligomer size increases Statistically, oligomers with even and odd numbers of pyridines exhibit similar behaviors Furthermore, as oligomer size increases, all units comprising a pyridine ring and adjacent azobenzene moieties function as basic helical structural units Notably, the interconversion between right- and left-handed helices can occur independently of the molecular ends, as each helical structural unit can locally unfold and adopt alternate helical configurations.

In summary, the oligomers can form distinct helical structures, which can be either right- or left-handed The local interconversion between these helices may happen through the unfolding of structural units This understanding of azobenzene oligomers offers valuable insights that can aid in the development of improved foldamers with tailored properties.

Introduction

Oligosaccharides are crucial in various biological processes and have recently garnered significant interest In organic chemistry, there is a growing focus on developing stereocontrolled synthesis methods for these compounds While previous research concentrated on oligosaccharides with pyranose residues, the synthesis of those containing furanose residues is now receiving increased attention.

Mycobacterium tuberculosis and Mycobacterium leprae are responsible for tuberculosis and leprosy, respectively Both pathogens feature oligosaccharides in their cell walls, with furanoses serving as the primary carbohydrate units.

Oligofuranosides contribute to the low permeability of drugs through the Mycobacterium tuberculosis cell wall, complicating tuberculosis treatment This impermeability is exacerbated by mycolic acids tightly packed and anchored to the cell membrane by an oligosaccharide of arabinofuranose residues In contrast, pyranose rings are more prevalent in mammalian biology due to their thermodynamic stability This difference positions mycobacterial oligosaccharides as key targets for drug design aimed at effectively combating tuberculosis Our objective is to develop synthetic techniques for these oligosaccharides while gaining insights into cell permeability and conformational effects, ultimately enhancing drug development to significantly improve disease treatment.

Furanosides have been largely neglected, leading to significant challenges in their synthesis The synthesis of 1,2-cis furanosides is particularly difficult compared to that of 1,2-trans furanosides.

A novel methodology has been developed for the synthesis of β-arabinofuranosides, achieving a high degree of stereoselectivity The process begins with the reaction of glycosyl sulfoxide with triflic anhydride, followed by the addition of an alcohol, leading to the formation of the desired glycoside The subsequent step involves the regioselective opening of an epoxide ring using LiOBn, with the regioselectivity significantly influenced by the presence of (-)-sparteine.

In the case of 9.2, this process leads to opening at the C3 rather than the C2 position of the anhydrosugar, resulting in the final β-arabinofuranoside, 9.3

The epoxide ring-opening reaction mechanism for furanosides remains underexplored Research indicates that regioselective ring opening, specifically at C3, occurs exclusively when a lithium alkoxide serves as the nucleophile alongside (-)-sparteine Notably, earlier studies by Lowary and colleagues revealed that the chirality of sparteine does not influence this regioselectivity.

(Scheme 9.2) This mechanism suggests that the complex could selectively activate nucleophilic attack at C3 due to the tight coordination of sparteine and Li + to the top face of the epoxide

The orientation of the group at the C5 position significantly impacts the stability of the Li+ complex and the preference for attack at the C3 position In our initial study, we explored the reaction pathways for C2 and C3 epoxide ring opening with various C5 substituents, both in the presence and absence of lithium cations To further investigate how sparteine catalyzes the epoxide ring-opening reaction, we utilized density functional theory (DFT) to identify the transition states for the C2 and C3 reactions involving different furanoses Additionally, we employed molecular dynamics (MD) methods to simulate systems with diverse C5 substituents, enhancing our understanding of conformational flexibility and Li+ cation binding Moreover, the ab initio molecular dynamics method was applied to simulate the epoxide ring-opening process using a direct dynamics protocol.

Methods

Epoxide ring-opening reactions

In this study, various systems were analyzed to investigate the influence of different components in the epoxide ring-opening reaction, specifically focusing on the preference for C3 over C2 attack To optimize computational efficiency, N,N,N’,N’-tetramethylpropanediamine was utilized as a simplified model for sparteine The DFT calculations were conducted on a model system comprising various furanose rings, a lithium ion, and the diamine Due to the high computational demands of larger models, only a select number of structures included explicit (-)-sparteine for examination.

Geometry optimizations for minima and transition states were conducted using the B3LYP/6-31G* level of theory in the Gaussian 98 program suite The vibrational frequency calculations for each transition state revealed only one imaginary vibrational frequency, confirming their nature To validate the reactants and products associated with each transition state, the geometry was displaced by 10% along the normal coordinate of the imaginary frequency and optimized towards the reactants and products This method established the connections for the epoxide ring-opening process Additionally, the vibrational frequency calculations provided a scaled zero-point vibrational energy correction, along with unscaled thermal and entropic corrections to the free energy.

311+G** level of theory were performed for each of the stationary points found at the lower level of theory

Single-point energies were calculated for stationary points using the polarizable continuum model (PCM) at the B3LYP/6-311+G**//B3LYP/6-31G* level of theory to assess the impact of implicit solvation This analysis focused on systems involving (a) the lithium cation positioned over the ring, (b) the diamine with lithium, and (c) those containing (-)-sparteine Methanol was selected as the solvent model to align closely with experimental data However, a limitation of the DFT approach was the exclusion of explicit solvent molecules, prompting consideration of a molecular dynamics approach for further analysis.

Molecular Dynamics simulation of sparteine, sugar ring

In this study, five furanose rings and (-)-sparteine were constructed and optimized using MacroModel version 6.5, followed by the integration of these rings with the (-)-sparteine Li+ complex to create systems analogous to structure 9.5 for molecular dynamics (MD) simulations Mulliken atomic charges for each molecule were calculated using ab initio methods at the B3LYP/6-31+G* theoretical level The MD simulations were conducted with the AMBER7 package, utilizing various force fields tailored to different components of the system; specifically, the GLYCAM force field was employed for the sugar rings, while the general AMBER force field (GAFF) was used for the (-)-sparteine unit Methanol was explicitly included as a solvent in the simulations, with an octahedral box set up for periodic boundary conditions, maintaining a distance of 12 Å between the solute and the box sides Each MD simulation commenced with a minimization of the solvated system using the sander module.

The system was gradually heated from 0 to 300K while maintaining a constant volume during the four runs For the production MD run, conditions were set at a constant pressure of 1 bar and a temperature of 300K To ensure stability, SHAKE bond length constraints were applied to all hydrogen bonds, and a cutoff value of 8.0 Å was utilized for nonbonded interactions.

Figure 9.1 Furanose Rings with Different C5 Substituents

CPMD simulation method

To dynamically observe the epoxide ring-opening reaction, we utilized the Car-Parrinello molecular dynamics method implemented in the CPMD 3.9.1 package This approach allowed us to generate ab initio molecular dynamics simulations based on the BLYP functional for the selected systems We employed Goedecker-type norm-conserving pseudopotentials, with electronic wave functions expanded in plane waves up to a cutoff of 70 Ry Additionally, a fictitious mass of 600 a.u was assigned to the plane-wave coefficients, and we selected an appropriate time step for the numerical integration of the equations of motion.

0.0968 fs The simulations were run until ring opening of the epoxide was observed The simulation times ranged from 38 to 97 fs.

Results and Discussion

Gas-phase and PCM potential energy surfaces

To investigate the functionality of sparteine with affordable computational cost,

N,N,N’,N’-tetramethylpropanediamine was used as a model system to form a complex as shown in Figure 9.2 Calculations were limited to compounds 9.6, 9.7, and 9.8 with C5 substituents of O - , OH and NH 2 , respectively, to determine how well the preferences for ring-opening in these systems were predicted before performing calculations on the systems that have not been studied experimentally at this time In this vein, we hoped to direct the targeted synthesis of additional derivatives for the future

In our study of specific species, we focused solely on the gg rotamers around the C4-C5 bond, as this conformer was identified as the most stable and lowest energy transition state in our lithium-complexed calculations However, our computational results did not align with the experimental trends observed, particularly for compound 9.6, where gas-phase calculations indicated a preference for C-2 attack by 0.4 – 0.9 kcal/mol For compound 9.8, while the predicted preference for C-3 attack was recognized, it was only favored by up to 1.4 kcal/mol, failing to explain the exclusive product formation from C-3 attack noted experimentally Attempts to incorporate solvation effects using PCM single-point energy calculations with methanol did not yield accurate predictions either, as the preference for C-2 attack in 9.6 was significantly reduced to 0.2 kcal/mol Current experimental investigations are assessing the impact of a model diamine compared to (-)-sparteine as an additive, but preliminary data suggests that the propanediamine model does not adequately represent the product formation observed.

Figure 9.2 Transition states with N,N,N’,N’-tetramethylpropanediamine for C2 and C3 attack of the ring-opening of the epoxide at the B3LYP/6-31G* level

C-5 substituent δΔE BW ts Gas Phase a δΔH O ts Gas Phase a δΔG 298 ts Gas Phase a

In the experimental analysis of the C3 and C2 attacks, the differences in reaction outcomes are expressed as (C3 attack) - (C2 attack), with negative values indicating a preference for the C3 attack Notably, all results were obtained using Li+ and (-)-sparteine as reagents Furthermore, all transition states examined were characterized by the gg orientation around the C4-C5 bond For a visual representation of the structures involved, refer to Figure 9.2.

Table 9.1 Transition-state preferences for C3 vs C2 ring-opening of the epoxide for the anion, Li + , and tetramethylpropanediamine model at the B3LYP/6-311+G**//B3LYP/6-

31G* level of theory (kcal/mol)

A simple tetramethylpropanediamine model may not accurately represent the structure of (-)-sparteine, prompting an examination of the complete system This analysis revealed a complex, as illustrated in Figure 9.3 Calculations were limited to systems with available experimental data (9.6, 9.8, and 9.11) due to the high costs associated with transition-state searches at the B3LYP/6-31G* level for large molecules Results summarized in Table 9.2 indicate that gas-phase calculations failed to predict appropriate trends across all systems Notably, for system 9.8 (C5=NH2), C-3 attack was correctly predicted to be favored by approximately 1 kcal/mol (ΔG298) In contrast, for 9.6 (C5=O -), C-2 attack was favored by 0.9 kcal/mol at 298K Additional single-point energies calculated using PCM with methanol as a solvent generally aligned with attack preferences, though discrepancies were noted For 9.6, C-3 attack was favored by about 0.7 kcal/mol over C-2, while for 9.8, PCM calculations incorrectly suggested a preference for C-2 attack by 0.8 kcal/mol, contradicting experimental observations where C-3 attack yielded the only observable product.

Figure 9.3 Transition states with (-)-sparteine for C2 and C3 attack of the ring-opening of the epoxide at the B3LYP/6-31G* level

Furanose C-5 substituent δΔE BW ts Gas Phase a δΔH O ts Gas Phase a δΔG 298 ts Gas Phase a

In the analysis of the differences between C3 and C2 attacks, a negative value indicates a preference for the C3 attack, as shown in the data (11 N 3 -1.10 -1.32 -2.06 2.53 1.0 : 8.0) The experimental results were conducted using Li+ and (-)-sparteine, with all transition states exhibiting a gg orientation around the C4-C5 bond Refer to Figure 9.3 for representative structures related to these findings.

Table 9.2 Transition-state preferences for C3 vs C2 ring-opening of the epoxide for the anion, Li + , and (-)-Sparteine model at the B3LYP/6-311+G**//B3LYP/6-31G* level of theory (kcal/mol)

Density Functional Theory (DFT) struggled to reliably predict the attack preference of an epoxide by an alkoxide and a lithium ion Various model systems were tested to assess the influence of different reaction components, but none successfully captured the observed trends This suggests that explicit solvation may significantly impact the experimental preferences, whereas our analysis only incorporated implicit solvation through the PCM method Therefore, we aimed to investigate the effects of explicit solvation in this intricate system.

Molecular Dynamics simulations

Despite using DFT methods to identify the transition states for C2 and C3 attack during the epoxide ring-opening, the calculated preferences did not align with experimental findings, leaving the regioselectivity influenced by (-)-sparteine inadequately explained To further investigate the conformational flexibility and the impact of explicit solvation, we conducted MD simulations on complexes formed by (-)-sparteine and Li+ with five furanose rings, labeled MD9.6 to MD9.10 for clarity The specifics of the MD setup for each complex are detailed in Table 9.3.

System (with (-)- sparteine and Li + ) a

Length of the simulation [ps]

MD9.10 76 461 41.2 0.614 2000 10 a See Figure 9.1 for structures

9.3.2.1 Molecular Dynamics Simulation for MD9.6 ( C5=O - )

Experimental results indicate that (-)-sparteine is an effective and stereoselective catalyst when paired with Li+ Monte Carlo conformational sampling using the MacroModel program reveals that (-)-sparteine adopts a "V" shape, which effectively accommodates the Li+ ion Stability of the (-)-sparteine-Li+ complex in methanol solution was confirmed through a 2ns molecular dynamics simulation, showing strong coordination of Li+ to the amine nitrogen of (-)-sparteine, with distances between Li+ and the nitrogen atoms (N1 and N2) ranging from 2.0 to 2.5 Å While coordination to (-)-sparteine restricts Li+'s mobility compared to its free state in methanol, the structure and flexibility of (-)-sparteine allow Li+ some movement around the furanoside substrate This balance of constraints is crucial for the catalytic ring-opening mechanism The N1-Li-N2 angle, which reflects the orientation of the two branches of (-)-sparteine, varies primarily between 110° and 120°, with maximum values reaching 100° and 130°, indicating the flexibility of the sparteine-Li+ complex Notably, the Li-N distances remain stable, suggesting that larger N1-Li-N2 angles correlate with shorter distances between (-)-sparteine and Li+.

Figure 9.4 Atomic pair distance fluctuation in MD simulation of MD9.6: Li-N1 (black),

Li-N2 (red), Li-O5 (green), Li-OE (blue) and Li-OF (violet)

Figure 9.5 N1-Li-N2 angle for (-)-sparteine and Li + complexes for (a) MD9.6 (black) and MD9.7 (red); (b) MD9.8 (black), MD9.9 (red) and MD9.10 (green)

In the MD9.6 system, the substrate 9.6 (C5=O -) carries a negative charge on the deprotonated hydroxyl group, leading to a strong electrostatic interaction with the lithium cation This interaction significantly influences the orientation of the (-)-sparteine-Li+ complex associated with 9.6 As illustrated in Figure 9.4, the distance between the lithium ion (Li+) and the negatively charged oxygen at C5 (O5) remains consistently stable, ranging from 1.8 to 2.0 Å.

MD simulations using Visual Molecular Dynamics (VMD) revealed that the catalytic complex of (-)-sparteine-Li+ predominantly occupies the C5 side of the furanose ring, influencing the orientation of O5 relative to the ring Analysis of the C3-C4-C5-O5 dihedral angle indicates that the C4-C5 bond can only access two of its three rotational minima, with both orientations directing the oxygen atom away from the furanose ring This biased distribution arises from significant steric hindrance from (-)-sparteine and the C1-OCH3 glycosidic group Consequently, Li+ is unable to approach the furanose ring or the epoxide oxygen (OE) effectively The Li-OE distance distribution indicates that while Li+ generally remains distant from OE, it can approach within 2.5 Å, where strong coordination interactions occur In contrast, the distance between Li+ and the furanose ring oxygen (OF) hovers around 5 Å, showing minimal correlation with the Li+ and OE distance distribution.

Figure 9.6 Distribution of exocyclic dihedral angles from MD simulations for MD9.6

(black), MD9.7 (red), and MD9.10 (green)

At the start of the simulation, the coordination number of Li+ is three As the distance between Li+ and the oxygen atom (OE) decreases to 2 Å, the coordination number increases to four Coordination numbers below five are attributed to the strong electrostatic interaction between Li+ and the O- from the C5 substituent in MD9.6 This interaction limits the flexibility of the (-)-sparteine and Li+ complex, resulting in restricted access for Li+ to the OE, leading to attacks predominantly from the C3 side.

To observe the dynamic behavior of the same system with absence of the electrostatic interaction, we ran a similar MD simulation on the MD7 system

9.3.2.2 Molecular Dynamics Simulation for MD9.7 (C5=OH)

The MD9.7 (C5=OH) complex demonstrated stability throughout the simulations, while sparteine and Li+ exhibited significant flexibility due to the absence of strong electrostatic interactions The N1-Li-N2 angle revealed two primary distributions, averaging approximately 100° and 120°, indicating two stable conformations for the (-)-sparteine-Li+ complex The distances between Li+ and the nitrogen atoms remained between 2.0 Å and 2.5 Å, suggesting that these angle distributions correspond to distinct positions of (-)-sparteine relative to Li+ A smaller N1-Li-N2 angle places Li+ further from (-)-sparteine but closer to the furanose ring, while a larger angle results in the opposite arrangement.

Figure 9.7 Atomic pair distance fluctuation in simulation of MD9.7: (a) Li-N1 (black), Li-N2 (red), and Li-O5 (green); (b) Li-OE (black), Li-OF (red) and Li-OG (green)

By positioning N1 and N2 closer together, the sparteine fingers can effectively push the Li+ ion away, directing it towards the substrate as the N1-Li-N2 angle decreases Conversely, as the angle increases, the interaction changes, causing fluctuations in the distance between Li+ and the OE.

In our study, we observed two contact modes with the substrate, identified as 3 Å for close contact and 4 Å for far contact (Figure 9.7b) Our DFT calculations revealed that during the epoxide ring-opening reaction, the distance between Li+ and the oxygen atom (OE) is approximately 2 Å This close proximity facilitates coordination between OE and Li+, which is crucial for initiating the epoxide ring-opening process.

In the MD9.7 simulation (C5=OH), the coordination number of Li+ is observed to be between 4 and 5, which is an increase compared to the MD9.6 simulation The two nitrogen atoms from (-)-sparteine consistently bind tightly to Li+, while the furanose ring oxygen (OF) remains a stable coordinator, maintaining a distance of under 3 Å from Li+ during most of the simulation The distances between Li+ and the oxygen atoms from the glycoside methoxy group (OG) and OE fluctuate similarly, with the closest distances around 2 Å, allowing the coordination number to reach 5 However, OF is a stronger coordinator than OG and OE, likely due to less steric hindrance around OF, which accommodates (-)-sparteine Additionally, the C3-C4-C5-O5 dihedral angle indicates that the hydroxyl group can rotate over the furanose ring, maintaining this orientation throughout the simulation The relative dihedral angle distributions between MD9.6 and MD9.7 show that the glycosidic methoxy group remains outside the ring to avoid interference with the sparteine unit, keeping the anomeric oxygen accessible to Li+.

9.3.2.3 Molecular Dynamics Simulation for MD9.8 (C5=NH 2 )

Replacing the C5 hydroxyl group in 9.7 (C5=OH) by an amino group provided

9.8 (C5=NH2) to form system MD9.8 This substitution seemed to decrease the stability of the complex After less than one nanosecond, 9.8 was released from the (-)-sparteine-

Before the complex's release, the distances from Li+ to three oxygen atoms measured between 2-3 Å, indicating strong coordination In contrast, the distance between Li+ and the nitrogen from the C5 amino group ranged from 3.5 Å to 6.5 Å, significantly larger than the Li-O5 distances.

The interaction between Li+ and the amino nitrogen in MD9.7 (C5=OH) appears to be weaker than the Li-O5 interaction, resulting in the substrate's release from the sparteine-Li+ complex.

The release of 9.8 (C5=NH2) revealed an intriguing conformational pattern in the (-)-sparteine-Li+ complex, characterized by two minima in the N1-Li-N2 angle at average values of 90° and 120° In the 90° configuration, Li+ is positioned near substrate 9.8, while in the 120° configuration, it is closer to (-)-sparteine The bond length between Li+ and the nitrogen atoms is approximately 2.2 Å, with the distance from Li+ to the line connecting N1 and N2 decreasing by about 0.5 Å as the angle shifts from 90° to 120° This indicates that changes in sparteine conformation can effectively move the Li+ coordination unit by up to 1 Å along the bisector of the angle formed by Li+ and the nitrogen atoms, aligning with the epoxide oxygen Experimentally, (-)-sparteine has been shown to selectively catalyze the epoxide ring-opening reaction at the C3 position Previous MD simulations indicated that the sparteine-Li+ complex predominantly resides on the C3 side of the sugar ring, influenced by steric hindrance from the glycosidic methoxy group at C1, which may lead to a preference for nucleophilic attack below the C3 side, resulting in C3 ring-opening.

MD simulations reveal distinct positions for Li+ in the C2 and C3 ring-opening reactions, as illustrated in Figure 9.3 The (-)-sparteine group plays a dual role in catalysis: it facilitates C3 selectivity by positioning Li+ on the top face of the anhydrosugar and closer to the C3 side, enhancing the likelihood of the C3 ring-opening process This is achieved through the coordination of Li+ with the two nitrogen atoms in sparteine Additionally, (-)-sparteine can manipulate its arms to direct Li+ towards the epoxide oxygen, enabling Li+ to approach from the C3 side effectively.

Analyzing the trajectories of the sparteine-Li+ complex reveals the significance of its flexibility, particularly in relation to its two conformations This study focuses on the scissoring motion of (-)-sparteine as it separates from the substrate, highlighting the dynamic interactions that influence the complex's behavior.

The inversion of a nitrogen atom in the sparteine unit resembles the conversion between chair and boat conformations of the six-membered ring shared by rings 1 and 2 While the other rings retain their chair conformation, this nitrogen inversion significantly impacts the coordination of Li+ ions and their orientation towards the epoxide oxygen Throughout the molecular dynamics simulation, (-)-sparteine oscillates between these two conformations, causing Li+ to shift back and forth in relation to the furanose ring.

9.3.2.4 Molecular Dynamics Simulation for the System MD9.9 (C5= C 6 H 5 CH 2 OCH 2 O- )

By replacing the amino group in 9.8 (C5=NH2) by a relatively large group,

C 6 H 5 CH 2 OCH 2 O-, we built our fourth glycoside 9.9 to form the system MD9.9

ab initio Molecular Dynamics Simulation Results

In our previous discussion, we explored the classical molecular dynamics (MD) results of five systems using a molecular mechanics force-field approach However, this empirical force field does not allow for the observation of bond formation or breakage during MD simulations To accurately capture the epoxide ring opening, a higher-level theoretical treatment is necessary Thus, we employed the Car-Parrinello MD (CPMD) program in conjunction with the DFT method to simulate the ring-opening reactions in our selected systems Our primary focus is to determine whether the scissoring motion of the N1-Li-N2 configuration in the (-)-sparteine unit is replicated during the CPMD ring-opening process, as the lithium ion effectively catalyzes this reaction by coordinating with the oxygen atom.

Due to the high computational demands of CPMD simulations, explicit solvation was not feasible, leading to a focus on the dynamical properties of the (-)-sparteine-Li + unit In our study, we incorporated the nucleophile CH3O- into the existing systems MD9.7 to MD9.10, resulting in four new complexes (CPMD9.7 to CPMD9.10) for epoxide ring-opening reactions Notably, the epoxide ring-opening process was observed across all four simulations, with some resulting in C2 attacks and others in C3 attacks However, due to the high computational cost, we could not conduct a statistically significant number of trajectories to assess the C3:C2 selectivity for epoxide ring-opening; instead, we compared the CPMD trajectories with the classical MD simulations previously described.

In the CPMD simulations, the Kohn-Sham (K-S) energy exhibited minor fluctuations prior to the initiation of epoxide ring-opening reactions, as illustrated in Figure 9.11 Specifically, the simulations labeled CPMD9.7 (C5=OH) and CPMD9.8 (C5=NH2) demonstrated C2 ring-opening behavior.

CPMD9.9 (C5=C6H5CH2OCH2O-) and CPMD9.10 (C5=C5NH4CH2O-) exhibited C3 ring-opening behavior Building on prior molecular dynamics simulations of the targeted (-)-sparteine-Li + substrate complexes, our focus will be on analyzing the geometric characteristics of the entire system during the epoxide ring-opening process.

Figure 9.11 Relative potential energy fluctuation during CPMD simulations: CPMD9.7 (C5=OH, black), CPMD9.8 (C5=NH2, red), CPMD9.9 (C5=C6H5CH2OCH2O-, green) and CPMD9.10 (C5=C5NH4CH2O-, blue)

In the CPMD9.7 simulation involving C5=OH and the methoxide (OM) nucleophile, the initial distance between Li+ and OE was 3.5 Å As the simulation progressed, Li+ rapidly approached OE, and before the distance fell below 2 Å, there was little variation between the OM-C2 and OM-C3 distances, both remaining stable Throughout this time, the OE-C2 and OE-C3 distances fluctuated around 1.5 Å Once Li+ got closer than 2 Å to OE, the OM-C2 and OM-C3 distances began to decrease, with the OM-C2 distance experiencing a significant drop that prompted an attack on C2 Concurrently, the OE-C2 distance increased from 1.5 Å to 2.5 Å, while the OE-C3 bond length remained relatively unchanged at about 1.5 Å.

Analysis of the CPMD trajectory yielded interesting insights The (-)-sparteine-

The Li+ complex exhibits a narrow N1-Li+-N2 angle, ranging from 64° to 97°, as analyzed in the MD simulation of MD9.8 (C5=NH2) The angle achieves its maximum when Li+ is positioned approximately 3.5 Å away from OE, while the minimum angle occurs when the Li-OE distance is around 2 Å.

In the CPMD9.8 (C5=NH2) simulation, the epoxide ring opening occurred at C2, where the OM nucleophile attacked C2, resulting in the concerted breaking of the C2-OE bond Following the epoxide ring opening, Li+ approached OE, and upon completion of the reaction, the negatively charged OE exhibited a strong electrostatic interaction with Li+ The orientation of the sparteine unit was consistent with Figure 9.8b.

In the CPMD9.9 (C5=C6H5CH2OCH2O-) simulation, the Li-OE distance fluctuated around 2 Å at the beginning From the trajectories, a short distance between

In the CPMD9.9 simulation involving Li+ and OE with the substrate C5=C6H5CH2OCH2O-, steric hindrance between the sparteine and the larger benzyl group resulted in a shorter Li-OE distance compared to CPMD9.7 (C5=OH) and CPMD9.8 (C5=NH2) Despite this reduced distance, the ring opening occurred approximately twice as slowly in CPMD9.9 than in the other simulations, indicating that steric effects do not necessarily correlate with the rate of reaction.

In the simulation of CPMD9.8, the nucleophilic OM initially maintained equal distance from both C2 and C3 As the simulation progressed, OM moved closer to C3, ultimately forming a carbon-oxygen bond with it Throughout this process, the (-)-sparteine-Li complex retained its conformation, as illustrated in Figure 9.8b.

(a) CPMD9.7 (R=OH) OE-C2 (black), OE-

C3 (red), OE-Li (green), OM-C2 (orange) and OM-C3 (cyan)

C2 (black), OE-C3 (red), OE-Li (green),

OM-C2 (blue) and OM-C3 (orange)

(b) CPMD9.8 (R= NH 2 ) OE-Li (black), OE-C2

(red), OE-C3 (green), OM-C2 (blue) and OM-

(d) CPMD9.10 (R= C 5 NH 4 CH 2 O-) OE-C2 (black), OE-C3 (red), OE-Li (green), OM-C2 (blue) and

Figure 9.12 Atomic pair distance fluctuation in CPMD simulations for (a) CPMD9.7, (b)

CPMD9.10 (C5=C 5 NH 4 CH 2 O-) has a similar structure, and shows similar behavior with CPMD9.9 (C5=C6H5CH2OCH2O-) The Li-OE distance fluctuated around

During the simulation, the distance remained at 2 Å, though to a lesser extent than in CPMD9.9 By the end of the simulation, OM was found to be closer to C3 than C2, resulting in the opening of the C3 epoxide ring Additionally, the NP from the pyridine ring exhibited strong coordination.

The distance between Li+ and the nanoparticle (NP) varied between 2-2.5 Å, while the pyridine ring of (-)-sparteine was oriented nearly perpendicular to the plane formed by the two nitrogen atoms and the top carbon This specific orientation resulted in (-)-sparteine positioning itself further away from both Li+ and the substrate compared to the previous three scenarios.

Conclusions

This study utilized three computational methods to investigate the regioselective epoxide ring opening of glycosides catalyzed by the (-)-sparteine-Li+ complex Initially, DFT was employed to identify transition states for C2 and C3 ring-opening reactions, using N,N,N’,N’-tetramethylpropanediamine as a simpler model for (-)-sparteine due to high computational demands The transition states for both C2 and C3 openings exhibited similar structures, with Li+ coordinating to the epoxide oxygen and shifting towards the target carbon, while the nucleophile was positioned below the target carbons The analysis revealed no significant preference for C3 opening in the tetramethylpropanediamine model Following this, (-)-sparteine was used for more selective transition-state calculations, which yielded results akin to those from the diamine model, failing to elucidate the catalytic functionality of (-)-sparteine in light of existing experimental data.

MD simulations were conducted on five selected systems (MD9.6-MD9.10), revealing that (-)-sparteine, with its two nitrogen atoms coordinating to Li+, effectively stabilizes the ion Notably, the simulations showcased interesting conformational changes of (-)-sparteine, which exhibited two distinct conformations By alternating between these conformations, (-)-sparteine can manipulate the position of Li+ relative to the furanose ring and the epoxide oxygen This biased positioning, influenced by the steric effects of nearby groups, alters the preferred orientations of Li+, ultimately leading to selectivity in the C2 or C3 ring-opening reactions.

The ab initio molecular dynamics (MD) method was employed to analyze selected systems involving an anionic nucleophile, revealing the epoxide ring-opening process across all simulations The study focused on the dynamical properties of (-)-sparteine during this process, comparing them with classical MD simulations Conformational analysis indicated that the (-)-sparteine unit maintained a restricted N1-Li+-N2 angle during ring opening, which effectively reduced the Li+-epoxide oxygen distance to approximately 2 Å This finding aligns with our density functional theory (DFT) transition state calculations, confirming that the distance between Li+ and the epoxide oxygen was similarly around 2 Å at the onset of ring-opening reactions.

This agreement leads us to conclusions about two functional capabilities of (-)- sparteine during these epoxide ring-opening reactions First, (-)-sparteine coordinates the

The interaction between Li + and the Lewis acid forms a robust complex with two nitrogen atoms, where the (-)-sparteine-Li + complex exhibits a distinct scissoring motion This motion, combined with the oscillatory movement of Li + along the bisector of the angle created by (-)-sparteine's two arms, allows (-)-sparteine to precisely control the positioning of Li + in relation to the epoxide ring oxygen The presence of steric hindrance from various substituents on the furanose ring further influences the orientation of the Li + Lewis acid towards the epoxide oxygen DFT transition-state calculations indicate that the opening of the epoxide ring at C2 or C3 is favored when Li + is coordinated on the corresponding side of the epoxide ring oxygen Consequently, the (-)-sparteine-Li + complex effectively catalyzes the epoxide ring-opening reaction with selectivity for C2 or C3 attack, depending on the scissoring angle of N1-Li-N2, the distance of Li + from the epoxide oxygen, and the approach of Li + influenced by substituents on the sugar ring.

Ngày đăng: 02/10/2024, 01:50

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w