Two-dimen- sional, exchange-mediated magnetization transfer experiments (see Section 3.2.3.) can be used for the assignment of the spectrum of a reversibly unfolded prot[r]
(1)Introduction
to Nuclear Magnetic Resonance
Christopher Jones and Barbara Mulloy
1 Introduction
This brief guide is not intended as a full explanation of the theory and practice of nuclear magnetic resonance (NMR), on which there are a large number of excellent texts (I-3), but as an introduction to the terms used in the subsequent chapters The section as a whole does not provide a comprehensive outline of the NMR of organic compounds, which would be out of place in this volume, but is a selection of particular applications likely to be of use to molecular biologists and biochemists Over the last few years, the number of publications deal- ing with NMR determinations of protein and peptide conformation in solution has increased dramatically, and this is reflected in the amount of space given here to the subject in Chapters and The use of NMR m the study of internal mobility in proteins and in interactions between molecules is covered in Chapter Chapters and deal with struc- tural studies on complex carbohydrates, which have thrived on recent advances in NMR Nucleic acids and their interactions are covered in Chapter
2 Basics of NMR
When the sample is placed in a magnetic field, the nuclei of some of its constituent atoms (usually ‘H, but r3C, 15N, 19F, 31P, and 2H are also commonly encountered in biomedical research) are forced into
From Methods m Molecular &o/ogy, Vol 17 Spectroscopic Methods and Analyses NMR, Mass Spectromefry, and Mefalloprotem Techmques Ed&d by C Jones, B Mulloy,
and A H Thomas Copynght 01993 Humana Press Inc , Totowa, NJ
(2)alignment with the field In this state, the absorption or emission of electromagnetic radiation with a suitable, resonant frequency becomes possible The frequency of the absorbed energy is directly propor- tional to the strength of the magnetic field, so the resonance condition can be achieved either by scanning the frequency of the electromag- netic radiation at constant field (as effectively happens in modern Fourier transform [FT] spectrometers) or by scanning the magnetic field at a constant irradiation frequency (as usually happened in older continuous-wave [CW] instruments) The nomenclature of NMR is complicated by the fact that both options are enshrined in the terminoi- ogy independently of the experimental setup used
3 Fourier Transform NMR
The older CW instruments utilized a monochromatic irradiation frequency and observed an absorption spectrum, whereas FT machines use a broad-band pulse of radiation to equalize the populations of the high- and low-energy states and then observe an emission spectrum This pulse methodology has proven extremely powerful in its practi- cal application, and the subsequent chapters assume that such instru- ments are available
A significant advantage of the FT method is that, since the whole spectrum is acquired in a few seconds following a single pulse, data from many such acquisitions may be added together to give much improved signal-to-noise ratios and sensitivity Without this advan- tage, the use of relatively insensitive nuclei, such as i3C, in studies of biological samples would be impossible
4 The 1D Spectrum
The NMR peak as usually seen in the one-dimensional (1D) spectrum can be characterized by four basic parameters: the frequency (or field) at which resonance occurs, the intensity of the peak, couplings to other nuclei as revealed by the multiplicity of the peak, and a series of para- meters, such as linewidth, based on relaxation behavior (see Fig 1)
4.1 Chemical Shift
(3)-B)-a-ManNAc-(l-OPO,- ;OAc
B
Chmkal ShlH (ppm)
Spin-spin coupling
4’5 -‘ Chemical shift
(4)LOW FREQUENCY
LOW FIELD
DESHIELDED
HIGH FRECUENCY
HIGH FIELD
SHIELDED
TMS reeonance
+
I
10 ppm mm
498.8e5,ooo Hz Reeonance frequency 500.000,000 Hz
Fig The chemical shift scale for protons in a 500-MHz spectrometer (in which
the magnetic field is 11.744 T) The frequency of the tetramethylsilane (TMS)
resonance is taken as an arbitrary reference point, and other frequencies are expressed in terms of parts per million (ppm) of this frequency
(5)field of their own when placed in an external field, which affects the chemical shift of nearby nuclei in a manner that depends on the geom- etry of the system, This is referred to as ring-current anisotropy
In a given magnetic field, different types of nuclei resonate at dif- ferent frequencies, and instruments are usually described by therr pro- ton frequency (e.g., a500-MHz instrument) In such an instrument 13C nuclei resonate near 125 MHz and 31P nuclei near 202 MHz Thus, the individual spectra are widely separated in frequency The typical widths of the spectra of different elements can be very different, too: Most of the ‘H spectrum 1s between and 10 ppm, whereas the r3C spectrum occupies 200 ppm The chemical shift of a resonance is dependent primarily on its local chemical environment and less critically on geometric factors
4.2 Intensity
With reasonable care in the choice of experimental conditions, the intensity (integral) of a resonance is proportional to the number of nuclei contributing to it Thus, integration of parts of the spectrum can be used to show how many of each type of nucleus are present within a complex molecule, or to quantify the amounts of two or more distinct chemical entities
4.3 Coupling
A single resonance may be split into several separate lines by the influence of the spins of nearby nuclei This interaction, called J- or scalar-coupling, is small and independent of the applied magnetic field (and so is quoted in Hertz), and operates throughchemical bonds The magnitude of the splitting depends on the number of chemical bonds involved and the geometry of the interaction
(6)3Jc~m,~~ = 17 - 127COd + 41COS%
Fig The Karplus relatlonshlp between the three-bond proton-proton coupling constant (3JH,H) between the a and NH protons of ammo acids in peptldes, and the dihedral angle between the C-H and N-H bonds as characterized m ref The curve IS symmetrical about 180”
coupling is small, the populations of the two levels are only slrghtly different Thus, the two peaks in a resonance (split into a doublet) have almost identical intensities, although the situation may appear more complex if a resonance is coupled to more than one other nucleus
Conventionally, the signal arising from a particular type of nucleus is
usually described as a single resonance even if it is split by scalar coupling
When two coupled resonances have very similar chemical shifts, multiplets become distorted, and it is no longer possible to measure coupling constants directly from the spectrum This is “strong cou-
plmg.” Spectroscopists use an alphabetical convention to describe
(7)denote strong couplmg (an AB system has two strongly coupled spins) and distant letters in the alphabet denote weak coupling (an AMX system has three weakly coupled spins) 13C spectra would be greatly complicated by coupling to protons, except for the fact that they are usually recorded with some form of broad-band irradiation of the fre- quencies absorbed by protons, which effectively decouples the proton and carbon nuclei
4.4 Relaxation
The equalization of the population of the nuclear spin states caused by the RF pulse (in modern instruments) creates a high-energy system that relaxes back to thermal eqmlibrium by a varrety of mechanisms, and analysis of the relaxation rates and pathways provides a great deal of information about the geometry and dynamics of the system The characteristic rates (I?,, R2) of relaxation resulting from these mecha- nisms, or their reciprocal relaxation times (T, = l/R,, T2 = 1/R2), are important not only as data, but because their values determine optimal conditions for the acquisition of spectra There are many NMR experi- ments for which it is important that a delay between pulses is incorpo- rated sufficient to give effectively full relaxation The spin-lattice (or longitudmal) relaxation time (T,) cannot be deduced from a simple 1D spectrum, but must be measured in a separate experiment, usually the inversion recovery experiment The spin-spin (or transverse) relaxation time (T,) can be estimated from the width of lines in the 1D spectrum or more accurately measured by special experiments, usually based on the “spin-echo” experiment These experiments are described m ref
(8)All these relaxation parameters are strongly influenced by the mobility in solution and, hence, the molecular size of the compound of interest For large molecules, TI and T2 are reduced, and the interproton NOES become negative in sign The magnitude of all three of these relaxation effects can be expressed in terms of the correlation time (Q, which is itself a charac- teristic of the rate of random reorientation of a molecule in solution
In NMR studies of biological molecules, it is usually assumed that relaxation by a single mechanism, known as dipolar relaxation, takes place between directly bonded nuclei with magnetic spins There are other relaxation mechanisms that become important m specific cir- cumstances, for example, relaxation via a paramagnetic nucleus (see Chapter 2, Section 7.)
5 Transfer of Magnetization
(9)-
Fig Shaped pulses (A) A simple square pulse, a long, “soft” pulse (1 e , of low power) with this shape will excite one part of the spectrum selectively, but ~111 cause artifacts (B) A Gaussian pulse will be as selective as the square pulse and avoid some of the artifacts There are many other possible shapes for pulses
6.2D NMR Experiments
Modem spectrometers using pulse methods not record the spectrum directly, but rather use an mterferogram of magnetization vs time, which is digitized and Fourier transformed to a spectrum of magnetization vs frequency (Fig 5) Excitation of the sample need not be by a single pulse, and multipulse sequences have been introduced to give a wide variety of informative experiments, including the 2D methods
If a delay between the pulses is introduced and a series of spectra are collected at various values of this delay, a second FT of intensity vs incremented delay generates a second frequency axis This is the basis of 2D NMR The spectra are usually plotted as a contour map with intensity as the z axis The most common series of 2D spectra has the original spectrum occupying the diagonal (frequency = frequency 2) and a number of off-diagonal peaks with frequency coordinates con- necting two peaks in the original spectrum, The position and intensity of these peaks generate additional information and extend the power of the NMR methods Chapter in this section gives a more detailed account of the principles behind 2D NMR spectroscopy, particularly of 2D NOE spectroscopy or NOESY, here we give an overview of the range of methods available and the information they provide
(10)Fourier transform
B
Free lnductfon decay (FID)
F(w) = lm f(t)ezp( -wt)dt
I J-m
Spectrum
,I ” I ,, & ” * * m
Frequency (w)
(11)the FT are used to distinguish between absorbance and dispersion components of the spectrum 2D spectra are sometimes recorded in the power mode, by taking the square of all the data points; this is eco- nomical in time and in computing power and memory, but does not give as good resolution and line shape as the phase-sensitive method Correlation spectra: COSY (Correlation SpectroscopY): In these spectra, crosspeaks are located at the frequencies of resonances, which are spin-coupled and allow assignments of specific resonances to in- dividual protons in the spectra by allowing a network of coupling connectivities (and hence bonded atoms) to be built up Figure shows an example of this kind of spectrum and its relatlonship to the structure of a simple molecule An extension of this experiment, the relayed COSY, generates crosspeaks where two resonances couple to a com- mon partner and is valuable when extensive spectral overlap makes assignment difficult
HOHAHA (HOmonuclear HArtmann-HAhn) and TOCSY (Total Correlation SpectroscopY) generate essentially the same information using different spin physics, although the degree of relay depends on the length of a “spin-lock” pulse rather than additional pulses in the sequence
The individual lines making up the fine structure of a crosspeak in a phase-sensitive COSY spectrum are antiphase, with some positive and some negative As the linewidth of these components approaches the separation between them, cancellation can occur, reducing sensi- tivity In HOHAHA and TOCSY experiments, the fme structure 1s m phase, and cancellation does not occur
NOESY (NOE SpectroscopY) generates crosspeaks at the frequen- cies of resonances that are close in space within the molecule, rather than linked through covalent bonding ROESY (Rotating frame noE SpectroscopY) generates similar information, but the dependence of magnitude of the NOES on molecular motion is different and spin diffusion less of a problem, although other artifacts occur This experi- ment uses a spin-locking pulse, and so is related to the HOHAHA and TOCSY experiments
(12)Hi A
(13)coupling constants These 1D equivalents are usually just called lD- COSY, and so on, although the ROESY equivalent is also known as CAMELSPIN (7)
7 Heteronuclear Correlation
The coupling phenomenon and the correlation methods based on it are not restricted to the case where both nuclei are the same, but allow, for instance, a 13C resonance to be correlated with that from the attached proton These methods require pulses to be applied at both the proton and heteroatom resonance frequency, and may be detected at either resonance In practice, heteroatom detection is simpler, but less sen- sitive, and results in the standard heteronuclear correlation experi- ment (8), although modified schemes, such as COLOC (9), can be used when the experiment is tuned for the smaller couplings arismg over more than one bond (e.g., 2Jc,u), and is particularly valuable for establishing a covalent framework when quaternary carbons are present Proton detection, often referred to as inverse detection, is more sen- sitive and gives better dispersion in the crowded proton domain, but requires spectrometer hardware that is not always available and pulse sequences that suppress signals from protons not attached to an NMR active heteroatom (e.g., 12C) Since the natural abundance of the “use- ful” heteroatomisotopes is rarely complete, signals from protons attached at unlabeled sites must be suppressed by careful design of the pulse sequence The standard pulse sequence for this work is called HMQC (Heteronuclear Multiple Quantum Coherence) (IO), but relayed ver- sions (II) (relayed HMQC) and long-range versions (12) are possible (HMBC-Heteronuclear Multiple Bond Correlation)
References
1 Neuhaus, D and Williamson, M P (1989) The Nuclear Overhauser Effect in
Structural and Conformational Analysis Verlag Chemre, Weinherm
2 Abraham, R J , Fisher, J , and Loftus, P (1988) Introduction to NMR Spec-
troscopy Wiley, Chichester
3 Ernst, R R , Bodenhausen, G., and Wokaun, A, (1987) Principles of Nuclear
Magnetic Resonance in One and Two Dimenstons Oxford Universtty Press, Oxford
4 DeMarco, A , Llinas, M., and Wuthrich, K (1978) Analysis of the proton
NMR spectra of ferrichrome peptides, Part The amtde resonances Blopoly-
(14)5 Kessler, H , Oschkmat, H , and Griesmger, C (1986) Transformation of homo-
nuclear two-dimensional NMR techniques mto one-dimensional techniques
using Gaussian pulses J Mugn Reson 70, 106-133
6 Kessler, H , Schmieder, P , Kock, M , and Kurz, M (1990) Improved resolu-
tion in proton-detected heteronuclear long-range correlation J Mugn Reson
88,615-618
7 Bothner-By, A A , Stephens, R L., Lee, J , Warren, C D , and Jeanloz, R W
(1984) Structure determination of a tetrasaccharide transient nuclear
Overhauser effects in the rotating frame J Am Chem Sot 106, 811-813
8 Bax, A and Moms, G A (198 1) An improved method for heteronuclear chemi-
cal shift correlation by two dimensional NMR J Mugn Reson 42,501-505
9 Kessler, H., Griesenger, C., Zarbeck, J , and Loosi, H R (1984) Assignment of carbonyl carbons and sequence analysis m peptides by heteronuclear shift
correlation via small couplmg constants with broadband decouplmg in fl
(COLOC) J Magn Reson 57,331-336
10 Bax, A and Subramaman, S (1986) Sensitivity enhanced two dimensional
heteronuclear shift correlation NMR spectroscopy J Magn Reson 67, 565-
569
11 Lerner, L and Bax, A (1986) Sensitivity enhanced two dimensional hetero-
nuclear relayed coherence transfer NMR spectroscopy J Magn Reson 69,
375-380
12 Bax, A and Summers, M F (1986) ‘H and t3C assignments from sensitivity
enhanced detection of heteronuclear multiple-bond connectivity by 2D mul-
(15)Structural Studies of Proteins
in Solution Using Proton
Nuclear Magnetic Resonance
David Neuhaus and Philip A Evans
1 Introduction
Nuclear magnetic resonance (NMR) spectroscopy has become estab- lished m recent years as a uniquely powerful technique for studying the structures of proteins in solution In a ‘H spectrum, each hydrogen atom in the molecule gives rise to an individual signal, and in favor- able cases, it is possible to resolve each of them and assign each to an identified atom The power of the method then lies in the wealth of information that can be obtained concerning both through-bond and through-space connectivities between individual nuclei This makes it possible to determine m ?&ail the three-dimensional (3D) conforma- tion from NMR data, and that is the major subject of this chapter The feasibility of such a full structure determination depends crucially on the completeness with which signals can be assigned to individual nuclei and conformation-dependent parameters determined The key to achieving this has been the development of two-dimensional (2D) NMR, which, by dispersing the signals more thoroughly and m a struc- turally significant manner, permits the resolution of large numbers of resonances and elucidation of the connectivities between them The principles of 2D NMR and the particular experiments that are com- monly employed in studies of proteins will be outlined, together with
From Methods m Molecular Biology, Vol 17 Spectroscopic Methods and Analyses NMR, Mass Spectrometry, and Metalloprotern Technrques Edited by C Jones, Mulloy,
and A H Thomas Copynght 01993 Humana Press Inc , Totowa, NJ
(16)the strategies employed to make specific resonance assignments A number of approaches are then possible to turning the accumulated spectral information into structural detail, and these are reviewed briefly Such a detailed analysis is not always feasible, however, because it may not be possible to resolve and assign the spectrum in sufficient detail-in that case, the structural information that can be obtained will necessarily be more limited, though it may nonetheless be valu- able We illustrate briefly how “partial answers” to some structural questions may be obtained
This chapter IS intended to give a general outline of the principles of protein structure determination from NMR data, from a practical perspective Thus, we not attempt to review the literature compre- hensively, but rather to provide a limited number of useful references, particularly reviews, where they can helpfully expand upon a particu- lar point In particular, the book by Wtithrich (I) and a number of recent general reviews (2-4) give a more detailed account of reso- nance assignment and structure determination methodologies
2 Scope and Limitations
The chief limitations on the applicability of NMR are sensitivity and resolution NMR is a particularly insensitive method, so that sample concentrations much higher than those used for other spectroscopic studies are generally necessary Protein solutions of at least mM and preferably higher are desirable for ‘H NMR Many small proteins are perfectly soluble at this concentration, but in some cases, there can be a problem both in terms of the amount of material required and its solubility NMR of other nuclei IS much more insensitive still, and in the case of 13C and 15N, the problem is compounded by low natural abundance Heteronuclear NMR studies are therefore largely depen- dent on being able to enrich the protein with the isotope concerned
Sensitivity in NMR experiments improves greatly as the magnetic field strength employed is increased For this reason, protein NMR requires a high-field spectrometer-at present, typical fields are between about 9-
(17)1 The number of resonances in the spectrum; The dispersion of their chemical shifts; and Their lmewidths
Linewtdths, in particular, can also pose problems rf they become
comparable to the coupling constants in the spectrum-if couplings
are not resolved because of broad lines, then the results of correlation
experiments, which depend on these interactions, will be impaired
Each of these factors can thus be an important consideration in NMR
studies of proteins, and we will consider them briefly in turn:
1 The number of resonances m a protein spectrum obviously Increases more or less linearly as the number of residues increases For a protein of 100 residues, for example, the ‘H spectrum will contam m the region of 600 resonances, most of which need to be resolved and assigned if a full structural analysis 1s to be carried out In some cases, it is possible to alleviate overcrowding by means of spectral edmng techniques The simplest such trick is to dissolve the protein m D,O, so that exchange- able (NH and OH) protons will be progressively replaced by deuterons and their tH resonances will be lost from the spectrum If some NHs are protected from the solvent (for example, by hydrogen bonding), they may be resistant to exchange and be selectively retained m the spec- trum This kmd of editing has the added advantage that it provides addi- tional structural information Other editing techniques typically exploit couplmgs to specific heteronuclei to isolate particular classes of reso- nances m the spectrum (5,6) These methods are proving to be extremely valuable m extending the range of NMR to larger proteins, but it must be noted that they depend on being able to introduce tsotopic labels, such as t3C or t5N, into the protein
(18)to permit assignment of mdtvtdual resonances In protems that not have a fixed tertiary structure, however, dynamic averaging can greatly diminish these effects, leadmg to severe problems with spectral over-
lap-for example, this is a major difficulty m studies of residual struc-
ture m nonnative states of proteins, which are important m relation to protein folding (8) Of course, lack of a unique tertiary structure also poses fundamental problems for structure determmation! Unless alter- native conformations interconvert very slowly (<lo s-l), only a single resonance will be observed for each proton m the spectrum represent- mg an average over the different states (see Chapter 7), mterpretation of spectral parameters will not then be straightforward This needs to be borne very much m mmd If the protein under mvesttgation does not have a compact globular structure
The separation of resonances increases linearly as the field strength increases-this is a further reason why the development of spectrom- eters with very high magnetic fields has played an important role m the progress of protein NMR The crucial development that has made vtr- tually complete resolution and assignment of protem spectra possible, however, has been the mtroduction of 2D (and recently 3D) NMR, as described m the next section
(19)undertaken, although resolution IS limited under these conditions, and only rather small polypeptides have thus far been studied in this way (see, e.g., ref II)
In solution, lines are narrowed, because the tumbling motions of solute molecules greatly reduce the rates of dipolar relaxation (see Chap- ter 1) However, as the molecular size increases, tumblmg becomes slower and the narrowing achieved is less For very large molecules, the linewidths can then become too great for high-resolution studies This, together with the mcreasmg problem of overcrowdmg, effectively limits the range of proteins for which detailed structures may be deter- mined by NMR These effects are illustrated by the 1D spectra shown m Fig The zmc finger from SW15 is a rather small protein of 35 residues; it gives a well resolved spectrum with narrow lines, permrt- tmg the multipltcities of mdivtdual resolved resonances to be seen It has been possible to analyze this spectrum in detail using 2D methods (Z2), and we use this as an illustration in subsequent secttons Hen egg- white lysozyme is substantially larger, at 129 residues It is nonetheless apparent that mdtvidual signals are still reasonably narrow, and tt has been possible to assign vtrtually the enttre proton spectrum of this pro- tem (13); proteins of this size are, however, close to the limit of the mol-wt range for which detailed assignment and analysts are possible without atd from heteronuclear studies using isotoptcally enriched pro- tein IgG is a much larger protein, comprising a total of approx 1200 residues Even though these are organized mto domains, with limited flexibility between them, the slow overall tumbling of this much larger molecule makes the maJorrty of its resonances extremely broad, so that detailed assignment and structural analysis would not be feasible
It should be emphasized that there are no hard and fast rules as to the size of protein for which detailed NMR investigations are applicable The particular structure, dynamics, and intermolecular interactions of
an individual protein are also of great significance in determining
resolution Some quite small proteins turn out to give rather poor *H spectra if, for example, they are prone to aggregation at the relatively
high concentrations typically required for NMR study-e.g., Fig
shows the example of glucagon (30 residues), which turns out to grve rather broad lined spectra because it tends to trimerize in aqueous
solution at high concentrations (14) Conversely, relatively large
molecules may give surprisingly good spectra if there IS significant
(20)Fig One-dimensional ‘H NMR spectra of three proteins, Illustrating the effects on resolution of increasing mol wt (A) A zinc-finger peptlde from SWIS: kDa;
(B) hen egg-white lysozyme: 14 kDa; (C) bovine immunoglobulm G 150 kDa
nase, which is discussed further in Section This protein has a total mol wt of 60 kDa, but comprises three distinct domains that appear to be able to move relatively independently, with the result that some
lines are quite well resolved and can be used as markers for the behav-
ior of the individual domains (15)
(21)J I I I
60 75 70 65
1 I I
20 15 10 wm
Fig Concentration dependence of the ‘H spectrum of glucagon m aqueous solution The spectra (whrch have been manipulated to enhance resolutron) were obtained at protein concentrations of (A) 0.5 mM and (B) mM The shifting and
broadening of signals evident at the htgher concentration are associated with par-
tial trimerization of the peptide under these conditions This Illustrates that even
relatively modestly stzed peptides not necessarily give very high resolutron spec- tra and that it may be necessary to experiment m order to obtain optimal solution conditions (Adapted, with permission, from ref 14 )
(22)that may allow one to live with broad lines utilize the fact that hetero- nuclear coupling constants may be much larger than ‘H-‘H couplings, so that although homonuclear 2D experiments, which rely on scalar coupling, may fail, heteronuclear variants may still be useful
In certain cases, lines may be broadened by processes other than dipolar relaxation One commonly encountered in studies of metallo- proteins results from interaction with a paramagnetic center This can cause virtual obliteration from the spectrum of resonances of protons close in space to the paramagnetic group Although this may be a problem, paramagnetic perturbations can, in some circumstances, be used to obtain valuable structural information if, for example, it is possible to mterconvert between paramagnetlc and diamagnetic forms of the protein A good example is provided by Megasphera elsdenii
flavodoxin This is a protein of 137 residues whose spectral assign- ment was facilitated by selective suppression of the resonances of protons close to the flavin prosthetic group, when the latter was m a paramagnetic oxidation state (I 7)
(23)of the spectrum The degree of broadening depends on the chemical shift disparity between the different states for each nucleus, so that different signals are in general broadened to different extents, and, indeed, some may become so broad as to be undetectable It may be possible to eliminate exchange broadening by driving the conforma- tional equilibrium over toward a single form in some way-for example, binding of a ligand may have this effect Alternatively, raising the temperature may increase the interconversion rates sufficiently to give fast exchange where single, sharp lines are observed; it then has to be borne in mind, as previously discussed, that these represent an average over different conformations
The general conclusion to be drawn from these considerations is that there are many potential difficulties in NMR studies of proteins Some can be overcome by judicious choice of experimental condi- tions; others are more fundamental and limit the extent to which struc- tural information is accessible for some proteins For all that, it is now clear that for small, nonaggregating proteins, NMR is a high resolu- tion structural method of considerable generality
3 Experiments
Although NMR has been used to study proteins for many years, the main impetus for the recent surge in structural applications has been the development of 2D NMR In this section and the next, we give an overview of the prmcipal 2D NMR experiments useful for proteins, concentrating on the type of information available from each experi- ment, and its place in the overall strategy of assignment and structure determination To begin, we give as background a much simplified picture of the general features of 2D NMR, using the NOESY experi- ment as an example
(24)2D spectrum
St F,, F2, WI, F2)
2 nd FT
\ \
\
1
,
V -F2
, -F*
(25)spectra show the influence of the variation in tt as a modulation of the amplitudes (or, in some experiments, the phases) of the lines in the spectrum This stage corresponds to the “interferogram” S(t,, F2) shown in Fig 3, where the amplitude of the single line in this spectrum varies as acosine function of tt In fact, the modulation arises because, during tt, the signal of interest is present in the transverse plane and therefore precesses (i.e., rotates) At the end of ti, only one component is selec- ted by the pulse sequence to contribute to the finally observed spec- trum A trace running “backwards” through the interferogram (i.e., in the direction of increasing tt), linking the center of the peak as it appears in each spectrum, thus corresponds to an indirectly detected FZD, charting the precession of the magnetization during t,
A second Fourier transformation, this time with respect to t,, turns the indirectly detected time domain tt mto an indirectly detected fre- quency domain Ft In Fig 3, this leads to a 2D spectrum containing just one line, but Figs and show what happens for a two-line NOESY spectrum If nothing occurs to cause appreciable interactions between the signals during the time between t, and t2, then each will continue to precess with the same frequency during t2 as it had during tl This gives results of the sort shown in Fig The lower frequency
signal S (assuming zero frequency to be at the left-hand edge of the spectrum, for simplicity) shows the lower frequency modulation in the interferogram, while the higher frequency signal I shows the higher Because both signals maintained their original frequencies through- out, their positions in the 2D spectrum after the second Fourier trans- form both lie on the 45” diagonal defined by F, = F2 Such behavior would be expected, for instance, for a NOESY experiment with a very short mixing time (called 2, in the figures)
If these diagonal peaks were the only signals detected, 2D NMR would be of little interest The whole value of the method arises because the spins can be made to interact during the time between t, and t2,
(26)%
*I ‘Y -
I - F2 -l *
(27)Fig Schematic representation of a NOESY experiment for a two-he spec-
trum with ‘t, not set to zero, to show the orlgm of the crosspeaks (Reproduced, with permission, from ref 20.)
causes the intensities of signals I and S to become interdependent during z,, provided that the correspondmg spins are spatially close enough together in the structure (see discussion of the NOE later in this section) For molecules the size of small proteins, this results effec- tively in an exchange of magnetization between the spins Thus, when the intensity values at the start of 2, are different for signals I and S (e.g., in Fig 4, when signal I has passed through half a cycle of modu- lation and is positive, but signal S is still negative), then the NOE will act to make the intensities of I and S converge during 2, As shown in Fig 5, this results in new modulations in the interferogram, since by the end of 2, the intensity of each signal depends both on its own precession frequency during tl and that ofthe other signal, It is these new modulations that give rise to crosspeaks in the 2D spectrum after the second Fourier transform
(28)increments used) Normally, a 2D experiment lasts for several hours at least, and overnight periods or longer are common for protein experi- ments However, when speed is essential (e.g., for kinetic measure- ments), sacrificing both resolution and sensitivity combined with some tricks can sometimes bring times down to some fraction of an hour (21) Note particularly that the length of a 2D experiment does not depend on how many signals are in the spectrum, but only on the spectral widths and resolutions in the two dimensions; the NOESY spectrum of a small molecule takes just as long to run as that of a high-mol- wt protein at the same resolution, if the spectral widths are similar
Normally, 2D spectra are displayed in the form of an intensity con- tour plot Note that, provided the experiment is phase sensitive, as it should always be (see ref 22), contours can represent either positive or negative intensity These are usually represented using lines of different color, but this sign information is often discarded when pre- paring monochrome figures for publication
Table shows a few of the more important 2D experiments, together with the interactions (called mixing processes) that grve rise to crosspeaks in each case The homonuclear 2D experiments in Table fall into two distinct classes: those based on J-couplings, and those based on the NOE J-coupling, which is transmitted between nuclei through electronic interactions in the covalent bonding network of the protein, is responsible for the splitting of signals into multiplets If two protons are J-coupled, the corresponding signals both appear as dou- blets whose components are separated by the coupling constant, and more complicated patterns result when several couplings act on one proton Since J-couplings are normally only appreciable over three or fewer bonds, interproton J-couplings can normally only be observed between protons within the same residue Thus, the experiments based on correlation through homonuclear J-couplings (COSY, RELAY, TOCSY, and so on) provide information that links together the signals within groups called spin systems, each spin system corresponding to a particular residue in the protein
(29)Table
Common 2D NMR Experiments and Their Information Content
Requirement for observing
Experiment a crosspeak Comments
(IH, 1H) COSY or DQF COSY (IH, 1H) RELAY
(lH, 1H) TOCSY
(IH, 1H) Double quantum
correlation
(1H, X>
Heteronuclear shift correlation
lH-X J-coupling
over one bond
( H, X) Long-range H-X J-couplmg
heteronuclear over one, two,
shift correlauon or three bonds
(lH, 1H) NOESY IH- - - -lH
short distance
(lH, 1H) ROESY
lH-1H J-coupling
lH-1H -1H
pathway of one or two J-couplings
IH-1H lH-1H
pathway of several J-couplings
lH-1H J-coupling
lH lH short distance
Many variants
Also detects COSY cross- peaks Not all possible peaks occur, dependent on setting of mixing time
Not all possible peaks occur, dependent on setting of mixmg time
Different layout and mfor- mation content from COSY Not all possible peaks occur, dependmg on settmg of tuned delay
Selectton of one bond correlattons based on large size of one bond coupling constants
Not all possible peaks occur, dependent on setting of tuned delay Also detects one bond correlations
Relattonship between inten- sity and distance dtfficult to quantify
Apphcable to “medmm- sized” molecules where NOESY fails
(30)at the signals owing to neighbors are called NOE enhancements For large molecules, the NOE causes a change for the neighbor in the same sense as that of the perturbed signal (as in Fig 5, discussed earlier), whereas for small molecules, it is in the opposite sense; for the NOESY experiment discussed earlier, this means that crosspeaks for large molecules have the same sign as the diagonal, whereas for small molecules crosspeaks and diagonal peaks have opposite signs.*
Figure shows some calculated examples of how the intensity of NOESY crosspeaks change as a function of 2, (23) During the early part of the buildup, the change is essentially linear and is caused only by cross-relaxation between the two spins whose signals are connected by the crosspeak (I and S in Fig 5) This early period is called the “initial rate regime,” and its significance is that only while it lasts is there a simple relationship between crosspeak intensity and the single internuclear distance between I and S At later times, intensity changes caused by the NOE themselves start to perturb intensities of other near neighbors, so that crosspeak intensities become dependent on many internuclear distances rather than one This process is called spin dif- fusion, and intensities in NOESY spectra where appreciable spin dif- fusion has occurred are only calculable numerically by methods requiring knowledge of the whole structure (24)
Until recently, protein work almost exclusively involved homo- nuclear (‘H,‘H) experiments, such as those previously discussed, but now that many proteins are available from expression of cloned genes in microorganisms, they can be obtained containing various stable isotopic labels to facilitate heteronuclear NMR experiments The most useful heteronuclei are 15N and 13C, which can be incorporated glo- bally (although this is very expensive in the case of i3C) or specifically (in the sense that a particular position is labeled in all occurrences of a particular amino acid) Using suitable pulse sequences, it is possible to edit the ‘H spectrum to yield signals from just those protons directly bonded to a heteronuclear label, and then to determine the interactions of these protons with others (25) Another important recent develop- ment 1s that of three-dimensional spectroscopy (26) Here, a second,
(31)A
zrn (set)
B
0.5
rm (set)
1.0
Fig Calculated time-course (Intensity vs 2,) for three crosspeaks m the NOESY spectrum of bovine phosphohpase A2 The crosspeaks all involve the backbone NH of Lys 56, interacting with Ala 55 NH (curve labeled N m the figure), Lys 56 Ca’ (labeled pl), and Lys 56 Ca2 (labeled p2) The relative drstances from Lys 56 NH to each of these spurs are m the ratio 1.13 (N)*l 00 (pt) 61 (p2) Intensities were calculated by a numerrcal integration procedure mvolvmg all nearby spins, using distances determined from the known crystal structure of this protein, and assuming a spectrometer frequency of 400 MHz and correlatron time for molecular tumbling of x lO-9 s m A and x 10d8 s in B Note the effect of spin diffusion
particularly on curve P2, where a very short period of slow growth (the Initial rate corresponding to the long distance between these spins) is rapidly followed by buildup of intensity transmitted through the Intermediate spin pt (Reproduced and modrfred, with permission, from ref 23.)
independently incremented delay is introduced, and the resulting matrix
of FIDs, written S(t,, t2, t3), is Fourier transformed to give a spectrum
with three independent frequency dimensions Two mixing periods
are then available, one between t, and t,, the other between t2 and t3,
(32)4 Assignment
It is c!early impossible to give a detailed account here of the way in which the full assignment is carried out for a new protein, so what follows is necessarily a rather cursory overview More detailed accounts may be found in the general references cited earlier (14) Spectra are acquired in a mixture of 90% H,O; 10% D20, the D20 being necessary for the spectrometer’s field-frequency lock Depending on the quality of the spectra, in particular of the suppression of the HZ0 signal, it may also be necessary to run spectra in 100% D20 Spectra in D20 lack most of the signals resulting from exchangeable protons, but are more sensitive and reveal crosspeaks closer to the water signal
Before tackling the methodology of assignment, a few points about chemical shifts in proteins are needed Proton chermcal shifts are mainly determined by electronic influences through bonds from immediately neighboring groups and atoms Thus, for instance, backbone amide NH signals are usually in the range of approx 10-6 ppm, CaH signals are usually near midfield at approx 5.5-3.0 ppm, and methyl groups bound to sp3 carbon are usually at high field, approx 2.0-0.5 ppm However, there are important subsidiary effects on chemical shifts caused by more remote interactions, often transmitted through space rather than through bonds (e.g., aromatic ring current shifts) These depend on the wider environment of the proton concerned, and so are properties of the whole protein conformation For example, if a pro- tein contains several alanine residues, each will contribute NH, CaH and CPH, signals (usually in the gross regions just mentioned), but, barring chance coincidences, these will all be at different shifts for each residue Such interresidue chemical shift differences cannot be predicted without extensive knowledge of the whole structure (if then), so assignments cannot be made on grounds of chemical shift alone As will be shown, the way out of this difficulty is to base assignments on
interactions between protons, as determined by 2D experiments
(33)C”H signal at midfield in COSY, and to the NH signal in RELAY No other residue gives the same combination of crosspeaks; threonine gives (methyl, CPH) crosspeaks in much the same spectral region as alanine (methyl, CaH) crosspeaks, but it does not show (methyl, NH) crosspeaks in RELAY Similarly, glycine is the only residue to have two C”H signals, and thus, to give two (NH, CCIH) crosspeaks in COSY or a (C”H + CaH, NH) crosspeak in a double quantum spec- trum Using distinctions based on arguments of this sort, it is often relatively simple to pick out spin systems corresponding to Gly, Ala, and Val residues, and to find at least parts of the spin systems of Leu, Be, and Thr
Other amino acids give patterns of crosspeaks that can only be categorized into groups The largest such group are the “AMX” resi- dues, so called because, in DzO, the CaH and two @H signals form an AMX spin system (see Chapter 1) This group comprises the aromatic residues Phe, Tyr, His, and Trp, and also Asp, Asn, Cys, and Ser Of these, often only Ser can be distmguished at this stage, based on the lower field CBH shifts and smaller geminal CPH coupling than for other AMX residues (both these differences are caused by the oxygen substitution at C?) The other large group comprises the residues with long side chains, namely Lys, Arg, Met, Glu, and Gln Of these, Met, Glu, and Gln can sometimes be distinguished by their lower CW shifts and simpler spin systems, but correlations involving more distant side chain protons in “long side chain” residues are often ambiguous or indistinct, since they give crosspeaks in very crowded spectral regions, or involve inefficient transfer of magnetization over several couplings By the same token, if only partial connectivities can be found for Leu residues, there may be no recognizable indication that the methyl groups are linked to the NH, C”H, and CPH signals, so that the spin system can be indistinguishable from a “long.” Finally, Pro is often the hardest spin system to characterize, since it has no NH proton, and must be identified using correlations to the CaH and CSH signals at midfield Figure shows schematically some of the patterns expected for vari- ous spin systems
(34)10 NH Ha Me
Alanine
Cross peak tn COSY, RELAY,
and TOCSY
Cross peak In # RELAY and
TOCSY Ha
NH
10
(35)I I I I
NH Ha HP HP
D i I 1:I /
0 /
0.0
0 f
0
0
0
/
0 0.0
O/ 0 000
0 /,
0 0.0
I I I
NH NHE Ha HS'S W
HP’s
0
AMX spin system
Cross peak in # cow, RELAY,
and TOCSY
Cross peak In RELAY and
TOCSY Ha
NH
10
0 Arginine
Ws HP HP
Cross peak In @ COSY, RELAY,
and TOCSY
HS’s
Cross peak In RELAY and
TOCSY Ha
0 TOCSY Cross peak In
NHe
NH
(36)10
L 10
10
0
0 a 0 0 4 / 0
I I I I I
NH Ha W HP HP
I I I
NH Ha Ha
0
HP
5spin system
HP
W
Cross peak In COSY, RELAY,
and TOCSY
Ha
Cross peak m RELAY and
TOCSY
Cross peak m TOCSY
NH
10
.O Glycine
Cross peak in # COSY, RELAY,
and TOCSY
Ha
Ha
(37)0
10
0
10 /
0 00 00
I I I
NH Ha I I
HYHp I Hy,,&Me6
/
0 mo co
I I II I II
NH Ha HP’s Hy Me’s
0
$
I
0 7-
0 O
Me6 Mq HY HP HY
Ha
NH
IO
.O
Ha
NH
10
lsoleucine
Cross peak m @ COSY, RELAY,
and TOCSY
Cross peak in # RELAY and
TOCSY
0 Cross peak m TOCSY
Leucine
Cross peak in COSY RELAY,
and TOCSY
Cross peak In RELAY and
TOCSY
(38)I I I I I
NH Ha HE’S HP HP
,O Lysine
W
HP
Cross peak in COSY, RELAY,
and TOCSY
He’s
Ha
Cross peak m @ RELAY and
TOCSY
0 Cross peak In TOCSY
NH
10
0
Pf HP HY HP
HS HS
Ha
10
- Cross peaks mvolwng Hy’s and H6’s
Proline
Cross peak In # COSY, RELAY,
and TOCSY
Cross peak m # RELAY and
TOCSY
(39)a
10
I
NH Ha HP I I Me I (
0
7-
-11
0
7-
0 O
: )Me’s
HP
HU
NH
I
NH Ha I
I I I
HP Me’s
10
0 Threonine
Me
Cross peak In @ COSY, RELAY,
and TOCSY
HP
HO
Cross peak In $ RELAY and
TOCSY
0 Cross peak In TOCSY
NH
0
Valine
Cross peak In COSY, RELAY,
and TOCSY
Cross peak in RELAY and
TOCSY
(40)and aromatic residues contribute nonexchangeable signals usually in the low-field region The J-coupling connectivity patterns expected for Phe, Tyr, Trp, and His are characteristically different in that Tyr contributes only two coupled aromatic signals, Phe contributes three, and Trp four (in addition to the NH& to H” connectivity), although His contributes a pair of sharp singlets with a small (unresolved) coupling between them For Phe and Tyr, more complex patterns can result if the rate at which the aromatic ring flips is slow on the NMR time scale All these additional spin systems from CONHz and aromatic groups are isolated from the rest of the molecule as far as J-coupling is concerned, but are linked in to the other assignments using NOESY connectivities during the sequential assignment stage
As just shown, J-couplings allow one to classify spur systems accord- ing to the residue type from which they originate, but it is not possrble to assign each spin system to its correct location m the sequence using homonuclear coupling data alone For this we need to know the sequen- tial neighbors of each spin system, and such knowledge can only come from the through-space mformatron in NOESY (or ROESY) spectra or from heteronuclear J-couplings along the backbone Only the former approach has been extensively used so far, mamly because of the very low sensitivity of heteronuclear experiments in the absence of isotopic enrichment and the small size of many long-range heteronuclear cou- pling constants (i.e., heteronuclear couplings transmitted over more than one bond)
(41)I d CZN
Fig Types of short Interproton distances that give me to sequential NOESY crosspeaks Involving backbone NH signals, see texl for discussion
crosspeaks and can be fitted to the sequence, the combined confidence m this stretch of assignments becomes very much higher
To illustrate the process of fitting the spin systems to the known sequence, Fig shows the NOE connectivrties that were used to make sequential assignments for a 35-residue zinc-finger peptide from SW15 (in its metal-bound form) (12) Before turning to the NOESY spec- trum, the spin systems identified from COSY, TOCSY, and RELAY spectra comprised 13 AMX systems (excluding serines), 12 “long” spin systems (among which are included one of the leucmes and the three five-spin systems, not yet distinguishable from the other “longs”; the N-terminal Met of course shows no amide NH signal), two ala- nines, two serines, two prolines, one valine, one isoleucine, the other leucine (readily identifiable since TOCSY showed connectivities from the PH through to the methyls), and one glycine
(42)dNN dcN
daN(t, r+3)
s s w
s m s w Ill w
(43)AMX system, duN and dpN connectivities were found to one of the Ala spin systems Given that there are only two alanines in the sequence, this represents another “anchor point” for the assignments, and allows high confidence to be placed in the assignment of the intervening AMX system Furthermore, there are NOESY connectivities from the CBH resonances of this AMX to the ortho protons of a phenyl ring spin system, reinforcing its assignment as a Phe, and simultaneously as- signing the aromatic signals (note, however, that these aromatic sig- nals also show NOESY connectivities to other CPH signals, so that the correct combination of an AMX with an aromatic spm system for this Phe could not have been deduced prior to the sequential assignment stage) From the NH of the Ala, the connectivities continue to a “long,” then to an AMX, followed by another AMX, and then a Gly Once again, the Gly represents an “anchor point,” reinforcing the confi- dence that can be placed in the intervening three assignments This particular stretch of connectivities involving NHs necessarily ends at P47 (although connectivities from the prolme C”H protons to CaH of H46 make a bridge to the next stretch of connectlvities) To the C-terminal side, from V54 connectivities are found to a “long” and then to an AMX, but here the path stops at least in this spectrum, because H57 turns out to have a particularly weak NH signal because of rapid exchange with sol- vent water protons The assignment of the AMX to N56 is strongly rein- forced by observation of enhancements from these @H signals to a pair of side chain CONH2 signals The region of a NOESY spectrum contain- mg the dclN and dpN crosspeaks is shown in Fig 10; the dNN crosspeaks appear near the diagonal in a region below that shown
(44),
!I c
0
%I
H46
1’
“57
‘,
Q
0
H57 ?
4 H46
8
-,
-2
3
4
(45)Usually, it is necessary to record spectra at more than one tempera- ture to complete the assignments Exchangeable signals are particu- larly temperature sensitive, so that changing the temperature slightly achieves two things: (a) It moves the water signal, revealing previ- ously buried C”H signals (e.g., D50 and A52 in Fig lo), and (b) it causes differential movements among the NH signals, so that by com- paring spectra at both temperatures, NH overlap can often be resolved (e.g., the degeneracy of H62 NH with 467 NH and the near degen- eracy of F53 NH with S45 NH seen in Fig 10 are not present in a spectrum recorded at 27°C) Overlap among nonexchangeable signals is harder to deal with, since it is less likely to be affected by tempera- ture For example, G48 and P47 have a common CaH shift both at 10°C and 27”C, so it 1s impossible to tell whether the crosspeak at 64.18, 68.98 is an intraresidue (NH, C”IH) interaction within G48, or a sequential dCLN close contact to P47 (or a combination of both)
Similar stretches of assignments can be made for fragment H57-A70, “anchored” principally on 160, S65, and A70, fragment P41-H46, “anchored” on S43, and fragment M36-R40; as with P47, crosspeaks from P41 C”H to R40 C”IH link the two N-terminal fragments Although there is no sequential link from H57 to N56, the assignment is secure on both sides of H57, and in this instance, there is additional informa-
Fig 10 (opposite page) Part of a NOESY spectrum used to make the sequential
assignments shown m Fig Crosspeaks corresponding to intruresidue contacts
(46)tion from d&i,i + 3) connectivities, because this region of the peptide forms an a-helix
There is no rigid division between the process of spin system assign- ment and sequential assignment, so that information from the sequen- tial assignment stage often “finishes off’ the spin systems In this example, the spectra are sufficiently simple that this was hardly required, but sometimes, for instance, the second CPH signal of an AMX or the more distant signals in a long side chain are confirmed by sequential crosspeaks when the intraresidue crosspeaks are overlapped Varia- tions of the sequential assignment strategy have recently been pro- posed in which patterns of sequential crosspeaks are searched for before
the side chain assignments are complete (27), but it seems likely that these methods are only applicable in regions of well ordered second- ary structure, where predictable patterns occur in the NOESY spectra One important omission from the discussion so far is that of
stereospecific assignments For methylene groups that give two resolved
signals, there is generally an ambiguity as to which signal corresponds to the pro-R proton and which to the pro-S This applies particularly to CPH signals There is a similar ambiguity for the diastereotopic methyl signals of Val and Leu, and also for the two CaH signals of Gly Various methods have recently been proposed to make such assign- ments For CFH signals, these involve interpreting (PH, CPH) cou- pling constants, but the relationship between 3J and dihedral angle is such that the two staggered gauche conformations (g’ and g-) cannot be distinguished, and additional information is needed to break this ambiguity Usually, such information comes from differential (NH, CPH) intraresidue NOE enhancements at short mixing times (28,29), but (15N, CPH) or (13C=0, CPH) heteronuclear three-bond couplings can also be used For methyl groups of Val and Leu, a method based on the stereospecificity of biosynthetic incorporation of 13C has been developed (30) In cases where these methods are unsuccessful or inapplicable, it may be possible to make stereospecific assignments during the structure calculation itself, if one assignment leads to sig- nificantly smaller violations of the NOE constraints (31)
(47)group of protons involved (e.g., for a methylene group, the midpoint between the CPH protons) A similar approach is used to handle the ambiguity that exists between the two sides of a symmetrical aromatic ring when fast ring fhpping leads to averaged signals
Clearly, the present example is a relatively simple case In larger proteins, overlap becomes a much more serious problem, and sensitiv- ity is likely to be lower owing both to the lower molar concentrations likely to be available and to the broader signals Fitting to the sequence also becomes more complicated, since there will be fewer if any unique residue types, so that the assignments have to be “anchored” on identifi- able, unique, di-peptide or tri-peptide fragments However, there have been some recent developments that have improved the situation and promise to extend the size range of proteins that can be studied If the protein is available in good yield via overexpression of a cloned gene, it may be possible to incorporate 15N globally throughout the protein with high efficiency (>95%) Heteronuclear variants of the spectra so far dis- cussed are available, in which each directly coupled NH pair contributes signals at its i5N frequency m Fi rather than at its ‘H frequency (32) Since there is generally no correlation between the two shifts, the combination of homonuclear and heteronuclear spectra together provide a powerful tool to resolve overlap Still more powerful is the application of 3D spectro- scopy, both homonuclear and heteronuclear, and the first experiments (both 2D and 3D) with globally i3C-labeled proteins have recently appeared m the literature (33-3.5) It is very likely that developments such as these will lead to significant changes in the way in which sequential assignments are carried out for those cases where labeling is viable
5 Structure Determination
Most of this section is concerned with the calculation of 3D struc- tures of proteins from NMR-derived distance constraints, but first we consider what structural features can be deduced from the assigned spectra by inspection, short of actual calculation Essentially, this is limited to characterizing secondary structural elements, in particular a-helices and P-sheets The residues involved in an a-helix can often be recognized by the combination of:
(48)Even more characteristic are enhancements transmitted across one turn of the helix, the most useful of which are d&i, i + 3) connectivities As an example, several such connectivities are indicated on Fig 9; not only these indicate a helix running from N56 to 467, but they are also a useful independent check on sequential assignments in this part of the structure (for instance, they bridge the gap m sequential connectivitles at H57) In much the same way, a regular P-sheet often shows:
1 Strong sequential daN connectlvitles; Relatively weak d,, connectlvltles; and Large J (NH, PH) coupling constants
Further evidence can sometimes be found from cross-strand NOE connectivities (although of course these represent tertiary structural information) Characterization of turns, other than in the simplest case of a tight turn linking two strands of antiparallel P-sheet, is in general more difficult and often emerges only during the calculation of the overall structure
In addition to this evidence from NOE connectivitles and J-cou- plings, regions of regular secondary structure are often associated with slowly exchanging NH signals If the protein can be transferred rapidly into D20 (e.g., by lyophilization from HZ0 followed by dis- solving in D20), these signals can often be identified directly, since those NH protons protected from solvent exchange by hydrogen bond- ing within secondary structural elements may persist for some hours or even longer However, exchange rates also depend on the particular dynamics of the protein structure, and in some cases, this may obscure the influence of the H-bonding pattern
(49)can be used to provide clearly identified distance constraints It is to be hoped that, for proteins where global 13C labeling is possible, het- eronuclear experiments may largely remove this problem (33-35)
The remaining task in preparing input data for structure calculation 1s to classify the enhancement intensities into semiquantitative groups, and to calibrate these against distance This area poses a number of difficulties, discussed later, The result is that classification by distance
can only be approximate, so that NOE-derived constraints are expressed as allowed distance ranges, rather than specific values Within such an allowed range, all distance values are usually taken to be equally probable First, there is the point raised in Section that NOE intensities have a simple distance dependence only during the “initial rate regime,” that is, for short mixing times 7, Within this approximation, crosspeak intensity is taken to be proportional to rw6, that is, the inverse sixth power of internuclear separation As z, increases, enhancements at directly neighboring protons themselves become large enough that they, in turn, disturb the balance of cross-relaxation at their near neigh- bors Thus, enhancements propagate through the network of protons within the structure, and the intensity of a given crosspeak becomes a complicated function of the geometrical arrangement of all nearby protons Still worse, new crosspeaks start to appear, corresponding to pairs of protons separated not by one short distance, but rather by a pathway of two or more short distances via intervening protons This process is called spin diffusion, and its influence increases as the tum- bling rate of the solute decreases, so the larger the molecule under study, the more severe the problem becomes Within the initial rate regime, as the name implies, enhancements grow linearly, so one way to reject spin diffusion is to measure NOESY crosspeak intensities at several mixing times, and then to take the initial slope of the time- course as being proportional to re6 However, the time during which the initial rate approximation is valid is different for each proton, and for some geometries associated with rapid spin diffusion, the true initial rate may escape detection (e.g., curve p2 in Fig 6) This leads to incorrect distances if the simple re6 dependence is assumed
(50)ing protons, so if the simple assumption that there is a single rigid structure tumbling isotropically is invalid, this will alter particular intensity values For globular proteins, the main consideration is that of internal motions, and these have two important effects First, if the distance between the interacting protons changes as a result of the motion, then the measured intensity represents an average over the motion This average may be strongly weighted toward shorter dis- tances because of the y6 dependence Second, whether or not the distance changes, NOE intensity is affected by the local mobility of the interacting protons As pointed out in Section 3., the NOE is positive for small molecules and negative for large If a large molecule includes a region of high local mobility, NOE interactions involving one or more protons in the mobile region will behave as if they occurred in a smaller, more rapidly tumbling molecule Since proteins are large enough to be in the negative NOE regime, this means that motion tends to reduce NOE intensities It is quite common, for instance, for a few residues at the C or N terminus of a protein to show very weak NOESY crosspeaks, as a result of the greater flexibility in these regions
At a more practical level, there is also the matter of measuring NOESY crosspeak intensities Volume integration is certainly the cor- rect method and is being used increasingly However, methodology is still developing in this area, and at present, volume integration can be difficult in regions of overlap, or where the base surface of the 2D spectrum is distorted or noisy For convenience, measurement of peak height (often simply by counting contours in an evenly contoured plot) is sometimes substituted for volume integration However, this should be combined with at least approximate corrections for individual linewidths and multiplet structures, since these factors alter the rela- tionship between integral and peak height for each crosspeak
(51)then be used to estimate the ratio of an unknown distance to the refer- ence distance, using the equation al/a2 = (r1/r2)-6, where al and a2 are the two intensities, and y1 and r2 are the two distances A geminal methylene interaction (e.g., the [C?H, C”H] crosspeak of glycine, r= 1.75 - 1.8 A) is often used as a reference or alternatively, the interaction between adjacent aromatic ring protons, e.g., of Tyr (r = 2.8 A) Since there is always the possibility of errors owing to internal motions and spin diffusion, it is as well to compare results with several reference distances and to assess whether the calibration is reasonable in terms of the implied range of distances observed for sequential contacts Note also that, for proteins, it is common practice to set only the upper bounds of the distance constraints according to this calibration, the lower bounds being set in each case to the sum of the appropriate van der Waals contact radii This is again to allow for internal motions; if a crosspeak is weak, it cannot be assumed that this implies the inter- acting protons are necessarily distant, since a short-range interaction could always have been quenched by high local mobility
With these factors in mind, it can be seen that quantification requires caution, Rather than attempting to find a rigid relationship between intensity and distance, crosspeak intensities are divided into semi- quantitative groups (e.g., “strong, ” “medium,” and “weak”), and each group associated with an appropriate calibrated value of the upper bound for the corresponding distance constraints The longer the dis- tance these upper bounds are set to, the more certain it is that the data are not overinterpreted and that the various sources of error mentioned earlier are allowed for, but the less active the distance constraints are in determining the structure during the subsequent calculations
(52)minimizes the ambiguities and maximizes the chance that the cou- pling originates in a region of defined local conformation Coupling constant data are in many ways complementary to NOE-derived con- straints, since couplings relate to local structural detail, which is pre- cisely where the approximate nature of NOE constraints leads to difficulties For proteins that can be labeled with 13C or 15N, it is to be expected that heteronuclear coupling constant measurements will prove very useful in the future
Several methods of calculation are available for determining structure from the NMR constraints Some aim to tackle the purely geometrical problem of fitting the maximum number of constraints while maintaining the covalent connectivity and minimizing van der Waals contacts Others combine this process withenergy calculations, which necessitates expres- sing the NMR constraints as if they were additional energy terms
Because of the number and approximate nature of the constraints, there is no one structure that uniquely fits the NMR data For this reason, it is usual to carry out a series of calculations using randomly different starting conditions (the meaning of this depends on the par- ticular method) and to compare the results for the whole series Each calculated result then represents one point in the conformational space compatible with the NMR constraints If the method used is itself unbiased, and if a sufficient number of calculations are carried out, the set of conformations represents this conformational space Also, some judgment can then be made as to how well different parts of the struc- ture are defined by the data, since well defined regions will vary little between the different computed conformations, whereas poorly defined regions will differ more considerably As an example, Fig 11 shows a set of five conformations for the small protein BUS1 IIA, calculated using a distance geometry algorithm
(53)Fig 11, Five conformations calculated for the small protein BUS1 IIA using the program DISGEO In addition to the covalent structure (including disulfide bridges), the input data for these calculatrons comprised 202 assigned NOE constraints, sev- eral constraints derived from coupling constants, and explicit constramts for hydro- gen bonds within the regular secondary structural elements once these elements had been located from initial calculattons For certain regions of the protein where there are many constraints, such as in the a-helix and triple-stranded P-sheet, there is good agreement between the individual structures, whereas in others where there are fewer constraints, such as m the N-terminal region and the loop at the bottom of the structure (as shown), there 1s much more divergence (Reproduced, with per- mission, from ref 36 )
(54)Among the purely geometrical methods, there are two quite sepa- rate approaches, although in at least one case the results obtained between them not seem to be very greatly different (37) The pro- gram DISMAN works by systematically varying torsion angles, while minimizing a target function that represents the sum of the NMR con- straint violations and the van der Waals interactions (38) In order to avoid local minima, the constraints are often introduced progressively, beginning with those that span only a few residues, and including longer range constraints only later in the calculation Because the method operates on a starting conformation, it is possible either to start with randomly generated conformations or from some known model structure if desired
The other geometrical approach is that of distance-geometry calcu- lation For a structure of Natoms, any geometry in (N-l) dimensional space will be compatible with a full set of N(N-1)/2 distances mea- sured between the atoms in three dimensions (39) Distance-geometry calculations project such a high-dimensional geometry into three-dimen- sional space using a process called “embedding,” while minimizing the extent to which constraint violations are introduced Further opti- mization of the structure is then carried out For calculations based on NMR data, distance geometry methods also have to cope with the fact that only some small fraction of the total number of distances is known, and although the covalently bonded distances are known precisely, the NOE-derived distances are not As mentioned earlier, this imprecision is usually handled by running several calculations For each calcula- tion, particular values are chosen at random from between the upper and lower bounds for each constraint, and some further checking is carried out to make these choices as mutually consistent as possible Note that, unlike the DISMAN program, distance-geometry calcula- tions not operate on a starting conformation, but take only the covalent connectivities and NMR constraints as input
(55)energetic terms, can also be varied, sometimes during the calculation itself Methods in this category include restrained energy minimiza- tion, restrained molecular dynamics, and “simulated annealing,” which in this context is essentially a simplified version of molecular dynamics
In practice, several techniques are often combined Quite often, the geometrical methods yield rather high-energy conformations There- fore, a typical strategy might be to begin with a series of geometrical calculations, to select those that have converged, and then to refine these using restrained molecular dynamics or energy minimization
Another method of refinement at present being developed is that of back-calculation of the NOESY spectrum As was mentioned in Sec- tion 3., it is possible to obtain theoretical crosspeak intensities for a known structure even when there is spm diffusion, by using a relax- ation matrix calculation Thus, once an initial structure has been obtained, it can be optimized by iterative comparison of the back-calculated theoretical NOESY mtensrtles with the real NOESY data to obtain the best match (40) This process transforms limited spin diffusion from a problem into a source of information and has the added advantage that it is largely independent of the external decisions necessary in the earlier preparation of input data, particularly in the quantification and calibration of the NOE intensities However, there are some practical problems in implementing the method, and the problem of accounting for dynamics of the structure remains
6 Partial Answers: Lower Resolution Information
(56)There are numerous examples where, although the majority of lines in a spectrum remain unresolved and unassigned, a limited number of “marker” resonances can be found that are particularly well resolved and can provide useful structural information, provided that they can be assigned Often such marker signals are those with extreme values of chemical shift that place them outside the broad envelopes of over- lapping signals Although full sequential assignments obviously can- not be made under these circumstances, it may be possible to assign marker resonances by studying chemically modified or mutant pro- teins, or protein fragments This principle underlies many NMR stud- ies of larger proteins, in particular
Among the most widely studied of marker resonances in protein spectra are those of the C2H protons of histidine residues (42) These resonances are often relatively sharp and in an uncrowded region of the spectrum (in D20), facilitating their observation In addition, their chemical shifts are pH dependent, reflecting the ionization of the side chain imidazole function, and this permits their pK, values to be deter- mined This can provide mechanistically important information about the ionization states of specific histidines in the active site of an enzyme, for example Conveniently observed resonances such as these can also be employed as more general probes for detecting conformational change in proteins-for example, Fig 12 illustrates how the histidine resonances of staphylococcai nuclease turned out to be very useful in studies of the proline isomerization equilibria in this protein (18)
(57)A
1 I I I I
80 70 60 50 Chemical shift (p.p m )
Frg 12 (A) Spectrum of staphylococcal nuclease in D20 All of the amtde NHs have been allowed to exchange out revealmg clearly the C2H resonances of the four histidmes to low field m the spectrum The smaller stgnals denoted by * arise from a minor conformer of the protein, which differs marginally from the major
form, in particular by cis-tram isomerism about a single prolyl peptide bond These
are the only resonances of the mmor form that are clearly resolved in the spectrum, demonstratmg the utihty of His C2H peaks as marker signals (B) Spectrum of a mutant nuclease (P117G) The prolme that was thought to isomerize was replaced
by a glycine to see rf thts would remove the conformational heterogeneity The
minor resonances have indeed disappeared from the spectrum, supportmg the hypothesis (Reproduced, with permission, from ref 43 )
surface residues, as histidines often are) These NMR studies, particu- larly measurements of histidine pK, values, have provided important mformation concerning the mechanism of cooperative oxygen bind- ing and the structural changes associated with it
(58)this is not necessarily the case in reality, because large proteins tend to be segregated into structural domains In some cases, the interac- tions between these are relatively weak, and there may then be suffi- cient relative mobility of individual domains to give surprismgly good spectra Although much of the spectrum may be hopelessly over- crowded, a limited number of well resolved peaks may be sufficient to provide a useful range of structural probes Individual domams, iso- lated from the remainder of the protein by, for example, proteolysis, may fold independently, and comparison of their spectra may make it possible to identify individual resonances m the intact protein spec- trum This has recently been illustrated for the multidomam fibrm- olytic protein, urokinase, as illustrated in Fig 13A (15) In principle, it might then be possible to mvestigate the effects of interdomain interactions An interestmg feature of multidomain proteins that has recently been explored is that mdividual domains may have different thermal stabilities, so that it is possible to obtain spectra of partially unfolded states in which only certain domains remain folded-this was also demonstrated for urokinase, where independent unfolding of four separate domains was observed see Fig 13B (15) Studies of this type are of interest in terms of protein folding, but they also offer the prospect of investigating the possible presence of distinct struc- tural domains where nothing is otherwise known about the structure Even in cases where the dimensions of a protein are such that reso- nances of nuclei buried in the core of the structure are hopelessly broadened, there are sometimes more mobile segments of the mol- ecule, giving rise to well resolved lines in the spectrum It may be possible to identify the origin of these regions, for example, by com- paring spectra of partially proteolyzed derivatives The existence of such regions of enhanced flexibility may be of functional significance, and their identification through NMR in this way thus constitutes valuable information in itself A good example of this is the pyruvate dehydrogenase multienzyme complex, which has a mol wt of approx million, but has profitably been studied by NMR, which revealed the presence of a flexible linker segment that apparently provides for rapid conformational changes that are crucial to the catalytic mechanism (46)
(59)04 -04 -06 -lzwm Ok 0 -04 -08 -12
(60)studies of protein-ligand complexes, for example, one can focus on the ligand (see Chapter 7) Thus, one can study ligands by hetero- nuclear NMR methods if they are labeled or, in the case of metal ions, if they can be substituted by ions, such as ’ 13Cd2+, that can be studied directly by NMR For example, in studies of metallothionein, it was possible to determine which residues coordinate the metal ion by detect- ing coupling of cysteinyl CPH protons to li3Cd2+ (47) Alternatively, it may be possible to study the conformation of the bound ligand when it is in equilibrium with the free form (which may be in excess, so that its spectrum is readily observed) by the detection of transferred NOES (48) It may also be possible to use labeled ligands to obtain structural information about the residues of the protein itself, to which they are bound NOES can also be detected between nuclei of the ligand and of the protein, potentially providing a very powerful specific probe of the binding site; however, the success of such experiments has so far proven to be limited m practice, principally because they not over- come the problem of assigning the protein resonances concerned
(61)7 Practical Considerations
The feasibility of undertaking a detailed structural study of a protein by NMR depends, in part, on the intrinsic properties of the protein, as discussed in Section A full 3D structure determination generally requires a level of assignment and analysis that is only currently attain- able forrelatively small proteins Thus, the NMR spectroscopist’s first question about a protein is always said to be “how big is it?” As we pointed out before, there are no hard and fast rules; as a guideline, we would suggest that if the mol wt is cl0 kDa, it is a possibility well worth considering, although only preliminary studies to gage the qual- ity of the spectrum can really tell For proteins between 10 and 15 kDa, it may still be possible, but the undertaking becomes increasingly onerous A few spectra of proteins of this size have been assigned using only homonuclear ‘HNMR, although in these cases, other tricks were generally used to obtain “edited” spectra in order to resolve problems of resonance overcrowding- for example, differential sol- vent exchange rates of amide protons in the case of lysozyme (13) and the variable oxidation state of the prosthetic group in the case of flavo- doxin (I 7) In the case of proteins much larger than approx 15 kDa, 13C or 15N labeling to permit heteronuclear studies will undoubtedly become necessary to permit much progress with assignment to be made The alternative with these and still larger proteins is to settle for seeking more limited structural information, as discussed in Section
The other major limitation of NMR is its Insensitivity Obtaining a -mit4 solution, the minimum desirable for ‘H NMR, requires mg of protein to be dissolved in a OS-mL sample, in the case of a protein of 10 kDa The overall requirement, both in terms of the amount of pro- tein needing to be purified and its solubility, could therefore in some cases be excessive It is also important to note that the protein must not only be soluble at this concentration, but it must also not aggregate appreciably; otherwise, the effective mol wt will of course be greatly increased, and the spectrum will correspondingly tend to be poorer It may be necessary to experiment with a variety of solution conditions m order to optimize the spectral quality obtainable
(62)solvent water be at least 10% deuterated to permit the field frequency lock to function; for some studies, it may be desirable to work in virtually 100% D,O Thus, some form of buffer exchange is generally necessary Since NMR samples are typically more concentrated than those used for other studies, this is usually associated with a concentration step The simplest method is to freeze-dry the protein, in the absence of added buffer salts, and then redissolve the product in a buffer appropriate to NMR work Some proteins cannot be freeze-dried, however, and m that case, it may be necessary to use some form of concentrator that employs a semi- permeable membrane to effect buffer exchange and concentration
A particular problem with protein samples is the presence of small molecule signals that can interfere with the spectrum The most obvi- ous of these is water Since it is in general necessary to work in H,O rather than D20 solution, in order to observe all of the exchangeable NH signals, considerable effort has been put mto developing methods for suppressing the water peak in protein spectra This may be achieved either by selective saturation of the water or selective excitation of the remainder of the spectrum (52) Whatever technique is applied, the key to success is excellent field homogeneity, since line-shape distortions can lead to poor suppression of parts of the peak, resulting m serious baseline distortions in the protein spectra obtained It must be remem- bered that any solute molecule contaimng protons will also give rise to signals in the ‘H spectrum and, therefore, that it is desirable to remove such species as far as possible This can usually be achieved by dialy- sis or gel filtration, but it can present more of a problem in relation to the buffer requirements of the particular protein The simplest solution is to use an inorganic buffer, such as phosphate, whose only protons are in fast exchange with the water and can readily be preexchanged for deuterons if need be simply by freeze-drying from D20 solution; alternatively, several common buffer salts are available as perdeuterated derivatives-simple compounds, such as (d,)-acetic acid are indeed quite cheap to obtain If it is necessary to use a protonated buffer, its concentration should be kept to an absolute minimum
(63)in a protein sample To remove trace metal ions, it may be sufficient to add a low concentration of EDTA or EGTA to the solution, but it is probably preferable to remove them altogether by dialysis against one of these agents or passing the protein through a column of a metal ion sequestering resin, Of course, the problem may need more careful consideration in the case of a metalloprotein! It should be noted that molecular oxygen is itself aparamagnetic impurity, and some workers remove dissolved oxygen from NMR samples by freeze-thaw meth- ods However, the effects are rather marginal in practice, and since the process is time-consuming and may have deleterious effects on some proteins, it is not very common today
Optimum linewidths in NMR spectra depend on a number of fac- tors One is the condition of the sample, which should be free of extrane- ous matter This can usually be achieved by centrifugation prior to placing the sample in the NMR tube Most important of all is the homogeneity of the magnetic field, which must be optimized by “shim- ming.” This may be a very tedious process (although increasingly it is possible to get the instrument to it, at least in part, automatically), but is absolutely necessary Some workers fmd that it is best to shim using a sample of a small molecule first, where resolution of very fine couplmgs provides a very stringent test of homogeneity-it should then only be necessary to make small final adjustments on introducing the protein sample Another factor that is a key to the success of 2D experiments is stability; this is to some extent dependent on the quality and situation of the instrument, but care should also be taken by the experi- menter to ensure, for example, that the probe temperature is fully equili- brated before commencing acquisition Running the experiment without spinning the sample also improves stability substantially, without degrad- mg the resolution noticeably, provided the shimming is adequate
(64)technique, but the opportunities are constantly increasing to take advan- tages of the unique structural information that it can generate
References
1 Wilthrrch, K (1986) NMR of Proterns and Nucleic Aczds Wiley, New York Wuthrrch, K (1989) Protein structure determmatron m solution by nuclear
magnetic resonance spectroscopy Scrence 243,45-50
3 Wdthrich, K (1989) The development of nuclear magnettc resonance spec- troscopy as a technique for protein structure determmatton Accounts Chem Res 22,36-44
4 Clore, G M and Gronenborn, A M (1990) Determination of three-drmen- sional structures of proteins and nucleic acids m solutton by nuclear magnetic resonance spectroscopy CRC Cnt Rev Biochem 24,419-564
5 Griffey, R H and Redfield, A G (1987) Proton-detected heteronuclear edited and correlated nuclear magnetic resonance and nuclear Overhauser effect m solution Q Rev Biophys 19,51-82
6 McIntosh, L P and Dahlqurst, F W (1990) Biosynthettc mcorporation of 15N and t3C for assignment and interpretation of NMR spectra of proteins Q Rev Biophys 23, l-38
7 Perkins, S J (1982) Applications of rmg current calculations to the proton NMR of proteins and nucleic acids, in Blologzcal Magnetic Resonance, vol (Berliner, L J and Reuben, J., eds ), pp 193-336
8 Baum, J., Dobson, C M., Evans, P A., and Hanley, C (1989) Characterlsation of a partly folded protein by NMR methods Studies on the molten globule state of a-lactalbumm Biochemistry 28, 7-13
9 Smith, S and Griffin, R G (1988) High resolution solid state NMR of proteins Annu Rev Phys Chem 39,511-536
10 Tappin, M J., Pastore, A., Norton, R S., Freer, J H., and Campbell, I D (1988) High resolution NMR study of the solution structure of b-hemolysin Blochemcstry 27, 1643-l 647
11 Braun, W , Wider, G , Lee, K H., and Wuthrrch, K (1983) Conformation of glucagon in a lipid-water interphase by ‘H nuclear magnetic resonance J Mol Biol 169,921-948
12 Neuhaus, D , Nakaseko, Y , Nagar, K , and Klug, A (1990) Sequence-specdtc [‘H]NMR resonance assignments and secondary structure identiftcation for I- and 2-zinc finger constructs from SWIS; a hydrophobic core mvolvmg four invariant residues FEBS Lett 262, 179-l 84
13 Redfield, C and Dobson, C M (1988) Sequential assignments and secondary structure of hen egg-white lysozyme m solution Blochemlstry 27, 122-I 36 14 Wagman, M E , Dobson, C M , and Karplus, M (1980) Proton NMR studres
of the association and folding of glucagon in solution FEBS Lett 119,265-270 15 Bogusky, M., Dobson, C M., and Smith, R A G (1989) Reverstble indepen-
(65)16 LeMaster, D (1990) Deutermm labeling m NMR structural analysts of larger
proteins Q Rev Biophys 23, 133-174
17 van Mterlo, C P M., Vervoort, G , Muller, F , and Bather, A (1990) A two-
dimenstonal ‘H NMR study on Megasphaera Elsdenii flavodoxm in the re-
duced state; sequenttal assignments Eur J Biochem 187,521-541
18 Evans, P A , Kautz, R A , Fox, R , and Dobson, C M (1989) A magnett-
zatton transfer NMR study of the folding of staphylococcal nuclease Biochem-
istry 28,362-370
19 Wagner, G (1983) Charactertsatton of the dtstributton of internal motions m
the bovine pancreatic trypsin inhibitor using a large number of internal NMR
probes Q Rev Biophys 16, l-58
20 Neuhaus, D and Williamson, M P (1989) The Nuclear Overhauser Effect VI
StructuraE and Conformational Analysis VCH, New York
21 Marion, D., Ikura, M., Tschudm, R., and Bax, A (1989) Rapid recording of 2D NMR spectra wtthout phase cyclmg Applrcation to the study of hydrogen
exchange in proteins J Magn Reson 85,393-399
22 Keeler, J and Neuhaus, D (1985) Compartson and evaluatton of methods for
two-dlmensronal NMR spectra with absorption-mode lineshapes J Magn
Reson 63,454-472
23 Williamson, M P (1987) Guidelines for the design of kinetic NOE expert-
ments from computer simulation Magn Reson Chem 25, 356-36
24 Borgras, B A and James, T L (1989) Two-dimensional nuclear Overhauser
effect complete relaxation matrix analysis Methods Enzymol 176, 169-l 83
25 Otting, G and Wuthrich, K (1990) Heteronuclear filters in 2D [ lH, lH] NMR spectroscopy Combmed use with tsotoprc labelling for studies of macromolecu-
lar conformatton and intermolecular mteractrons Q Rev Blophys 23,39-96
26 Grresinger, C , Sorensen, W , and Ernst, R R (1989) Novel three-dimen- sional NMR techniques for studies of peptides and brologtcal macromolecules
.I Am Chem Sot 109,7227-7228
27 Englander, S W and Wand, A J (1987) Main-chain-directed strategy for the
assignment of ‘H NMR spectra of proteins Biochemistry 26,5953-5958
28 Hyberts, S G , Markt, W and Wagner, G (1987) Stereospectfic assignments
of side-chain protons and characterisatton of torsteron angles m Eghn c Eur
J Blochem 164,625-635
29 Guntert, P , Braun, W , Brlleter, M , and Wuthrrch, K (1989) Automated ste- reospectftc ‘H assignments and their impact on the precision of protein struc-
ture determinations in solutron J Am Chem Sot 111, 39974004
30 Neri, D , Szyperskt, T , Ottmg, G , Senn, H , and Wuthrrch, K (1989) Ste- reospectfic nuclear magnetic resonance assignments of the methyl groups of
valme and leucine m the DNA-binding domain of the 434 repressor by blosyn-
thettcally directed fractional t3C labelling Biochemistry 28,7510-7516
31 Weber, P L , Morrison, R , and Hare, D (1988) Determmmg stereo-specific
‘H nuclear magnetic resonance assignments from distance geometry calcula-
tions J Mol Blol 204,483-487
(66)powerful method of sequential proton resonance assignment m proteins using relayed 15N-‘H multiple quantum coherence spectroscopy FEBS Lett 243, 93-98
33 Feslk, S W , Eaton, H L , OleJniczak, E T , and Zmderweg, E R P (1990) 2D and 3D NMR spectroscopy employing 13C - 13C magnetisatlon transfer by isotropic mixing Spm system ldentlflcatlon m large protems J Am Chem sot 112,886-888
34 Wang, J , Hmck, A P., Loh, S N., and Markley, J L (1990) Two-dlmen- sional NMR studies of staphylococcal nuclease Sequence-specific asslgn- ments of carbon- 13 and mtrogen- 15 signals from the nuclease H124L-thyme- dme 3’S’-blsphosphate-Ca*+ ternary complex Blochemwy 29, 102-I 13 35 Ikura, M , Kay, L E , Tschudm, R., and Bax, A (1990) Three-dimensional
NOESY-HMQC spectroscopy of a 13C labelled protein J Mugn Reson 86, 204-209
36 Wllhamson, M P , Havel, T F , and Wuthnch, K (1985) Solution conforma- tion of protemase mhlbltor IIA from bull seminal plasma by ‘H nuclear mag- netlc resonance and distance geometry J Mol Biol 182,295-3 15
3’7 Wagner, G , Braun, W , Havel, T F , Schaumann, T , Go, N , and Wuthnch, K (1987) Protem structures m solution by nuclear magnetic resonance and distance geometry The polypeptide fold of the basic pancreatic trypsin mhibi- tor determmed usmg two different algorithms, DISGEO and DISMAN J Mol Biol 196,6 l-639
38 Braun, W and Go, N (1985) Calculation of protein conformation by proton- proton distance constraints, a new efficient algorithm J Mol Biol 186,61 l- 626
39 Crippen, G M and Havel, T F (1988) Dcstance Geometry and Molecular Conformation Wiley, New York
40 Boelens, R , Koning, T M G., and Kaptem, R (1988) Determmatlon of blomolecular structures from proton-proton NOE’s usmg a relaxation matrix approach J Mol Struct 173,299-311
41 Jardetzky, and Roberts, G C K (1981) NMR ln Molecular Biology Aca- demic, New York
42 Markley, J L (1975) Observation of hlstldme residues m protems by means of NMR spectroscopy Act Chem Res 8,70-80
43 Evans, P A , Dobson, C M., Kautz, R A , Hatfull, G , and Fox, R (1987) Prolme lsomerlsm m staphylococcal nuclease characterlsed by NMR and site directed mutagenesis Nature 329,266-268
44 Shulman, R G , Hopfield, J J , and Ogawa, S (1975) Allosteric mterpretatlon of haemoglobm propertles Q Rev Brophys 8,325-420
45 Ho, C and Russu, I M (1987) How much we know about the Bohr effect m haemoglobin? Biochemistry 26,6299-6305
46 Radford, S E., Laue, E D , Perham, R N., Miles, J S , and Guest, J R (1987) Segmental structure and protein domams m the pyruvate dehydrogenase mul- tienzyme complex of Escherichla coli Biochem J 247,641-649
(67)H R , Ernst, R R , and Whthrtch, K (1985) Polypeptide-metal cluster
connecttvmes m metallothionem-2 by novel ‘H - lt3Cd heteronuclear 2D NMR
experiments J Am Chem Sot 107,6847-6851
48 Clore, G M and Gronenborn, A (1982) Theory and applications of the trans- ferred NOE to the study of the conformations of small hgands bound to pro-
teins J Magn Reson 48,402-417
49 Roder, H (1989) Structural charactertsatton of protein folding mtermediates
by proton magnetic resonance and hydrogen exchange Methods Enzymol 176,
446-473
50 Udgaonkar, J B and Baldwin, R L (1989) NMR evidence for an early frame-
work Intermediate on the folding pathway of rtbonuclease A Nature 335,
694-699
51 Roder, H , Elove, G., and Englander, S W (1989) Structural charactertsatton of folding intermediates m cytochrome c by hydrogen exchange labellmg and NMR Nature 335,700-704
(68)(69)Peptide Structure
Determination by NMR
Michael I? Williamson
1 Introduction
The difference between peptides and proteins (the subject of Chapter 2) is that peptides are molecules too small to have a “globu- lar” structure This means that the spectral assignment process is often much simpler for peptides than it is for proteins, because there are fewer signals present m peptide spectra; on the other hand, peptides seldom adopt a single, well defined structure in solution, which makes the interpretation of structural data more contentious for peptides than it is for proteins The emphasis in this chapter is therefore different from that in Chapter The acquisition of structurally relevant data is straightforward, given a familiarity with modern two-dimensional (2D) NMR techniques and is given less emphasis here, but the analysis of the data is seen as the key to obtaining a meaningful answer, and is the area where experience and expertise are most necessary
The difficulty in dealing with flexible structures by NMR derives from the fact that intramolecular motion (i.e., rotation about single bonds) causes most NMR parameters, such as NOE, coupling con- stant, and chemical shift, to be averaged, rather than giving a superpo- sition of values, as IS seen in many other branches of spectroscopy This has a number of consequences First, it is not at all obvious from inspection of a spectrum whether one conformation or many confor-
From Methods m Molecular Biology, Vol 17 Spectroscop/c Methods and Analyses NMR, Mass Spectrometry, and Metalloprotem Techmques Edited by C Jones, B Mulloy,
and A H Thomas CopyrIght 01993 Humana Press Inc , Totowa, NJ
(70)mations are present (see Note 1) Second, the different conformational parameters are averaged in a very nonlinear way (I) Thus, for exam- ple, the size of the NOE depends not on the internuclear distance r, but on +, so that a very close contact between two protons in a minor conformer can still give a very strong NOE after conformational aver- aging Third, it is usually impossible to use the observed averaged NMR parameters to deduce the nature of the constituent conformers; in other words, the problem is underdetermined The consequence of these three points is that, in deriving peptide structures from NMR data, it is usually assumed that only one conformation is present, often without any serious attempt to justify the assumption Analysis of the data will then often produce a structure that may fit the data, but may possibly have very little relation to the actual conformations present On the other hand, if one is more cautious and assumes that more than one conformation may be present, how does one limit the choice of possible conformations? In the rest of this chapter, we describe meth- ods for tackling the problem, based on the plan:
1 Acquire as much and as varied information as possible; Analyze it to see if it could fit a single conformation; and
3 Adopt a cautious approach to structure determination, bearing in mind the underdetermined nature of the problem
Several aspects of this approach have been discussed in reviews
G’,3
2 Materials
We assume that a Fourier transform (FT) mode NMR spectrometer is available The alternative is acontinuous-wave (CW) machine, which is generally less sensitive, and is incapable of doing 2D experiments A 300-MHz instrument would be quite adequate for almost all the experiments described here, although higher field instruments are more sensitive and have better spectral dispersion Peptides longer than eight to ten residues would benefit greatly from the extra dispersion available from higher field machines The volume of solution needed for NMR is approx 0.5 mL, and the minimum concentration is approx
(71)The NMR response is proportional to the number of nuclei present It is therefore important to ensure that the sample is free from any proton-carrying material, including buffers For work in aqueous solu- tions, phosphate is aconvenient nonproton-carrying buffer Many tran- sition metal ions broaden NMR signals and should be removed using chelating agents The solvent used should be fully deuterated, to avoid the need for suppressing the large signal from solvent protons In protic solvents, such as water or methanol, there is chemical exchange between solvent protons and the amide protons on peptides The use of D20 or CDsOH therefore removes the very valuable conforma- tional information obtainable from amide protons, and for water and methanol, it is common to put up with the need for solvent suppression and use 90% protic/lO% deuterated solvent mixtures, the deuterated component being required for the field-frequency lock This difficulty is one reason for the popularity of dimethylsulfoxide, m which this solvent exchange reaction cannot occur, as a solvent for peptide work (see Notes and 4)
3 Methods 3.1 Assignment
(72)To complete the sequence-specific assignments, NOES are usually also necessary, as elaborated in Chapter There are two ways of collecting NOE information; either using the normal longitudinal crossrelaxation pathways, with 1D NOES or NOESY, or via the rotating frame NOE, called ROE or CAMELSPIN in the 1D experiment, or ROESY in two dimensions Both techniques are typically runovernight ROESY is more prone to produce misleading signals and is also more difficult to set up (#), so NOESY is preferred where possible However, the biggest factor affecting the size of the NOE is the tumbling rate of the interproton vectors concerned, as shown in Fig As a rough guide, for moderate or high-field machines operating at room temperature, small peptides
in nonviscous solvents have 02, << 1, and neither NOE nor ROE will work very well; linear peptides in nonaqueous solvents and cyclic pep- tides in nonviscous solvents have cez, - 1, and only ROE will work; long linear peptides in water and cyclic peptides m dimethylsulfoxide or water have NIX, >I, and NOE and ROE will both work, although NOE is usually preferable, as previously described At lower field strengths or higher temperatures, 07, becomes smaller If in doubt, it is simpler to try NOESY first, but expert help is strongly advisable for any NOE experiments If using ROESY, it is advisable to use low spin-lock field strengths and to acquire two spectra with different offsets of the spin- lock field, in order to reduce and identify some of the undesirable signals The undesirable signals are most troublesome when the transmitter frequency is midway between the two protons giving the ROES (4) COSY, TOCSY, and NOESY/ROESY are usually sufficient for a complete ‘H assignment; sometimes other techniques are used, par- ticularly ones that not make use of the NOE, such as COLOC (5)
All these techniques have 1D analogs, but in nearly all cases, the 2D version is simpler to set up (see Note 5) The exact method of imple- mentation of these techniques relies heavily on the instrumentation available, which varies widely
3.2 NMR Parameters Available
Some of these parameters are discussed more fully in a review (6)
3.21 The NOE
This is generally the most useful parameter available, since it is very sensitive to mternuclear distance (intensity proportional to rm6) As
(73)Fig Dependence of NOE intensity on wz, for longitudinal NOE (N) and trans- verse (rotating frame) NOE (R)-CII is the observation frequency (in rad s-i) and 2, is the rotational correlation time The figure depicts the maximum observable value in a 2D experiment from an isolated two-spin system In practice, NOE values will be smaller, particularly for values of oz, c c 1, because of external relaxation In 1D experiments, numerical values are slightly more than twice as large and in- verted (i e , R IS always positive, whereas N starts off positive and goes negative with mcreasmg ~2,)
2D experiments are generally done overnight (see Note 6) It can some- times be useful to obtain heteronuclear { NH}-i3C0 NOES, but these are time-consuming and difficult to interpret (4) NOE intensities from 2D spectra are best measured by integration of crosspeak volume, having first ensured that the baseplane around the crosspeak is cor- rected (see Notes and 8)
3.2.2 Coupling Constant (3J~~a)
Protos separated by three bonds have signals split by a coupling constant, J, which varies in a somewhat complicated fashion with the angle between the protons, as shown in Fig For peptide NH protons,
JHN~ is usually measurable directly from the normal 1D spectrum,
(74)1, I, ‘ , I,, , I, , , ,
-160 -120 -80 -40 40 80 120 160
‘p
Ag Varlatlon of3JHNcl with dihedral angle for peptides, using the equation of ref (7), J = cos*0 - 1.4 cos0 + (Cl = ( $- 60 I) Typical values for a-helix, p- sheet, and random co11 are mdlcated by a, p, and I, respectively
resolution, phase-sensitive COSY experiment (8), which is simple to acquire, but tiresome to analyze because of the very large data matrix needed for adequate digitization of crosspeaks
3.23 Temperature Dependence of NH Shifts
Solvent-exposed amide protons shift roughly 0.0064008 ppm (6-8 ppb) upfield per degree temperature increase, whereas amide protons hidden from the solvent shift much less (see Note 9) Thus, the tempera- ture dependence of the shift (AS/IT) is widely used as a measure of the extent of hydrogen bonding of amide protons, with values of ~3 ppb/ K taken as indicative of well formed hydrogen bonds (e.g., ref 9) This 1s an easy parameter to measure, which is one reason for its popularity It is important to leave sufficient time for the temperature of the sample to equilibrate after altering the temperature (at least 15 min, depending on the spectrometer) If it has not been done recently, it is also advis- able to check that the temperature reading of the spectrometer is accu- rate, using a reference sample of methanol or ethylene glycol
3.2.4 NH Exchange Rate
(75)in D20 or CDsOD (10) Exchange rates are at their slowest at a pH of 3-3.5; the peptide should therefore be lyophilized from the proton- carrying solvent at this pH, then dissolved in the deuterated solvent, and immediately observed (see Note 10) Faster exchange rates can be detected by 1D saturation transfer or 2D NOESY (see Chapter of this vol.), which are harder to perform than the straightforward exchange experiment and also harder to interpret
3.2.5 Chemical Shift
In an unstructured peptide, protons have chemical shifts dependent only on the amino acid type These are known as the randomcoil shifts, and their values have been tabulated (II) Chemical shifts very differ- ent from these values indicate some form of preferred structure, with- out any indication of what that structure may be (IO) Considerable care is needed, since chemical shifts may be affected by nearby aro- matic rings, titratable groups, or hydrogen bonds from side chains Chemical shifts of 15N not seem to be very good indicators of hydrogen bond formation, and 13C0 shifts seem to be only poor mdi- caters, but it is still too early to say much about the usefulness of heteronuclear shifts
3.2.6 Solvent Titration
Exposed amide protons are sensitive to the hydrogen-bonding cap- ability of the solvent Thus, on adding chloroform to adimethylsulfoxlde solution, exposed amide protons will become less hydrogen bonded and shift to higher field (12) The absence of a chemical shift change is indicative of shielding from solvent Naturally, these arguments are only relevant if the conformation does not change on altering the sol- vent composition -this is not always an easy point to decide, as dis- cussed in Section 3.3
Alternatively, solutes, such as shift reagents or free radicals, can be added to perturb resonances in a more or less predictable manner, hopefully without altering the peptide conformation (6) All of these methods are difficult to interpret meaningfully,
3.3 How Many Conformations?
(76)do have structured regions, these are likely to be in fast exchange with random-coil conformations (see discussion in Section 3.4.2.) Cyclic peptides tend to be more structured, although a range of conformations can often exist in fast exchange Side chains are likely to be in fast exchange between the two or three staggered conformations A single conformation can be assumed if all the following hold:
1 Most structural parameters indicate a preferred structure In other words, there should be non-random-coil NOES, extreme values of J (~6 or >8.5 Hz), low temperature coefficients (<3 ppb/K) and slow exchange of any NH implicated in a hydrogen bond, and some unusual chemical shift values (differing from random-co11 values by ~0.4 ppm for NH and >0.2 ppm for other protons, which cannot be explained by ring-current or titration effects) Diastereotopic protons (especially Gly CaH) should have different chemical shifts and couplmg constants Conformational preferences of side chains are only worth considering if diastereotopic CaH have different chemical shifts and coupling constants
Another way of putting this is to say that the ammo acids in the se- quence should show sequence-dependent differences m their coupling constants, NOES, and so on A good example of this is quoted by Kessler
(3): in cyclo(Gly& all the Gly NH are equivalent, with Dd/T equal to
2.96 ppb/K, whereas in cycle (Ala-Gly,), all five residues are distm- guishable, having temperature coefficients (starting with Ala) of 4.16, 2.45, 3.46, 3.21, and 1.87 ppb/K, respectively, thereby providmg evi- dence for a preferred structure (but not necessarily only one single pre- ferred structure)
2 Temperature changes not alter the parameters or at least alter them in a linear fashion This applies particularly to A6/T, for whrch a nonlinear variation 1s indicative of multiple conformations, or at least of the unfolding of a folded conformation with increase m temperature All structural parameters are self-consistent; thus, A&T, NH exchange,
and solvent titration should all rmplrcate the same amide protons
Peptide conformations can depend markedly on solvent composi- tion (see Note 3) If structural parameters (e.g., 3J and NOE) not change as the solvent composition is altered, it can be assumed that the conformational equilibrium has not altered, and thus almost certainly a single conformation is indicated If they change, considerable caution is called for At the very least, it shows that several conforma-
tions are accessible, while leaving open the question of how many con-
(77)warning given earlier: It is very unwise to assume that only a single conformation is present, without careful examination of all available data
3.4 Structure Analysis
Here, as elsewhere in this chapter, the golden rule is that as many parameters as possible should be used to reach structural conclusions Non-NMR parameters, such as CD and fluorescence quenching, should also be used if applicable (see refs 13-17) We treat the simpler case first, where only one conformation is present in solution
3.4.1 Single Conformation
There are several ways of deriving structures from NMR data Dis- tance geometry or molecular dynamics can be applied, as described in Chapter 2, but these methods are often unsatisfactory for cyclic pep- tides because of the restraints imposed by the ring system (Not only are some programs incapable of handling cyclic systems adequately, but the high energy barriers to internal rotation of the backbone in small cyclic peptides can mean that dynamics calculations cannot access all the available conformations.) The normal approach is to go through each of the possible types of local structure in turn and see if it fits the data This approach is risky, since it is easy to overlook other possibilities once a conformation has been identified and to ignore conflicting data We stress again that a claim for a single conformation requires that all structural data be satisfied by the conformation pos- tulated A promising new approach, particularly for cyclic systems, is to calculate all the low-energy backbone conformations accessible and see which one fits the NMR data best This approach has the major advantage that it is far less subjective than the manual approach, but it is not yet generally available
(78)(18), which involve a cis amide bond, and are usually only found with proline or N-alkyl amino acids Cis amide bonds are readily recogniz- able by a short distance between Ca protons on either side of the bond Particularly in cyclic pentapeptides, y-turns and reverse y-turns can also be found, usually with a bulky residue (e.g., Pro, Val, Phe, or Aib) in the turn The reverse y-turn is less sterically stramed than the y-turn for L-amino acids These turns are depicted in Fig 3, and some char- acteristic angles and distances are given in Tables and In crystal structures of proteins (18), local geometries can differ considerably from those used to produce the data in Tables and 2, implymg that the distances and angles in real peptides may vary quite markedly from those given in the tables In practice, it is usually AS/T, NOE, and Jthat are used to identify structural features, but many other techniques should be used to confirm the conclusions reached using these parame- ters AS/T is much quicker to measure than the NOE, and is therefore more often quoted (especially in earlier work), although it is not as effec- tive at distinguishing different secondary structures than the NOE (9)
3.4.2 Multiple Conformations
As stressed earlier and in many other places (e.g., 4), it is only when a single conformation is present that structures can be derived with any degree of reliability Because of the averaging of NMR properties by intramolecular reorientation, NMR cannot easily be used to charac- terize multiple conformers, unless some independent knowledge or assumptions are used as to the nature of the conformations present For example, imagine a flexible peptide for which most of the NMR param- eters are fit by a type I turn Assume that a better fit can be obtained by including a small amount of a type II’ turn, plus smaller proportions of conformations involving hydrogen bonds from side chains to back- bone atoms By suitable juggling of the populations of these conform- ers, the data can be fit very well, but this has meant introducing a large number of experimentally undetermined parameters (i.e., conforma- tions and populations): Anything can be fit in this way, provided enough new conformations are introduced, and the exercise is therefore largely meaningless
(79)B
H\ ,R2 H,N/C lc//O
I I
cso
HNC /
HAN\C,H
R” \ I ‘R3 Fig (A) A p-turn, (B) A y-turn
Table
Characterlstlc Angles and Coupling Constants in Secondary Structure Structure
Residue Residue 4) w 3J~~~u ye 3J~~~a
a-Hehx -57 -47 39 Random co11 6.5-8
Turn I -60 -30 46 -90 Turn II -60 120 46 80 62 Turn II’ 60 -120 69 -80 Turn y 80 -65 62
Reverse y -80 65 67
U 3J~~~ values are calculated using the equation of ref
peptides This is by definition a mobile structure, but a wide range of information from NMR and elsewhere implies that it is predomi- nantly an extended (P-sheet-like) conformation However, because of its mobility and the nonlinear averaging of NMR parameters (particu- larly the NOE), it has some characteristics of more folded structures; for example, low-intensity dNN NOES are commonly found m random- coil peptides
(80)Structure NH2-g? NH&,= NH,-NH, NH3-or, NH+‘,” NH3-q NH+x’~~ NH,-NH3 NH‘+-a3 NH4-cx’3= &-Helix 2.7 28 3.5
Random co11 Medium Long Short
Turn I 27 23 2.8 3.4 29 29 - 26 33 -
Turn II 28 - 45 2.2 - 23 29 26 3.3 33
Turn II’ 2.3 28 45 3.2 2.2 29 - 28 33 Turn y 2.3 - 39 3.6
Reverse Y 29 - 38 2.7
(81)such a model, it is reasonable to assume that such a conformational equilibrium is occurring, particularly if solvent titration can be used to shift the conformation from random coil to the folded structure If the NMR data not fit a simple random coil ti folded structure model, then it is very hard to deduce anything reliable Sometimes, compari- son of different peptide sequences can be useful (IO), but sequence comparisons can be misleading, for example, if interactions with the side chain lead to the perturbation of As/T (9) There is no established way of dealing with the problem of multiple conformations One prom- ising method, particularly for cyclic peptides, is to obtain the confor- mational models either from crystal structures or from molecular mechanics or both (I), and use the NMR data to assess the populations
of each conformatton In a similar approach, Nikiforovich et al (19)
calculated a large number (nearly 15,000) of accessible conformations of the linear peptide angiotensin II, which were categorized into 12 families They then used NMR and fluorescence data to give statistical weights to the different conformers No single conformer could ade-
quately describe the conformation of the peptide, but five different
“indispensable” conformers were shown to be the minimum number
necessary to account for the experimental data adequately
4 Notes
1 Cdtruns isomerlzation about amide bonds is usually slow enough to lead to two sets of signals in the NMR spectrum It is particularly common for prohne and N-alkyl amino acids If the rate of exchange between the two isomers is slow enough, they can be treated as two separate compounds
However, If it is faster than l/7’, (the spin-lattice relaxation rate), NOES
will be partially or completely averaged between the two conformations,
even though the two conformations give separate NMR signals (4)
2 The upper limit to the concentration suitable for NMR experiments is
determined either by solubility or by mtermolecular mteraction, but in
any case, measurement of the concentratton dependence of the chemi-
cal shifts, A6/T, and couplmg constants is recommended to check that
there are no overt concentration-dependent effects Chloroform is par-
ticularly prone to aggregatton phenomena
3 The choice of solvent IS crucial to a meanmgful result Because of their
flexibility, pepttdes can often adopt different structures in different sol-
vents It then becomes debatable what the significance of a structure IS,
(82)3.3 of this chapter, it is m any case advisable to use several solvents or solvent mixtures to obtain a more complete picture of the conforma- ttonal heterogeneity of the peptide There IS no general rule as to the “best” solvent to use Most peptides are normally found in aqueous environments, and water would therefore seem an obvious choice However, peptides act at protein surfaces or m membranes, which are less polar, and there- fore less polar solvents may give a more relevant result; less polar solvents also tend to induce more structure m peptides, because the hydrogen- bonding potential of the solvent is weaker Dtmethylsulfoxide is a com- mon choice and also a good solvent for peptides, whereas either methanol or 2,2,2-trifluoroethanol is often added to aqueous solutions to induce helix formation, the assumption being that the helices seen m such sol- vent systems are representative of the helices formed m their native environment (usually in membranes) (20) Water/dimethylsulfoxide mix- tures have been suggested for use at temperatures below 273 K, as a way of mcreasmg z, (in order to make NOESY crosspeaks larger) and to freeze out some conformational riotion (21) It has been suggested that chloroform mduces conformations of enkephalm analogs with a better correlation to their activities than does dimethylsulfoxide (22), whereas a study of somatostatm analogs (23) showed that conforma- tions m dimethylsulfoxide are good predictors of the presence or absence of biological activity, although structure-activity relationships are bet- ter when conformations in aqueous solution are used Different recep- tor environments are probably best modeled by different solvent systems, but many more structure-activity studies are necessary before any gen- eral conclusions can be drawn m this area
4 Samples can be recovered from chloroform and methanol by solvent evaporation in a stream of dry mtrogen, and from water by lyophiliza- tion Lyophilization can also be used to recover samples from dimethyl- sulfoxide, but only if it is frozen in a thm film and often only if water is added Alternatively, desalting columns provide a rapid way of exchang- mg dimethylsulfoxide to water for subsequent lyophilization
5 All the 2D experiments described here, with the exception of COLOC, should be performed m the phase-sensitive mode COSY should be run as the double-quantum filtered version
(83)7 The NOE intensity is only proportronal to rm6 at short mixing times,
because at longer times, spin diffusion and magnetization decay pro- duce intensity distortion This topic is discussed at length in Chapter and ref 4, but as a rough guide for peptides, mixing times of longer than 150 ms should be avoided for quantitative work
8 When the exchange rate between conformers is faster than the overall rota- tional correlation rate, which is often the case (particularly for side chain rotation), the NOE should be averaged not as a-% but as <r3>2 (4) Low values of amide temperature coefficients can arise from hydrogen
bonding to side chains, such as glutamate H-bonding to its own amide, or aspartate H-bonding to a residue ahead m the sequence Obvrously, pH will have a marked effect on H-bonding from side chants In some cases, anomalous results can be obtained; for example, in a 22-residue
peptide from the Herpes simplex virus glycoprotein D-l antigenic
domain, the amide proton of Val-14 has a very high temperature coeffi- cient, but is the most slowly exchangmg amide proton in the peptrde (IO) The large coefficrent was ascrtbed to loss of the local structure around Val-14 on increasing the temperature
10 The exchange rate of a solvent-exposed amide proton in water depends on the nature of the side chain on either srde of it The sequence effects have been tabulated (24) and confirmed by numerous experiments since Most sequences give exchange rates within a lo-fold range of the value for -Ala-Ala- The major excepttons are residues at either terminus of the peptide, and His+, which can give a base-catalyzed exchange rate m the peptide -His + -His+ - some 300 times faster than that in -Ala-Ala-
References
1 Kessler, H , Griesmger, C., Lautz, J., Muller, A., van Gunsteren, W F., and
Berendsen, H J C (1988) Conformatlonal dynamics detected by nuclear mag-
netic resonance NOE values and J coupling constants J Am Chem Sot 110,
3393-3396
2 Kessler, H and Bermel, W (1986) Conformational analysis of peptides by
two-drmensional NMR spectroscopy, in Applications of NMR Spectroscopy to
Problems in Stereochemistry and Conformattonal Analysis (Takeucht, Y and
Marchand, A P., eds.), VCH, Weinherm, pp 179-205
3 Kessler, H (1982) Peptrde conformations Part 19 Conformation and blologt-
cal effects of cyclic peptides Angew Chem Int Ed 21,5 12-523
4 Neuhaus, D and Wilhamson, M P (1989) The Nuclear Overhauser Effect rn
Structural and Conformational Analysis VCH, Weinheim
5 Kessler, H , Griesmger, C , and Lautz, J (1984) Determmatlon of the connect-
ivities of weak proton-carbon couplings with a variation of the two-dimen-
(84)6 Smith, J A and Pease, L G (1980) Reverse turns m peptides and proteins
CRC Crit Rev Btochem 8,3 15-399
7 Pardr, A , Billeter, M., and Wuthrich, K (1984) Calibration of the angular
dependence of the amide proton-Ca proton couplmg constant, 3JHNa, in a globu-
lar protein J Mol Btol 180, 741-751
8 Marion, D and Whthrich, K (1983) Application of phase sensitive two-drmen- sional correlated spectroscopy (COSY) for measurements of *H-‘H spin-spin
coupling constants in proteins Btochem Biophys Res Commun 113,967-974
9 Dyson, H J , Rance, M., Houghten, R A, Lerner, R A., and Wright, P E (1988) Folding of immunogenic peptide fragments of proteins in aqueous solu-
tion I Sequence requirements for the formation of a reverse turn J Mol Biol
201,161-200
10 Williamson, M P , Hall, M J , and Handa, B K (1986) ‘H-NMR assignment
and secondary structure of a Herpes stmplex virus glycoprotem D-l antigenic
domain Eur J Biochem 158,527-536
11 Wiithrrch, K (1986) NMR of Proteins and Nucleic Acids Wiley, New York
12 Urry, D W and Long, M M (1976) Conformations of the repeat peptrdes of elastin m solution an application of proton and carbon-13 magnetic resonance
to the determmation of polypeptide secondary structure CRC Crtt Rev
Btochem 4, l-45
13 Urry, D W (1985) Absorption, circular dichroism and optical rotatory drsper-
sron of polypeptides, proteins, prosthetic groups and biomembranes, in Modern
Phystcal Methods in Biochemistry (Neuberger, A and Van Deenen, L.L M , eds ), Elsevier, Amsterdam
14 Drake, A F (1993) Optical spectroscopy, in Methods in Molecular Biology
Protocols for Opttcal Spectroscopy and Macroscopic Techniques (Jones, C., Mulloy, B , and Thomas, A H., eds ), Humana, Totowa, NJ, m press
15 Varley, P G (1993) Fluorescence spectroscopy, m Methods in Molecular Btol-
ogy* Protocols for Optical Spectroscopy and Macroscopic Techniques (Jones, C , Mulloy, B , and Thomas, A H , eds.), Humana, Totowa, NJ, m press
16 Drake, A F (1993) Circular dichroism, m Methods tn Molecular Biology*
Protocols for Optical Spectroscopy and Macroscopic Techniques (Jones, C , Mulloy, B., and Thomas, A H , eds.), Humana, Totowa, NJ, m press 17 Haris, P I and Chapman, D (1993) Analysis of polypeptide and protein struc-
ture using Fourier Transform infrared spectroscopy, in Methods m Molecular
Biology: Protocolsfor Optical Spectroscopy and Macroscopic Technrques (Jones, C., Mulloy, B., and Thomas, A H , eds ), Humana, Totowa, NJ, in press
18 Richardson, J S (1981) The anatomy and taxonomy of protein structures Adv
Prot Chem 34, 167-339
19 Nikiforovich, G V , Vesterman, B , Betms, J , and Podms, L (1987) The space
structure of a conformationally labile oligopeptide in solution: angiotensin J
Btomol Struct Dynamics 4, 1119-l 135
20 Bazzo, R , Tappm, M J., Pastore, A , Harvey, T S , Carver, J A , and Campbell,
I D (1988) The structure of mehttin A ‘H-NMR study in methanol Eur J
(85)21 Motta, A., Prcone, D , Tancredi, T., and Temussi, P A (1987) NOE measure-
ments on linear peptides m cryoprotectrve aqueous mrxtures J Magn Reson
75,3&t-370
22 Temussi, P A , Tancredt, T., Pastore, A., and Castrghone-Morelh, M A (1987)
Experimental attempt to simulate receptor site environment A 500-MHz ‘H
nuclear magnetic resonance study of enkephalin amides Biochemistry 26,
7856-7863
23 Wynants, C , Coy, D H , and van Binst, G (1988) Conformational study of
superactive analogues of somatostatm with reduced ring size by ‘H NMR
Tetrahedron 44,94 l-973,
24 Molday, R S , Englander, S W., and Kallen, R G (1972) Prrmary structure
(86)(87)High-Resolution NMR of DNA
and Drug-DNA Interactions
Jill Barber, Helen F Cross,
and John A, Parkinson
1 Introduction
The advantage of NMR over most other spectroscopic techniques lies in the ability to gain structural and dynamic information at atomic resolution Every nucleus with spin gives rise to a signal that is char- acterized by a number of parameters (chemical shift, J-couplings, relaxation data, and NOE connectivities) that can be used to obtain quite detailed structural information about the molecule under study They can also be used to determine kinetic properties, for example, the interconversion rates of different conformations of a molecule and the exchange rate of free with bound ligand on a macromolecule NMR has been widely used to study both static and dynamic aspects of DNA structure and drug-DNA mteractions
Several atomic nuclei are available for the study of DNA by NMR ‘H is the most common, but 31P NMR is especially useful for studying the effects of ligand binding on the phosphate groups of DNA The simplicity and large chemical shift range of 13C spectra, relative to ]H spectra, sometimes make this the nucleus of choice Other nuclei that may be considered are 15N, a good nucleus if isotopic enrichment of the DNA IS possible; 2H and 14N, which are quadrupole nuclei and only really suitable for specialized solid-state studies; 170 and “0, whrch may be detected indirectly via isotopic shifts in the 13C or 31P spec-
From Methods m Molecular Biology, Vol 17 Spectroscopfc Methods and Analyses NMR, Mass Spectrometry, and Metalloprotern Techmques Edited by C Jones, B Mulloy,
and A H Thomas Copynght 01993 Humana Press Inc , Totowa, NJ
(88)trum; and 3H, the best NMR nucleus of all in terms of sensitivity, the use of which is, sadly, almost completely prohibited by financial and safety considerations
NMR spectroscopy can, if necessary, be used as an alternative to X- ray crystallography On rare and much publicized occasions, it has even been used to correct information obtained by X-ray diffraction In general, however, when crystals are available, it is easier to obtain precise information, such as absolute stereochemistry and interatomic distances, from diffraction data than from NMR data The advantage of the NMR experiment is its versatility Information can be obtained at various temperatures, and solvents of different pH, ionic strength, or dielectric constant can be used NMR is a particularly good tech- nique for the study of interactions of small molecules with macromol- ecules, such as DNA; the effects of changing experimental conditions can be monitored relatively easily, and there is a wealth of conforma- tional and dynamic information to be extracted
The technique, naturally, is not without disadvantages The most obvious drawback is the cost of a suitable spectrometer (hundreds of thousands of pounds) If the money is available, one then has to satisfy the inherent insensitivity of the technique; 100 pM solutions are usu- ally the minimum requirement (i.e., approx 0.4 mg of a decamer duplex) The most fundamental problem, however, is that of the broad lines associated with NMR spectra of large molecules
Figure 1A shows part of a 270 MHz ‘H NMR spectrum of calf thymus DNA Uninterpretable lumps like this are typical of NMR spectra of large molecules For NMR spectroscopy of DNA (unlike proteins), this problem can be alleviated rather easily by the use of smaller fragments of DNA Sonicated or sheared DNA may be used, but most usually synthetic self-complementary oligonucleotides of 4-
(89)Fig (A) Calf thymus DNA aromatrc region (B) d(CGCGAATTCGCG)2.Hoechst 33258 1 complex
drug are, of course, normally observed, and, when the DNA fragment 1s self-complementary, dyad symmetry may be lifted, doubling many of the peaks
Figure shows a short section of B DNA illustrating the principal modes of binding of drugs Many antitumor drugs are intercalators They slip between the base pairs of DNA In order to this, they need to be flat, generally containing a number of fused aromatic rings Ethi- dium bromide (Structure l), acridines (Structure 2), and actinomycin D (Structure 3, which appears as part of Fig later in this chapter) are familiar intercalators Intercalators have limited ability to read sequence, and even those used as drugs are usually very toxic
Drugs that bind in the minor groove of DNA are able to read sequence to some extent Most minor groove binders bind selectively in AT-rich regions of DNA, but there have been attempts meeting with increasing success (14) to design molecules that read sequence from the minor groove
(90)intercalators fit between base pFdrt3
Fig A short section of DNA showing schematlcally the modes of binding of intercalators, minor-groove binders, and major-groove binders
Structure
Structure
(91)Table
Typical Chemical Shift Ranges for Proton Resonances in NMR Suectra of Nucleic Acids
Proton type Expected chemical shaft” Distmgmshing features
TCH,
Sugar 2’ and 2”H S’terminal 5’H and 5”H Sugar 5’H, 5”H, and 4’H Sugar 3’H
Sugar 1’H C C5H
A C2H, C8H G C8H T C6H c C6H
C ammo CH4 ( l)b C amino CH4 (2)b G immo CHlb T immo CH3b
1 OO-2.00 ppm 2.00-3.00 ppm 70 ppm 4.00-4 50 ppm 4.50-5.20 ppm 30-6 20 ppm 5.30-6 20 ppm
6.50-8 20 ppm
6.50-8.20 ppm 6.40-6.80 ppm 30-8.50 ppm 12 50-13 OOppm 13 50-14.00 ppm
Sharp singlet
3J = lo-Hz doublet
3J = lo-Hz doublet Exchange out in D20 Exchange out m D20 Exchange out m D20 Exchange out m D20
@Chemical shifts relative to Internal T S P
“For Watson-Crick base pairs
2 Measurement and Interpretation of NMR Spectra of DNA and DNA-Ligand Complexes
2.1 Sample Conditions
It is generally accepted that, at low concentrations and in low-ionic- strength solutions, self-complementary oiigonucleotides tend to form hairpin loops as well as double helices (6) Oligonucleotrdes required for the NMR study of double-helical structures in solution are there-
fore usually dissolved in a buffer containing 100 mM NaCl and 10 mM phosphate At room temperature (295 K), such a double helix produces a characteristic ‘H MNR response, details of which are shown in Table
2.2 ID NMR
(92)single-stranded DNAs is usually manifest as a shift of ‘H resonances to higher frequencies A plot of chemical shift change vs temperature for any single resonance will usually yield the melting temperature, T,, at which half the population has converted to a second form, Dis- sociation enthalpies for the duplex to hairpin and duplex to single-strand conversions can also be derived using such data (7,8) Clearly, varra- tion of solvent and solute composition can markedly alter the global DNA structure; for example, conversion of B-DNA to left-handed Z-DNA can be effected by the addition of methanol to an aqueous sample (9)
2.3 ID Multiple-Pulse Experiments
For most NMR studies of aqueous solutions, the sample is dissolved in D20 This is inappropriate for observation of imino and amino protons of DNA, which exchange rapidly with the solvent even when base paired It is therefore necessary to dissolve the sample in 90% Hz0 (with 10% D20 present to provide a lock signal), and to suppress or attenuate the huge water signal This is most simply achieved by presaturation, but saturation transfer to the imino protons results in reductions in the intensity of these signals (10) Although the loss of signals can be used to one’s advantage, an alternative, known as the binomial pulse sequence, can be applied (11,12) This technique atten- uates the water signal without continuous irradiation of the water reso- nance and does not employ the use of a decoupler channel By using a binomial pulse (17 or 133T) as the observe pulse, the decoupler channel can be used to generate 1D NOES between imino protons and near neighbors, such as ademne C2 protons Such NOES can be drffi- cult to detect in 2D experiments The NOE in such cases, which is a measure of interatomic separation <5 A, is usually negative, owing to the slow tumbling time of the molecule m solutron Two-pulse experr- ments provide means of measuring other parameters, such as the spin- lattice relaxation times (T,) A variation of T,s for particular types of proton along an oligonucleotide chain can point to sequence-specific structural variations (13) Measurement of Trs can provide a means of Identifying adenine C2 hydrogens, whrch have characteristic T,s lymg between and 10 s
(93)low natural abundance of i5N and t3C prevents many detailed experi- ments from being carried out, but phosphorus can be detected readily at natural abundance and serves as a probe for DNA backbone confor- mation Under normal circumstances, a right-handed B-DNA struc- ture gives rise to 31P resonances clustered around -3 ppm relative to internal phosphate Variations away from this central position by or ppm have been shown to be indicative of alternative DNA backbone conformations (9) Assignments of phosphorus spectra can be made either by *H detected HMQC (2D Heteronuclear Multiple Quantum Coherence) (14) or in 1D by specific 170 and 180 labeling (15) Label- mg with 170 directly attached to phosphorus broadens the 31P signal below detection, whereas ‘*O labeling causes an isotopic shift to the low-frequency side of the normal 31P signal Enrichment is necessary for the observation of i3C DNA resonances, and by combining 31P with 13C NMR, it can be possible to formulate some understandmg of DNA backbone motion (16) In a similar way, base-pair mobility has been investigated by appropriate 2H and t5N enrichment (16)
Such 1D NMR applications are of benefit, but despite improved signal dispersion at higher magnetic field strengths, the combination of substantial signal overlap and poor resolution, especially for iH nuclei, imposes major limitations on the assignment of spectra of DNA and l&and-DNA complexes For this reason, 2D and 3D NMR meth- ods have emerged as the principal sources of both assignments and structural and dynamic information on such materials
2.4.20 NMR
(94)are the most useful For these experiments to be executed within a sensible time period (3-12 h), l-5 mM solutions (lo-20 mg oli- gonucleotide) are required This provides sufficient material without causing aggregation or providing too much signal for the spectro- meter ADC to handle Data should normally be acquired m phase- sensitive mode on a nonspmning sample and, where possible, all necessary data (i.e., COSY, NOESY, TOCSY) should be acquired with- out removing the sample from the magnet (18) Whereas useful 1D experiments may be performed at medium field strengths, for all but the shortest oligonucleotides, 2D experiments should be carried out at
500-600 MHz
2.5 Strategy for the lH NMR
Assignment of DNA
The strategy used for assignmg 2D NMR spectra of DNA involves connecting adjacent spin systems on the basis of the spatial mforma- tion inherent in the NOESY data Three scalar-coupled spin systems (isolated from one another m terms of direct bonding) exist in DNA, namely sugar ring, Hl’, H2’, H2”, H3’, H4’, H5’, HS’, cytosine C5H- C6H, and thymine CHs-C6H (4J-coupling) Figure shows a typical COSY spectrum for a regular B-DNAstructure; off-diagonal responses correlate nuclei that are scalar-coupled
In NOESY data sets, the off-diagonal responses correlate nuclei that are dipolar-coupled When the NOESY experiment is run with a mixing time of 200-300 ms, dipolar couplings are seen relating all pairs of nuclei separated in space by A or less These dipolar cou- plings can be predicted on the basis of known DNA structure (see Table 2) and used as the basis of a sequential method of assignment of ‘H NMR data (19) Figure 4A shows one of the most useful quadrants of a DNA NOESY data set used to make the sequential assignment in the duplex [d(AGACGTCT),] (20)
(95)9 PPM
Frg Magnrtude mode 2D tH COSY NMR spectrum of the 8-mer [d(AGACGTCT)], at 296 K recorded at 500 MHz Although the spectrum is shown symmetrized, thus practrce 1s now not recommended for 2D spectra
Table
Characterrstlc Dipolar Couplings m a B-DNA Structure” Pu C8H(n)/Py C6H(n) to CI’H(n) C2”H(n) Cl’H(n - 1) C2”H (n - 1) Pu C8H(n)/Py C6H(n) to C CSH(n) or T CHs(n + 1)
Cl’H(n) to C2”H(n), C3’H(n), C4’H(n) C2’H(n) to C2”H(n), C3’H(n)
C3’H(n) to C4’H(n), CS’H(n), CS’H(n) C4’H(n) to CS’H(n), CS”H(n)
(96)A
C6H C4 Ts
$ G5 G2 Ai A3
6.00 5.50 5.00 Sugar ring I’H resonances : &mer
6.00
PPM
(97)B-DNA 5’ Z-DNA
H6 H5 Hl’ H2’ H2” c H6 H5 Hl’ H2’ H2” H4’ H5’ H5”
H6 H5 HI’ H2’ H2” C H6 H5 Hl’ H2’ H2” H4’ HS H5” 3’
Fig, Spatial NOE connectlvltles expected for S-CpGpC-3’ in right- and left- handed DNA structures
shown This “walk” procedure can be applied not only to the aromatic- Cl’H region, but also to the aromatic-C2’H, C2”H region
2.6 Structural Characterization
of DNA from NMR Data
Once an assignment has been made, certain qualitative conclusions can be drawn concerning the type of structure formed in solution, based on the NMR data acquired As shown earlier, 1D NMR studies can reveal whether a material has double-helical characteristics The assignment “walk” from 2D NOESY data can show whether that duplex IS left- or right-handed Predictions of NOE “contacts” for both types provide a basis for making the judgment (Fig 5) (21) Since the NMR data for a single conformation have a unique solution, the assignment falls into one of these two categories Thus, for the example in Fig 4, the structure is right-handed Two different forms of right-handed duplex DNA, A- and B-DNA, show subtly different NMR responses X-ray studies show that distances between purine H8 or pyrimidine H6 base protons and neighboring 2’ protons are different for the two geometries:
H6/H8 (n) to H2’ H6/H8 (n) to H2’(n - 1)
A-DNA 3.9‘k 1.7‘&
B-DNA 1.98, 3.9A
H2’ (n - I) relates to the 2’ proton on the 5’ flanking nucleotides relative
(98)These distances are clearly reflected in NOESY data For a B-DNA structure, NOES between base C6 and C8 hydrogens and the C2’ sugar hydrogen of the same nucleoside unit are larger than NOES between base C6 and C8 hydrogens and the C2’ hydrogen of the neighboring 5’ nucleoside unit For an A-DNA structure, the opposite is true (22) Such is the power of the NOESY experiment that it is now being used as the source of quantitative solution structures of DNA m solution This subject is beyond the scope of the current discussion, but refer- ences for further reading are provided (23-25)
3 Structural and Dynamic Studies of DNA
A number of different forms of DNA exist and even coexist in the same sample NMR lends itself particularly well to their study, espe- cially in cases where no suitable crystals can be obtained for X-ray work Both structural and dynamic studies are m progress on unusual DNA structures known to be responsible for mutagemc effects
3.1 Left-Handed Z-DNA
(99)lar to that used for B-DNA, can be used to assign the ‘H NMR spec- trum of Z-DNA (21) Alternatively, a 1: mixture of Z- and B-DNA can be generated by the addition of methanol to B-DNA (9), and the Z-DNA spectrum assigned by chemical exchange using the (previ- ously asstgned) B-DNAspectrum This procedure is possible, because the B- and Z-DNA are in slow exchange (giving separate distinct signals) on the NMR time scale Slow exchange also allows the percentage Z- DNA to be calculated (from the integration of resolved signals) Data recorded at different temperatures have enabled Arrhenius plots of In(%B/%Z) against l/T(K)-’ to act as a source of AH and AS values for the interconversions The results indicated that the enthalpy term favors Z-DNA and the entropy term, B-DNA (9)
3.2 Hairpin-Loop DNA
DNA sequences consisting of inverted repeats (such as AGCTAGCT) are often present in regulatory sequences of DNA, and are of particular interest since they mimic hairpin-loop structures present in cruciforms and RNA The possibility that these hairpin-loop forms have some regulatory capacity has fueled interest in the duplex to hairpin-loop conversion, the factors that govern hairpin-loop stability, and general consideration of conformational pathways to chain folding in solu- tion Hairpin-loop DNA crystals have so far proved elusive, and NMR is therefore the only source of structural data on short DNA hairpin structures NOESY NMR data of hairpin loops show base stacking features similar to B-DNA (7)
Usually the stem is a right-handed B-DNA duplex, with base stack- ing continuing into the loop region Many thymine loop-containing oligonucleotides studied by NMR, such as d(ATCCTATTTTTAGGAT) and d(CGCGTTTTCGCG), have been proposed to be stabilized by alternative T.T base pairing in the loop region It remains to be estab- lished how the sequence and structure of the stem region influences the nature of the loop region in such materials
3.3 Base Mispairing
and Defective DNA Structures
(100)significant interest in the structure and stabrhty of DNA containing GT, GA, AC, and TC mismatches NMR studies of the GT mismatch DNA d(AAATTTTCAAA) d(TTTGAGAATTT) show that GT occu- pies a normal position in the helix (26) One effect of GT base-pair formation is the reduction in stability of the duplex, as evidenced by a decrease in the melting temperature of aberrant structures compared to the parent materials The melting temperature of d(CGTGAATTCGCG) d(CGCGAATTCGCG) containing two GT mismatches was found to be 52°C compared with 72°C for the parent [d(CGCGAATTCGCG)], (27)
3.4 Extrahelical Bases
and Frame-Shift Mutagenesis
DNA sequences that incorporate an extra unpaired base into the structure, such as the sequence shown, are of interest in the study of frame shift mutagenesis
C-G-C-A-G-A-A-T-T-C -G-C-G
G-C-G _ _ _ C-T-T-A-A-G-A-C-G-C
Although X-ray crystal studies of this material showed the As to be looped out (28), NMR reveals that A is stacked into the double-helical structure in solution (29) By contrast, NMR studies of d(CA,-CA,G)
l d(CT6G) show the extrahelical C to be looped out (30) Further
studies by NMR of triple-helical DNA, a-DNA, base-alkylated DNA, and backbone alkylated DNA are all adding to the understanding of factors influencing DNA structure and mutagenic repair
4 Drug-DNA Interactions
4.1 Minor Groove Binders
(101)Table
Examples of Llgand-DNA Complexes Studied by NMR
Ligand DNA sequence Ref
Netropsin d(GGAATTCC) 31
Netropsm d(GGTATACC) 33
Distamycm A d(CGCGAAATTGGC) l d(GCCAATTTCGCG) 35
Lexltropsin d(CATGGCCATG) 34
Hoechst 33258 d(CGCGAATTCGCT) 32
planar and crescent shaped, and possess donor/acceptor functionality, and the DNA minor groove possesses an electrostatic potential minimum attractive to many such ligands Studies of these complexes by NMR involve analysis of binding modes, complex lifetimes, and base speci- ficity Examples of materials studied in this way are shown in Table
4.1.1 Structural Features
of Complexes from NMR Data
The broadening of DNA ‘H resonances on the addition of a suitable minor-groove binding ligand has often been taken as primary evidence of complex formation The broadening is a reflection of the increased rotational correlation time of the DNA with a ligand tightly bound to it
(102)1 ligand DNA
0 75 lrgand DNA
0.5 ligand DNA
0 25 ligand DNA
ti
3 00 2.60 20 60 40 00 PPM
Fig One-dlmenslonal ‘H NMR spectra of mixtures of free d(CGCGAATTCGCG) and Hoechst 33258 bound d(CGCGAATTCGCG) m slow exchange, in the ratios of DNA to ligand indicated
point in toward the minor groove, and base-pair imino and adenine C2 hydrogens line the floor of the minor groove By assessing both the complexation shifts of these resonances and NOES between these protons and resonances assigned to the ligand, a clear picture of the binding mode is created
(103)0 2- O.l-
O-
A6 -0 l-
-0.2- -0.3- -0 4-
0 2- l- O- -0.1- -0.2- -0 3-
A6 -0 4-
-0 5- -0 6- -0 7- -0 0- -0 9- -1 o- -1 l- -1 2-
~1’ ~2’ ~3’ C4’ ~5’ T6’ A7’ A6’ @’ $0’,$1’C12
Fig NMR spectra (1D) of the thymme methyl region of d(CGCGAAITCGCG)2 m 10 mM phosphate, 100 mM NaCl, and m&Z NaN, at pH apparent 0,99 96% DzO referenced to Internal 3-tnmethylsilyl [2,2,3,3,-2H4] proplonate Spectra were recorded at 500 MHz m the presence of the indicated molar ratios of added Hoechst 33258
IS seen as the main cause of these shifts, Cl’ hydrogens lying perpen- dicular to the plane of the rings
(104)A3’-H2 W-% H
HO
H
Structures and Examples of mtermolecular NOES observed for both Netropsm (left, Structure 4) and Hoechst 33258 (right, Structure 5) bound at an AATT bmd- mg site
may take some months to complete, following the protocol outlined m
Section A NOESY off-diagonal response that correlates a ligand
proton resonance with a DNA proton resonance represents a close contact between DNA and ligand Charting such responses for all ligand protons provides a series of structural constraints correspond- ing to intermolecular distances <5 A Typical mtermolecular contacts are shown for both Netropsin (Structure 4) and Hoechst 33258 (Struc- ture 5) bound at an AATT site (31,32) (See Structures and 5.)
The majority of contacts are between imino and adenme C2 hydro-
gens and ligand aromatic/NH hydrogens, which lie on the concave
(105)\ ,’ FLIP \ -
’ 1’
Fig The existence of transferred NOE arises through drug flippmg on the NMR time scale Closed and open circles represent two sets of two protons that are chemically equivalent m the free DNA, but become inequlvalent in the presence of
an unsymmetrical hgand “Normal” NOES are shown by sohd double-headed ar-
rows “Transfer” NOES are shown by double-headed dashed arrows
ments and the presence of imino proton resonances in the correct position for Watson-Crick base pairs indicate that base pairing remams intact when a ligand is bound
4.1.2 Dynamzc Features of Ligand-DNA Complexes
(106)Structure
process is deemed to go by one of four mechanisms, namely intermo- lecular exchange betwen DNA molecules, mtramolecular sliding, “walking” of the ligand, or by a flip-flop exchange (34)
The last, and most favored, is expected to go vra a loosely associated complex between drug and ligand in which the rate-determining step is the departure of the drug from where it orients itself along the minor groove floor By way of example, the complex between distamycin A (Structure 6) and [d(CGCGAATTCGCG)], has been calculated to flip at a rate of s-l, and exchange at a rate of s-t (36) The figures are arrived at by use of Eq 1, where Av is the frequency separation of the resonances of two interconverting centers measured at coalescence
Kcoal = (I’&-&) AV = 2.22Av (1)
The actlviation energy for exchange, AG, at the coalescence tempera- ture, Tcoal, can be calculated from Eq
AG = 19 14Tcoa1[9 77 + log (T,,dAv)] (Jmol-‘) (2)
(107)4.2 Intercalators
NMR studies of drugs binding to DNA by intercalation now fre- quently parallel studies of minor groove binders The accessibility of milligram quantities of self-complementary oligonucleotides has greatly facilitated very detailed studies of the molecular basis of the interaction of minor-groove binders and intercalators NMR studies of intercalators, including carcinogens, such as ethidium bromide, and antitumor drugs, such as actinomycin D, predate these developments, and many ingenious studies have been published based on relatively simple NMR techniques It would be wrong to suggest that these have been superceded Detailed information about binding interactions is almost always obtained from analysis of complex 21) data, but, if such detail is not required, this analysis should be avoided!
One-dimensional 31P NMR spectroscopy can provide useful infor- mation about the binding of intercalators to DNA (37)‘) Chemical shifts of 31P are sensitive to conformational changes in DNA, and intercalat- ing drugs cause downfield shifts in the 31P signal, whereas divalent cations cause upfield shifts Relaxation parameters (linewidths and T, measurements) are also sensitive to intercalating drugs
When ethidmm bromide is added to sonicated calf thymus DNA, there is a downfield shift of the (unresolved) 31P resonance indicating that the complex is m fast exchange As more ligand is added, the 31P resonance continues to move downfield until the DNA becomes satu- rated (at O.SMequivalent drug:base pairs) The linewidth also increases; when saturated with ethidium bromide, the 31P resonance of calf thy- mus DNA increased by about 50 Hz (37) The spin-lattice relaxation time, T,, is, however, affected very little by the presence of ethidium bromide The authors suggested that ethidium may have little effect on the internal motion of DNA (reflected in T,), but may slow the overall motion of the molecule (reflected in the increased linewidth)
(108)shows strong GC selectivity, and it was concluded that the GC phos- phates are responsible for the shifted resonance, whereas the more distant AT phosphates give rise to the unshifted peak
Methodology for analyzing fast-exchange intercalators has been extended (38), and the binding of ethidium to Z-DNA (39) is among the systems studied by NMR Further notable 1D studies of intercala- tion include a t9F study of fluoroquinone binding to DNA (40), an investigation of the interaction ofpropidium with oligonucleotides contain- ing mismatch (G-T) base pairs (41), and a 23Na study of the effect of intercalators on the association of sodium ions with DNA (42)
Particularly instructive, however, are the long-term NMR investi- gations of the binding of actinomycin D and the his-intercalator echinomycin to DNA Echinomycin intercalates with GC selectivity, with the connecting peptides lying in the minor groove The many NMR studies leading to a detailed model for its binding have been reviewed in depth recently (43), so this discussion principally con- cerns actinomycin D
In 1986, a 1D NMR study of the binding of actinomycin D to a
number of self-complementary oligonucleotides all containing the (GCGC)2 sequence was published (44) In this study, 31PNMR and ‘H NMR (observing the imino protons at 611 S-14.5) spectra were used to elucidate several points Actinomycin D binds preferentially in SGC sequences (rather than CG or any other sequence) When two GC sequences are available, both may be bound by actinomycin D, even if they are adjacent The length of the flanking sequence has little effect on the binding When the drug and oligonucleotides are present in a 1: ratio, two distinct 1: adducts are formed; at higher drug concentration, a unique 2: adduct is formed In essence, these conclu- sions were arrived at by counting the imino peaks m the ‘H spectrum!
(109)Fig The bmdmg of actmomycin D to d(GCGC)z* (left) more favored 1: complex, (center) less favored 1: complex, and (right) 2: complex Structure IS shown at bottom
4.3 Major Groove Binders
(110)*sN \ / c’ Pt
HN’ 3 ‘a
Structure
minor groove (6) Chromomycm A3 was at one stage in its checkered history believed to be a major groove binder, having started life as an intercalator Recent NMR studies by two groups (43,47) provide very strong evidence for the drug binding in the minor groove
One notable set of NMR experiments concerns the binding of cisplatin to DNA fragments Cisplatin (Structure 7) binds to two adjacent guanine bases linking them Very high-resolution NMR studies have been carried out on the cisplatin d(GG) adduct (48) This is quite a small molecule and gives very sharp lines in the ‘H NMR system The NMR spectrum was assigned completely using 1D techniques Proton-phosphorus coupling constants were then converted to conformational informa- tion using an empirical mathematical relationship, a modified Karplus equation It was concluded that the 5’ G sugar moiety is distorted by the drug so that the sugar is in the N (C3’-endo) conformation, whereas the 3’ G sugar adopts predominantly the normal S(C2’-endo) conformation These studies have been extended to longer oligonucleotide chains (49) NMR studies of carcmogens binding DNA in the major groove are currently more numerous than those of drugs The techniques involved are directly transferable and this field has been reviewed in depth (50)
5 Conclusions
The availability of synthetic oligonucleotides, especially of self- complementary strands of DNA of defined sequence, has greatly facil- itated very detailed NMR (and X-ray) studies of DNA structure and DNA-ligand binding NMR is the only technique available to date that allows solution structures of DNA to be explored to atomic resolution In conjunction with DNA footprinting, it is also an invaluable tool in determining the molecular basis of drug action
(111)design molecules that can read sequence, and so (subject to nontrivial pharmacokinetic and toxicological considerations) be used to control the replication of viruses and the expression of oncogenes At the moment, this effort is largely concentrated m minor-groove binders and on extended intercalators, such as actinomycin D, whose peptide chains lie in the minor groove The major groove is now attracting attention, however, and since proteins read sequence from the major groove, it is likely that drugs will be able to the same and that NMR will be used extensively to determine their modes of binding
References
1 Dervan, P B (1986) Design of sequence-specific DNA-bmdmg molecules Science 232,464-47
2 Goodsall, D and Dickerson, R E (1986) Isohelical analysts of DNA groove- binding drugs J Med Chem 29,727-733
3 Kissmger, K , Krowrcki, K , Dabriowak, J C , and Lown, J W (1987) Molec- ular recognition between ohgopeptides and nucleic acids Monocatiomc imi- dazole Lexitropsms that display enhanced GC sequence dependent DNA bmd- mg Blochemlstry 26,5590-5595
4 Zakrzewska, K and Pullman, B (1988) Theoretical study of the sequence selec- tivity of isolexms, tsohelical DNA groove binding ligands Proposals for the GC minor groove specific compounds J Biomol Struct Dyn 5, 1043-1058 Thurston, D E and Thompson, A S (1990) The molecular recogmtton of
DNA Chem Br 26,767-772
6 van de Ven, F J M and Hilbers, C W (1988) Nucleic acids and nuclear magnetic resonance Eur J Biochem 178, l-38
7 Wemmer, D E., Chou, S H , Hare, D R., and Retd, B R (1985) Duplex- hairpin transitions in DNA NMR studies on CGCGTATACGCG Nucleic Acids Res 13,3755-3772
8 Delort, A.-M., Neumann, J M , Molko, D., Hervt, M , TCoule, R , and Tran Dinh, S (1985) Influence of uracil defect on DNA structure ‘H NMR mvesti- gation at 500 MHz Nucleic Acids Res 13,3343-3354
9 Fetgon, J., Wang, A H -J., van der Mare], G A., van Boom, J H , and Rich, A (1984) A one- and two-dtmensional NMR study of the B to Z transition of (m5dC-dG)3 m methanoltc solution Nucleic Acids Res 12, 1243-I 263 10 RaJagopal, P , Gilbert, D E , van der Marel, G A , van Boom, J H , and Fetgon,
J (1988) Observation of exchangeable proton resonances of DNA in two-dimen- sional NOE spectra using a presaturation pulse application to d(CGCGAA- TTCGCG) and d(CGCGAm6ATTCGCG) J Magn Reson 78, 1243-l 263 11 Hore, P J (1983) A new method for water suppression m the proton NMR
spectrum of aqueous solutions J Magn Reson 54,539-542
(112)13 Lefevre, J.-F, Lane, A N , and Jardetsky, (1987) Solution structure of the
trp operator of E colt determmed by NMR Biochemistry 26, 5076-5090
14 Byrd, R A., Summers, M F., and Zon, G (1986) A new approach for asstgning 31P NMR signals and correlating adJacent nucleosrde deoxyrtbose motettes via
‘H-detected multtple-quantum NMR Apphcatton to the adduct of d(TGGT)
with the antrcancer agent (ethylenediamme) drchloroplatmum J Am Chem
Sot 108,504,505
15 Shah, D O., Lai, K , and Gorenstem, D G (1984) Facile synthesis and 31P NMR spectra of a double-labelled ohgonucleotide d(Ap170)Gp(‘80)Cp(160)T)
J.Am Chem Sot 106,4302,4303
16 James, T L., Bendel, P., James, J L., Keepers, J W., Kollman, P A., Lapidot,
A., Murphy-Boesch, J., and Taylor, J E (1983) Conformattonal flextbihty of
nucleic acrds Jerusalem Symp Quantum Chem Blochem 16, 155-167
17 Kessler, H., Gehrke, M., and Griesmger, C (1988) Two-dtmensional NMR
spectroscopy-background and overview of the experrments Angew Chem
Int Ed Engl 27,490-536
18 Wuthrich, K (1986) NMR of Proteins and Nucleic Acids Wtley, New York
19 Hare, D R , Wemmer, D E , Chou, S -H , Drobny, G , and Reid, B R (1983) Asstgnment of the non-exchangeable proton resonances of d(CGCGAATTCGCG)
using two dimensional nuclear magnetic resonance methods J Mel Biol 171,
3 19-336
20 Parkinson, J A (1989) NMR Studies on the Met J Operator from Escherxhla
coli Ph D Thests University of Leeds, UK
21 Orbons, L P M , van der Marel, G A., van Boom, J H , and Altona, C (1986) The B and Z forms of the d(mSC-G)s and d(brSC-G)s hexamers in solution* a
300 MHz and 500 MHz two-dimensional NMR study Eur J Biochem 160,
131-139
22 Haasnoot, C A G., Westerink, H P , van der Mare], G A , and van Boom, J H (1984) Dtscrrmination between A-type and B-type conformattons of double hehcal nucleic acid fragments in solutton by means of two dimensronal nuclear
Overhauser experrments J Blomol Struct Dyn 2,345-360
23 Nerdal, W , Hare, D R , and Rerd, B R (1988) Three-dtmensional structure
of the wild-type lac Prrbnow promoter DNA m solution J Mel Biol 201,
7 17-739
24 Clore, G M , Oschkinat, H , McLaughlm, L W , Benseler, F , Happ, C S ,
Happ, E , and Gronenborn, A M (1988) Refinement of the structure of the
DNA dodecamer 5-d(CGCGPATTCGCG)2 contammg a stable purme-thym-
me base parr combmed use of nuclear magnetrc resonance and restramed
molecular dynamics Biochemistry 27,4185-4197
25 Metzler, W J., Wang, C., Kitchen, D B , Levy, R M., and Pardi, A (1990)
Determinmg local conformational variation m DNA J MOE Biol 214,71 l-736
26 Qmgnard, E , Fazakerley, G V , van der Marel, G A., van Boom, J H., and Guschlbauer, W (1987) Comparison of the conformation of an oligonucle- otrde containmg a central G-T base pair with non-mrsmatch sequence by pro-
(113)27 Patel, D J., Kozlowski, S A , Marky, L A., Rice, J A , Broka, C , Dallas, J , Itakura, K., and Breslauer, K J (1982) Structure, dynamics, and energetms of
deoxy guanosine-thymidine wobble base pan formation in the self comple-
mentary d(CGTGAATTCGCG) duplex in solution Biochemistry 21,437444
28 Joshua-Tor, L., Rabinovich, D., Hope, H., Frowlow, F., Appella, E., and
Sussman, J L (1988) The three-dimensional structure of a DNA duplex con-
taming looped-out bases Nature 334,82-84
29 Patel, D J., Kozlowski, S A., Marky, L A., Rice, J A., Broka, C., Itakura, K , and Breslauer, K J (1982) Extra adenosme stacks mto the self complementary
d(CGCAGAATTCGCG) duplex m solution Biochemistry 21,445-451
30 Morden, K M , Chu, Y G., Martin, F H., and Tmoco, I (1983) Unpaired
cytosme in the deoxyoligonucleotide duplex dCA,CasG l dCT6G IS outside of
the helix Btochemistry 22, 5557-5563
3 Patel, D J and Shapiro, L (1985) Molecular recognition in noncovalent
antitumour agent-DNA complexes* NMR studies of the base and sequence
dependent recognmon of the DNA minor groove by netropsin Blochimie 67,
887-915
32 Parkmson, J A, Barber, J , Douglas, K T., Sharples, D , and Rosamond, J
(1990) Minor-groove recognition of the self-complementary duplex
d(CGCGAATTCGCG)2 by Hoechst 33258 a high-field NMR study Biochem-
istry 29, 10,181-10,190
33 Patel, D J (1982) Antibiotic-DNA Interactions intermolecular nuclear
Overhauser effects in the netropsin- d(CGCGAATTCGCG) complex m solu-
tion Proc Natl Acad Sci USA 79,6424-6428
34 Lee, M., Hartley, J A., Pon, R T., Krowrcki, K , and Lown, J W (1988) Sequence specific molecular recognition by a monocationic lexitropsin of the
decadeoxyribonucleotrde d(CATGGCATG), structural and dynamic aspects
deduced from the high field ‘H NMR studies Nucleic Acids Res 16,665-684
35 Pelton, J G and Wemmer, D E (1989) Structural charactertsation of a 2.1
distamycin A d(CGCAAATTGGC) complex by two-dimensional NMR Proc
Nat1 Acad Sci USA 06,5723-5721
36 Klevit, R E., Wemmer, D E., and Reid, B R (1986) ‘H NMR studies of the
interaction between dlstamycin A and a symmetrical DNA dodecamer Bio-
chemistry 25,3296-3303
37 Wilson, W D and Jones, R L (1982) Interaction of actinomycm D, ethidmm, quinacrme, daunorubrcin, and tetra-lydne with DNA: 31P NMR chemical shift
and relaxation investigation Nucletc Acids Res 10, 1399-1410
38 Chandrasekaran, S , Kusuma, S., Boykin, D W., and Wilson, W D (1986) A new approach utrlismg high-resolution proton NMR m structural analysis on
intercalation complexes of natural DNA Magn Resort Chem 24, 630-637
39 Shafer, R H., Brown, S C., Delbarre, A., and Wade, D (1984) Bmding of
ethidium and bis(methidium) spermme to Z DNA by intercalation Nucleic
Acids Res 12,4679-4690
40 Mirau, P A , Shafer, R H., James, T L , and Bolton, P H (1982)
(114)cal, and fluorescence properties m the presence of DNA, poly(A) and tRNA Biopolymers 21,909-92
4 Wilson, W D , Jones, R L , Zon, G , Banvtlle, D L , and Marzilh, L G (1986) Specrfrcity m DNA Interactions: an NMR mvestrgatton of the interaction of proptdmm wtth ohgodeoxynucleottdes contaming normal and G-T base pairs Biopolymers 25, 199 l-20 15
42 Martam, Y H and Wilson, W D (1983) Effect of mtercalatmg drugs and temperature on the assoctation of sodmm tons wtth DNA 23Na NMR studtes J Am Chem Sot 105,627,628
43 Gao, X and Pate], D J (1989)Antttumour drug-DNA mteracttons NMR studtes of echmomycm and chromomycm complexes Q Rev Biophys 22,93-l 38 44 Wtlson, W D., Jones, R L., Scott, E V., Zon, G , and Marztlh, L G (1986)
Actinomycin D binding to ohgonucleotides with S’d(GCGC)3’ sequences Definitive ‘H and 31P NMR evidence for two dtstmct d(GC) 1 adducts and for adjacent sate binding in a unique 2.1 adduct J Am Chem Sot 108,
7113,7114
45 Jones R L., Scott, E V , Zon, G., Marztlh, L G , and Wtlson, W D (1988) An NMR mvesttgatron of the binding of the anttcancer drug actmomycm D to ohgodeoxyrtbonucleosrdes wtth isolated Sd(GC)3’ bmdmg sttes Blochemls-
try 27,6021-6026
46 Scott, E V , Jones, R L., Banvtlle, D L , Zon, G , Marztlh, L G , and Wilson, W D (1986) ‘H and 31P NMR mvestrgattons of actmomycm D bmdmg selec- ttvtty wtth deoxyrrbonucleostdes containing multiple adJacent d(GC) sttes Bio- chemistry 27,9 15-923
47 Banville, D L., Kemry, M A , Kam, M , and Shafer, R H (1990) NMR stud- ies of the mteractton of chromomycm A3 wtth small DNA duplexes Bmdmg to GC-contaming sequences Biochemistry 29,652 l-6534
48 den Hartog, J H J., Altona, C , Chottard, J -C., Gtrault, J -P., Lallemand, Y , de Leeuw, F A A M , Marcehs, A T M., and ReedJtk, J (1982) Conforma- ttonal analysis of the adduct cis-[Pt(NH& d(GpG)]+ m aqueous solution A htgh field (500-300 MHz) NMR study Nucleic Ads Res lo,471513730 49 den Hartog, J H J., Altona, C , van der Marel, G A , and ReedJrk, J (1984) A
‘H and 3’P NMR study of crs-Pt(NH3)2 [d(CpGpG)-N7(2), N7(3)] Eur J Biochem 147,37 1-379
(115)of the Carbohydrate Moieties
of Glycoproteins by High-Resolution
lH=NMR Spectroscopy
Herman van Halbeek
1 Introduction
The biochemical/hromedical research community, the pharmaceu- tical industry, and, indeed, molecular biologists generally are faced with the increasing need for characterization of carbohydrate struc- tures of recombinant glycoproteins and natural analogs Cultured mammalian cells (such as Chinese hamster ovary [CHO] cells) are used to produce glycoproteins for therapeutic and diagnostic use because of their ability to perform glycosylation The presence of oligosaccha- ride moieties is often compulsory to define several biological activi- ties of glycoproteins, including clearance rate, immunogenicity, and specific biological activity Since a number of factors that influence glycosylation still elude our control (such as culture environment and age of the cells), the same gene expressed in the same type of cell may not always yield a product with exactly the same glycosylation pattern, presenting drug batch quality-control problems for the pharmaceutical industry Nuclear magnetic resonance (NMR) spectroscopy provides a powerful nondestructive means to characterize glycoprotein carbohy- drates structurally and is an indispensable part of the current method- ology of glycosylation site mapping
From Methods m Molecular &o/ogy, Vol 17 Spectroscopic Methods and Analyses NMR, Mass Spectrometry, and Metalloprotern Techmques Edited by C Jones, B Mulloy,
(116)We will limit the discussion in this chapter to solution-state ‘H- NMR spectroscopy as a method for the characterization of theprimary structure of N-type oligosaccharide chains of glycoproteins In pre- senting NMR spectroscopy as a “fingerprinting” technique, we need only discuss single-pulse, one-dimensional (1D) ‘H-NMR spectros- copy A slightly more complicated experiment would be performed only for solvent suppression We will illustrate the applicability of the 1D ‘H-NMR fingerprinting method for the structural elucidation of the carbohydrate chains of three recombinant glycoproteins, namely, recombinant soluble human CD4 (rCD4) (1) and recombinant human tissue plasminogen activator (rtPA) (2), both expressed in CHO cells, and a recombinant hepatitis B virus (HBV) surface antigen glycopro- tem (pre-S2 + S), expressed in cells of the mnn9 mutant strain of the yeast Saccharomyces cerevisiae (3) We will briefly compare the pre- S2 + S structures to the structures of carbohydrate chains of allergen
Art v II, as elucidated by ‘H-NMR spectroscopy (4)
The 1D ‘H-NMR spectrum of an oligosaccharide or glycopeptide represents an “identity card” of the carbohydrate A D ‘H-NMR study may suffice for primary structure determination if the oligosaccharide itself, or a compound of closely related structure, has been characterized previously The usefulness of the “structural-reporter group concept” (5,6) in this context will be outlined in detail in Section 3.5 Fingerprinting a carbohydrate through 1D ‘H-NMR spectroscopy is possible if at least 10 nmol of pure oligosaccharide/glycopeptide are available
(117)ers often seek collaboration with laboratories specializing in NMR spectroscopy of carbohydrates The author’s laboratory is part of the US National Institutes of Health (NIH) Resource Center for Biomedi- cal Complex Carbohydrates; it welcomes any requests regarding struc- tural analysis of glycoprotein carbohydrates and provides advice on preparing samples, on the suitability of samples for NMR analysis, and so on NMR spectra of carbohydrates are run at 500 or 600 MHz at nominal costs on a nonprofit basis, and help in interpretation of the data is provided
2 Materials
A glycoprotein, whether natural or recombinant, consists of a pro- tein in which one or more amino acids bear carbohydrate (are glycosylated) A carbohydrate chain attached to the amide (CO-M12) group in the side chain of an Asn residue is referred to as an N-type oligosaccharide; oligosaccharides attached to the hydroxyl group of a Ser or Thr residue are O-type chains N-type oligosaccharide chains have the following pentasaccharide core structure in common:
4’
Mana( l +6) \
Mar$(
/ 1-+4)GlcNA$( 1-+4)GlcNAc Mana(
4
l-+3)’
Depending on the extension of the core, N-type carbohydrates are subdivided into three types termed:
1 N-acetyllactosamine; High-mannose; and Hybrid (see Scheme 1)
(118)(a) N-acety~ia&.samme type
dmntennaty’
N 5’ 4’
NeuAcu(2~3)Gal~(l+4)GlcNAc~( +P)Mana(l *S), P=ucw -wo-I\
Manp(l+4)GlcNAcf3( l -4)GlcNAc NeuAca(2+3)Galj3(1+4)GlcNAc~( +P)Mana( l-0)’
N
tetra-antennary:
N” 8’ T NeuAcu(2-r3)Galf3(1+4)GlcNAc~(l-r6~
N’ 4’
NeuAcu(2+3)Gal~(l+4)GlcNAc~(1 +P)Mana( +S), [Fuca(l+6)]0.1\ Mar@ 1+4)GlcNAcP( 1+4)GlcNAc NeuAca(2+3)Gal~(l+4)GlcNAc~(l+2)Mana(1+3~
N
NeuAcu(2+3)Gal/3(1+4)GlcNAc~(l-4~
N’
(b) high-mannose type
4
Mana(1 +P)Mana(l+6h 4’ DZ A Mana( -r61 Mana(l+2)Mana(l+3f
Man~(l-r4)GlcNAc~(l+4)GlcNAc Mana(1 +P)Mana(l-+6)\
D4 E Mana(l-rS)/
Mana(1 -+2)Mana(l-2)/ DI C
(4 Wnd we
0
Mana(l+6)\ 4’
A Manrr(l+61 F-=41 -6)10-l\
Mana( l -3j Man~(l+4)GlcNAc~(l-+4)GlcNAc NeuAca(2-3)Galp(l+4)GlcNAc~(l-2)Manu(l~3)/
N
(119)to rigorous purification of a glycoprotein into its individual glycoforms Nevertheless, the glycoprotein must be purified to homogeneity, as the presence of contammatmg (glyco-)proteins must be eliminated before tackling the structural analysis of the carbohydrates, The mol wt of the glycoprotein and the number of glycosylation sites occupied (i.e., the carbohydrate content) determine how much sample is needed to begin structural analysis Ordinardy, one needs about 20 mg of pure glyco- protein starting material
Glycoproteins are m general too large to be studied as intact macromol- ecules by ‘H-NMR spectroscopy for the structure of their carbohydrates The spectra of intact glycoprotems show mostly fairly wide lines and lack the resolution needed for fine structural analysis The severe overlap of resonances, includmg the overlap of carbohydrate and protein signals, and the mrcroheterogeneity of the sample render spectral analysis of an mtact glycoprotem as yet impossible Degradation to partial structures is mandatory when detailed primary structural information on oligosaccha- rides 1s desrred Partral structures suitable for NMR spectroscopy are:
1 Glycopeptrdes; Olrgosaccharrdes; and Reduced ohgosaccharrdes
Glycopeptrdes are the partial structures of choice when we must preserve the information on the glycosylation site m the protein Gly- copeptrdes can be generated by specific (e.g., trypsin, chymotrypsin) or nonspecific (pronase, pepsin) proteolytic digestion of the protein portion of the glycoprotem Although larger peptides can be handled, NMR spectroscopic analysis IS facrlitated when the glycopeptides (once purified from the peptrdes) have a peptide chain no longer than -10 amino acids Also, preferably, the glycopeptides should be homoge- neous in their peptide portion, although peptide heterogeneity is usu- ally not an insurmountable problem, since only the positions of the signals of the first couple of monosaccharide resrdues attached to a peptide are affected However, for glycopeptides to be analyzed suc- cessfully, they should contain only one glycosylation site; the hetero- geneity in the carbohydrate structure at that site can then be adequately characterized by ‘H-NMR spectroscopic analysis
(120)N-glycanase (also known as PNGase F) or by an endo-glycosidase, such as endo-H or endo-F Scheme gives a typical example of such an approach for complete structural characterization of the carbohy- drates of a recombinant glycoprotein N-glycanase cleaves fairly aspecifically the N-glycosidic linkage between core residue GlcNAc- and Asn, resulting in a reducing oligosaccharide with an intact N,N’- diacetylchitobiose moiety The endo enzymes cleave the linkage between the two GlcNAc residues in the core, producing an oligosaccha- ride that ends in just one GlcNAc residue at the reducing end The endo enzymes vary in specificity For example, endo-H cleaves high-man- nose and hybrid structures, but not N-acetyllactosamine-type oligo- saccharides The advantages of oligosaccharides over glycopeptides are: (1) they show no signal overlap in the NMR spectrum with amino acid protons; and (2) they are easier than glycopeptides to purify to homo- geneity of carbohydratechain, which is reflectedin the NMR spectrum The drawback of analyzing oligosaccharrdes is the loss of information about the glycosylation site and, sometimes, the anomerization of the reducing oligosaccharide The oligosaccharide in solution is present as a mixture of the a and p anomers, and this may affect the NMR spec- trum to such an extent that the interpretation becomes ambiguous
To avoid the anomerization effect on the ‘H-NMR spectrum, to simplify subsequent purification procedures, or to render the oligosac- charide amenable to techniques of structural analysis other than NMR, the oligosaccharide may be reduced by NaBH, (or NaBD4 for mass spectrometry) to its corresponding oligosaccharide-aldito1.t To enhance sensitivity of detection during purification of the oligosaccharides, the reduction may be carried out with NaBT4, thereby incorporating a radioactive label in the compound Neither the D nor the T label in the oligosaccharide-alditols has an adverse effect on ‘H-NMR spectro- scopic analysis of the compounds
It is evident that, in all cases, the partial degradation technique and subsequent chemical modifications applied should not affect the struc- ture of the carbohydrate Desialylation, defucosylation, deacetylation, desulfation, aspecific cleavage at the reducing end, and so forth, are
(121)N Asn 300 r-CD4
‘“I mg) towIn
1
tlyptic digest peptldes and glycopeptldes
c
C
36s
- Asn- 271
Asn-271 glycopeptide(s) N-g/ycirlese
I
peptrde & olrgosaccharldes I I I I / I I reversed-phase HPLC
I
G- 15 (desalfmg) I FPLC on moncM
CI Asn ,- 300
Asn-300 glycopeptide(s) I
endo+ I rewrsed-phase HPLC
I
olrgosaccharldes klYW I peptides
1 G-15 glycopepfldasa A
I L
I peptrde & I olrgosaccharldes t
reversed-phase HPLC
I I
I G- 75 (desa/lmgJ I ; FPLC on mono-0
Asn-300 endc-H
1) lH-NMR spectroscopy
2) hrgh-pH anion exchange chromatography (HPAE)
3) glycosyf composition analysis 4) glycosyl llnkage (methylation) analysis 5) FAB-MS (of peptrdes and glycopeptides)
Scheme Release, isolation, and purification of N-type oligosaccharides from
recombmant soluble human CD4 For detads of the experimental procedures, the reader 1s referred to ref (1) The structural characterization of fractions Asn-271 Q2 and Asn-300 endo-H is discussed m the text
(122)Finally, the (reduced) oligosaccharides must be purified before under- going NMR analysis Important in this respect are removal of any (proton-containing and nonproton-containing) contaminants and purifi- cation of the carbohydrates to virtual homogeneity (charge, size, and so on) The most frequently observed noncarbohydrate contaminants that can disturb the NMR spectrum are salts/buffers (acetate, lactate [from the fingers of the biologist], SDS, EDTA, and so forth) Even salts that not contain protons (NaCl, phosphates, and so on) ~111 impair the NMR spectrum (line broadening) and shift the HOD peak (effect on dielectric constant and pH of the solution) One should be aware that carbohydrates only detected by their radioactive label are usually the ones most seriously contaminated with all sorts of nonra- dioactive, but NMR-disturbing substances
The amount of pure oligosaccharide needed to obtain a 1D ‘H-NMR spectrum in <6 h of data acquisition time depends on the field strength of the spectrometer available and the sensitivity of the probe As a rule of thumb, 15 nmol of oligosaccharide (corresponding to 25-30 pg of a decasaccharide) are the minimum requirement for primary structural analysis on a 500-MHz spectrometer For analysis at 600 MHz, 7-10 nmol(15-20 l.tg) are sufficient, whereas 35-40 nmol are needed at 400 MHz and approx 100 nmol at 300 MHz
The other materials needed for NMR analysis of oligosaccharides are adeuterated solvent (D20 for underrvatized glycoprotein oligosaccha- rides) and an NMR tube As for the D20, used for exchange of OH and NH protons for deuterium atoms, one should bear in mind that the actual spectrum is recorded using a solution of the compound in 0.4 or 0.5 mL of D,O ofthe highest available deuterationgrade (“lOO.O%,” “gold-label,” in practice >99.99%), but the initial exchange steps can be performed with 99.8% D20 Also, the deuteration percentage on the label of the bottle or ampule is only real if the D20 is handled correctly (see Section 3.1.)
(123)3 Methods 3.1 Sample Preparation
A glycopeptide or oligosaccharide sample submitted for NMR analy- sis is stored in dry state at -20°C until use The first steps in the actual preparation of the sample for NMR analysis are proton exchanges m D20 The sample is dissolved twice in D20 (99.8 and 99.96% D, respectively) at room temperature and pD with intermediate lyophiliza- tion The purpose of the exchange treatments is the complete conver- sion of OH and NH groups in the constituent monosaccharides into OD and ND (chemical exchange), and the preparation of an eventual solution of the carbohydrate with as low as possible a residual amount of HOD The exchanges are usually performed in small glass vials; some researchers prefer to perform the exchange directly in the NMR tube Each step in the exchange procedure (dissolving the sample in -0.5 mL DzO, allowing the exchange process to take place, and sub- sequent lyophilization) may take 6-8 h Check the pH (pD) of the solution immediately after dissolving the sample for the first exchange step If the pH does not fall between and 8, adjust it with dilute DC1 or NaOD Remember that the glycosidic linkages of sialic acid residues tend to hydrolyze at pH or lower, whereas esters, such as O-acetates, are cleaved under both acidic (pH < 5) and basic (pH > 8) conditions
One last purification step critical to the quality of the NMR spectrum is removal of paramagnetic impurities (metal ions) This step is typically carried out in the NMR lab just prior to the recording of the NMR spec- trum; paramagnetic impurities may not bother the biologist in any other application of the glycoprotein and/or its oligosaccharides Chelex is the best material to use for the routine removal of paramagnetic ions A few particles of Chelex are sufficient to sequester the metals Incubate in a small vial during the first and/or second exchange step, under slow, con- tinuous swirling for 30 min, to remove the paramagnetics, and then pipet off the solution In this way, a 0.5-r& sample can be processed with minimal loss from adsorption and with no significant dilution (9)
(124)A few final remarks pertinent to sample preparation: Several com- panies market D20 of the quality (deuteration grade, free of paramag- netic impurities, and so on) required for this type of NMR analysis It is best to purchase D,O (especially the 99.996% D-grade D,O) in small ampules (0.5-1.0 mL) rather than in large bottles (over 10 mL) In any case, the D20 should be handled in dry atmosphere so as to preserve its quality after the container is opened Perform the exchanges in a glove box maintaining humidity at ~7 or 8% Allow samples that have been stored in a freezer to warm to room temperature before dissolving them in D20 Prerinse syringes and pipet tips in D20 just prior to use Moisture in the air is the NMR spectroscopist’s worst enemy Lyophilization can be replaced by flash evaporation One way or the other, try to prevent contact of the sample with the air
Prior to ‘H-NMR spectroscopic analysis, the sample is redissolved in 0.4 or 0.5 mL of D20 (99.996% D) and transferred into a 5-mm NMR tube The actual volume depends on the length of the RF receiver coil in the probe that the NMR spectrometer uses The sample should be dissolved at least 12 h before the actual spectrum is recorded to ensure complete solvation This time lapse significantly improves the quality of the NMR spectrum over that of a spectrum recorded imme- diately after dissolving the sample When transferring the sample into the NMR tube, filter it (over cotton wool, prerinsed with highest qual- ity D20) to remove any insoluble particulates Check the pD of the resulting solution, either before or after the NMR spectrum is run, by putting a droplet on pH paper The pD of the solution should be between and It is not necessary to degas the solution and/or to seal the NMR tube for the types of NMR experiments described here
(125)to the solution for lock purposes (see Section 3.2.); the residual CD&OCD2H gives rise to a multiplet, centered around 2.167 ppm, and provides another means for the calibration of chemical shifts
3.2 The NMR Spectrometer
‘H-NMR spec troscopy is performed on a pulse-FT NMR spectrom- eter, operating at a radio frequency in the range of 300-600 MHz for ‘H For the purposes of this type of analysis, there are no major differ- ences in performance of NMR spectrometers of different manufactur- ers (Bruker, General Electric, JEOL, Varian) of the same field strength The spectra shown in this chapter were recorded in the author’s labo- ratory on a Bruker AM-500 spectrometer (operating at 500 MHz for ‘H) interfaced with an Aspect-3000 computer It is important to use a high-sensitivity 5mm probe for recording the ‘H-NMR spectra of the oligosaccharide The D signal of the solvent serves as a reference for the field-frequency lock
(126)acetone-& As little as 50 pL must be added to the solution and are sufficient for lock in the presence of 450 ~JL D,O
After carefully adjusting the shims, i.e., optimizmg the magnetic field homogeneity over the sample, the scene is set for data collection During the NMR experiment, the sample is spun at a constant rate of - 15-20 Hz If the gam in resolution does not outweigh the occurrence of spinning side bands (especially around the strong residual solvent “HOD” signal), the NMR spectra are recorded without spinning the sample In our experience, it is not necessary to spin a sample if the nonspinning shim gradients are carefully adjusted
3.3 Recording the IH-NMR Spectrum
Standard acquisition parameters for routine 1D ‘H-NMR spectro- scopic analysis of oligosaccharides in D20 are as follows With the spectral width set to lo-12 ppm and a time domain of 16K or 32K data points, we get an acquisition time of 2-4 s/scan The flip angle of the pulse used is 70-75” An additional relaxation delay between consecu- tive scans is not necessary under these conditions (Typical T, values for ‘H in medium-size oligosaccharides are 0.1-0.5 s.) As examples, we present the ‘H-NMR spectra of two oligosaccharide samples iso- lated from rCD4 (Scheme 2) The actual values for all relevant acqui- sition parameters can be found in the legend to Fig
Data collection is continued until the signal-to-noise (S/N) ratio in the anomeric-proton region of the spectrum is at least 3, but preferably or better Depending on the amount of carbohydrate material avail- able for analysis, reaching this S/N value may require a few hundred
(127)HOD
Hi atoms Man H2
-
“1 NeuAc HIaq
Fuc -CH3 -
NeuAc HIax
B
‘i i
(128)up to several thousand scans Thus, the total time to obtain the NMR spectrum is a few minutes (for - 1~01 of carbohydrate) up to h (for
10 nmol of carbohydrate)
Despite taking all the above precautions when preparing the sample for NMR analysis, the residual HOD signal will still appear as the dominant peak in all spectra but those of the most concentrated samples The HOD signal is found at 4.75-4.80 ppm; its exact position varies with temperature, pH, and concentration of the solution and, therefore, should not be used for calibration of the chemical shift scale If the residual HOD signal obscures any signals of interest, we have two ways to make signals in the immediate vicinity of the HOD peak visible We can either modify the routine single-pulse NMR experiment (into a “water suppression” experiment) while maintaining sample temperature, or we can repeat the standard NMR experiment at higher temperature, since raising the sample temperature to40 or45”C is usually sufficient to observe the region around 64.7-4.9 undisturbed However, solvent suppression is preferred over temperature elevation Not only does the HOD signal shift when the sample temperature is changed, but some of the carbo- hydrate proton signals shift, too Although the chemical shifts vary only slightly with temperature changes, such effects usually prevent unambiguous assignment of signals based on ambient temperature data There is also the risk of degrading the sample (e.g., desialylation) at high temperature
The srmplest way to suppress the solvent signal is by fast pulsing That is the major reason for not using an additional relaxation delay in data collection when the acquisition trme is already on the order of a second (see earlier) Since carbohydrate protons have much shorter relaxation times than the HOD proton, the signals of the former will not be affected by “fast pulsing,” and, therefore, the sensitivity of the method will not be degraded
Alternatively, presaturation, a technique very popular with peptide and nucleotide NMR spectroscopists (IO), can be used to suppress the residual HOD signal Careful adjustment of the irradiation time and power level of the presaturation pulse usually generated by the ‘H decoupler channel of the spectrometer is required to obtain a spectrum that stall holds information on signals close to the HOD peak
(129)utilizes the difference in relaxation times between the HOD and car- bohydrate protons This experiment is based on the “inversion-recov- ery” principle Typically, a (180” -‘c - 90” - acquisition) sequence, in which the delay z is empirically optimized, gives quite satisfactory results, especially if the 180” pulse is composite (90”,180”,90”,) (4) If, for sensitivity reasons, selective inversion of the HOD peak is desired, the 180” pulse in the preceding scheme may be replaced by a selective 180” pulse (usually a DANTE pulse train [11] or a shaped pulse) Both the nonselective and the selective WEFT experiment leave the regions immediately to the right and the left of the HOD signal unaffected
3.4 Data Processing
The result of the data collection just described is called a free induc- tion decay (FID), or an NMR spectrum in the time domain To convert the FID into the NMR spectrum in the frequency domain, one routinely applies FT, followed by phase correction The FID can be manipulated before FT, depending on the aspect of the resulting NMR spectrum to be emphasized When multiplied by an exponential function, the result after FT is a sensitivity-enhanced spectrum at the cost of resolution (line-broadening); however, when a sinusoidal or Gaussian function is used for window multiplication (S/N ratio permitting), the result is a resolution-enhanced spectrum Often, after Gaussian multiplication, the number of data points is increased before FT (“zerofilling”) so as to ensure a sufficient number of data points to achieve a digital resolu- tion of -0.2-0.3 Hz/pt Artificial resolution enhancement was applied to the spectra displayed m Figs 1B and 2B; parameters are specified in the legends The latter technique is most useful in the methyl proton region of the ‘H-NMR spectrum, where small differences in chemical shift between sharp methyl singlets (NAc signals) or doublets (Fuc C6 protons) are very significant for structural analysis Sometimes, spec- tral integration is applied in the anomeric-proton region of the spec- trum, mamly to verify the purity of the sample by determining the ratio in which two or more components in the sample occur
3.5 Spectral Interpretation
(130)interpretation) can demonstrate the identity of compounds Thus, a 1D ‘H-NMR study may suffice for primary structure determination if the oligosaccharide has been characterized previously Several glycopro- tein carbohydrate ‘H-NMR data bases are available for N-type glyco- peptides and ohgosaccharides in D20, for example (5,12,13) When the spectrum does not match any of the spectra m existing data bases, attempts can be made to interpret the ‘H-NMR spectrum in terms of (partial elements of) the primary structure of the carbohydrate (includ- ing anomeric configurations and positions of glycosidic linkages) by using the well documented structural-reporter group concept (5,6) As an example of this approach, the interpretation of the 500-MHz ‘H- NMR spectra of two oligosaccharide samples isolated from soluble human rCD4 (Figs and 2) is discussed
We basically ignore the crowded region in the center of the spectrum (between and ppm; Figs 1A and 2A), and only the positions and patterns of those signals that are individually observable are examined Particularly useful structural reporter groups in such 1D analyses are:
1 The anomeric (Hl) protons;
2 The protons attached to the carbon atoms in the direct vlclnlty of a substl- tution posltion;
3 The protons attached to deoxy carbon atoms; and Methyl protons, e.g., m N-acetyl groups
The structural-reporter-group regions of the spectra are depicted in Figs 1B and 2B The chemical shifts of the structural reporter groups are measured and compiled in a table These values are then compared with literature data on similar N-type oligosaccharides and/or glyco- peptides
(131)HOD
A
NeuAc H3ax
I I ” r
50 30 20
+ (ppm)
EH-l(-B)/EH-2( t B)
B
e
lMarro(l-s)l,,\ A Malla(l-311
MWW\
Mm4(14)GbNAc
N~(2-3)(ia18(1-4)GlcNAcp(l-2)Msno(l-3)/ ’
(132)Figure shows the ‘H-NMR spectrum of fractron Q2 released from rCD4 glycosylation site Asn-27 by N-glycanase and purified on the basis of its charge (Scheme 2) The NMR spectrum indicates that the sample contains a mixture of two di-antennary N-acetyllactosamine- type oligosaccharides ending in NJ’-diacetylchitobiose (compare Scheme 1) The di-antennary type of branching is evident from the set of chemical shifts of the Man Hl and H2 atoms (Table 1) (cf l-3,5) Tri-, tri’-, and tetra-antennary structures would reveal their degree of branching by virtue of, among other features, different sets of Man H and H2 chemical shifts (see Table 2)
Both branches of the di-antennary oligosaccharides Q2 terminate in NeuAc attached in a(2+3)-linkage to Gal, as seen by the pair of NeuAc H3ax and H3eq signals (61.80 and 2.76, respectively;Table 1) The precise position of the NeuAc H3ax signal reflects the branch location of the NeuAc residue The position of the H3ax signal is different for NeuAc in the C3-linked (6 1.796) vs the C6-linked branch (6 1.799) These values, along with the positions of the Gal-6 and Gal- 6’ Hl doublets and the GlcNAc-5 and GlcNAc-5’ NAc signals, are valuable for determining both the type of linkage of the NeuAc residue to Gal and the branch location of the residue (2,5) (It is worth noting that a(2+6)-linked NeuAc residues [although not found m the carbo- hydrate chains of glycoproteins expressed in CHO cells] have differ- ent characteristic chemical shifts for their H3ax and H3eq signals They also exert different effects on the chemical shifts of the afore- mentioned structural reporter groups of residues in the sialylated branch; see ref 14.)
(133)(p); each pair of signals occurs in the intensity ratio typical of reducing oligosaccharides ending in GlcNAc, a:P - 2: All three of the struc- tural-reporter-group signals of Fuc show relatively large anomerization effects (A&.& Extending the chitobiose unit by Fuc ~(1-6) at GlcNAc- affects the chemical shifts of H (A8 0.055 ppm) and NAc protons (A8 0.013 ppm) of GlcNAc-2, and of HI in the a-anomer of GlcNAc-1 (A6 -0.008 ppm) (see Table 1) The latter effect was used to determine the ratios of fucosyl and nonfucosyl compounds in mixture 42 (Fig 1) as a complementary aid to the intensity ratio of the NAc signals of GlcNAc-2 at 2.09612.093 for the fucosyl and 2.0821 2.081 for the nonfucosyl compound (Table 1)
(134)for Dr-antennary Oligosaccharides of the N-Acetyllactosamine Type Released by N-Glycanase Chemrcal shrft,b ppm in c
QO+F QO-F Ql+F Ql-F Ql’+F Ql’-F Q2+F Q2-F
Anomer Reporter of ollgo- group Residuea sacchande H-l GlcNAc-Id
Fuccr( l-6) GlcNAc-2 Man-3 Man-4 Man-4 GlcNAc-5 GlcNAc-5’ Gal-6 Gal-6 H-2 H-3 Man-3 Man-4 Man-4’ Gal-6 Gal-6’
5 182 70 4.889 895 66 4.66 77 5.121 4.927 585 585 4.469 474
5 191 470
5 183 696 4.889 896 665 669 77 119 928 4.575 4.583 4544 474
5 190 696
5.190 696
- - -
- 615 606 77 121 927 585 4.585 469 4.474
- 614
4 605 77
5 119 928
4.575 583 4544 474 5.183 696 4.89 4.90 4.665 4.669 77 119 4.926 4.583 575 4.467 550
- 614 605 4.77 119 4.926 583 575 4.467 550 5.182 697 4.893 4900 4.663 667 77 118 924 573 573 4544 550 248 248 4.247 247 247 247 246 4.190 190 191 4.191 4.191 191 190 110 4.110 108 108 108 108 114 ride ride 113 113 nd’ nd’ 4.113 ride ride nd’ nd’ 113 113 118
(135)H-3eq H-5
NeuAc NeuAc’ Fuca( 1+6)
CH3 Fuca( 146)
NAc GlcNAc- GlcNAc-2 GlcNAc-5 GlcNAc-5’ NeuAc Neu AC - - 4097 4.130 1.209 1220 2.039 2095 2.091 2.051 2.049 - -
- 757 - - - 4.095 - 4.135 - 1.210 - 1.220 2.039 039 2.082 2.096 082 2.093 2.05 2.048 2049 2.048 - 2.03 - -
2.757 - - 2.759 - 2.757 757 2.759 - 10 - 4.095 - 4.13 - 4.136 - 1.210 - 212 - 220 - 1.222 2.039 2.039 2.039 2.039 082 2096 082 2.096 2.080 2.093 2.080 2.093 2.048 2051 2051 2.048 2048 2.045 2.045 2.043 2031 - - 2.032 - 2.03 2.03 2.032
2.759 759 - - 039 2.082 2081 2.048 2.043 032 2.032 “The numbermg system used for denotmg gIycosy1 residues in the dxmtennary ohgosaccharides IS as follows
N 5’
NeuAca(2-+3)Galp( l+l)GlcNAcP( 1+2)Mana( 1+6) $
Fuca( l-6 k
oi Mar@ +l)GlcNAcP( 1-+4)Glc AC
NeuAca(2+3)Galp( 1-+4)GlcNAcP( 1+2)Mana( 1+3
6
blIka were acqmred at 500 MHz for neutral solutions of the compounds m D20 at 27°C
“Oligosaccharides were released from recombinant soluble human CD4 or from recombmant human tissue plasmmogen activator (1,2); for complete structures, compare Scheme QO denotes asialo, Ql denotes mononalyl, and Q2 stands for dlsialyl ohgosaccharide Ql’ denotes a monosmlyl ch-antennary ohgosaccharide havmg Its siahc acid residue attached to Gal-6 The F stands for an a( 146)~fucosyl residue at GlcNAc-1 Structures are schematxally illustrated m the table heading usmg a shorthand symbohc notation; W = GlcNAc, = Gal, = Man; A = NeuAc, = Fuc The peripheral umt on the left corresponds to the glycosyl residues 5-6-N, the umt on the nght to the 5’-6’-N’ glycosyl resrdues dData for correspondmg, reduced ohgosacchandes are compiled m (14)
(136)and Tetra-antennary Oligosacchandes of the N-Acetyllactosamme Type Released by N-Glycanase Chemical shlft,b ppm mc
Q3+F Q3-F Q3’+F Q3’-F Q4+F Q4-F Q(4+ l)+F 4(4+2)-F
Reporter
group Residuea
Anomer of ohgo- sacchande H-l GlcNAc-
Fucoc( 1+6) GlcNAc-2 Man-3 Man-4 Man-4 GlcNAc-5 GlcNAc-5’ GlcNAc-7 GlcNAc-7’ Gal-6 Gal-6’ Gal-8
a 181 P 690 if 4 893 899 if 4 663 668 %P 760 a$ 114 a-3 910 a$ 562 %P 573 542 %P - %P 542 @P 549 a$ 546
5 190 4.690 - - 4.615 606 760 5.114 910 562 4.573 542 - 542 549 546
5 181 190 690 690 893 - 899 - 663 614 668 605 760 760 123 123 4871 871 4.572 572 590 590
- - 562 562 546 4.546 546 546
- - 182 4.688 4.902 910 662 667 76 131 858 563 594 542 562 542 4.547 547 5.190 688 - - 615 4.606 4.76 131 4.858 563 4.594 542 562 4.542 547 547
(137)H-2 H-3 H-3ax GlcNA@ Galp4add Galp4add Man-3 Man-4 Man-4 Gal-6 Gal-6 Gal-8 Gal-8’ GalP4add Galf14add NeuAc NeuAc’ NeuAc* NeuAc*’ H-3eq H-4 H-S NeuAc Neu Ad NeuAc* NeuAc*’ Gal-6 Gal-8’ Fuca( 1+6)
CH3 Fucct( 1+6)
- - - 214 214 107 122 122 122
- - - 1813 1.813 1813 - 2.756 756 756 - ndf ndf 4095 136 1.211 1221 - - - 214 4.214 107 4.122 4.122 122
- - - 1.813 1.813 1.813 - 2.756 756 2.756 - nd ndf - - - 253 4.196 4091 122 4.122
- 122
- - 1.813 1.813 - 1813 756 2.756 - 2.756 ndf nd 4.095 136
1.211 1221
- - - 253 4.196 4091 4.122 4.122 - 4.122 - - 1.813 1.813 - 813 2.756 756
- 2.756 nd ndf - - -
4209 4.209 4.224 224 4092 4.092 120 4.120 4.120 120 120 4.120 4.120 120
- - -
1.805 1.805 805 1.805 1.805 1.805 1.805 1.805 756 756 2.756 756 756 756 2.756 2.756 ndf nB ndf nd 4.095
4 135 1.211 1221 - - - - - 556 - 4.210 4.223 4.090 117 ndf 117 117 4.117 - 1.803 803 1.803 1.803 2.756 2.756 2.756 756 162 ndf 4095 4.135 1210 1220 4.697
4.556 Ei 4.556
% 4.212 4.224 (2” 4.090 4.117 kl ndf 0’ i% 4.117 Be rid : 4.117
4 117 e g 1.803 s 1.803 R 1.803 R’ 1.803 P 757
2.757 757 2.757 162 4.162 4095 4.135
(138)Chemical shlft,b ppm in’
Reporter group NAc
Residuea GlcNAc- GlcNAc-2 GlcNAc-5 GlcNAc-5’ GlcNAc-7 GlcNAc-7’ GlcNAcP3 GlcNAcP3’ NeuAc NeuAc’ NeuAc* NeuAc*’ Anomer of ohgo- sacchande
Q3+F Q3-F Q3’+F Q3’-F Q4+F Q4-F Q(4 + l)+F Q(4 + 2)-F
2 039 039 2097 2.083 2095 2081 2044 2044 2044 2.044 2.074 074
- - -
- - 2031 2031 2031 2031 2031 2031
-
2 039 039 2095 082 2091 2081 052 052 039 039
- - 039 039
- - - 2031 2031 - 2031 - 2031 2031 - 2031
2 039 2.039 2.095 2.082 2091 080 2.048 2048 039 039 075 075 039 039
- - - - 030 030 030 2.030 030 030 030 030
2.038d 2094 2090 2047 037d 075 035d 036d - 030 030 030 030
(139)NeuAca(2~3)Galp(l %l)GlcNAc~( 1+6)
N’ 5’ \4
NeuAca(2+3)Galp( 1+4)ClcNAcP( 1+2)Mana( l-+6) >
Fuca( 1+6), Mat& *)GlcNAcP( 14)GlcNAc NeuAca(2~3)Gal~(l~4)GlcNAc~(l~2)Mana(l-+3)
N /4
NeuAca(2+3)Galp( l-~I)GlcNAcp( lj4)
N*
The resrdues m the addrtronal N-acetyllactosamme units m compounds Q(4 + 1) + F and Q(4 + 2) + F are denoted GlcNAcP3 and GalP4 add (see foatnote ‘)
bData were acquired at 500 MHz for neutral soluttons of the compounds m D,O at 22-27°C
‘Ohgosaccharides were released from recombinant human tissue plasmmogen acttvator (2) or from recombmant human erythroporetm (Watson, Blithe, and Van Halbeek, in preparation); for complete structures, compare Scheme 43 denotes trtstalyl tn-antennary, Q3’ denotes tnstalyl tn’- amennary, and Q4 stands for tetrasralyl tetra-antennary oligosacchande Q(4 + 1) denotes a tetrasralyl tetra-antennary ohgosacchande having an additional (sialylated) N-acetyllactosamine unit p( lj3)-attached to Gal-6 Q(4 + 2) denotes a tetrastalyl tetra-antennary ohgosacchande havmg two addtttonal (sialylated) N-acetyllactosamine units P(1*3)-attached to Gal-6 and Gal-8, respectively The F stands for an a(lj6)-fucosyl restdue at GlcNAc-1 Structures are schemattcally illustrated m the table heading usmg a shorthand symbolic notation, n = GlcNAc, = Gal, = Man; A = NeuAc, D = Fuc The penpheral umts, from left to nght, correspond to the glycosyl restdues 5-6-N, 7-8-N*, 5’-6-N’ and 7’-8’-N*‘, respectively
(140)of the High-Mannose and Hybrid Types Released by Endo-H or by N-Glycanase Chenucal shift? ppm mc
ml HM
(5 + 1) (::) (6 + 2) (?2) (:?2) (9”+“1) (!? ) (1E2) EH- 1”B) EH - 2(+W Anomer
Reporter of oligo- group Residue” sacchande H-l GlcNAc-
GlcNAc-2 Man-3 Man-4 Man-C Man-D, Man-4 Man-A H-2 Man-D, Man-B Man-D, Man-E Man-D, Glc-NAc-5 Gal-6 Man-3 Man-4 Man-C a P ; a# a,P a,P ;1 a# a.P a$ a# a3 ; a$ a$ - - 245 72 77 108
- - 874 083 108 - 4911 - - - - - 255 4244 069
-
- 189 - 698 245 - 72 597 77 4.765 352 5340 054 5.046
- - 874 870 083 - 108 093
- - 4911 909
- - - - - - - - - - 4244 - 232 4.230 4118 ndd 4069 ndd
5 189 698 597 4.765 340 5.301 5046 870
- 093
- 909
- - - - -
5 189 698 - 597 765 340 301 5.046 870 - 093 142 5046
-
5 189 4.698
- 597 765 5.340 301 5046 870
- 5404 046 142 5.046 - - - - - 249 4.720 782 5.347 5304 050 874 085 115 - 147 5042 932
- - - - - 5.192 470 - 4.602 77 5.338 301 5048 873
- 095
- 141 5.048 141 5042
-
- -
- - 4.166 - - 230 230 158 230 4.261 ndd ndd 089 ndd 106 ndd ndd 117 ndd 092
- - 252 472 77 124
- - 897 094 124 - - - - - 575 4544 4.256 239 197 -
- 248 72 77 121
- - 876 079 105 - 4911
-
- - 576 545
ii 256 s 239 22 202 cw
(141)H-3 H-fax H-3eq NAc Man-A Man-D, Man-B Man-D, Man-E Man-D, Gal-6 NeuAc NeuAc GlcNAc- GlcNAc-2 GlcNAc-5 NeuAc
a$ 4069 4.069 ndd
a$ - - -
a$ 3.99 3.98 ndd
a$ - - -
a$ - - -
a$ - - -
a$ - - -
a$ - - -
a$ - - -
43 2.039
a.P 2.043 2043 - 2064
a$ - - -
a$ - - -
ndd - ndd - - - - - - 039 2064 - - ndd - ndd ndd - - - - - 2.039 2064 - - 4.053 - 4.027 4.074 3991 - - - - - 045
- - ndd ndd ndd ndd - - - - - 2.039 2064 - - 4067 - 4018 067 4.018 4.067 - - - 038 2.065 - - 4049 - - - - - - 116 4.114
I 797 I 797 755 757
- -
2.045 2045 050 050 2031 2031
4049 - 3.98
- -
The numbermg system used for denotmg glycosyl restdues m the htgh-mannose ohgosacchandes ts as follows Mana(l+Z)Mana(l+6)
D3 B
>
Mana(l-t6)
Mana(l+2)Mana(I+3)
D2 A Mat& 1+4)GlcNAc~( l+I)GlcNAc
Mana( 1+2)Mana( 1+6) >
D4 E
>
Mana( 1+3) Mana(l-+2)Mana(l-+2)
Q C
and m the hybnd ohgosacchandes B
Mana(l+6) A Mana(l+6) Mana(l+3 > Man~(l+4)GlcNAc~(l~4)GlcNAc
NeuAca(2+3)Galp( l+I)GlcNAcB( 1+2)Mana( 1+3)
N
bData were acqutred at 500 MHz for neutral soluttons of the compounds in DzO at 2TC
‘Ohgosaccharides were released from recombmant soluble human ttssue plasmmogen acttvator (Z), from recombmant hepatttts B surface anttgen preS2 + S (3) or from allergen Art Y II (4) for complete structures, compare Scheme HM(5 + 1) denotes high-mannose MqGlcNAc, HM(6 + 2) denotes MattsGl~NAc~, and so on; EH-1 stands forendo-H released hybnd ohgosacchande-1 Structures are schemattcally tllustrated m the table headmg usmg a shorthand symbolic notatton,
n = GlcNAc, = Gal, = Man, A = NeuAc The peripheral umt on the left corresponds to the glycosyl restdues C-D,, and the untt on the nght to the B-D, glycosyl restdues
(142)Man-A in the a-anomer of the respective oligosaccharides (at 5.094 and 5.079) (see Fig 2B)
The structures of the oligosaccharides released from Asn-27 and Asn-300 of rCD4, and their relative abundances, have been published (I) A pictorial representation of the site heterogeneity of the carbohy- drate structures of recombinant soluble CD4 expressed in CHO cells is given in Fig
4 Notes
1 Advantages and disadvantages of NMR NMR spectroscopy is a powerful method for primary structural characterization of glycoprotein carbohy- drates, but, standmg alone, the method has its limitations Therefore, NMR should be the first, but never the only step m the structural analysis proce- dure Partial or even complete primary structure determination IS possible from the 1D ‘H-NMR spectrum provided that structurally related compounds have been previously characterized by ‘H-NMR spectroscopy It is recom- mended that the glycosyl-residue composition be obtained independently by chemical analysis and the mol wt be verified by FAR mass spectrometry The most important advantage of NMR spectroscopy over other tech- niques used for structural analysis of carbohydrates is its nondestructive nature The ohgosaccharide/glycopeptide sample, after NMR analysis, can be recovered 100% unimpaired and used for other analyses, biological activity tests, and so forth Also, mixtures of structurally closely related components can be analyzed successfully The most important hmitation of NMR spectroscopy is its sensitivity Not only are at least 10-15 nmol of pure carbohydrate required to record an NMR spectrum, even at 600 MHz, heterogeneity occurring in low abundance in the sample may escape atten- tion For example, the occurrence of NeuGc eluded NMR analysts (4% of total siahc acid m the samples discussed in Figs and 2, as determined by sialic acid analysis, see Fig 4; cf [1.5])
‘H-NMR spectroscopy may also fail to detect the presence of non- magnetically active nuclei in the carbohydrate: Although it is relatively straightforward to detect the presence of a phosphate (16) or O-acetate (17) group m an oligosaccharide by NMR, sulfate may escape detection (see, however, ref 18)
(143)n Asn-271 q Asn-300
EH EH O&F 00 F OlrF QlF OlrF 01 F 02+F 02 F
Fig Histogram showmg the glycosylatlon site heterogeneity of recombinant soluble human CD4 expressed m CHO cells (I) Explanation of the symbolic nota- tlon n = GlcNAc, = Gal, = Man, Cl = Fuc, A = NeuAc (compare Table 1) A small portion (9%) of the structures EH-1 and EH-2 is attached to Asn-300 via C6- fucosylated GlcNAc; the remamder (9 1%) of the structures IS lmked to Asn-300 through
GlcNAc devoid of fucose The N-acetyllactosamine-type structures occur m the gly- coprotein as shown
sable to define structural elements that extend the core and backbone of the common structures (Scheme l), but it is not always possible to delineate
unambiguously their branch location by NMR alone A classical example
of the latter situation 1s the so-called poly-N-acetyllactosamme type struc-
tures, i.e., extensions of the basic dl-, trr-, or tetra-antennary ohgosaccha-
rides (Scheme 1) with a number of N-acetyllactosamine units (m series
and/or parallel) attached p( 143) and/or p( 14) to Gal residues (21) (com-
pare Table 2) Also, blood group and other antigenic determinants m the
peripheral regions of N-type oligosaccharldes cannot always be located in
(144)NeuAc
~
A -O NeuRa(2+3)Galp(l-+
I
8 10 12 14 16 18
Ttme (min)
Fig Determmation, by high-pH amon-exchange chromatography with pulsed amperometrlc detection (PAD), of slahc acids m recombinant soluble human CD4 expressed in CHO cells after mild acid hydrolysis (0 lMTFA, 80°C, h; then Dionex AS6) The glycoprotein was found to contain NeuAc (R = AC (CO-CH3) and NeuGc
(R = Gc (CO-CH,OH) m the ratio of 96:4
2 Automation of the method With a dedicated rmcrocomputer at the heart
of the NMR spectrometer, the method of recording spectra is easily auto-
mated However, sample preparation will remam the responslbllity of a
researcher The most time-consuming part of the structural analysis of car-
bohydrates by NMR, until now, has been spectral mterpretatlon It 1s there
where efforts along two different alleys are underway to automate the
method The use of a search algorithm to compare a list of chemical shifts
of structural reporter groups wtth all those m a data base appears to be
rather straightforward Indeed, several such computer programs have been
written to assist m the interpretation of glycoprotem ohgosaccharlde ‘H-
NMR spectra (7,23) A much more elegant and potentially faster way is to
use the entire spectrum for pattern recogmtion, mcludmg the 3-4 ppm
envelope region The NMR spectrum, already available m digital format, would not be reduced mto a list of chemical shifts, as 1s done for human
(145)networks have been successfully applied for automated spectral mterpre- tation, mcludmg NMR spectra (8) In the foreseeable future, (NMR) spec- tral data bases will be connected to the complex carbohydrate structure data base (CCSD) (24) The neural network search algorithms will be made available to the scientific community much like CarbBank
3 De nova structural elucrdatron of carbohydrates by NMR spectroscopy When the ID *H spectrum does not resemble that of a known oligosaccha- ride structure, the combmation of multiple-pulse ‘H-NMR spectroscopic
techniques (chiefly, TOCSY and ROESY) may be applied for the de now
sequencing of the carbohydrate, provided that 1-3 pm01 of pure substance are available for the analysis The TOCSY technique permits subspectral editing of the ‘H spectrum for each constitutmg monosaccharide and, con- sequently, the vutually complete assignment of all the multiplet patterns in the ‘H-NMR spectrum Subsequently, from the ROESY spectrum, we can deduct the sequence of the monosaccharide residues, mcludmg identi- fication of the positions and configurations of glycosidic lmkages A dis- cussion of more sophisticated NMR techniques is beyond the scope of this chapter However, the Interested reader is referred to recent monographs
(25,26) and review articles (27-30) As mentioned earlier, for de nova
sequencing of the carbohydrates by experiments, such as 1D and 2D TOCSY and ROESY, typically 100 times the amount of sample mentioned for the ID analysis is needed (e.g., pm01 at 500 MHz)
4 Solution conformation analysts by NMR spectroscopy ‘H-NMR is pre- sented here as a method eminently suited for the elucidation of the primary structure of glycoprotein carbohydrates It is also the method of choice for solution conformation analysis Complete ‘H resonance assignments and primary structure determmation are a prerequisite for the analysis of the solution conformation based on quantitation of (‘H,*H) NOES Oftentimes assisted by other NMR parameters (r3C chemical shifts, heteronuclear cou- pling constants and NOE effects, isotope shift effects, and so on [31]) and always evaluated by theoretical conformational analysis, i.e., potential energy calculations of one sort or another (HSEA, AMBER, MM2, Monte Carlo, molecular dynamics, and so on) (32,33), 2D and 3D ‘H-NMR spec- troscopy is the key experimental technique for solution conformation analy- sis of carbohydrates and glycoconjugates
(146)Acknowledgments
Research in the author’s lab is supported by National Institutes of HealthGrants P41-RR-0535 1, POl-AI-27135 andROl-HL-38213 The author is indebted to Rosemary Nuri for editing the manuscript
Abbreviations
CHO, Chinese hamster ovary; lD, one-dimensional; 2D, two-dimen- sional, and so on; CCSD, complex carbohydrate structure data base; DSS, sodium 4,4-dimethyl-4-silapentane-1-sulfonate; FAB, fast-atom bombardment; FID, free induction decay; FT, Fourier transform( ation); ‘H(H), hydrog en, D, deuterium, T, tritium; HBV, hepatitis B vu-us; NOE, nuclear Overhauser effect; rCD4, recombinant cluster differentiation antigen; RF, radio frequency; ROESY, rotating-frame NOE-correlated spectroscopy; rtPA, recombinant human tissue plasminogen activator; S/N, signal-to-noise ratio; TOCSY, total correlation spectroscopy; WEFT, water-eliminated FT
References
1 Spellman, M W., Leonard, C K., Basa, L J., Gelmeo, I., and Van Halbeek, H (1991) Carbohydrate structures of recombmant soluble human CD4 expressed m Chinese hamster ovary cells Biuchemlstry 30,2395-2406
2 Spellman, M W , Basa, L J., Leonard, C K , Chakel, J , O’Connor, J V , Wdson, S., and Van Halbeek, H (1989) Carbohydrate structures of human tissue plasmm- ogen activator expressed m Chinese hamster ovary cells J Biol Chem 264,
14,100-14,111
3 Yu Ip, C C., Miller, W J., Kubek, D J., Strang, A.-M , Van Halbeek, H , Pieseckr, S J., and Alhadeff, J A (1992) Structural characterization of the N-glycans of a recombinant hepatitrs B surface antigen derived from yeast Biochemistry 31, 285-295
4 Nrlsen, B M., Sletten, K., Smestad Paulsen, B , O’Nerll, M., and Van Halbeek, H (1991) Structural analysis of the glycoprotein allergen Art v II from the pollen of mugwort (Artemisla vulgarrs L.) J Biol Chem 266,266C-2668
5 Vhegenthart, J F G , Dorland, L , and Van Halbeek, H (1983) High-resolutron ‘H-nuclear magnetic resonance spectroscopy as a tool m the structural analysis of carbohydrates related to glycoprotems Adv Carbohydr Chem Blochem 41, 209-374
6 Van Halbeek, H (1984) Structural analysis of the carbohydrate chams of mucm- type glycoprotems by high-resolutron ‘H-NMR spectroscopy Biochem Sot Trans 12,601-605
(147)8 Meyer, B , Hansen, T , Nute, D., Albersheim, P., Darvrll, A G., York, W S., and Sellers, J (1991) Identification of the ‘H-NMR spectra of complex oligosaccha- rides with artificial neural networks Science 251,542-544
9 Oppenheimer, N J (1989) Basic techniques Sample preparation Methods
Enzymol 176,78-92
10 Hore, P J (1989) Basic techniques Solvent suppression Methods Enzymol 176,
64-77
11 Haasnoot, C A G (1983) Selective solvent suppression m ‘H FT-NMR using a
DANTE pulse; its application in normal and NOE measurements J Mugn Reson
52,153-158
12 Carver, J P and Grey, A A (198 1) Determination of glycopeptide primary struc-
ture by 360-MHz proton magnetic resonance spectroscopy Biochemistry 20,
6607-6616
13 Brockhausen, I., Grey, A A , Pang, H., Schachter, H., and Carver, J P (1988)
N-acetylglucosaminyltransferase substrates prepared from glycoprotems by
hydrazinolysts of the GlcNAc-Asn linkage Purification and structural determr- nation of oligosaccharides with mannose and iV-acetylglucosamme at the non-
reducing termini Glycoconpgate J 5,419448
14 Green, E D , Adelt, G , Baenziger, J U., Wtlson, S , and Van Halbeek, H (1988) The asparagine-linked oligosaccharides of bovine fetuin: Structural analysis of
N-glycanase-released oligosaccharides by 500-MHz ‘H-NMR spectroscopy J
Biol Chem 263, 18,253-18,268
15 Hokke, C H , Bergwerff, A A., Van Dedem, G W K., Van Oostrum, J., Kamerling, J P., and Vliegenthart, J F G (1990) Sialylated carbohydrate chains of recombinant glycoprotems expressed in Chinese hamster ovary cells contam
traces of N-glycolylneurammic acid FEBS Lett 275,9-14
16 Couso, R 0, Van Halbeek, H , Reinhold, V N., and Kornfeld, S (1987) The
high-mannose ohgosaccharides of Dictyostelium discoldeum glycoproteins con-
tain a novel intersecting N-acetylglucosamine residue J Biol Chem 262,
452 l-4527
17 Damm, J B L , Voshol, H., HBrd, K , Kamerlmg, J P , and Vliegenthart, J F G
(1989) Analysis of N-acetyl-4-O-acetylneurammic acid-containing N-lmked car-
bohydrate chams released by N-glycanase, Apphcation to the structure determr-
nation of the carbohydrate chains of equine fibrinogen Eur J Biochem 180,
101-l 10
18 De Waard, P., Koorevaar, A., Kamerlmg, J P , and Vliegenthart, J F G (1991) Structure determinatron by ‘H-NMR spectroscopy of (sulfated) sialylated N-linked
carbohydrate chains released from porcine thyroglobulin by N-glycanase J Biol
Chem 266,42374243
19 Paz Parente, J., Wieruszeski, J M , Strecker, G , Montreuil, J , Fournet, B., Van Halbeek, H., Dorland, L , and Vhegenthart, J F G (1982) A novel type of carbo-
hydrate structure present m hen ovomucord J Blol Chem 257, 13,173- 13,176
20 Paz Parente, J , Strecker, G , Leroy, Y., Montreml, J , Fournet, B., Van Halbeek, H , Dorland, L., and Vhegenthart, J.F G (1983) Primary structure of a novel N- glycosidrc carbohydrate unit derrved from hen ovomucord, a 500-MHz ‘H-NMR
(148)21 Fukuda, M , Bothner, B., RamsamooJ, P., Dell A., Tiller, P R., Varlu, A , and
Klock, J C (1985) Structures of sialylated fucosyl polylactosaminoglycans iso-
lated from chronic myelogenous leukemia cells J Blol Chem 260, 12,957-
12,967
22 Fmne, J , Brermer, M E., Hansson, G C , Karlsson, K A., Leffler, H., Vhegenthart, J F G., and Van Halbeek, H (1989) Novel polyfucosylated iV-lmked glycopep- tides with blood group A, H, X, and Y determmants from human small-intestmal
epithelial cells J Biol Chem 264,5720-5735
23 Bot, D S M., Cleij, P , Van ‘t Klooster, H A., Van Halbeek, H , Veldink, G A., and Vliegenthart, J F G (1988) Identification and substructure analysis of oh- gosaccharide chains derived from glycoprotems by computer retrieval of hrgh-
resolution ‘H-NMR spectra J Chemometncs 2, l-27
24 Doubet, R S , Bock, K , Smith, D M , Darvill, A G , and Albersheim, P
(1989) The complex carbohydrate structure database Trends Bwchem Scl
14,475477
25 Derome, A E (1987) Modern NMR Technrques for Chemrstry Research
Pergamon, Oxford
26 Sanders, J K M and Hunter, B K (1987) Modern NMR Spectroscopy* A Guide
for Chemists Oxford University Press, Oxford
27 Bush, C A (1988) High-resolution NMR m the determination of structure m
complex carbohydrates Bull Magn Reson 10,73-95
28 Dabrowslu, J (1989) Analytical methods: Two-dimensional proton magnetic reso-
nance spectroscopy Methods Enzymol 179,122-l 56
29 Van Halbeek, H (1990) NMR of complex carbohydrates, m Frontiers of NMR m
Molecular Biology, UCLA Symposia Series vol 109 (Live, D., Armnage, I M , and Patel, D , eds ), Liss, New York, pp 195-213
30 Van Halbeek, H and Poppe, L (1992) Structure elucidation of ohgosacchartdes
by NMR spectroscopy Adv Carbohydr Chem Biochem (in preparation)
31 Poppe, L., Stutke-Pnll, R., Meyer, B., and Van Halbeek, H (1992) The solutton conformatron of sialyl-cx(2+6)-lactose studied by modern NMR techniques and
Monte Carlo stmulations J Biomol NMR 2, 109-136
32 Homans, S W (1990) Ohgosacchartde conformations: Application of NMR and
energy calculattons Progr NMR Spectrosc 22,155-g
33 Meyer, B (1990) Conformational aspects of ohgosaccharides Top Curr Chem
(149)The Application of Nuclear Magnetic
Resonance to Structural Studies
of Polysaccharides
Christopher Jones and Barbara Mullqy
1 Introduction
1.1 Polysaccharides:
Occurrence and Importance
Polysaccharides are ubiquitous components of living tissues They are storage compounds in both animals and plants, and form important struc- tural elements in, for example, plant cell walls, insect exoskeletons, and animal connective tissues In bacteria, they are important both as struc- tural elements in the cell wall (the teichoic and teichuronic acids) and as surface antigens, such as the O-antigenic oligo- or polysaccharide chain of the lipopolysaccharides (LPS) of gram-negative species, and the cap- sular polysaccharides (CPS) found on many pathogenic bacteria These extracellular bacterial polysaccharides have a protective function, pre- venting desiccation of the organism, and are important determinants of virulence, since they shield the bacterium from the body’s defenses
Polysaccharides also occur in many mammalian and other systems as the glycosaminoglycan (GAG) side chains of proteoglycans, with both biochemical (such as those of cell-surface heparan sulfate [I] ) and struc- tural functions (for example, the chondroitin sulfates of connective tissue [22] ) An increasing range of polysaccharides is now being exploited commercially Bacterial CPS mixtures are in use as human vaccines (3),
From Methods III Molecular Biology, Vol 17 Spectroscop/c Methods and Analyses NMR, Mass Spectrometry, and Metalloprotem Techmques Edlted by C Jones, B Mulloy,
and A H Thomas Copynght 01993 Humana Press Inc , Totowa, NJ
(150)and the glycosaminoglycan heparin has long been used clinically as an anticoagulant and antithrombotic agent (4)
1.2 Polysaccharide Structures
Two classes of polysaccharides will be considered in this chapter: those having a strict regular repeat unit, such as the capsular polysaccharides and LPS O-antigen, and, second, polysaccharides, such as the glycosami- noglycans, in which heterogeneity occurs as a result of varying substitu- tion and/or epimerization of P-u-glucuronic acid to a-L-iduronic acid
1.3 Comparison Between
Polysaccharides and Peptides
A knowledge of the ways in which polysaccharrde structure differs from that of polypeptides rationalizes the different approaches used to obtain and interpret nuclear magnetic resonance (NMR) data for these two types of biopolymers The repertoire of commonly occurring monomers is about the same size in each case, but, whereas the peptide linkage is rigidly defined, monosaccharides may be linked together in a wider vari- ety of ways Each sugar can be present either as the a- or p-anomer and may be linked to any of the free hydroxyl groups on the adjacent sugar residue (see Note 1)
Both linear and branched systems occur in polysaccharide systems, with a wide variety of nonsugar substituents: acetate esters, sulfate esters, pyruvate acetals, and so on This would lead to an impossibly complex spectrum, but for the fact that even in relatively heterogeneous polysac- charides there is a strong repeating element of not more than seven sugars, rather than the nonrepeating linear sequence found in globular proteins Consequently, a single resonance in the spectrum usually does not arise from a single residue in the primary sequence, but is the superposition of signals from similar residues at various positions along the cham Polysac- charides are also almost invariably polydisperse; for structural studies, this does not introduce difficulties in NMR measurement or mterpreta- tion Capsular poiysaccharides have mol wt of typically hundreds of kilodaltons, but can give surprisingly sharp signals, unlike proteins of the same size (but see Note 2)
(151)heterogeneity) and with broader signals, since steric crowding of the bulky substituents tends to make the polysaccharide chain stiffer
1.4 Scope ofNMR Studies 0fPolysaccharides
NMR 1s the single most powerful technique for solving the structures of intact polysaccharides Information can be obtained on the composi- tion, sequence, linkage, and substitution positions of polysaccharides, as well as the anomeric configuration The absolute configuration of the sugar residues cannot normally be determined by NMR, for which GC or optical techniques must be used (5) The nondestructive nature of NMR spectroscopy allows it to precede other techniques, such as methylation analysis (6) Structural studies on carbohydrates by NMR involve some consideration of conformational properties as a matter of necessity, but the use of NMR techniques in the determination of “secondary” and “ter- tiary” structures of polysaccharides will not be dealt with here
2 Sample Preparation
2.1 Removal of-Protein and Nucleic Acid Impurities
Samples must be free from protein and nucleic acid impurities The extent of protein and nucleic acid contamination can be readily estimated by measurement of UV absorption at 280 and 254 nm; a pure polysaccha- ride sample will have little or no absorption at these wavelengths Enzy- mic digestion with ribonuclease, deoxyribonuclease, and proteases followed by dialysis or gel filtration is valuable, since glycosidase impu- rities in these enzymes are not significant
2.2 Removal of Unwanted Counterions from Anionic Polysaccharides
(152)overnight) immediately before use In other sulfate-containing samples and samples prone to gelling, control of the counterion can also be impor- tant and is achieved by the same process
2.3 Sample Quantity
For capsular polysaccharides or LPS O-antigens approx mg are required for a full proton study at high field More material, typically 20 mg, is required for carbon analysis These quantities depend on the sample to some extent-more when a large repeat unit is present or for a very viscous sample, but methods possible on newer instruments, particularly proton-detected heteronuclear correlation spectra, are improving sensi- tivity here Larger samples of glycosaminoglycans may be necessary-
10-20 mg for proton and 50-100 mg for carbon studies
2.4 Solvents: Exchange with D,O
The NMR spectrum will almost invariably be collected in aqueous solution (see Note 3), and, since polysaccharides carry a large number of exchangeable protons, deuteriumexchange is very strongly recommended These polysaccharides are almost invariably thermally stable and not prone to denaturation Comprehensive freeze-drying and exchange with D20 are both possible and desirable Dissolve the sample in the minimum amount of D,O (CPSs and GAGS are very soluble), freeze, and lyophilize; repeat this process three times (see Note 4) Solvent suppression in proton spectra should then be unnecessary, thereby simplifying the experimental procedure and the final spectrum Many important peaks are close to the solvent resonance and can be seen more clearly without solvent suppres- sion The information that can be obtained from the exchangeable protons (OH and NH) is relatively little, m contrast to the use made of the armde proton in peptide and protein studies
2.5 Control ofpH
(153)3 Conditions for the Collection of Spectra 3.1 Temperature
Spectra can be obtained at high temperatures, since polysaccharides are usually quite heat stable Increasing the probe temperature to 70-90°C considerably sharpens the resonances and increases sensitivity, particu- larly in 2D experiments (see Note 5)
3.2 Residual Water
Some interesting resonances may be obscured by the water peak, but this can be moved upfield by increasing the temperature One-dimen- sional proton spectra should therefore be collected at more than one tem- perature In general, resonances arising from thepolysaccharide will show small temperature coefficients, and temperature changes not addition- ally complicate the assignment
3.3 ID Proton and Carbon Spectra
Prelimmary ID proton spectra should be obtained for any sample, as a check on its suitability for NMR, before large amounts of spectrometer time (and spectroscopist time) are committed to more elaborate studies Carbon spectra take much longer, but are very informative
3.4.20 Spectra
Most of the usual repertoire of 2Dexperiments can be applied, but some fail because of poor sensitivity In general, those experiments that obtain correlations through small coupling constants by using relatively long tuning delays cause problems, since the signal decays (by rapid T2 relax- ation) before acquisition begins Experiments that fall into this category include the J-resolved proton experiment, long-range correlation experi- ments (homo- and heteronuclear), and sometimes even the standard one- bond carbon-proton correlation experiment, which can fail with very viscous samples (7) On the other hand, rapid T, and T2 relaxation does allow time to be saved on relaxation delays (see Chapter 5)
3.5, Planning an NMR Study
(154)non needed for a structural study or highlight specific problems to be solved by other methods
3.6 Field Strength and Sensitivity
Proton spectra should be run at the highest possible field (300 MHz at least), but worthwhile carbon spectra can be obtained even at a field as low as 20 MHz Proton spectra are of course much more sensitrve than carbon spectra, and sensitivity increases dramatically with increas- ing field strength There have been very few studies on labeled (13C m this context) polysaccharides
4 Interpretation of Spectra
and Structure Determination
4.1 Repeating Polysaccharides: Composition
High-mol-wt repeating polysaccharides usually show remarkably simple spectra with insignificant complications owing to end groups Unfortunately, these spectra are very crowded, which creates a different set of problems The structure (3) of the repeat unit of a typical bacterial polysaccharide, the CPS from Streptococcus pneumoniae Type lOA, IS shown in Fig Nearly all the nonexchangeable protons are present in either H-C(-C)(-C)-0 or H-C(-C)(-C)-N systems, and give signals between 3.34.5 ppm Other structural elements that frequently occur are uranic acids, O- andN-acetyl groups, and pyruvate acetals, all of which give rise to quite characteristic resonances
The 500-MHz proton spectrum of pneumococcal type 10A CPS is shown in Fig and the 125MHz carbon spectrum in Fig The usual ranges of important peaks are shown Countmg the resonances of each type in these spectra usually answers most of the basic compositional questions, such as (Tables and 2):
How many carbon atoms are there m the repeat unit? How many sugars are there m the repeat unit? How many aminosugars are present?
How many Cdeoxysugars are present? How many uromc acids are present?
(155)-S)-/3-D-Galf-( I-3)-/J-D-Galp-( I-4)-/?-D-G;lpNAc-(I-3)-a-D-Galp-( I-2)-D-Rlbol-(SOPOz- I
/I-D-Lalf
Fig The structure (3) of the repeat unit of the capsular polysaccharide from Streptococcus pneumoniae Type 10A This polysaccharrde contains sugars in both the pyranose and furanose rmg form, in both the a- and J3-anomeric configuratron, and an alditol phosphate linkage
s HOC
k-
I Anomerlc hydrogen8 I
3lng hydrogens
OAc H
C-Me -
,~‘~‘I” I’~.‘I ‘.1”“I.~“1”‘~I””)’~’.I’~~’I”~’I’.~’
s-5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 I 1.0 K
Chemical Shllt ( ppm)
Fig The 500-MHz proton spectrum of the pneumococcal 10A polysaccharide
(obtained at 7O’C) showing the typical positions of a number of common resonances “Ring hydrogens” mclude all the hydrogens on the sugar rmg and hydroxymethyl
pendant groups The resonance marked n is from ethanol, and the resonances marked
* arise from contammatron by the cell-wall polysaccharrde
Are any furanose ring-form sugars present?
Are any 13C-3’P couplmgs apparent (typtcally Hz)?
(See Notes 6-8)
(156)I’ ’ -‘-I .‘ ,- ’ -I’ - -I I-.- - I *-‘ I- ‘. , - -I-
110 100 so 80 70 60 50 40 30 20
Cherntcal Shift (ppm) s
Fig The 125-MHz DEPT-135 carbon spectrum of the pneumococcal 10A polysacchande labeled with the positrons of the resonances from a number of common groups This spectrum was obtamed at 7O”C, and 1s phased so that resonances from methene F: and methyl carbons are negative, and resonances from methylene carbons are posmve Resonances from quaternary carbons wrll not appear The resonance at 55 ppm arises from an impunty
(157)Table
Typical Proton Chemical Shifts and Coupling Patterns for Frequently Resolved Resonances m the NMR Spectra of Polysaccharides
H-l of a sugars H-l of p sugars H- of CWZLZ~~~ sugars
H- of p-manno sugars
ManNAc H-2 a-GalA H-5 a-IdoA H-5
ct-munno H-2 f3-Glc H-2”
N-acetylammosugar methyl H-6 of 6-deoxysugars
Chemical shift range, ppm
5.5-5.0 4.4-5.0 5.549 5.0-4.6 4.3-4.1 6-5 46-50 3944 3.4-3 15-2.00 1.35-1 15
Coupling pattern
d, Hz d, Hz d, <2 Hz d, <2 Hz d,35Hz d, approx Hz d, approx Hz d,4Hz
t, 10Hz
Zl,6Hz
Wnsubstttuted
Table Typical Chemical Shafts
in the Carbon NMR Spectra of Polysaccharldes
Chemical shift, ppm
Uromc acid C-6 170-180
N-acetylaminosugar carbonyl carbon 170-180
Furanose sugar C- (a, p) 101-111
Pyranose sugar C- (c1) 91-101
(PI 95-105
UnsubstitutedQ furanose ring carbons 70-85
Unsubstituteda pyranose ring carbons 65-75
Glycosylated hydroxymethyl groups 65-68
Unsubstituteda hydroxymethyl groups 61-64
C-N m amino sugars 48-55
Wacetyl methyls 23-25
C-6 in 6-deoxyhexoses 15-19
aSubstnutton, e.g , with sulfate groups in the glycosaminoglycans, causes a
downfield shift (of 2-7 ppm) of the resonance due to the carbon atom at the
position of substitution
(158)dues, where the (small) 3Jn,,n, value is difficult to measure and only poorly diagnostic of anomeric configuration
A 1D phosphorus spectrum can usually be readily obtained if phospho- rylation is suspected, and chemical shift correlation should distinguish between phosphomonoesters and phosphodiesters More detailed struc- tural analysis requires resorting to 2D methods
4.2 Glycosaminoglycans
All glycosaminoglycans, except keratan sulfate, consist of a lmearchain of alternating hexuronic acids and hexosamines, so that the basic repeat- ing unit is a disaccharide Commonly found substituents areiV-acetyl and N- and O-sulfate groups The structure of heparan sulfate is shown m Fig 4, as an example; Fransson (2) can be consulted for other aspects of GAG structure Heterogeneity can arise in several ways a-L-Iduronic acid in heparin, heparan sulfate, anddermatan sulfate arises from the postpolymer- izationepimerizationof P-o-GlcA, which may not be complete There may also be considerable variation in substitution with sulfate groups (see Fig 4) Some well defined atypical primary sequences may be found, for example, as the remnant of the linkage regions at the reducing end of the polysaccharide chain, originally attached to the protein core of the proteoglycan (9) and occasionally as the capping region (IO) of the GAG side chains of proteoglycans An unusual primary sequence is also found as the binding site on heparin for antithrombin, and contributions from this sequence can be seen in the ‘H spectrum of heparin (II)
The emphasis in NMR studies of GAGS is as a consequence rather different from that in structural determination of bacterial polysaccha- rides Establishment of the main backbone structure is likely to be some- what simpler, with the positions of substituents and identification of variant sequences a more important goal
Figure shows a detail of the anomeric region of the high-field ‘H spectrumofheparansulfate.TheNMRspectraofheparln(11,12), heparan sulfate (11,13), dermatan sulfate (14,151, keratan sulfate(7,16), and chon- droitin sulfate (I 7) have been discussed in a number of papers
4.3 Spectral Assignments
(159)OH NHCOCH,
-4)-P-D-Glc A-( -4)~CY-D-GlcNAc-( I -
oso,- NHSO;
-4)-ac-L-ldoA(20SOJ-( 1-4)-ol-D-GlcNS0,(60SO~)~l-
Fig Structure of heparan sulfate The backbone IS a block copolymer of two different repeating disaccharides sequences, A and B In sequence A, the glucosamine residue is occasionally O-sulfated rather than iV-acetylated, in sequence B, erther of
the O-sulfate groups may be missing The sequence (-GlcNAc-IdoA) never occurs
correlation methods of COSY, relayed COSY, and HOHAHPOCSY methods This usually gives sufficient information, but problems may remain because of severe spectral overlap or when small coupling con- stants result in extremely weak correlation crosspeaks, and the gala&o H- 4/H-5 correlation (3J n4,n5 = typically < Hz) is the most common problem In cases where the connectivity is lost, NOE methods must be used, but these can create circular arguments if NOE arguments are also used to determine sugar residue sequence
(160)I
5.8 5.4 5.0 iwm
4.6
Fig Part of the 500-MHz NMR spectrum of heparan sulfate, showing the well resolved resonances owing to the anomerrc protons (H-l) and to H-5 of cr+rduromc actd The signals are numbered as follows: 1, H- of N-sulfated glucosamine linked at C-l to P-o-glucuromc acid; 2, H-l of N-acetylated glucosamme linked through C-l to glucuromc acid (as m Ftg 4, sequence A) and N-sulfated glucosamine lmked at C-l to cc-L-rduronic actd (as rn Fig 4, sequence B), 3, H-l of 2-O-sulfated iduromc acrd (mostly as m Ftg 4, sequence B); 4, H-lof unsulfated tduromc acid linked to 6-O- sulphated glucosamme; 5, H-l of unsulfated rduromc actd linked to glucosamine, not 6-O-sulphated; 6, H-5 of iduronic acid, 7, H-l of glucuromc actd lmked to N-acety- lated glucosamme (as m Frg 4, sequence A)
phates) are a particular problem, since they usually lack well resolved, easily assigned resonances that can be used as starting points for the connectivity analysis In such cases, proton assignments from 31P-1H or i3C-lH correlation experiments may be very useful Pulse sequences that may simplify the spectrum and make specific assignments, such as tnple- quantum filtered COSY (for hydroxymethyl H5/H6/H6’ systems), are valuable-the assignment is only a means to an end
(161)J ,,=3Hz Jza=lOHz JM=4Hz
J ,=lHz Js6=Jse=6Hz Js6=12Hz
Fig Coupling constants around the a-Gal residue
Proton-detected experiments are to be recommended, since they provide both high sensitivity and the best resolution in the crowded proton domain (IPJO) The limitation on unambiguous assignment of the carbon spec- trum is usually the overlap of the proton resonances, and carbon chemical shift arguments can resolve these ambiguities
4.4 Assignments of Spin Systems to Specific Sugar Residues
In contrast to amino acids, the spin systems arising from various sugar residues show similar chemical shifts, and they must be differentiated on the basis of their interproton coupling constants These coupling constants are often only available from 2D correlation spectra, which should there- fore be obtained at the highest possible digital resolution, or from the 1D variants of the various 2D correlation techniques Omitting well resolved and easily assigned high-field resonances reduces the spectral width and improves digitization without undue increases in data storage require- ments, and the spectrum may be processed with strong resolution enhance- ment to resolve the fine structure of the crosspeaks, from which interproton coupling constants can be estimated
(162)equatorial relationship show a 34 Hz couplmg, dropping to ca Hz if both hydrogens are antiperiplanar to an oxygen across the bond These relationships hold for pyranose sugars in the standard 4C, ring forms, but are not applicable to furanose sugars (where there are fewer data avail- able) or when a pyranose ring is not in the 4C1 conformation (see Note 9)
4.5 Substituents
The position of 0-acetyl groups can be determined from the strong downfield shift on the a proton; effects in the carbon spectra are relatively small upfield shifts on the substituted carbons, although long-range C-H correlation has been used (21) De-0-acetylation and reanalysis of the sample can provide additional evidence N-acetyl groups can be located by correlation to NH m Hz0 solution or from the downfield position of the N-CH proton resonance, but the proton shifts are not really diagnos- tic and our preferred method is from the high-field 13C resonance of the C-N system at ca 48-55 ppm The position of phosphorylation can be determined fromcoupling in the t3C spectrum if the resonance IS resolved or by 1H-31P correlation, which has high sensitivity even when detected through the 31P nucleus In some cases, such as the phosphodiester link through the anomeric position found in some capsular polysaccharides, the low-field chemical shift and 1H-31Pcoupling constant (typically Hz) are diagnostic Dephosphorylation can be carried out by treatment with48 or 60% hydrofluoric acid (22) (handle this with care; it is extremely dangerous) or phosphodiesters cleaved with strong base (23), to gen- erate either oligosaccharides or a dephosphorylated polymer depend- ing on the polysaccharide Pyruvate acetals generate a characteristic proton resonance at ca 1.5 ppm, and the carbon chemical shift of the acetal carbon distinguishes between five- or six-membered ring systems Vari- ous methods are available to determine the stereochemistry at the pyru- vate C-2 that generally rely on comparison of chemical shifts with those of model systems (24-26) These substituents can be removed by dilute acid hydrolysis
4.6 Linkage Position and Sequence
(163)to locate the position of glycosylation, but can be very useful to assign terminal residues in side chains We are not aware of any complete tabu- lations of proton chemical shifts for oligosaccharides (29), although they are available for carbon chemical shifts Glycosylation causes a large (6-8 ppm) downfield shift of the a carbon and a small ( l-2 ppm) upfield shift of the p carbons, and so is diagnostic of the linkage position (30,311, but the situation is more complex at branch-point residues Interproton interresidue NOES will occur when an anomeric proton on one residue is close in space to a ring proton on another residue, and so provide both linkage and sequence data (27) The strongest interresidue NOE is almost always observed between hydrogens attached to the carbon atoms involved in the glycosidic link This method is quite reliable, but some linkages, especially (I-6)-links, give weak NOES, and NOES (usually weaker) to the proton on the carbon adjacent to the linkage can occur (32) Depending on the precise local conformation, other NOES may be observed to con- firm the analysis; well reported cases are between Hex H-l and Gal H-6s in a-o-Hex (1-4)-o-Gal and between Hex H-5 and Man H-2 in a-D-Hex- ( I-3)-D-Man(NAc) systems, but such cases are obviously dependent on the anomeric configuration, relative configuration, and absolute configu- ration of the sugars concerned (33) These NOES are usually obtained from NOESY spectra, but the ID NOE difference experiment has its place, since the anomertc resonances are usually well resolvedand the high digital resolution allows the multiplicity of enhanced peaks to be determined, which is of particular value in heavily overlapped systems The literature on interresidue NOES involving furanose sugars is much smaller, but the same patterns are to be expected
Correlation experiments tuned for the small interproton coupling con- stant across the glycosidic link usually fail because spin-spin relaxation is too fast, but a selective ID long-range carbon-proton correlation across the glycosidic link has been used successfully in some cases (28) These experiments require a well resolved anomeric resonance and an assigned carbon spectrum, but fail on viscous samples where proton 7’+ are short (i.e., where lines are broad)
4.7 Conformation
(164)methods developed for the conformational analysis of proteins and oligo- nucleotides are not applicable to polysaccharides Shorter range interresidue NOES may, in favorable cases, be used to determine the approximately helical “secondary” structures of polysaccharides (34)
5 Notes
1 There is, however, a wide range of rare sugars m the bacterial polysaccha- ndes that are found occasionally and that can greatly complicate the analysts The fact that lures are sufficiently narrow to give interpretable polysaccha-
ride spectra reflects internal mobility rather than overall tumbling; linewidth is therefore at least as much a functton of chemical structure as of mol wt
Polysaccharides with a particularly flexible link (e.g , a [l-6] link, a
phosphodiester or an alditol) tend to gave better spectra than those lackmg this kind of feature Similarly highly charged polysaccharides wtth counterton shells and accompanying solvation, or crowdmg owing to many large substrtuents give much poorer spectra Both these problems are encountered m the glycosaminoglycans
Sometimes chemical modificatton can help (e.g., de-O-acetylatton); thts works by changing the internal motions rather than the overall mol wt Sonicatron of very high-mol-wt samples can lead to partial, random depolymerizatron to approx 50 kDa, so reducing lmewrdths and improv- mg the spectra (35)
3 Very few studies have been carried out tn DMSO solution; the hmttmg factor is solubihty There 1s a growing body of solid-state studies (ltmrted to 13C studies) on polysacchartdes with a simple repeat unit
4 We find a vacuum centrifuge useful; freeze-drying m a small desiccator evacuated by means of an oil pump, protected by a dry ice/methanol cold trap seems more efficient than the big freeze-dryers
5 We have encountered problems owtng to spectrometer tnstabrhty at elevated temperature, frequently because of such simple mechanical problems as unbalanced au flows or slow thermal equilibration within the probe In such cases, obtaining the spectrum without sample spurning would be advantageous, and the effects on hnewrdth are not tmportant m such hrgh- mol-wt samples
6 Even the regular repeating polysaccharrdes can suffer from mcomplete or heterogeneous substrtutron, which complrcates the spectra This hetero- geneity can sometimes be removed chemically, for example (27), by de-O-acetylation with 5% ammonia for h at 37°C Nonintegral ratios of the area under resonances is usually mdtcattve of some form of mcom- plete substitution
(165)mol-wt complexes Of these, some, such as chitin and cellulose are msol- uble, some, such as agarose, and the carrageenans form gels, and many of the fungal (I-3)-P-glucans form htghly ordered triple-helical structures These systems are amenable to t3C solid-state NMR studies, which pro- vide a lot of structural and conformational information, but will not be considered further here (36)
8 Pulse methods: Rapid repetmon, usually with a short pulse, tmproves sig- nal-to-noise for protonated carbons, but unprotonated carbons (whtch for carbohydrates are mainly ketose C-2, carbonyls, and pyruvate acetal car- bons) will give signals of much lower intensity owing to their slow relax-
ation rates If your carbon spectra are poor because of small NOES, running
spectra m polarization transfer mode should be well worthwhile
9 An example of this is the a-L-tduronate residue found in several glycosami-
noglycans, where the pyranose rmg is mobile, and its conformatton can
best be described as an equiltbtium between two or three contrrbutmg forms
In such cases, interpretation 1s not simple, and the considerable literature
on this example should be consulted (37-39)
References
1 Gallagher, J T , Lyon, M , and Steward, W P (1986) Structure and function of
heparan sulphate proteoglycans Biochem J 236,3 13-325
2 Fransson, L A (1985) Mammalian glycosammoglycans, in The Polysaccharides,
vol (Aspinall, G , ed.), Academic, New York
3 Jennings, H J (1990) Capsular polysaccharides as Vaccine Candtdates, in Cur-
rent Topics in Microbiology and Immunology, vol 150 (Jann, K and Jann, B., eds.) Springer-Verlag, Berlin, pp 97-128
4 Lane, D A and Lindahl, U (1989) Heparin Edward Arnold, London
5 Gerwig, G J., Kamerling, J P , and Vliegenthart, J F G (1979) Determination of the absolute configuration of monosacchartdes m complex carbohydrates by
capillary g.1.c Carbohydr Res 77, l-7
6 Lindberg, B and Lonngren, J (1978) Methylation analysts of complex carbohy-
drates general procedure and application for sequence analysis Methods Enzymol
SO, 3-33
7 Cockin, G H., Huckerby, T N , and Nieduszynski, I A (1986) High-field NMR
studres of keratan sulphates; ‘H and 13C assig nments of keratan sulphate from
shark cartilage Biochem J 236,921-924
8 Bock, K and Pedersen, C (1974) A study of t3CH coupling constants in
hexopyranoses J Chem Sot Perkin Trans II, 293-297
9 Van Halbeek, H , Dorland, L., Veldink, G A , Vliegenthart, J F G., Garegg, P J., Norberg, T., and Lindberg, B (1982) A 500 MHz Protonmagnettc-resonance study of several fragments of the carbohydrateprotein lmkage region commonly
occurring m proteoglycans Eur J Blochem 127, l-6
(166)of ohgosaccharldes from the non-reducing termml of keratan sulphate chams
Carbohydr Res m press
11 Mulloy, B and Johnson, E A (1987) Assignment of the ‘H-NMR spectra of
heparm and heparan sulphate Carbohydr Res 170, 151-164
12 Gatti, G , Casu, B., Hamer, G K , and Perlm, A S (1979) Studies on the con-
formation of heparin by ‘H and 13C NMR spectroscopy Macromolecules 12,
1001-1007
13 Huckerby, T N and Nreduszynskr, I A (1982) Proton chemical shrfts m the
NMR spectra of heparan and heparin Carbohydr Res 103, 141-145
14 Sanderson, P N , Huckerby, T N., and Nieduszynski, I A (1989) Chondrortinase ABC digestion of dermatan sulphate, NMR spectroscopic characterrzation of the
oligo- and poly-saccharides Biochem I 257,347-354
15 Bossennec, V , Petitou, M., and Perly, B (1990) ‘H-NMR mvestigatlon of natu-
rally occurrmg and chemically oversulphated dermatan sulphates Btochem J
267,625-630
16 Hounsell, E , Feeney, J., Scudder, P., Tang, P W., and Ferzi, T (1986) ‘H-NMR studies at 500 MHz of a neutral disaccharide and sulphated di-, tetra-, hexa- and
larger oligosaccharrdes obtained by endo+galactosrdase treatment of keratan
sulphate Eur J Biochem 157,375-384
17 Weitr, D., Rees, D A., and Welsh, E J (1979) Solution conformatton of gly- cosaminoglycans assignment of the 300 MHz ‘H-magnetic resonance spectra of chondrortm 4-sulphate, chondroitm 6-sulphate and hyaluronate, and investiga-
tion of an alkali-induced conformational change Eur J Biochem 94,505-5 14
18 Altman, E , Brrsson, J.-R , and Perry, M B (1988) Structure of the O-antigen
polysaccharide of Haemophilus injluenzae serotype (ATCC 27090) lipopolysac-
charade Carbohydr Res 179,245-258
19 Byrd, R A , Egan, W , Summers, M F., and Bax, A (1987) New NMR spectro- scopic approaches for structural studies of polysaccharrdes apphcation to the
Haemophilus injluenzae type a hpopolysaccharrde Carbohydr Res 166,47-58
20 Tsui, F P , Egan, W., Summers, M F , Byrd, R A , Schneerson, R., and Robbms,
J B Determmation of the structure of the E coli KlOO capsular polysaccharide,
cross-reactive with the capsule from Type B Haemophtlus injluenzae Carbohydr
Res 173,65-74
21 Bax, A, Summers, M F , Egan, W., Guirgls, N., Schneerson, R , Robbins, J B ,
Orskov, I , and Vann, V F (1988) Structural studies of the E colt’ K93 and K53
capsular polysaccharrdes Carbohydr Res 173,53-64
22 Moreau, M , Richards, J C., Perry, M B , and Kmskern, P J (1988) Structural
analysis of the specific capsular polysaccharlde of Streptococcus pneumomae
Type 45 (American Type 72) Btochemrstry 27,6820-6829
23 Watson, M J , Tyler, J M , Buchanan, J G , and Baddlley, J G (1972) The
Type-specrfic substance from Pneumococcus Type 13 Btochem J 130,45-54
24 Garegg, P J., Jansson, P.-E , Lindberg, B., Lmdh, F , Lonngren, J , Kvarnstrom, I , and Nimmich, W (1980) Configuration of the acetal carbon of pyruvic acid
(167)25 Gorin, P A J., Mazurek, M , Duarte, H S , Iacommi, M , and Duarte, J H (1982) Propertres of 13C-NMR spectra of 0-( 1-Carboxyethylidene) derivatives of methyl
/%Galactopyranosrde models for the determinatron of pyruvic acetal structure
m polysaccharides Curbohydr Res 100, l-15
26 Jones, C (1990) A novel method for the determination of the stereochemistry of pyruvate acetal substrtuents applied to the capsular polysaccharide from Strepto-
coccus pneumoniae Type Carbohydr Res 198,353-357
27 Moreau, M., Richards, J C , Perry, M B , and Kmskern, P J (1988) Apphcation of high-resolution NMR spectroscopy to the elucidation of the structure of the
specific capsular polysaccharide of Streptococcus pneumoniae type 7F
Carbohydr Res 182,79-99
28 Richards, J C and Leitch, R A (1989) Elucrdatron of the structure of the Pasteurella haemolytica serotype TlO hpopolysaccharide O-antigen by NMR spec-
troscopy Carbohydr Res 186,275-286
29 Bock, K and Thogersen, H (1982) Nuclear magnetic resonance spectroscopy m
the study of mono- and ohgosacchandes Ann Rep in NMR Spectroscopy 13, 1-57
30 Bock, K and Pedersen, C (1983) Carbon-13 nuclear magnetrc resonance spec-
troscopy of monosacchartdes Adv Carbohydr Chem Blochem 43,27-66
31 Bock, K , Pedersen, C., and Pedersen, H (1984) Carbon-13 nuclear magnetrc
resonance data for oligosaccharides Adv Carbohydr Chem Blochem 42,193-225
32 Lrpkmd, G M , Shashkov, A S., Mamyan, S S., and Kochetkov, N K (1988) The nuclear Overhauser effect and structural factors determining the conforma-
tions of disaccharide glycosrdes Curbohydr Res 181, l-12
33 Jones, C and Currre, F (1989) Pneumococcal polysaccharide S4; a structural
revision Carbohydr Res 184,279-84
34 Forster, M , Jones, C., and Mulloy, B (1989) NOEMOL integrated molecular
graphics and the simulatton of Nuclear Overhauser effects m NMR spectros-
copy J Mol Graphics 7, 196-217
35 Szu, S C , Zen, G , Schneerson, R., and Robbins, J B (1986) Ultrasonic uradia- tion of bacterral polysaccharrdes Characterization of the depolymerised prod-
ucts and some applicatrons of the process Carbohydr Res 152,7-20
36 Saitb, H and Ando, I (1990) High-resolutton solid-state NMR studies of syn-
thetic and brologrcal macromolecules Ann Rep NMR Spectroscopy 21,209-290
37 Smay, P (1986) Active fragments of natural ohgosaccharides Pure and Appl
Chem 61,481-483
38 Sanderson, P N., Huckerby, T H., and Nieduszynsh, I A (1987) Conforma- tional equilibria of a-L-iduronic residues in disaccharides derived from heparm
Blochem J 243,175-181
39 Paulsen, H , Pollex, A , Smnwell, V , and van Boeckel, C A A (1988)
Konformatronanalyse von heparm-analogen di- und trisacchariden mu CGL-
(168)(169)Dynamic and Exchange Processes
in Macromolecules Studied
by NMR Spectroscopy
Lu-Yun Lian
1 Introduction
It is normal for a biological system to be in a dynamic state The function of many biological systems depends on their flexibility, and here NMR can provide the experimental basis for investigating the mechanical function of such systems The types of motions that are hought to occur in proteins, their frequency ranges, and the methods for their detection are summarized in Table It is important to distin- guish ubiquitous thermal vibrations or group rotations from the more extensive motions that can be propagated through larger segments of a structure Motions and dynamics are reflected by five different NMR parameters: chemical shift, spin-spin coupling constant, the areaenclosed by a resonance, relaxation time, and nuclear Overhauser effect (I) The NMR data can be used to provide either qualitative evidence of flexibility or quantitative measurements of exchange rates
This chapter describes several basic experimental analytical NMR techniques frequently used for the qualitative and quantitative analy- sis of dynamic and exchange processes, focusing on protein systems; the same approach can be applied to most biological macromolecules ‘The analysis of data for dynamic processes, such as the determination of rate constants and binding constants, can be rather complicated;
From Methods m Molecular S/o/ogy, Vol 17 SpecRoscop/c Methods and Analyses NMR, Mass Spectrometry, and Metalloprotem Tecbnrques Edlted by C Jones, B Mulloy,
and A H Thomas Copynght 01993 Humana Press Inc , Totowa, NJ
(170)Table
Matrons m Protems Approximate Frequency Ranges and Methods of DetectIon
Types of Motion Frequency, Hz Methods of detection
Bond and angle vibration
Side chain and protein rotation
Aromatic side cham rotation Conformational changes,
protein unfolding
10’2 to 10’4 109 to 10’3
108 to 10”
100 to 10s
10-s to 16
IR, raman X-ray, molecular
dynamics simulation NMR relaxation, flourescence
depolarization, ESR NMR chemical shift averaging
Isotope exchange deutermm exchange
wherever possible, attempts are made here to simplify the mathemat- ics involved
2 Sample Preparation
and Experimental Conditions
This section deals with the considerations that should be borne in mind m order to acquire good data for further analysis, with an emphasis on experiments to determine exchange and dynamics These considera- tions are: quality sample preparation, the effects of factors influencing exchange rates, and some experimental NMR parameters to be used
2.1 Sample Preparation
The general guidelines for sample preparation in NMR studies should be adhered to (2) If the sample is to be prepared in D20, the easiest method is repetitive lyophilization with a D,O buffer, although care must be taken to ensure that the stability of the protein (as monitored by turbidity on redissolving or by loss of biological activity) is not affected by lyophilization The presence of salts and of some buffers can influence stability Where proteins cannot by lyophilized, alterna- tive methods, such as dialysis against a D20 buffer, membrane ultra- filtration using microconcentrators, centrifugal filtration, or less often, size-exclusion gel filtration may be possible If the sample is to be kept m D20 solution for some time (~3 d), approx 50 rniV of sodium azide should be added to prevent microbial growth
(171)of nucleotides, there is a tendency to stack at high concentrations Low solution concentrations mean long data acquisition time and, hence, greater usage of expensive spectrometer time In addition, the stability of protems (enzymes) or the possibility of the degradation of ligands (sometimes by the host protein itself) limits the duration over which an experiment can be performed If exchange rates are to be determined when using enzyme-bound substrate complexes, the substrate concentra- tlons should be sufficiently smaller than those of the enzyme, such that over 90% of the substrate molecule is in the bound form Therefore, the limiting factor in deciding the feasibility of an NMR experiment for the determination of exchange rate is the highest manageable pro- tein concentration, the availability of substrates not normally being a problem.* For experiments carried out on most high-field spectrom- eters (>400 MHz), at which frequencies the signal-to- noise ratio has improved significantly over the last few years, protein concentrations of approx 0.5-l mM in D20 and approx 2-3 mM in HZ0 (with the substrate concentrations therefore slightly less) are typically used
Paramagnetic cations can produce dramatic changes in the NMR linewidth of nuclei in their vicinity (of the paramagnetic cation), even If their concentrations are very low (approx 10-4-10-5 times those of the enzyme or substrate) Hence, the paramagnetic ions should be removed using chelating agents, such as EDTA (5-10 w)
2.2 Factors Influencing Exchange and Dynamic Processes
Many exchange rates are either acid- or base-catalyzed or both, and most increase with temperature Care must also be taken to remove the possibility of any extraneous exchanges, since these will invalidate certain assumptions made in the interpretation of the results
2.3 NMR Parameters
Implementation of many of the experiments and the interpretation of the results may be quite complex, and some specialist help is advis- able at both stages When attempting line-shape analysis, it is impor- tant to obtain the best possible data for meaningful information-by
(172)avoiding saturation (and hence partial relaxation) of a spin system, since this will result in a distorted line shape, by using the maximum number of points per hertz for good representation of the line shape, and by acquiring data with the best possible signal-to-noise ratio
For 1D magnetization transfer experiments (Section 3.2.), it is impor- tant to ensure high-quality selective pulses both for selective inversion and for saturation The pulse power should be carefully calibrated and adjusted to avoid spill-over effects; a shaped pulse can sometimes be helpful, since many spectrometers suffer from “unclean” selective square pulses
All the general conditions required for the acquisition of good 2D data apply to the 2D experiments described here (see Chapter 2) Postacquisition data processing has now reached (but by no means “plateaued” at} such a state of sophistication that many alternatives to discrete Fourier transformation are now available These alternative data-processing methods (3,4) are geared toward enhancing the reso- lution and signal-to-noise ratio of a data set Such methods can be particularly beneficial- many data sets are wasted as a result of a poor processing approach
3 NMR Techniques for Qualitative and Quantitative Analysis of Dynamic and Exchange Processes
This section discusses the several NMR techniques that are used for determining exchange rates or for observing dynamic processes One very important dynamic process that has been studied extensively using a combination of all the techniques discussed in this section is that of protein unfolding; hence, a separate section will be devoted to discussing this process
3.1 Chemical Exchange
(173)Table NMR Time Scale Time scale Slow
Exchange rate
Intermediate Fast Chemical shift, kc<&, 6, k-8A-6B k>>& & Coupling const , Jb k<<J,-J, k=JA-JB k>>J,-J, T2 Relaxatton k<<A-ld ,,L-l
(Av,,$ =l/M2) TEA T213 TEA T213
k>>A-l
TEA TUB “(8, - 6,) IS typically of the order of hundreds of Hz
bI is typlcally of the order of l-l Hz cAv,,z = linewidth at half height
d(l-
TEA -) TUB is typIcally of the order of l-20 Hz for proteins
three parameters define quite different, although partially overlap- ping, time scales The chemical shift is expressed in o rad s-l, that is, if the shift is ppm, and the spectrometer frequency is o, MHz, then o = x 7c x o, rad s-l Thus, fast exchange at a low frequency can sometimes be found to be in the equivalent of intermediate or even slow-exchange on higher frequency spectrometers
Consider two very common biochemical situations, one where A and B are interconverting forms of a molecule, and the other where ligand A binds to a macromolecule B to give a complex AB
ka AHB
kb
(1)
kt A+B AB 1
(174)h
Cm
WA OB
kc< IDA-WBI
k=z(WA-WB)
k=3(WA-WB)
k= ~(WA-WB)
WA WB
L-L WA WB
k=dOA-WBl
Fig Change m chemical shifts and linewldths m the presence of chemical exchange between two equally populated environments (A) Slow exchange; (B), (C), and (D) progressively more rapid mtermediaterates of exchange, (E) fast exchange
(175)What can be deduced from the line shape in each of the different exchange situations? Consider a situation where the signals are single nonoverlapping Lorentzian lines without a multiplet structure, as in Fig If the slow exchange condition exists, the exchange rates are deduced simply in terms of the broadening of each resonance; if in fast exchange, the spectrum generally contains no measurable information about exchange rates other than the implicit fact that these rates must be much larger than the chemical shift differences m the absence of exchange For intermediate exchange, a detailed comparison of the observed line shapes with those predicted by analytical expressions will be necessary to determine the exchange rate (5) In the present chapter, however, a more simplified approach is taken where quanti- tative binding constants are to be determined Because the two pro- cesses (I) and (2) just described are somewhat different in terms of their analysis by NMR, they will be treated separately
3.1.1 Slow Exchange
3.1.1.1 PROCESS (1)
The condition for slow exchange is as shown m Table The equa- tions to describe the observed linewidth of A and B are governed by the following transverse relaxation times, bearing in mind that the linewidth at half height uli2 = 1/7cT2:
l&A,obs = l/T2~ + k, and /TZnVobb = l/TzB + kb (3)
The magnetization ofA(or B) will decay as if in the absence of exchange
( 1/T2* or l/T,,) with an additional relaxation process caused by the
exchange of rate k, (or kb)
3.1.1.2 PROCESS (2)
When monitoring only the ligand resonances A and AB, the trans- verse relaxation times are given by:
l/Tmobs = l/T,, + ki[B] and
l&m,obs = ~/TZAB + k-1 (4) [B] being the concentration of the macromolecule The range of k,
(176)3.1.2 Fast Exchange
3.1.2.1 PROCESS (1)
A and B interconvert sufficiently fast to make resonances A and B indistinguishable and a new resonance, which exhibits the weighted average of the observable NMR, P, parameter in each of the two states, is observed
P obs =P@A + pBf B (5)
where PA and pn are the mole fractions of the A and B species present in SOlUtiOn with PA + PB =
3.1.2.2 PROCESS (2)
When monitoring resonances of species A and AB (which can be free and bound forms of either ligand or of protein), the observed NMR parameter P (8A,SAu, l/TzA, l/TzAB, and SO on) will be a weighted average of the A and AB parameters
P obs = p#A + PABfAB (6)
where PA = [Al/Atot, P,Q = [AWAtot and Atot = [Al f LABI, Btot = PI
+ [AB], and PA + PAB =
It is possible to obtain the ligand-binding constant by analyzing the behavior of one of the measurable NMR parameters, e.g., chemical shift, as a function of the ligand concentration at constant macromo- lecular concentration One way of doing this is to express the changes in chemical shift in a form where Kd, the dissociation constant, can be readily obtained Let:
&=pAB-PAandd=Pobs-PA (7)
where PA is the observable NMR parameter of the free species (ligand or protein), PAB is the chemical shift of the bound form (given by the shift at PAB + 1, zero free ligand or protein concentration) Since PA + /JAB = 1, it is possible to write:
A=&PAB (8)
At equilibrium, the dissociation constant I$ can be written as: Kd = [Al[Bl/[AB] = kJk,
(177)PAB = [ABl/Am = [A][Bl/Am l Kd
= W([Bl + Kd) (10)
or
= [AlJW&,{ [Al + Kd} (11)
At this point, it is important to make a distinction between two common situations:
(1) The species A, whose NMR parameter is observed, is held at constant concentration, for example, at constant protein concentra- tion, AtOt, and variable llgand concentration Btot The change in chemi- cal shift A can be expressed as:
A = QAB = &PlNBl + Kd) (12) Eq(2) resembles aMichaelis-Menten equation, and the standard graphi- cal methods of analyzing this type of data can be used A plot of the dependence of the change in chemical shift, A, on the ligand concen- tration, [B] is a hyperbola (Fig 2A) The dissociation constant Kd can be deduced directly from this plot; it is the concentration of the ligand that gives half-maximal binding, that is, the concentration at AJ2
(2) The species A, whose parameter is observed, is varied in concen- tration; for example, A is the ligand that is added to a constant protein concentration Btot Eq (11) is now applicable; this equation is less simple to analyze, but if [A] >> [AB], Eq (11) simplifies to give:
A = bAKAto, + Kd) (13)
A plot of the change in chemical shift as a function of ligand concen- tration gives a rectangular hyperbola (Fig 2B); the binding constant can now be deduced using a best-fit curve analysis
3.1.3 Moderately Fast Exchange
3.1.3.1 PROCESS (1)
Figure (B, C, and D) shows the situation where the exchange rate is in this regime; line broadening of the order of up to six times the linewidth when in the fast exchange regime is observed No detailed analysis is possible
3.1.3.2 PROCESS (2)
(178)Llgand concentration (mM)
B 540- Y 320- EIOO-
f 80- z 80- = 40-
8
6 20 -1
“4
0.0 08
Fig (A) A theoretical plot of the change m chemical shift of a protein resonance as a function of hgand concentratron These data are obtamed for the experiment when variable concentrations of a hgand are added to a constant concentration of the protein and a resonance associated with the protein monitored (B) A theorettcal plot of the change m chemtcal shaft of the ligand resonance as a functron of ligand concen- tratton, obtained at constant protein concentration
(179)lfTzobs=P~fT2~ +PABIT2AB + 4’n2Pd?do2~(k, PI + k-1) (14) Although the first two terms represent the weighted average of the relaxation rates in the two states, a third term is included to account for the exchange contribution The range of kr which can be detected, depends on Ao, this usually being in the range: 102-lo5 s-l To obtain an accurate binding constant from the observed linewidth, a full line- shape analysis is required (5)
3.2 Magnetization Transfer
NMR spectroscopy provides the ability to measure rate constants by monitoring a system at equilibrium Examples include: reaction path- ways, which can be deduced by following the transfer of nuclei between two positions- on the substrate and on the product molecule, and the estimation of the exchange rates of labile hydrogen in peptides and proteins with the hydrogen in the bulk water, hence providing valuable information concerning the conformational and dynamic properties of these molecules When rate constants are in the slow exchange regime 10-2-102s-‘, th e magnetization transfer technique can be used for their determination
Consider a slow exchange process A+ B -AB Perturbation of one resonance by selective irradiation, for example, the resonance of A, will cause changes in the intensity of the other observable resonance, in this case that of AB, owing to transfer of magnetization from one to the other as a result of exchange The three magnetization transfer experiments commonly used are: saturation transfer, inversion trans- fer, and 2D exchange Only the two-site exchange case is discussed here, this being rather more straightforward than a multisite case, in which extra care must be taken in the analysis to account for as many of the processes involved as possible
3.2.1 Saturation Transfer
Using the aforementioned slow-exchange process as an example, if resonance A is saturated, the fractional change in intensity of the AB resonance at steady state is given by the equation
~ABIIAB=RIABI(RIAB+~-~) (15) where IAB, I’ AB are the intensities of the AB resonance before and after
(180)nal relaxation rate+ of AB, which can be determined independently, The advantage of the saturation transfer method in the steady state over the inversion transfer approach (see following section) is that the time-course of the intensity of only one signal (AB) needs to be ana- lyzed The saturation transfer experiment can also be used qualita- tively; for example, systematic irradiation throughout the relevant region of the spectrum permits location of the resonance(s) of bound ligand by observing selective decreases in intensity of the correspond- ing resonance(s) of the free ligand
3.2.2 Inversion Transfer
This experiment is performed in the same manner as the saturation transfer experiment previously described, with the exception that a selective 180” pulse is used to completely invert a selected resonance The pulse sequence is 180” (selective) - t- 90” (nonselective) - acqui-
sition, t being a variable delay The pulse sequence is repeated for
different t values Although the inversion transfer approach affords
more experimental information concerning the involved rate constants covering a larger range of rates when compared with the saturation transfer approach, its major drawback is the multiexponential time dependences of the signal intensities (6) This latter disadvantage excludes simple data analyses based on semilogarithmic plots and initial slopes A computerized nonlinear least-squares analysis using a complete theoretical model has to be used for correct estimation of the rate constants
3.2.3.20 Exchange
From the experimental point of view, both the 1D saturation and inversion transfer methods described earlier have major selectivity and experimental-time disadvantages, particularly for macromolecules In addition, in the case of the saturation transfer approach, the rate constant needs to be greater than the relaxation rate, R, Clearly, the 2D magnetization transfer (2D exchange experiment), which uses the same
pulse sequence as the 2D NOESY experiment (see Chapter 2), is more
efficient; it allows the entire matrix of all the exchange processes in a system to be obtained from a single experiment
(181)To illustrate the analysis of data obtained from a 2D exchange experi- ment, the example used here is the determination of the rate of hydro- gen exchange of the labile hydrogens of a peptide with water (7) The 2D exchange spectra using various mixing times are acquired by means of an observation pulse that does not excite the water signal, such as a Redfield pulse or a 1337 pulse Analysis of the variation of intensities of the cross- and diagonal-peaks with mixing times can be simplified by making some assumptions:
1 This 1s a simple, two-site exchange, A(NH) + B(H,O);
2 The mole fraction, X, of NH is much smaller than the mole fraction of HzO, that is X,, c< XHZo and XHZo = 1; and
3 The normalized rate constant, k, is given by k = kAXA = kBX, The original equations (8), which describe the dependence of the kinetics on mixing time, tm, can be reduced to:
UAA = XA exp[(-RIA + ‘%I
~BA =[XAWIA + k - RIBI kxp(-%Ah) - exp[(-hA +&J) (16)
whereaAAandaBAarethemixing COCffiCiCntS,UAAbCingprOpOrtiOnal
to the intensity (or volume) of the diagonal and uBA being proportional to the intensity (or volume) of the crosspeak RIA corresponds to the spin-lattice relaxation time An example of a plot of the scaled inten- sities (to account for the variations in linewidths and peak heights in the ID spectrum) against the mixing time tm is shown in Fig Con-
sidering only the diagonal peak, the time dependence of the intensity gives a value of (RI, + k) for each resonance This value can then be used in Eq (16) to determine k, since RIB, the spin-lattice relaxation time, can be obtained using the inversion-recovery method
3.3 Isotope Exchange Methods
(182)0.0 0.2 0.6 tm, set
0.8 1.0
Fig Representative plot of the variation of intensity (or volume) of diagonal- (*) and cross (H) peak as a function of mixing time, fm m a 2D magnetization transfer experiment
needed to acquire a good NMR spectrum of the sample The simplest way in which exchange of isotopes may be followed using NMR is to monitor the appearance or disappearance of a signal by direct detec- tion of a magnetically active nucleus For example, one can detect proton replacement by a deuterium atom from the disappearance of the relevent proton resonance (see Section 3.5.) It is also possible to follow the exchanges of isotopes of nuclei that are inaccessible to NMR detection by observing their influence on nuclei that are easily detected; for example, the replacement of l*O by 160 can be probed by means of 31P resonance multiplicity characteristics Other indirect methods of isotope detection include the spin-echo technique which allows the observation of the exchange of spin- l/2 nuclei of low sen- sitivity (such as i3C, i5N) via attached protons (IO)
3.4 Relaxation Time Measurements
(183)parameters, and an appreciation of how these parameters in turn affect the final NMR data acquired is important for the design and execution of experiments and for the interpretation of data However, this approach is a complex and difficult one, since the problem is not merely one of deter- mining rates of motion within the framework of a particular dynamic model, but one of formulating an actual description of the motion
Nevertheless, qualitative analyses of relaxation effects can be car- ried out, and some of these are describe here For example, in the case of smaller proteins (mol wt ~20 kDa), having typical rotational corre- lation time for overall motion of 10-9-10-8s, observation of linewidths narrower than those expected from the mol wt can be explained by side chain motion, especially the rotation of methyl groups Rotation about a single bond m the side chain takes place in the range 10-9-10-‘o s Any observed differences in the linewidths of, for example, the different methyl groups, may reflect either a difference in their rates of rotation, the presence of additional motions, or a difference in the number of neighboring atoms contributing to relaxation To distinguish between the different possibilities, a detailed analysis of the relaxation param- eters, especially the nuclear Overhauser effects, is necessary
In large molecules (mol wt >30 kDa), where the rotational correla- tion times are typically 1O-8-1O-6 s and NMR linewidths are >50 Hz, the appearance of linewidths of 5-20 Hz in the spectrum can be taken to indicate the presence of more extensive motions in addition to indi- vidual side chain rotations For example, the existence of a random- coil segment in a protein or of structured subdomains, with internal motions, in a multidomain protein (II) may be the case
Experimentally, it is straightforward to measure the spin-lattice relax- ation time (T,) and the spin-spin relaxation time (T.) The spin-lattice relaxation time is commonly measured by the inversion recovery method, using the pulse sequence 180”- t - 90”- acquire, where t is a
variable delay The amplitude of the signal after Fourier transforma- tion is given by:
AtrJ = A, - 2&e-‘/T,
where A, is the thermal equilibrium value and A0 the value immedi- ately after inversion (A, -A,)
(184)experiment The pulse sequence is 90”- t - 180”- t - acquire, where t is a variable delay Further details of these experiments are given in
most standard NMR textbooks
3.5 Protein Folding and Unfolding
One very important dynamic process that has been successfully studied using NMR is that of protein unfolding The dynamics of structural changes can be investigated over a wide range of time scales using magnetization transfer, hydrogen exchange, line-shape analy- sis, and relaxation methods Many of the methods used depend on the ability to monitor changes in the structural environment of individual protons and a knowledge of the specific assignment of the proton resonances The three aspects of protein folding/unfolding most closely examined using NMR are the dynamics of protein folding/unfolding, protein stability via hydrogen exchange kinetics, and the structural characterization of folding/unfolding intermediates in addition to the unfolded form
3.5.1 Dynamics of Protein Folding
The interconversion between the folded and unfolded forms of a protein is usually slow on the NMR time scale, giving rise to separate lines for the native and denatured states at equilibrium; the spectrum at the intermediate denaturant concentration or temperature is a super- position of the native and unfolded spectra Generally, two types of spectrumcan be obtained when studying folding/unfolding transitions at equilibrium: spectra where all the resonances can be attributed to either the folded or the unfolded state or those that contain additional resonances from a discrete partially folded state, the latter case being illustrated in Fig for the refolding of P-lactamase The absence of intermediate state resonances can simply imply that the intermediates are either short-lived, in low population, or in rapid exchange with either the folded or the unfolded conformation
(185):;:
lntermedlate state hts resonances
Fig ‘H spectrum showing the refoldmg of P-lactamase at 294 K (A) Native state, (B) “stable” intermediate refolding state, and (C) unfolded state
two-state model, the intensity decays exponentially from the equilib- rium value M, to a limiting value Iw,, where M,IM, = RfIRf+ k, (see Eq [ 15]), Rf being the relaxation rate in the folded form, and k,, the rate of unfolding Conversely, if the resonance in the folded state is irradi- ated and that in the unfolded state observed, the rate of folding kfcan be measured The saturation transfer experiment can, in addition, reveal the existence of multiple folded and unfolded conformations (12) The presence of multiple forms must be taken into consideration when quantitative analysis of the saturation transfer experiment is undertaken
3.5.2 Hydrogen Exchange and Protein Stability
(186)of hydrogen bonds to global transition approaching general unfolding Where major structural unfolding is responsible for the exchange of internal labile hydrogen atoms with the solvent, as is generally the case when destabilizing conditions exist, a structural unfolding model is most frequently used to derive quantitative rate values:
ku kc kf
N(H) t) U(H) t) U(D) t) N(D) (18)
kf (In DO ku
N and U are the native and unfolded states of a protein, respectively,
and k, the unfolding rate Peptide hydrogen exchange is either acid- or
base-catalyzed, and k, can be written as:
k, = kn[H+] + kou[OH-]
Base catalysis is much more efficient than acid catalysis with values
of koH = lo* s-‘44-l and k u = s-‘Me* It is important to understand the
kinetic exchange mechanisms governing these exchange rates in order to interpret the hydrogen exchange data (13)
The correlation between the rate of hydrogen exchange and protein stability has been demonstrated in many proteins In the ‘H spectrum of medium-sized proteins, such as ribonuclease A and bovine phos- pholipase AZ, subsets of NH protons are observed when the protems are dissolved in DzO These protons are attributed either to the deeply buried protons distributed throughout the protein structure, these protons being involved in hydrogen bonding, or to a whole strand of buried sec- ondary structure, such as an a-helix, these slow exchanging protons being accessible to the solvent only after unfolding in drastic conditions
3.5.3 Structural Characterization
and Folding Pathways
It is possible to characterize, in some detail, the structure of folding mtermediates by using a combination of advanced NMR and rapid solu- tion mixing techniques In this combined approach, proton or deuteron labels are trapped within the backbone and side chain protons m the refolded protein, and their location and quantity determined using homo-
nuclear and heteronuclear 2D NMR (NOESY, HOHAHA, HMQC)
experiments The relative proton occupancy (that is, the number of proton
(187)(20)
where I, is the measured resonance intensity or volume, Z, the signal intensity or volume of the fully protonated group, andf, the residual fraction of HZ0 present in the reaction mixture The two main approaches for label trapping experiments are described in the following sections,
3.5.3.1 COMPETITION METHOD
This technique is suitable for examining early refolding events It aims to balance the rate of amide exchange against the rate of refold- ing, so that exchangeable protons are trapped in parts of the protein
that refold early (14) The following steps are involved:
I Unfold the protein in HZ0 (using a denaturant or at an extreme pH) Refold the protein, and at the same time induce H-D exchange, by rapid
dilution of the denaturant with D,O buffer
3 After a reaction time, t, quench the exchange process by rapid lowermg of pH; refolding wtll contmue to completion without further H-D exchange Recover protein (by freeze-drying or multiple concentration-wash proce-
dure; see Section 2)
5 Prepare an NMR sample under conditions that minimize further H-D exchange, e.g., low temperature and moderately actdlc pH
3.5.3.2 PULSE LABELING METHOD
This method is designed for investigating the later stages of folding
(15,16) The following steps are involved:
1 Refold the initially deuterated protein in D,O for a variable period Pulse label for approx 50 ms with an excess of Hz0 buffer; NH sites
that are still exposed are selectively protonated Change to basic pH briefly to ensure that all exposed sites are fully labeled and protected groups deuterated
3 Quench the exchange by lowermg pH, and allow the protein to refold to natrve form
4 Prepare an NMR sample as m Section 3.5.3 (steps and 5)
The backbone and side chain labile protons are ideal conforma- tional probes, because they are distributed throughout the protein struc- ture, amide proton exchange rates are determined predominantly by intramolecular hydrogen bonding, thereby reflecting important aspects
of the protein structures, and hydrogen-deuteriumexchange is usually
(188)As far as the unfolded protein is concerned, its structural character- ization requires the assignment of the ‘H NMR spectrum Two-dimen- sional, exchange-mediated magnetization transfer experiments (see Section 3.2.3.) can be used for the assignment of the spectrum of a reversibly unfolded protein provided that, first, specific assignment of the resonances in the folded protein is known, and second, the rate exchange or structural interconversion between the folded and unfolded forms on the NMR time scale is slow (of the order of s)
The NMR spectrum of an unfolded protein contains amino acid resonances that deviate from those in the unstructured peptide These deviations have been attributed to short-range interactions between hydrophobic side chains, rather than to residual secondary or tertiary structures that are of any significance
References
1 Jardetzky, and Roberts, G C K (1981) Protein dynamrcs, m NMR zn Molecu- lar Biology, Academrc, New York, pp 448492
2 Oppenheimer, N J (1989) Sample preparatron, m Methods zn Enzymology, vol 167, Part A, Academrc, New York, pp 78-89
3 Stephenson, D S (1988) Linear prediction and maximum entropy methods m NMR spectroscopy, m Progress zn NMR Spectroscopy , vol 13, Pergamon, Ox- ford, pp 15-626
4 Hoch, J C (1989) Modern spectrum analysis m nuclear magnetic resonance alternatives to the Fourier transform, in Mefhods in Enzymology, vol 167, Part A, Academic, New York pp 216-241
5 Rao, B D N (1989) Nuclear magnetic resonance lure-shape analysis and deter- mination of exchange rates, in Methods zn Enzymology, vol 167, Part A, Aca- demrc, New York pp 279-3 11
6 Dahlquist, F W , Longmurr, K J., and Du Vernet, R B (1975) Dtrect observa- tion of chemical exchange by s selective pulse nmr technique J Magn Reson 17,406
7 Dobson, C M., Lran, L-Y, Redfield, C., and Toppmg, K D (1986) Measure- ment of hydrogen exchange rates using 2D NMR spectroscopy J Magn Reson 69,20 l-209
8 Jeener, J , Meier, B H , Bachmann, P., and Ernst, R R (1979) Investrgatron of exchange processes by two-drmensronal NMR spectroscopy J Chem Phys 71, 4546-4553
9 Brindle, K M and Campbell, I D (1987) NMR studres of kinetics m cells and tissues Q Rev Biophys 19(3/4), 159-182
(189)11 Oswald, R E , Bogusky, M J , Bamberger, M , Smith, R A G , and Dobson, C M (1989) Dynamics of the multtdomam fibrinolyttc protein urokmase from two-
dtmenstonal NMR Nature (London) 337,579-582
12 Evans, P A , Dobson, C M , Kautz, R A , Hatfall, G., and Fox, R (1987) Proline rsomerrsm m staphylococcal nuclease charactertzed by NMR and stte-
directed mutagenests Nature (London) 329,266-268
13 Creighton, T E (1984) Protems in solution, in Protems, W H Freeman, New
York, pp 265-328
14 Schmrd, F X and Baldwin, R L (1979) Detection of early intermediate m the folding of Rrbonuclease A by protection of amrde protons against exchange J
Mol Biol 135199-215
15 Udgaonkar, J B and Baldwin, R L (1988) NMR evidence for an early frame-
work intermediate on the folding pathway of ribonuclease A Nature (London)
335,694-699
16 Roder, H , Elove, G A , and Englander, W (1988) Structural characterizatron of folding intermediates in cytochrome c by H-exchange labellmg and proton NMR
(190)Introduction to Mass Spectrometry
Robin Wait
1 Introduction
Mass spectrometry (MS) is a sensitive and powerful analytical tech- nique, in which ionized sample molecules are separated according to their mass to charge ratios (m/z) by the application of electric and/or magnetic fields If the ionization regime deposits sufficient excess energy, a proportion of the sample molecules will dissociate, the pat- tern of product ions formed being dependent on the structure of the mtact compound (Fig 1) Amass spectrum thus consists of the masses (strictly mass to charge ratios, m/z) of these ions plotted agamst abun- dance Interpretation of the spectrum thus affords information about both the mol wt and the structure of the sample By the standards of most other physical methods, mass spectrometry is fairly sensitive, requiring somewhere between low picomoles and nanomoles of material, depending on the ionization method employed, but against this must be set its destructive nature The present mtroduction aims to provide a brief overview of the technique, to define some of the key terms, and to offer a short tour of some of the different instruments that are more or less legitimately called mass spectrometers Readers wishing a more detailed account should consult refs 1-9 A recent volume of Methods
in Enzymology (5) devoted entirely to mass spectrometry is particu- larly recommended, since both instrumentation and applications are comprehensively covered
All mass spectrometers consist of a means of ion generation, a mass analyzer for their separation, and an ion detector In the following
From Methods III Molecular Btology, Vol 17’ Spectroscoprc Methods and Analyses NMR, Mass Spectrometry, and Metalloprotern Technrques Edlted by C Jones, B Mulloy,
and A H Thomas CopyrIght 01993 Humana Press Inc , Tofowa, NJ
(191)Sample
molecule Ionizati
A
x -t
+ c f, M+ fz f
dLLLil
on f’2 f;
Fragmentation Mass-spectrum
(192)1 AM
\ 10 %
Fig The 10% valley defimtmn of resolution Two peaks of equal mtenslty are said to be resolved when the height of the valley between them IS 10% of the maxl- mum peak intensity
sections, these elements will be briefly considered in relation to the requirements for the analysis of biological molecules
2 Mass Analyzers
(193)mass analyzers are described at greater length in refs and A comprehensive account of the optics of charged particles is available (for the mathematically sophisticated) in ref 10
2.1 Magnetic Sector Mass Spectrometers
In a magnetic sector mass spectrometer, mass measurement is per- formed by deflecting the ions with a magnetic field; the extent of the deflection is proportional to the mass of the ion, more massive ions experiencing smaller deflection at a given field strength Ions are accel- erated out of the source region by the application of an accelerating voltage ‘t: acquiring thereby kinetic energy equivalent to mv2/2, i.e.:
ZV = mv2f2 (1)
where z is the number of charges on the ion under consideration (in units of the charge on an electron), v is its velocity, and m is its mass When the ion enters a magnetic field of strength B, it experiences a deflecting force of magnitude Bzv at right angles to its direction of travel, which forces it to describe a circular orbit of radius, r, such that:
Bzv = mv2/r (2)
Combining Eqs (1) and (2) and rearranging gives the fundamental mass spectrometer equation:
m/z = B2&2V (3)
Inspection of Eq (2) shows that a magnetic sector is strictly a momen- tum analyzer that separates ions according to the product of their mass and velocity rather than mass alone It is important therefore that ions enter the magnetic field with the same energy, since otherwise ions of the same mass, but differing in their velocity will be brought into focus at a different point, which will degrade the resolution The energy spread of the ion beam may be reduced by means of an electrostatic analyzer (ESA; the electric sector) The ESA consists of two curved plates, across which a fixed electric field, E, is applied Ions entering the field are constrained mto circular orbits of radius R, such that:
zE = mv21R (4)
By rearranging, it can be seen that:
(194)Electric Sector Magnet
Directional focusing curve Velocity focusing curv
Source Slit P&t of Double Focus
fig Schematic representation of the ion optics of a double-focusing magnetic sector instrument of forward geometry If the ion detector is placed at the point of intersection of the direction and velocity focusing curves, the image will be indepen- dent of the velocity and angular spread of the ion beam
By combination of Eq (4) with Eq (l), it follows that:
R = 2VIE (6)
Thus, ions of the same charge and kinetic energy will follow the same orbit, irrespective of their mass, and will be brought to a focus at the same point, whereas ions of varying energy will follow a slightly divergent path Thus, the energy spread of the ion beam may be reduced by placing a slit at the exit of the ESA so that only ions of the selected range of energies pass into the magnetic analyzer The ESA also acts as a directional focusing device, counteracting tendencies toward angular divergence of the ion beam The combination of an electric and a magnetic sector is described as double focusing, because the two fields are so arranged that the direction and velocity dispersion pro- duced in one is counteracted by that from the other (Fig 3) The detector is placed at the point where the direction and velocity focusing curves intersect, so that the final image is independent of the velocity and directional spread of the ion beam By varying (scanning) the mag- netic field strength, ions of different mass are sequentially brought into focus at this point
Instruments in which the ESA precedes the magnet are said to be of
(195)is placed after the magnet Double-focusing instruments of either geom- etry are capable of extremely high resolution (up to lOO,OOO), but this is achieved by reducing the widths of the various resolving slits, which results in considerable sacrifice of sensitivity In practice, it is seldom necessary to operate above about 5000 resolution when analyzing biological molecules, which still enables mass measurement accuracy of better than 0.5 dalton across the entire mass range
The current upper mass limit attainable by these instruments at full accelerating voltage (8 or 10 kV) is around 15 kDa Inspection of Eq (6) shows that the mass range can be extended by operating at higher magnetic field strength, by increasing the radius of the ion trajectory, or by reducing the accelerating voltage The latter course is the sim- plest, but is achieved at the cost of a reduction in sensitivity, which usually becomes unacceptable once the voltage drops below keV.* In practice, it is difficult to achieve magnetic field strengths much higher than about 2.3 T with electromagnets It is not possible to take advantage of the much higher fields attainable with superconducting magnets because of the need for scanning operation Increasing the radius causes a rapid increase in the overall ion optical path length, resulting in impracticably large and expensive instruments This effect can be reduced to some extent by decreasing the focal length by mtro- ducing the ion beam into the magnetic field at nonnormal incidence angles and by the use of inhomogeneous fields
2.2 Quadrupole Mass Filters and Ion Traps
The quadruple mass analyzer consists of four rods of circular or
hyperbolic cross-section, arranged as in Fig 4A Voltages having a DC component U and a radiofrequency component of the form V,cos cot are applied across opposite rods as shown in Fig 4B The radiofrequency period is short compared to the transit time of ions through the device At any given field, ions with a narrow range of m/z, values are con- strained into stable trajectories and pass down the mam axis of the quadrupole; all other ions describe unstable paths until they collide with the rods or are lost between them At fixed frequency, a mass
(196)(U + vo cos 64 -
Fig 4A,B The principle of the quadrupole analyzer
(197)range The transmission efficiency of quadrupoles IS potentially high, since there is no necessity for resolving slits Another advantage is that quadrupole analyzers are most effective at separating ions of low velocity, so their ion sources operate close to ground potential, the injected ions generally having energies ~100 eV, this greatly simpli- fies the task of interfacing to LC systems and to atmospheric pressure ionization techniques, such as electrospray A triple quadrupole ar-
rangement provides a relatively cheap route to tandem mass spectrom- etry The first stage is used for precursor ion selection, and the intermediate (radiofrequency only) quadrupole functions as a collision cell The product ions are recorded by scanning the third analyzer The main limitation of these instruments is that they are restricted to the analysis of low-energy collisions (see Section 5.1 and Chapter 12)
The ion trap (sometimes known as a QUISTOR, for quadrupole ion
store) is a device that operates on similar principles to a conventional quadrupole (II) It consists of three electrodes, one toroidal and two end caps The electrodes are machined so as to provide hyperbolic inner surfaces and thus resembles a quadrupole in cross-section Ions of a given mass can be constrained into stable orbits by the application of suitable potentials and then sequentially ejected into an external elec- tron multiplier detector Ion traps have principally been used as com- ponents of low-cost GC-MS systems, but the devices have considerable potential for tandem MS, and have recently been shown to be capable of both high-mass (>40 kDa) and high-resolution operation (12)
2.3 Time-of-FZight Analyzers
In the time-of-flight (TOF) mass spectrometer, ions are accelerated through a potential (V) and then drift down a tube toward a detector If all the ions arrive at the beginning of the drift tube with the same energy (mv2/ = zeV), then those differing in mass will have different velocities:
v = (2 zeVlm)“* (7)
so for a tube of length L, the time of flight of an ion is given by:
t = L/v = (L*m/2 zeV)“* (8)
from which its mass (m) may be easily calculated
(198)in conjunction with pulsed ionization techniques, such as plasma desorp- tion and laser desorption Some variation in ion energy is difficult to avoid, and this is responsible for the relatively low mass resolution of the devices (often ~1000) The energy spread among ions of the same mass can be reduced by the use of various types of ion reflectors; more energetic ions penetrate further into the reflecting field than less energetic ones of the same mass, and thus, their flight time is slightly increased, resulting in tighter bunching among isomass ions Resolving powers of several thousand have been achieved using reflector technology The mass range of these instruments is virtually unlimited, and the absence of resolv- ing slits ensures very high transmission efficiencies compared to magnetic mass spectrometers, resulting in excellent sensitivity
2.4 Fourier Transform Mass Spectrometers
Ions (of charge z) contained within a strong magnetic field (B) will describe a circular motion in a direction perpendicular to the applied field, the angular frequency (0,) of this motion being inversely pro- portional to the mass (m) of each ion:
co, = zBlm (9)
(199)the main practical restriction is imposed by the detection limits of the low-frequency signals characteristic of high-mass ions
FT instruments are well suited to use with pulsed ionization meth- ods, since all the products of a single ionization event can be trapped and analyzed Powerful tandem MS experiments are also possible with the trapped ions The principal technical difficulties are a conse- quence of the very stringent vacuum requirements of the method
3 Methods of Ion Production 3.1 Vapor-Phase Ionization Methods
The oldest method for generating ions for mass analysis that is still m routine use is electron impact (EI) ionization Sample molecules m the vapor phase collide with electrons emitted from a heated metal filament, causing eJection of an electron from the sample molecule, which thus is left carrying a net positive charge:
M + e- + M+’ + 2e- (10)
The radical cations produced by electron impact are known as odd electron ions, since they possess an unpaired electron
The energy of the bombarding electrons is about 70 eV, whereas the ionization potentials of most organic molecules are below 15 eV, so up to 50 eV of excess energy are imparted by the ionization process Smce the dissociation energies of most organic bonds fall within the range 3-10 eV (300-1000 kJ/mol), considerable fragmentation usually re- sults This fragmentation is reproducible and characteristic of the molecule, and therefore offers a powerful technique for the structure determination of unknowns
In some cases, the fragmentation is so extensive that no molecular ions are observed, and the spectra are dominated by relatively unmformative low-mass ions This problem may be overcome by the use of various soft
ionization techniques that limit the excess energy deposited, and so con-
(200)sample molecules, usually by proton attachment or electrophilic addi- tion, but sometimes by charge exchange The collision processes allow equilibration of the energy deposited by the primary ionization event, so the excess energy imparted to the sample molecules is small (gen- erally approx eV), and molecular ion production predominates over fragmentation The cationized molecules produced by CI (sometimes called quasi-molecular ions in the older literature), are even electron
species, in contrast to the odd electron ions characteristic of EL The probability of direct ionization of sample molecules by electron impact is low because of the much higher concentration of the reagent gas Electron impact and chemical ionization sources can be fitted to most types of mass spectrometer, and are generally provided as part of the standard equipment of commercial instruments, particularly those equipped with a gas chromatographic inlet system
The major drawback of EI and CI mass spectrometry from the point of view of the biochemist is the requirement for the sample to be presented for ionization in the vapor phase This limits their applica- tion to compounds of low mass (generally <1200 dalton) that are either intrinsically thermally stable and volatile, or that can be made so by suitable chemical derivatization Thus, most biopolymers are ame- nable to analysis only after chemical or enzymatic degradation and conversion to volatile derivatives The requirement for sample vola- tility is less stringent in Desorption Chemical Ionization (DCI) (also called direct chemical ionization), in which the solid sample is depos- ited on a wire and introduced into the plasma within a chemical ion- ization source On electrical heating, cationized sample molecules are ejected Although heating of the sample is still required, molecular ions can be obtained from materials of higher mass and polarity than are suitable for analysis by conventional CL As the mass of the sample increases, however, sensitivity falls and pyrolytic processes become increasingly significant The following sections describe methods of ionization that are less subject to this limitation
3.2 Field Desorption
Desorption methods refer to ionization techniques in which ionized