Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 28 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
28
Dung lượng
410,42 KB
Nội dung
Chapter1 X-ray crystallography CHAPTER 1. X-RAY CRYSTALLOGRAPHY 1.1 METHODS FOR STRUCTURAL STUDIES In this chapter, an introduction to one of the most powerful techniques for structural studies, X-ray crystallography, will be shortly presented. X-ray crystallography requires good crystals that can reliably provide the answer to several structure related questions, from global folds to atomic details of bonding. The other method used for structure determination is Nuclear Magnetic Resonance spectroscopy (NMR), which is limited to relatively low molecular weight proteins (up to 30 kDa) but can be useful for dynamics studies and can provide us with many other useful information about proteins in solution. 1.2 CRYSTALLIZATION Crystallization is one of the several means (including nonspecific aggregation/precipitation) by which a metastable supersaturated solution can reach a stable lower energy state by reduction of solute concentration (Weber, 1991). The three stages of crystallization common to all molecules are nucleation, growth, and cessation of growth. Nucleation is the process by which molecules or noncrystalline aggregates (dimers, trimers, etc.) that are free in solution, come together in such a way as to produce a thermodynamically stable aggregate with a repeating lattice. The aggregate must first Chapter1 X-ray crystallography exceed a specific size (the critical size) defined by the competition of the ratio of the surface area of the aggregate to its volume (Feher and Kam, 1985; Boistelle and Astier, 1988). Once the critical size is exceeded, the aggregate becomes a supercritical nucleus capable of further growth. The degree to which nucleation occurs is determined by the degree of supersaturation of the solutes in the solution. The extent of supersaturation is in turn related to the overall solubility of the potentially crystallizing molecule. The variables influencing crystal growth are too many to allow an exhaustive search. This is the reason for using a sparse matrix method of trial conditions that is based and selected from known crystallization conditions for macromolecules. In this way it is possible to test in a short time, wide ranges of pH, salts and precipitants using a very small sample of macromolecules. Results from initial trials could produce crystals or solubility information. If no crystals are obtained, ligand complexes or alternative forms of the protein could be explored or effects such as metal ions and detergents could be introduced (McPherson, 1990). The most commonly used methods for initial crystal trials are the hanging drop and sitting drop vapor-diffusion methods, dialysis and batch method. The hanging and sitting drop methods rely on the diffusion of either water or some volatile agent between a micro-drop of mother liquor and much larger reservoir solution. Chapter1 1.3 X-ray crystallography X-RAY CRYSTALLOGRAPHY The knowledge of accurate molecular structures is a prerequisite for structure based functional studies to aid the development of effective therapeutic agents and drugs. The first prerequisite for solving the three-dimensional structure of a protein by X-ray crystallography is a well-ordered crystal that will diffract X-rays strongly. X-ray crystallography is an experimental technique that exploits the fact that X-rays are diffracted by crystals. X-rays have the appropriate wavelength (in the Angstrom range, 10-10 M) to be scattered by the electron cloud of an atom of comparable size. 1.3.1 Lattices, planes and indices The simplest repeating unit in a crystal is called a unit-cell with three basis vectors a, b and c, and interaxial angles α, β and γ between them (Fig. 1-1). c β a α γ b Figure 1-1. The unit-cell. The unit-cells are classified into seven different systems based on their symmetry. A cell in which a ≠ b ≠ c and α ≠ β ≠ γ, is called triclinic. If a ≠ b≠ c, α = γ = 90o and β > 90o, the cell is monoclinic. The other crystal systems are orthorhombic, tetragonal, rhombohedral, hexagonal, trigonal and cubic. The Trigonal system needs the hexagonal unit cell for some space groups and the rhombohedral axes (denoted as R) for some space Chapter1 X-ray crystallography groups. It is important to realize that the conditions imposed on the cell geometry are not sufficient to distinguish between crystal systems. For instance, if a unit-cell is found to have three angles of 90° it does not necessarily mean that the crystal belongs to the orthorhombic system. It could also be triclinic with three angles of 90°. To be orthorhombic the crystal requires to have minimum point group symmetry of 222, vide infra. In a crystal, unit-cells are imagined to be arranged in a contiguous way to fill space. A point at the corners vertices of the unit-cell is considered to represent the whole unit-cell. The array of these points is called a lattice. In the ‘primitive’ arrangement, designated as P, only one lattice point is present per unit-cell. For convenience, in some favorable situations, when smaller primitive cells are enclosed in a larger non-primitive cell, two or more lattice points will be present in the non-primitive unit-cell. They are designated A, B or C if the faces of the cell contain a lattice point: A for bc face centering, B for ac face centering and C for ab face centering. If all the faces are centered, the designation is F, and if the cell is body-centered, it is I. The cubic crystal system can have P, F, or I lattices; the hexagonal system has P; the trigonal system can have a primitive hexagonal lattice or a rhombohedral lattice; the tetragonal system can have P and I; the orthorhombic system can have P, F, I, A/B/C; while the monoclinic system can have the P or C lattice, the triclinic system can only have a primitive lattice. The collection of these 14 lattices is called ‘Bravais lattices’. For the purpose of X-ray diffraction, the unit-cell is imagined to be sliced into planes and the planes are to be labeled. The standard method for doing this is by ‘Miller Chapter1 X-ray crystallography indices’. The directions of the lattice vectors a, b and c are first identified, where a, b, c are the dimensions of the axes. A set of parallel planes is called in the form (h k l) where h, k, l are the integral number of cuttings made by the planes on the a, b and c axes, respectively. Thus the (1 1) plane intercepts all three axes at 1. The (1 0) plane intercepts the a axis at but never intercepts the b and c axes; the (1 0) plane is perpendicular to the a axis and lies parallel to the b-c plane. Note that there is no set of planes assumed with fractional values, which means it is not possible no have a set of parallel (A) (B) Figure 1-2. (A) planes in two dimension (B) planes with fractional indices (like, 2.5 0.6) are not possible planes that will cut the unit-cell axes fractionally and still obey the law of rational indices, Fig. 1-2. Chapter1 1.3.2 X-ray crystallography Symmetry, point groups and space groups Symmetry operations make a duplicate image of an object, which may be identical or similar to the original object. The symmetry properties of objects (molecules in a crystal) may be described in terms of the presence of certain symmetry elements and their associated symmetry operations. The three basic symmetry operations that may be present in a crystal are rotation, reflection and inversion. A symmetry element is defined as the geometrical feature upon which the symmetry operator acts. The three corresponding symmetry elements for the above three symmetry operations are line, plane and point. Note that each unit-cell is identically related to other unit-cells by translational symmetry. The rotational symmetry is designated by an integer (n) from to infinity. Because of geometrical constraints, rotation in a crystal is limited to 1-fold, 2-fold, 3fold, 4-fold and 6-fold axes. The angle between two objects related by an n-fold rotation is given by 360/n (e.g., 360/4 = 90º for a 4-fold rotation). The inversion through a point (the symmetry element, sometimes called point reflection) takes any point in the molecule, moves it to the center, and then moves it out the same distance on the other side again. Inversion can act only on the rotational symmetry operator. The inversion axes are denoted with a bar above, say . Symmetry operations may not be combined arbitrarily with each other, but only in a limited number of variations called point groups. Point groups describe the assembly of Chapter1 X-ray crystallography symmetry elements without any translation in the unit-cell. Point groups involving the inversion and reflection symmetry are not possible for protein molecules. Translation allows for new types of symmetry operations. An nm screw axis results from rotation by 360/n degrees followed by a translation of m/n unit-cell dimension parallel to the screw axis. For a 2-fold screw axis (21), there is only one translation by 1/2 units along the unit-cell axis and for a 4–fold axis there are possibilities, 41, 42, 43, with translations of 1/4, 2/4, and 3/4 along the axis. A glide plane, combination of a mirror plane and a translation operation parallel to it, cannot appear in protein structures. Generally, a glide is constructed by a mirror operation followed by a/2, b/2 or c/2 translation along the a, b or c axis, respectively, a face diagonal (a+b)/2, (a+c)/2, (b+c)/2 (in all cases denoted n) or, in the case of d along (a+b)/4, (a+c)/4 or (b+c)/4. A space group is the combination of a lattice type and a point group (or its class of glide planes and screw axes). The 230 space groups result from the systematic combination of the 14 Bravais lattices with the 32 crystallographic point groups. We end up with 230 space groups as ways to describe the possible packing of an asymmetric object placed in the 14 Bravais lattices so that the surroundings of the asymmetric object is the same (based on symmetry operations) in space. However only 65 space groups are possible for proteins as enantiomeric amino acids (D and L- amino acids) are not present in protein molecules. Chapter1 1.4 X-ray crystallography X-RAY DIFFRACTION X-rays, which are electromagnetic waves, interact with matter, particularly electrons. A wave can be described by its amplitude (the height of the peaks), and its frequency (how many times it repeats per unit time). If we call the amplitude A, time t and the frequency through the circular velocity ω, then the equation describing this wave is given by: E = A cos ωt (1.1) X-rays interacts with matter and get scattered (or re-emitted) in all directions from the electrons they encounter. X-rays, scattered from different electrons, will travel different distances, so they will differ in their relative phases and there will be interference as they add up. They can add up in phase, so that the resulting amplitude is the sum of the individual amplitudes, or out of phase, so that the resulting amplitude is the difference of the individual amplitudes, or anything in between (Fig. 1-3). Figure 1-3. Wave addition The atom in crystals interacts with X-ray waves in such a way as to produce interference. The interaction can be thought of as if the atoms in a crystal structure reflect the waves. Chapter1 X-ray crystallography But, because a crystal structure consists of an orderly arrangement of atoms, the reflections occur from what appears to be planes of atoms. 1.4.1 X-ray diffraction and Bragg's law Diffraction by a crystal can be regarded as the reflection of the primary beam by the sets of parallel planes. Bragg’s law gives the relationship between the reflection angle, θ, the distance between the planes, d, and the wavelength, λ. Consider two X-rays (Fig. 1-4). Ray reflects off of the upper atomic plane at an angle θ, its angle of incidence. Similarly, Ray reflects off the lower atomic plane at the same angle θ. While Ray is in the crystal, however, it travels a distance of 2a farther than Ray 1. If this distance 2a is equal to an integral number of wavelengths (nλ), then Rays and will be in phase on their exit from the crystal and constructive interference will occur. Figure 1-4. Braggs’ law. From trigonometry, 2a = 2d sinθ or nλ = 2d sinθ (1.2) Chapter1 X-ray crystallography This is known as Bragg's law for X-ray diffraction. The condition for constructive interference to occur is nλ = 2a. If the distance 2a is not an integral number of the wavelength, then destructive interference will occur and the waves will not be as strong as when they entered the crystal. Thus we could orient and expose infinite sets of atomic planes and measure the diffraction result of all the atoms for all these orientations, eventually leading us to determine the crystal structure. 1.4.2 Reciprocal space The concept of reciprocal space arises from the observation in a diffraction experiment that the diffraction of a set of planes with finer interplanar spacing is recorded farther from the direct beam position than that for a set of planes with greater interplanar spacing. Figure 1-5. The reciprocal space 10 Chapter1 1.5 X-ray crystallography FOURIER TRANSFORM The atomic arrangement (or precisely, the electron density) in a crystal is related to all the diffraction spots obtained from a crystal through the Fourier transformation principle. The electron density at any point can be calculated by Eq. 1.3. ρ ( x, y , z ) = ∑∑∑ F hkle −2πi ( hx+ky +lz ) V h k l (1.3) In fact we are transforming the inverse space of lattice planes to the real or direct space electron density at x,y,z. It is a general feature of the Fourier transform to relate one space into its inverse space and vice versa. So our diffraction pattern (an image of the reciprocal space) it transformed back into the real space of electron density. This transformation is accurate and in principle complete. In the above equation, if we know Fhkl , the structure factors (inverse space from diffraction by electrons) we can calculate the actual real structure (the density of the electrons in real space) 1.5.1 The structure factor Fhkl The structure factor Fhkl for a reflection h,k,l is a complex number derived quite straightforward as follows: n Fhkl = ∑ f j e 2πi ( hx j + ky j +lz j ) (1.4) j =1 This is a simple summation, which extends over all atoms j, with xi, yi and zi their fractional coordinates. f(j) is the scattering factor of atom j and depends on its atomic number and the diffraction angle of the corresponding reflection (h,k,l). Eq. 1.4 shows that if we know the structure, we can easily calculate structure factors. 13 Chapter1 X-ray crystallography The X-ray scattering power of an atom, f, decreases with increasing scattering angle and is higher for heavier atoms. A plot of scattering factor f in units of electrons vs. sinθ / λ shows this behavior. Note that for scattering angle zero, the value of f equals the number of electrons of the atom. Under the anomalous dispersion condition (essentially resonance absorption), which is substantial in the vicinity of the X-ray absorption edge of the scattering atom, f is given by f = f o+ f ′+ if ″ (1.5) These anomalous contributions can be calculated and their presence can be exploited in the MAD phasing technique. Remember that the i preceding f" implies that there is a phase shift of +90º between f" and the real components of f. In actual cases there will be additional weakening of the scattering power of the atoms by the temperature factor. fB = f .e -B(sinθ/λ)2 (1.6) where B is related to the mean displacement of a vibrating atom by the DebyeWaller equation B = 8π22 (1.7) 1.5.2 Phase problem In order to compute Eq. 1.4, we need not only the amplitudes of the diffracted waves, which are obtained from the intensities of the diffraction spots, but also their relative phase shifts with respect to the origin of the unit-cell, which cannot be measured directly. The immeasurability of the phase angle is known as “phase problem”. The structure factor equation of can be rearranged as shown in Eq. 1.8 to include the phase angle. 14 Chapter1 X-ray crystallography n f ( x) = ∑ | Fh | [cos 2π (hx) + i sin 2π (hx)] (1.8) h =0 The methods with which we can get the estimate of the phase angle α are discussed later. 1.6 GEOMETRIC DATA COLLECTION For a crystal structure determination the intensities of all diffracted beams must be measured. To so, all corresponding reciprocal lattice points must be brought to the diffracting position by rotating the lattice points. In crystallographic diffraction data collection, two things are achieved. Figure 1-7. The four-circle diffractometer First is the determination of the geometry of diffraction from which the shape, size and symmetry of the reciprocal and direct lattices can be calculated. Second is the assignment to every point in the reciprocal lattice an observed intensity, which may be ultimately related to the distribution of diffracting electrons in the unit-cell. 15 Chapter1 X-ray crystallography An X-ray diffraction instrument consists of two parts namely the mechanical part to rotate the crystal and the detecting device to measure the intensities of diffracted beams (Fig. 1-7). The crystal can be rotated on three independent axes of rotation (ω, χ and φ) so that any reciprocal lattice point can be brought into the diffraction position. Different detectors utilize different physical mechanisms to record the X-ray reflections.The most popular systems include photographic film, multiwire proportional chamber (MWPC), the solid state TV and charge coupled device (CCD) and the Image plate (IP, which records the signal via color centers followed by a laser scan). 1.6.1 Data reduction In a diffraction experiment we measure the intensities and the position of reflections. From the position of the reflection we can determine its index triple (h,k,l) and assign an appropriate intensity to it. This intensity is proportional to the square of |Fhkl|. Correct Miller indices are assigned and intensities are calculated for all the observed reflections. The reflections are scaled to remove systematic errors introduced by effects such as absorption (arising from non-regular crystal shape), non-linearity in the monitoring of the incident beam intensity by the detector, and changes in the average diffracted intensities due to variation in the total diffracting volume of the crystal sample arising when part of the crystal moves in or out of the incident beam. Also, data are to be corrected for the thermal motion of atoms (which causes the fall-off of intensity with increased scattering angle) and radiation damage (which contributes to reduction in the intensity as a function of resolution). 16 Chapter1 X-ray crystallography The usefulness of a dataset depends on its highest resolution limit. It has been shown that the correctness of a solved structure increases with the resolution of the data used [Hubbard and Blundell, 1987]. Before the availability of sophisticated crystal cooling devices, which would otherwise introduce severe radiation damage to the crystal, diffraction data from several crystals would be collected. In practice, multiple observations of symmetry related reflections (either from different crystals or from the same crystal) are merged to give only the unique reflections for that particular crystal system. The quality of merged data is verified by Rsym, Eq. 1.9, with typical values of less than 3% for low resolution data and upto 20% for data near the high resolution diffraction limit [Ealick, 1995]. Rsym = ∑hkl∑i[|Ii(hkl) – | / Ii(hkl)] (1.9) Various people have demonstrated the value of using as much data as possible. Cowtan (1996) has shown how missing reflections affect the reproduction of an image from diffraction data. The completeness of low resolution data is important in the placement of missing parts of the structure, while refinement may benefit from the inclusion of high resolution data even if the merging R is up to 40% [Dodson, Kleywegt and Wilson, 1996]. Depending on the structure determination method, the Bijvoet pairs of a reflection can either be treated separately to give h,k,l,F+,F- (acentric) or averaged to give h,k,l,F (centric). 17 Chapter1 1.7 X-ray crystallography STRUCTURE DETERMINATION Most new structures are solved by experimental phasing methods based either on isomorphous replacement or anomalous dispersion. Both approaches give phase information indirectly by perturbing the diffraction pattern through an effect involving a subset of atoms. In isomorphous replacement, “heavy” atoms are added to the native crystal, changing the diffraction pattern, whereas in anomalous dispersion, certain atoms (those with absorption maxima near the energy of the X-rays) introduce a phase shift to their contribution to the diffraction pattern, which perturbs the equality that would otherwise be found between the intensities of “Friedel pairs” of diffraction spots. If we know the contribution of the subset of atoms to the diffraction pattern, then the change in amplitude gives information about the original phase angle. Instead of solving the entire structure to get phase information, it is then only necessary to solve the substructure of the special (heavy or anomalous) atoms, which is generally a much simpler problem. For the isomorphous replacement or anomalous dispersion experiment to give any phase information, it is obvious that the size of the predicted change in amplitude must be at least comparable to the experimental uncertainties in the amplitudes. The size of the change in amplitude will depend on the scattering power of the atoms in the substructure (hence the requirement for “heavy” atoms or for atoms with a large anomalous dispersion contribution), and also on the number of such atoms in the substructure. The expected fractional change in amplitude can be estimated with simple diffraction ratio formulas, which assume that the special atoms are distributed randomly through the crystal. 18 Chapter1 X-ray crystallography Some structures, under favorable conditions, may be solved by direct methods’. There are two principles of direct methods. The first one, inequality method, was developed by Harker & Kasper, Sayre and others. Jerome Karle and Herbert Hauptman developed the second method, called the probability method. This method exploits the relationship of certain classes of reflections with their phases, supported by probability. The phase relationship among three reflections is first developed by multitangent formula and a high score for the probability for that relationship to be true approves that combination. Subsequently, more combinations are generated and validated. When adequate reflections are phased, the electron density map is drawn. Direct method is likely to succeed for a molecule of about 100 residues and the data are of high resolution, better than 1.5 Å. 1.7.1 Molecular replacement When a structure homologous to the structure of interest has already been determined and available, the crystal structure can be determined using the molecular replacement method. As the structures reported in this thesis were determined by the molecular replacement method, this method is explained in a relatively detailed manner. When identical or similar structures exist, similarities between their diffraction patterns, which are directly related to their Fourier transformation, would be expected. Molecular replacement exploits this similarity to determine the phases. The molecular replacement method was first developed by Rossmann and Blow (1962) to exploit the presence of non-crystallographic symmetry to obtain phase information and reduce phase uncertainties. The ideas were extended by Bricogne (1976) for real-space symmetry 19 Chapter1 X-ray crystallography averaging to improve the electron density. The method is based on a simple idea: locating a molecule at different places in a unit-cell. The basic principle of molecular replacement is rooted in the Patterson function. A Patterson map contains information of two types of vectors: intramolecular or selfpatterson vectors of pairs of atoms in the same molecule. These vectors are relatively short and are thus clustered around the origin; intermolecular or cross-Patterson vectors, which are generally longer than self-vectors. The self-vector cluster would be equal for non-crystallographically related molecules in the same unit-cell but also very similar for similar molecules in different crystals, apart from a rotational difference. The cross vectors provide information on the required translation of molecules to their positions relative to symmetry elements. Thus the process of molecular replacement can be divided into two steps: orientation and positioning. In certain cases when there need for a direct six-dimensional search, which is computationally feasible is carried out. 1.7.1.1 The rotation function Patterson function or its reciprocal space analog is compared between the search model and the observed diffraction data. The construction of a Patterson function can be regarded as consecutively placing each atom in the origin and drawing vectors from such an atom to all other atoms in the unit-cell. The value assigned to the vector is proportional to the product of the atomic numbers at either end of the vector. Thus, the height of the origin peak in a Patterson map is proportional to the sum of the squared atomic numbers, due to the null-vectors from each atom to itself. As stated before, the 20 Chapter1 X-ray crystallography cross vectors in a protein Patterson are generally longer than the self-vectors, unless the protein molecules are unusually closely packed or non-globular in shape. In the self-rotation function, the Patterson of the unknown structure is rotated with respect to itself. Maximum overlap of these Patterson maps will occur at zero rotation (obviously) and at rotations representing (non-crystallographic) symmetry. Similarly, two Patterson maps of related molecules (one known, one unknown) can be superposed when performing a cross-rotation function. The origin peak may be excluded by using an inner limit for the integration. The integration radius used should reflect the dimensions of the molecule and packing considerations, in order to exclude cross vectors. In the crossrotation function the occurrence of cross vectors of the known molecule can be prevented altogether by putting the molecule in an artificially large P1 cell, so that cross vectors lie well outside the integration radius. The general notation for the rotation function, seen as an overlap function of a Patterson function P(u) and a rotated Patterson function Pr(ur), dependent on Eulerian angles α,β,γ, is R(α,β,γ) = ∫U P(u).Pr(ur)du (1.10) where U is the appropriate volume of integration, and u is a Patterson vector (u,v,w). The rotation function [Navaza, 1994., Crowther and Blow, 1967., Vagin and Taplyakov, 1997] has been formulated in real space as well as in reciprocal space. Although the two formulations are in principle equivalent, there are differences in the numerical approximation employed. 21 Chapter1 X-ray crystallography 1.7.1.2 The translation function The rotation function is based on the rotation of a Patterson function around an axis through its origin and involves no translation. However, for the final solution of the molecular replacement method, the translation required to overlap one molecule onto the other in real space must be determined. The first fast translation function expression was devised by Crowther and Blow (1967): Tjk(v ) = ∫v Pjk(u,v) . Pobs(u)du (1.11) Pjk(u,v) is the cross-Patterson function of the model structure in which two molecules (j and k) are related by crystallographic symmetry. v is the intermolecular vector between the local origins of these two molecules. Pobs(u) is the Patterson function of the unknown structure. This function effectively searches one pair of asymmetric units, i.e. one crystallographic symmetry element, at a time. Computation of translation functions is usually performed in reciprocal space, sometimes using normalized structure factor amplitudes. Several methods have been proposed to improve the calculations: • combining the translation search with a limited systematic variation of the parameters from the rotation function, assessing improvement by R-factor and correlation coefficient calculations [Fujinaga and Read, 1987]. • including existing phase information from isomorphous replacement studies, calculating the so-called 'phased translation function' [Read and Schierbeek, 1988]. 22 Chapter1 • X-ray crystallography including information of a partial model by either adding or subtracting partial Patterson functions appropriately during rotation or translation function calculations [Zhang and Matthews, 1994]; • improving the search model and its orientation before the translation search. Brünger (1990) has suggested the procedure of Patterson Correlation refinement, minimizing a target function based on the correlation between observed and calculated Patterson functions and including an empirical energy term to restrain the geometry of the model. The model can be divided into domains treated as rigid bodies, while the domains are allowed to move with respect to each other; • fixing one molecule in the asymmetric unit while searching for noncrystallographically related molecules (as in AMoRe [Navaza, 1994]). The translation function may be by-passed altogether by a trial and error search, moving the correctly oriented model through the asymmetric unit and calculating the conventional R-factor as a function of the molecular position. For the molecular replacement technique a known structure is required, which serves as a model for the unknown structure. Homology in the amino acid sequence and the obtained solution will indicate whether a model is suitable. The solutions of the rotation and translation functions are not always found in a straightforward way. It is necessary to modify the model, for instance, by ignoring side chains and deletions/additions in the model, and to adjust the resolution range of the X-ray data. With a rapid increase in the number of successful protein structure determination, molecular replacement has become an extremely useful technique in protein crystallography. 23 Chapter1 1.8 X-ray crystallography REFINEMENT OF THE INITIAL MODEL After the determination of an initial model by one of the four methods, the structure needs to be refined in order to obtain a set of atomic coordinates that corresponds best to the observed data. Various systematic and random errors have effects on the accuracy of the initial model. Refinement is the process of adjusting the model to find a closer agreement between the calculated and observed structure factors by leastsquares methods or molecular dynamics. In the least-squares refinement the function to be minimized is: Q = Σ hkl w(h k l) ( Fobs(h k l) - Fcalc(h k l) )2 (1.12) In macromolecular refinement the number of observed structure factors (Nobs) is often limited. In fact, the number of crystallographic parameters (Nc) to be refined (xi, yi, zi, Bi, i.e. × number of atoms) is substantial and the ratio Nobs/Nc may approach or even be lower than 1. In these cases the conventional least squares method fails. It has therefore become customary to increase formally the number of "observations", by adding a stereochemical energy function (U), which contains information on the ideal stereochemical values of bond-lengths, bond-angles, non-bonded contacts, and torsion angles. In some cases the restrained refinement of a protein structures can be locked in a local minimum. In order to overcome this one can use extensive analysis of electron density maps, in conjunction with molecular graphics, or apply molecular dynamics (MD). In the latter case, eventually one takes advantage of the kinetic energy associated to each atom in the structure (at a given temperature) to overcome local potential energy 24 Chapter1 X-ray crystallography barriers. In practice, MD cycles are coupled to crystallographic refinement through the combination of a sort of "effective potential" resulting from the sum of the molecule potential energy to the crystallographic residual. The basic idea of molecular dynamics is to rise the temperature sufficiently high for the atoms to overcome the energy barriers and then to cool slowly to approach the energy minimum. The agreement index between calculated and observed structure factors is represented by R-factor. R= ∑F −F ∑F obs calc (1.13) obs The adjustment of the model consists of changing the positional parameters and the temperature factor for all atoms, except the hydrogen atoms, which are generally not included. The stereochemical information on bond lengths, bond angles etc. can be applied in two different ways: constrained refinement in which the stereochemical information is taken as rigid and only dihedral angles can be varied and restrained refinement in which the stereochemical parameters are allowed to vary harmonically around a standard value. The conventional R-factor is the most widely accepted indicator of the general quality of a crystal structure. It is not a good independent validator, since excluding data or adding parameters inappropriately can manipulate it. A better indication of the fit between observed and calculated structure factors is the 'free R-factor' [Brünger, 1992], which aids bias-free refinement by indicating over fitting. Commonly between and 10 % of diffraction data is excluded from refinement, with a set of between 500 and 1000 reflections considered to produce reliable statistics. Noncrystallographic symmetry will affect the discrepancy between conventional and free R. 25 Chapter1 1.9 X-ray crystallography PROTEIN STRUCTURE VALIDATION The correctness and precision of the atomic parameters in the structure will need to be assessed thoroughly, both during and after refinement. Keep in mind that validation through computer programs is only as good as the parameters prespecified as ideal values inside these programs. Protein structure refinement is inherently difficult because of the data being weak, not highly over determined, not to atomic resolution and prone to data error. Data quality has improved for the following reasons [Dodson, Kleywegt and Wilson, 1996]: • usage of better detectors, which reduce the problem of systematic errors • better cryosystems are available that extend the resolution limit and crystal quality stays constant during the course of data collection • synchrotron radiation allows the use of smaller crystals, while higher resolution data may be obtained and absorption effects are reduced because of shorter wavelengths • data processing techniques are advancing, incorporating information on experimental standard deviations. Accurate bond length and angle parameters for X-ray protein structure refinement have been extracted from the Cambridge Structural Database and compiled by Engh and Huber (1991), providing a reliable measure for ideal values for bond lengths and angles. Ever increasing computing power allows for more advanced software, both for refinement calculations and graphics facilities. 26 Chapter1 X-ray crystallography 1.9.1 Ramachandran plot After experiment duplication, the best validation is by criteria that cannot and/or have not been used during the course of refinement. One of the validation tools that are very hard to use during automatic refinement, is the Ramachandran plot [Ramachandran et al., 1963], which checks the stereochemistry of the structure. It is a good check of the structure, and residues at disallowed positions in the plot need further investigation. 1.9.2 Folding profile methods The potential fold of a protein sequence is most often assigned by searching databases for structures of known proteins are similar to the protein under study. This only works if the sequence identity is reasonably high. Eisenberg and co-workers have devised a method, which allows more flexibility, both of the backbone of the protein and of the spacing between particular residues [Bowie et al., 1991]. Each residue of a protein is assigned one of 18 environment classes, based on the total area of the side chain buried by other protein atoms, the fraction of the side chain area covered by polar atoms or water and the local secondary structure, i.e. α-helix, β-sheet and 'other'. Now the threedimensional structure of a protein is converted to a one-dimensional string. It is well established that each of the 20 natural amino acid types has a clear preference for a certain environment. On the basis of well-refined three-dimensional structures a score is assigned to the occurrence of every amino acid in each of the 18 environment classes. These scores can then be used to find the best alignment of amino acid sequences to the one-dimensional string of the protein under study, producing a compatibility score of the sequence with its three-dimensional structure. 27 Chapter1 X-ray crystallography Instead of stereochemical parameters, Sippl (1993) uses knowledge of the forces, which stabilize proteins in solution in the analysis of the energy distribution of a protein profile. These energy forces are obtained from well-refined structures in the form of potentials as a function of the spatial separation of two atoms. Cα-Cα pairs and Cβ-Cβ pairs seem to work equally well, indicating that the method can be applied if only a Cαtrace is available. The atom pair interaction energy is a function of the sequence, and the structure with the correct conformation will have a lower energy than any alternative conformation. Since no stereochemical parameters are used in the method, the occurrence of high energies in a particular structure is not a consequence of violations of basic steric requirements. 1.9.3 Presentation The refined coordinates (positions of the atoms) are orthogonalized (arranged with respect to a three orthogonal axes system), even if the unit cell has non-orthogonal axes. The temperature factor is a good indicator about the static and dynamic disorder in protein structures and also about the thermal vibration of the atom. The solved structure is deposited at the Protein Data Bank (PDB). 28 [...]... string of the protein under study, producing a compatibility score of the sequence with its three-dimensional structure 27 Chapter1 X-ray crystallography Instead of stereochemical parameters, Sippl (19 93) uses knowledge of the forces, which stabilize proteins in solution in the analysis of the energy distribution of a protein profile These energy forces are obtained from well-refined structures in. .. changes in the average diffracted intensities due to variation in the total diffracting volume of the crystal sample arising when part of the crystal moves in or out of the incident beam Also, data are to be corrected for the thermal motion of atoms (which causes the fall-off of intensity with increased scattering angle) and radiation damage (which contributes to reduction in the intensity as a function of. .. crystallography REFINEMENT OF THE INITIAL MODEL After the determination of an initial model by one of the four methods, the structure needs to be refined in order to obtain a set of atomic coordinates that corresponds best to the observed data Various systematic and random errors have effects on the accuracy of the initial model Refinement is the process of adjusting the model to find a closer agreement... ray Because all of these reciprocal space vectors start from the same point, the base of the red vector in the figure must define the origin of the reciprocal space The diffracted ray can go in any direction in three dimensions, so the vector representing it can have its tip anywhere at the surface of a sphere with a radius of 1/ λ Such a sphere is called the Ewald sphere 12 Chapter1 1. 5 X-ray crystallography... variation of the parameters from the rotation function, assessing improvement by R-factor and correlation coefficient calculations [Fujinaga and Read, 19 87] • including existing phase information from isomorphous replacement studies, calculating the so-called 'phased translation function' [Read and Schierbeek, 19 88] 22 Chapter1 • X-ray crystallography including information of a partial model by either adding... diffraction angle of the corresponding reflection (h,k,l) Eq 1. 4 shows that if we know the structure, we can easily calculate structure factors 13 Chapter1 X-ray crystallography The X-ray scattering power of an atom, f, decreases with increasing scattering angle and is higher for heavier atoms A plot of scattering factor f in units of electrons vs sinθ / λ shows this behavior Note that for scattering angle... co-workers have devised a method, which allows more flexibility, both of the backbone of the protein and of the spacing between particular residues [Bowie et al., 19 91] Each residue of a protein is assigned one of 18 environment classes, based on the total area of the side chain buried by other protein atoms, the fraction of the side chain area covered by polar atoms or water and the local secondary structure,... structure of a protein is converted to a one-dimensional string It is well established that each of the 20 natural amino acid types has a clear preference for a certain environment On the basis of well-refined three-dimensional structures a score is assigned to the occurrence of every amino acid in each of the 18 environment classes These scores can then be used to find the best alignment of amino acid... phase shift of +90º between f" and the real components of f In actual cases there will be additional weakening of the scattering power of the atoms by the temperature factor fB = f e -B(sinθ/λ)2 (1. 6) where B is related to the mean displacement of a vibrating atom by the DebyeWaller equation B = 8π22 (1. 7) 1. 5.2 Phase problem In order to compute Eq 1. 4, we need not only the amplitudes of the diffracted... always found in a straightforward way It is necessary to modify the model, for instance, by ignoring side chains and deletions/additions in the model, and to adjust the resolution range of the X-ray data With a rapid increase in the number of successful protein structure determination, molecular replacement has become an extremely useful technique in protein crystallography 23 Chapter1 1. 8 X-ray crystallography . for a set of planes with greater interplanar spacing. Figure 1- 5. The reciprocal space Chapter1 X-ray crystallography 11 In Fig. 1- 5, the Bragg planes and the incoming and reflected. wavelength (in the Angstrom range, 10 -10 M) to be scattered by the electron cloud of an atom of comparable size. 1. 3 .1 Lattices, planes and indices The simplest repeating unit in a crystal. resulting amplitude is the difference of the individual amplitudes, or anything in between (Fig. 1- 3). Figure 1- 3. Wave addition The atom in crystals interacts with X-ray waves in such