576 BIOCHEMICAL AND SYNTHETIC POLYMER SEPARATIONS sugar base X X X X X X X X X = -O-PO 2 -O- (phosphate) Chain-1 Chain-2 Figure 13.4 Schematic diagram of the helical structure of DNA, showing hydrogen bonding between complementary base pairs. of a double-stranded nucleic acid depends on the contribution of many individ- ual interactions. Under physiological conditions, hydrogen bonding and stacking interactions between bases stabilize the helical structure. Exposure of the helix to elevated temperature or extremes of pH can induce separation of the two strands. Single-stranded nucleic acids may form intramolecular, base-paired segments if they have regions of internal complementarity; these structures can be dissociated by heat or chemically denaturing conditions. 13.2.3 Carbohydrates Polysaccharides play a variety of roles throughout the biosphere. Glucose homopoly- mers provide nutritional storage in both animals and plants. Glycogen, a large branched polymer of glucose residues linked by main-chain α-1,4 glycosidic bonds, is the sugar-storage entity in animals. Starch, the sugar-storage polymer in plants, consists of unbranched (amylase) or branched chains (amylopectin) of glucose with α-1,4 linkages. Both glycogen and starch exist as helical structures. Cellulose, the structural polysaccharide in plants, is a linear polysaccharide with β-1,4 glycosidic linkages. 13.2 MOLECULAR STRUCTURE AND CONFORMATION 577 Oligosaccharides play important functional roles as components of glycopro- teins, including integral membrane proteins and many secreted proteins such as antibodies and clotting factors. Oligosaccharides participate in immune cell recog- nition and cell–cell communication, and also contribute to protein stability and the maintenance of cell structure. Oligosaccharides are added as co-translational or post-translational modifications to proteins, and are covalently linked to the side chains of serine or threonine (O-linked oligosaccharides) or the side chain of asparagine (N-linked oligosaccharides). The predominant sugars in glycopro- teins (Fig. 13.5) are glucose, galactose, mannose, fucose, N-acetylgalactosamine (GalNAc), and N-acetylglucosamine (GlcNAc). In O-linked glycoproteins, the car- bohydrate is attached to the protein by a GalNAc residue, while the attachment point in N-linked oligosaccharides is via GlcNAc. In N-linked glycoproteins (the predom- inant form in mammals), the oligosaccharide consists of a common core-structure consisting of two GlcNAc residues and three mannose residues. Additional sugars attached to this core form a diverse family of oligosaccharide structures, including high-mannose oligosaccharides and complex oligosaccharides containing GlcNAc, yl β-L-Fucose (Fuc) β-D-Mannose (Man) Sialic acid (N-Acetylneuraminate) (Sia) β-D-N-Acetylgalactosamine (GalNAc) β-D-N-Acetylglucosamine (GlcNAc) β-D-Galactose (Gal) 3 H O N-C-CH N-C-CH 3 H O H H 3 -C-N O Figure 13.5 Sugar residues commonly found in glycoproteins. 578 BIOCHEMICAL AND SYNTHETIC POLYMER SEPARATIONS (a)(b) Man GlcNAc GlcNAc Asn Man GlcNAc GlcNAc Asn α1,2 α2,3 β1,4 β1,4 β1,4 β1,4 β1,4 β1,2 β1,2 α2, 3 α1,2 α1,3 α1,3 α1,3 α1,6 α1,6 α1,6 α1,6 β1,4 β1,4 α1,2α1,2 Sia Sia Gal Gal GlcNac GlcNac Man Man Man Man Man Man Man Man Man Man GlcNAc Fuc High-mannose type Complex type Figure 13.6 N-linked oligosaccharides of (a) high-mannose type, and (b) complex type. The common core structure consisting of three mannose residues and two GlcNAc residues is indi- cated (dashed enclosure). galactose, sialic acid, and fucose residues (Fig. 13.6). Glycoproteins vary in the number of glycosylation sites, the occupancy of these sites, and the oligosaccharide structure at each occupied site. 13.2.4 Viruses In the 1990s the advent of gene therapy generated a need for the purification of recombinant (laboratory-created) viruses; chromatography emerged as the preferred technique for the purification of large quantities of material for gene-therapy trials. Virus purification is also required for the production of some vaccines. Viral purification has traditionally been done by cesium-chloride density-gradient ultracentrifugation [4], but scale-up is not practical. This method results in variable purity and poor yields, and can require removal of CsCl [5]. Recombinant adenoviruses (rAd) are common vectors for gene therapy, and they have served as models for the development of chromatographic methods for the purification and analysis of viruses. The adenovirus particle contains 85% protein and 13% DNA and has a molecular weight of 167 × 10 9 Da. It consists of an icosahedral protein shell or capsid (70 to 100 nm in diameter) surrounding a protein core that contains the linear, double-stranded DNA genome [11]. The capsid also 13.3 SPECIAL CONSIDERATIONS FOR BIOMOLECULE HPLC 579 contains some additional, minor polypeptide elements. The adenovirus genome consists of a linear double-stranded DNA molecule of 35 to 36 kilo base-pairs. One distinguishing feature of viruses, so far as their chromatographic separation, is their enormous relative size—which restricts the penetration of the virus molecule into a porous column-packing. 13.3 SPECIAL CONSIDERATIONS FOR BIOMOLECULE HPLC The size and shape of biopolymer molecules, as well as the need in preparative applications for maintaining biological activity, require special consideration for the choice of column, mobile phase, and temperature—any of which conditions can affect the recovery of biological activity. Preferred conditions vary for each chromatographic mode and sample type, as discussed in Sections 13.4 through 13.7. However, some general comments can be made with regard to column characteristics and stability as a function of conditions (Section 13.3.1). General principles of method development are provided in Chapters 6 through 9 for individual chromatographic modes, while other aspects of method development are dealt with in Chapters 11 and 12. The latter material is largely applicable for all solute molecules, both large and small. Additional considerations that are important for biomolecules are addressed in this chapter; for additional information, see [9]. 13.3.1 Column Characteristics The large size of biomacromolecules requires particular attention to the selection of the pore size and particle diameter of the column packing. Analyte stability and mass recovery, a possible need for nondenaturing conditions, and column stability also affect the final choice of column. 13.3.1.1 Pore Size Column capacity (Section 15.3.2.1) and retention are a function of the amount of stationary phase available for sample interaction, which is in turn proportional to the accessible surface area of the packing (Section 5.2.1). For smaller molecules (<1000 Da), particles with pore-diameters of 8 to 12 nm permit free access of solutes into the pore system, such that the solute can sample the entire surface area (typically ≈ 250 m 2 /g for a 10-nm-pore particle) and diffuse freely within the pores (so as not to compromise column efficiency N). In contrast, large biopolymers can be excluded partly or entirely from pores of this size so that they interact only with the external surface of the particle (which represents < 1% of the total surface and column capacity within the pores). In order to achieve adequate column capacity, retention, and column efficiency, particles should be used which have pore diameters large enough to permit an easy entry and exit of the biomolecule. The relationship between molecular weight M and molecular size for globular and random coil proteins is shown in Table 13.1. To avoid peak broadening due to restricted diffusion of the protein within the pore, the pore diameter should exceed the solute diameter by a factor of 3 or more [2, 6]. However, surface area decreases approximately in proportion to increasing pore size (Table 13.2), so that an optimum pore size allows access of the protein to the pores—without unduly compromising 580 BIOCHEMICAL AND SYNTHETIC POLYMER SEPARATIONS Table 13.1 Protein Diameter and Molecular Weight Compared Molecular Weight (kDa) Hydrodynamic Diameter Random Coil (nm) a Globular (nm) b 12.61.6 10 8.2 3.5 100 25.8 7.6 1000 81.6 16.3 Source: Data from [9]. a Applies to separation under denaturing conditions (including RPC). b Usually the native (nondenatured) protein. Table 13.2 Effect of Pore-Diameter on Surface Area Pore-Diameter (nm) Surface Area (m 2 /g) a 10 250 30 100 100 20 400 5–10 a Approximate values that vary with pore-volume. surface area and column capacity. The combined effects of solute and pore size on retention are illustrated in Figure 13.7. In Figure 13.7a, maximum retention of small peptides angiotensin I and II (approximately 1000 Da each) is observed with pores of 10-nm diameter. For smaller pores, exclusion of the two peptides occurs with a decrease in retention; for larger pores, surface area is reduced with a corresponding decrease in both column capacity and retention. For proteins (Fig. 13.7b), significant pore penetration is achieved only with the 30-nm-pore packing, and all three proteins are largely excluded from particles with smaller pore-diameters. In practice, columns with pore diameters of about 30 nm are satisfactory for proteins of ≤ 50 kDa, while columns with pore diameters of 100 to 400 nm are preferred for large globular and/or denatured proteins. Note that particles with pore diameters ≥ 100 nm will have reduced surface area, and often exhibit poor mechanical strength. It should be kept in mind that the hydrodynamic diameter of a protein increases approximately 2- to 3-fold upon denaturation (Table 13.1). Therefore, if proteins are to be separated under denaturing conditions, a larger column-pore size required (particularly important for size-exclusion chromatography [SEC]). 13.3 SPECIAL CONSIDERATIONS FOR BIOMOLECULE HPLC 581 4 3 2 1 0 k myoglobin lysozyme BSA Pore (nm) = 30 6 10 15 4 3 2 1 k 0 200 400 600 800 ligand coverage (μmoles/g) 0 200 400 600 800 ligand coverage (μmoles/g) angiotensin I angiotensin II Pore (nm) = 30 1015 6 (a) (b) Figure 13.7 Effect of pore and solute-molecule size on retention in RPC with C 18 columns. (a) Small peptides ( ≈1000Da); 35% acetonitrile/pH-2.3 phosphate buffer; (b)proteins(> 10,000 Da); 49% acetonitrile-pH-2.3 buffer. Note the column-packing pore-diameter at top of each figure; ‘‘ligand coverage’’ (x-axis) is proportional to surface area. Adapted from [10]. 13.3.1.2 Particle Size Columns packed with fully porous, 3.5- to 5-μm particles are currently preferred for small-molecule analytical separations, and these columns are also widely used for biomolecule applications. However, the slow diffusion of biopolymers results in reduced column efficiency and increased peak widths compared to small molecules (Section 2.4.1). This can be counteracted by the use of much lower mobile-phase flow rates, but with correspondingly longer separation times. Several approaches have been pursued in order to improve column efficiency for large biomolecules. As discussed in Section 2.4.1.1, the effects of slow diffusion on column efficiency can always be mitigated by the use of smaller particles, with particles as small as 1.5-μm finding increasing use for large-molecule separations. A further improvement in the plate number N for large biomolecules can be achieved with small-diameter, nonporous (‘‘pellicular’’) particles (Section 5.2.1.1). The absence of pores in these packings eliminates slow diffusion within the pores, while the external surface area—although quite small—may provide sufficient col- umn capacity for the analysis of major sample components. So-called superficially 582 BIOCHEMICAL AND SYNTHETIC POLYMER SEPARATIONS porous particles (Section 5.2.1.1) have a solid core, which speeds up the move- ment of molecules into and out of the particle pores—with an increase in column plate number N, but only a small decrease in column capacity. So-called perfu- sion chromatography (Section 5.2.1.1 [12]) uses packings that contain very large through-pores that allow flow of mobile phase through the particle. In principle, this can also minimize the effects of slow diffusion into and out of the particle, although these columns are used mainly for preparative separations of large biomolecules. Finally, the replacement of packed beds with a monolith (Section 5.2.4) is still another option. Monolithic columns consist of a continuous, interconnected skeleton with through-pores for transport of mobile phase and solutes through the column. As a result monoliths can be operated at high flow rates with modest pressures, and with little decrease in column efficiency. Both polymer-based and silica monoliths are commercially available. Polymeric monoliths include polymethacrylate and polystyrene-divinylbenzene materials, and they are available in both column and disk formats for analytical and preparative chromatography. These materials also contain a bimodal pore structure of large and small pores. 13.3.1.3 Support Characteristics and Stability Porous silica has several properties that favor its use as a chromatographic support (Section 5.2.2). Unfortunately, silica has other properties that limit its use for the separation of biomolecules. The separation of peptides by RPC is generally performed under acidic conditions (pH < 3), and ion-exchange separations are often carried out under neutral or alkaline conditions (pH > 7); some silica-based columns experience reduced stability outside the limits of 2.5 ≤ pH ≤ 7.5 (Sections 5.3.1, 5.3.2.1). Separations are sometimes carried out at elevated temperature in order to improve peak shapes or optimize selectivity, but bonded-phase silicas exhibit decreased stability at temperatures above 40 ◦ C, especially at extremes of pH. However, the use of suitable columns and other conditions can reduce the adverse effects of mobile-phase pH and temperature on column stability. Another major potential problem with the use of silica-based columns for biomolecule separations can arise from strong interactions between the silica surface and the solute, resulting in wide, tailing peaks and loss of sample due to irreversible adsorption. Problems of this kind are much more pronounced for older, ‘‘type-A’’ columns; higher-purity, ‘‘type-B’’ columns are therefore strongly recommended (Section 5.2.2.2). End-capped RPC columns (Section 5.3.1) are also more effective in minimizing undesirable sample-column interactions. Several strategies have been pursued to improve the performance of bonded-phase silicas for biopolymer chromatography, and present-day silica-based RPC columns are the columns of choice for most peptide separations. Because of the limitations of silica-based packings for other modes of chromatography (e.g., ion exchange and hydrophobic interaction), however, polymeric column packings are used mainly for these applications (Section 5.2.3). Polymers such as polystyrene-divinyl benzene and polymethacrylate can be formed into porous particles that can be used directly for chromatography (e.g., PS-DVB for RPC). Alternatively, polymeric columns can be functionalized so as to introduce specific groups (e.g., ionic moieties for ion exchange; Section 7.5.4). These packings are stable over a broad pH-range (including pH < 2, pH > 10) and can be used at 13.3 SPECIAL CONSIDERATIONS FOR BIOMOLECULE HPLC 583 higher temperatures that would destroy a silica-based column. They are also stable to pressures of 4000 to 5000 psi. A major advantage of polymeric materials for preparative and process-scale applications is the ability to clean them with strong bases, in order to remove contaminants such as endotoxins. Endotoxins (typically lipopolysaccharides from host cells used in the production of biopharmaceuticals) cause inflammatory responses, and their introduction into drug products destined for human use must be avoided. 13.3.1.4 Recovery of Mass and Biological Activity In applications where HPLC is used to isolate material for further characterization or other uses, the analyte must be recovered with good yield. If bioactivity is to be preserved, the biopolymer must also maintain its native conformation. These requirements require careful selection of the chromatographic mode and separation conditions. RPC can be denaturing for proteins, and generally it is not used for the recovery of larger proteins. However, denatured peptides or small proteins from a RPC separation can usually be restored to full bioactivity by exposure to organic-free buffer with appropriate ionic conditions. The use of other chromatographic modes with harsh elution conditions (extremes of pH, elevated temperature) can also compromise the recovery of intact, active species. The mass recovery of a polypeptide can be reduced by its adsorption to active sites on the packing, or by entrapment within the pore system. Sample loss due to adsorption can be minimized by pretreatment of the column with a surrogate biopolymer (e.g., bovine serum albumin for proteinaceous samples), in order to deactivate the column prior to use. Sample loss due to protein unfolding within the pore system can also be minimized by using large-pore supports or less denaturing conditions. The tendency of HPLC conditions to denature a protein can be ordered as follows: RP HIC ≈ IEC > SEC. For further details on polypeptide recovery, see [13]. 13.3.2 Role of Protein Structure in Chromatographic Behavior Protein retention in HPLC can be understood as an interplay between protein structure and the chromatographic process. In the case of small solute molecules, most parts of the molecule are in contact with the stationary phase. For large peptides and especially proteins, this may not be possible because only the surface of these three-dimensional molecules (their contact area) can be in contact with the stationary phase. Several resulting retention relationships were described in a seminal publication by Regnier [14], and these can be summarized by a series of postulates: • The weak chemical forces that govern protein conformation and surface recognition (ionic, hydrophobic, hydrogen bonding) are the same as those involved in chromatographic interactions. • It is not possible for all the amino acids in a protein to simultaneously contact the stationary-phase surface (even more so for the native molecule). • Only residues located at the protein surface have an impact on chromato- graphic behavior, and only a fraction of those residues (those within the contact area) are involved in stationary-phase interactions. 584 BIOCHEMICAL AND SYNTHETIC POLYMER SEPARATIONS • The heterogeneous distribution of residues on the protein surface allows some portions of the surface to dominate chromatographic behavior; these interactive regions may not be the same for different chromatographic modes. • Structural changes that alter the protein surface can change chromatographic behavior if they occur within the contact area or alter the surface of the contact area. • Interaction with the stationary or mobile phase can alter protein secondary, tertiary, and quaternary structure. We will return to these concepts in our following discussion of different chromato- graphic modes. 13.4 SEPARATION OF PEPTIDES AND PROTEINS The selection of the appropriate mode for peptide and protein chromatography is dictated by the goals of the separation. If high recovery of protein mass and biological activity is required, relatively gentle chromatographic techniques, such as ion exchange, hydrophobic interaction, and size exclusion are preferred. If the aim is to resolve proteins based on size, or carry out a class separation between large and small molecules, size-exclusion chromatography is the method of choice. RPC is used less often for the purification of larger proteins, because of its tendency to denature—which can degrade separation and compromise both mass recovery and biological activity. For the purification of peptides and smaller proteins, however, RPC has been remarkably successful; it is the universal first choice for separating peptide mixtures. In the area of proteomics, which seeks to characterize the entire protein composition of a cell or tissue, extraordinarily complex mixtures of peptides must be separated prior to MS-MS analysis (Section 4.14); two-dimensional (2-D) HPLC based on ion exchange followed by RPC is often used (Section 13.4.5). 13.4.1 Reversed-Phase Chromatography (RPC) Several features of reversed-phase chromatography (RPC) are responsible for its wide use in the analysis and purification of peptides and proteins. The high efficiency of RPC columns packed with small particles provides increased resolution and high peak capacities (Section 9.3.9.1) for the separation of complex mixtures. The selectivity of RPC also favors the resolution of peptides with very similar structures. The solvents used in RPC are compatible with UV detection, and their volatility permits solvent removal from recovered fractions. Most important, aqueous-organic mobile phases are compatible with electrospray ionization, so RPC is almost always the technique used with detection by mass spectrometry (Section 4.14.1.1). A large selection of mobile phases and columns is available for solving a given separation problem, although in practice a few generic methods are used for most samples. An important feature of peptide, and especially protein, behavior in RPC is a strong dependence of solute retention on small changes in solvent strength (Section 13.4.1.4). For example, a 29 kDa protein has been shown to exhibit a 20% change in retention time for a variation of only ±0.1% organic solvent 13.4 SEPARATION OF PEPTIDES AND PROTEINS 585 concentration [15]. This usually makes the isocratic separation of even simple polypeptide mixtures impractical; gradient elution is almost always required. 13.4.1.1 Column Selection Silica-based RPC packings exhibit acceptable stability for commonly used separation conditions, and they are often preferred over polymer-based materials because of their higher column efficiency. Small-pore silicas (≈10 nm diameter) are satisfactory for peptides but large pore silicas (pore diameter≥30 nm) are preferred for protein separations. Proteolytic digests can contain peptides of widely varying sizes (200– ≈ 2500 Da for tryptic digests, larger for LysC and AspN digests), so the use of packings with 30-nm pores may be necessary for some peptide samples. Other considerations in column selection are the stationary-phase ligand and its bonding density. The average selectivity of columns with different ligands is summarized in Table 5.8a and the related text, including values of the column-hydrophobicity parameter H. The most popular columns are based on straight-chain alkyl groups (C 4 ,C 8 ,C 18 ), which exhibit increased retention and stability with increasing alkyl length. Trimethyl (C 1 ) columns are the least hydrophobic (H = 0.41) and have been used for the separation of proteins that are too strongly retained on longer ligand (more hydrophobic) columns. However, under conditions usually employed for protein separations, C 1 columns are easily hydrolyzed and quite unstable; octyl (C 8 , H = 0.84) and octadecyl (C 18 , H = 0.99) columns are generally the first choice for separating peptides, because of their greater stability and suitable retention characteristics. For gradient separations of peptides and proteins, the concentration of the B-solvent rarely exceeds 60–80%B. Under these conditions the selectivities of butyl, octyl, and octadecyl columns for proteins and peptides appear comparable [16]. Phenyl and cyano columns are likely to exhibit different selectivity than straight-chain alkyl columns, but cyano columns are generally less stable when used at either low or high pH. Butyl columns are much less hydrophobic and less retentive; these columns are therefore preferred for very hydrophobic species, such as membrane proteins and the more hydrophobic polypeptides generated by cyanogen bromide cleavage. Monomeric columns (Section 5.3.1) exhibit higher efficiencies, and they are usually a first choice. Polymeric columns are more stable but may exhibit lot-to-lot variability due to poor reproducibility of the polymerization reaction. The concen- tration of the ligand (μ moles/m 2 ) also affects column stability and the effects of residual surface silanol groups. 13.4.1.2 Mobile-Phase Selection In RPC the mobile phase consists of aqueous buffer (A-solvent) and an organic solvent (B-solvent). For biochemical separations, the ‘‘buffer’’ is often a dilute acid such as phosphoric, trifluoroacetic, formic, or acetic—with a pH of 2 to 3.5. Sample retention in isocratic elution (values of k) can be controlled by varying %B—as in the case of small-molecule samples (Section 2.3.2). For gradient separations, an increase in gradient steepness leads to reduced values of k ∗ and usually poorer separation (see Section 13.4.1.4 below). The mobile-phase composition can affect separation selectivity, detector compatibility, and (to a lesser extent) column efficiency. . such as endotoxins. Endotoxins (typically lipopolysaccharides from host cells used in the production of biopharmaceuticals) cause inflammatory responses, and their introduction into drug products. surface of the particle (which represents < 1% of the total surface and column capacity within the pores). In order to achieve adequate column capacity, retention, and column efficiency, particles. ‘‘ligand coverage’’ (x-axis) is proportional to surface area. Adapted from [10]. 13.3.1.2 Particle Size Columns packed with fully porous, 3.5- to 5-μm particles are currently preferred for small-molecule