28.10 Do Proteins Ever Behave as Genetic Agents? 893 (Figure 28.30). Less often, 5-BU is inserted into DNA at cytosine sites, not T sites. Then, if it base-pairs in its keto form, mimicking T, a C–G to T–A transition ensues. The adenine analog, 2-aminopurine (recall that adenine is 6-aminopurine) normally behaves like A and base-pairs with T. However, 2-AP can form a single H bond of suf- ficient stability with cytosine (Figure 28.31) that occasionally C replaces T in DNA replicating in the presence of 2-AP. Hypoxanthine (Figure 28.32) is an adenine ana- log that arises in situ in DNA through oxidative deamination of A. Hypoxanthine base-pairs with cytosine, creating an A–T to G–C transition. Chemical Mutagens React with the Bases in DNA Chemical mutagens are agents that chemically modify bases so that their base- pairing characteristics are altered. For instance, nitrous acid (HNO 2 ) causes the ox- idative deamination of primary amine groups in adenine and cytosine. Oxidative deamination of cytosine yields uracil, which base-pairs the way T does and gives a C–G to T–A transition (Figure 28.33a). Hydroxylamine specifically causes C–G to T–A transitions because it reacts specifically with cytosine, converting it to a deriva- tive that base-pairs with adenine instead of guanine (Figure 28.33c). Alkylating agents (Figure 28.33e) are also chemical mutagens. Alkylation of reactive sites on the bases to add methyl or ethyl groups alters their H bonding and hence base pairing. For example, methylation of O 6 on guanine (giving O 6 -methylguanine) causes this G to mispair with thymine, resulting in a G–C to A–T transition (Figure 28.33d). Alkylating agents can also induce point mutations of the transversion type. Alkylation of N 7 of guanine labilizes its N-glycosidic bond, which leads to elimination of the purine ring, creating a gap in the base sequence. An enzyme, AP endonuclease, then cleaves the sugar–phosphate backbone of the DNA on the 5Ј-side, and the gap can be repaired by enzymatic removal of the 5Ј-sugar–phosphate and insertion of a new nucleotide (see Figure 28.28). A transversion results if a pyrimidine nucleotide is in- serted in place of the purine during enzymatic repair of this gap. Insertions and Deletions The addition or removal of one or more base pairs leads to insertion or deletion mu- tations, respectively. Either shifts the triplet reading frame of codons, causing frameshift mutations (misincorporation of all subsequent amino acids) in the protein encoded by the gene. Such mutations can arise if flat aromatic molecules such as acri- dine orange insert themselves between successive bases in one or both strands of the double helix. This insertion or, more aptly, intercalation, doubles the distance be- tween the bases as measured along the helix axis (see Figure 11.12). This distortion of the DNA results in inappropriate insertion or deletion of bases when the DNA is replicated. Disruptions that arise from the insertion of a transposon within a gene also fall in this category of mutation (see Figure 28.23). 28.10 Do Proteins Ever Behave as Genetic Agents? Prions Are Proteins That Can Act as Genetic Agents DNA is the genetic material in organisms, although some viruses have RNA genomes. The idea that proteins could carry genetic information was considered early in the history of molecular biology and dismissed for lack of evidence. Prions may be an ex- ception to this rule. Prion is an acronym derived from the words proteinaceous infectious particle. The term prion was coined to distinguish such particles, which are pathogenic and thus capable of causing disease, from nucleic acid–containing infectious particles such as viruses and virions. Prions are transmissible agents (genetic material?) that are apparently composed only of a protein that has adopted an abnormal conforma- tion. They produce fatal degenerative diseases of the central nervous system in (a) CH 3 N N H H N O N N N O CЈ 1 CЈ 1 2-Aminopurine (2-AP) Thymine (b) H 1 6 2 5 4 3 9 8 7 H N N H N N N N N O CЈ 1 CЈ 1 2-AP Cytosine H H N N FIGURE 28.31 (a) 2-Aminopurine normally base-pairs with T but (b) may also pair with cytosine through a single hydrogen bond. Cytosine H O N H N O H N N N CЈ 1 CЈ 1 N H 1 6 N N N N Oxidative deamination HN N N N O Adenine Hypoxanthine Hypoxanthine (Hy p oxanthine is in its keto tautomeric form here) N NH 2 FIGURE 28.32 Oxidative deamination of adenine in DNA yields hypoxanthine, which base-pairs with cytosine, resulting in an A–T to G–C transition. 894 Chapter 28 DNA Metabolism: Replication, Recombination, and Repair CH 3 C 1 Ј C 1 Ј H N N N N N H O H Pairs normally with cytosine Guanine Alkylating agent H N N N N N O H Sometimes pairs with thymine O 6 -methylguanine Nitrosoamines: Dimethylnitrosoamine N H 3 C NO Diethylnitrosoamine H 3 C N CH 3 CH 3 NO CH 2 CH 2 Nitrosoguanidine: ONN CH 3 C NH N H NO 2 N-methyl-NЈ-nitro- N-nitrosoguanidine Nitrosourea: Ethyl nitrosourea O CH 2 N N NO CH 2 CH 3 Nitrogen Mustard: NH 3 C CH 2 CH 2 Cl NH 2 CH 2 Cl Alkyl Sulfates: Ethylmethane sulfonate O SOH 3 C CH 3 CH 2 O Dimethyl sulfate O SOH 3 C O CH 3 O (d) (e) (a) HNO 2 H N N H N O CЈ 1 CЈ 1 H N N O OH N N N N H (c) CЈ 1 Cytosine Uracil Adenine N N H H ON N N N O CЈ 1 CЈ 1 H N H NH HNO 2 N N N CЈ 1 N CytosineAdenine Hypoxanthine NH 2 OH H N N H N O CЈ 1 CЈ 1 H N N O NH N N N N H CЈ 1 Cytosine Adenine OH N N O (b) Generic structure of nitrosoamines NN R 1 R 2 FIGURE 28.33 Chemical mutagens. (a) HNO 2 (nitrous acid) converts cytosine to uracil and adenine to hypoxanthine. (b) Nitrosoamines, organic compounds that react to form nitrous acid, also lead to the oxidative deamination of A and C. (c) Hydroxylamine (NH 2 OH) reacts with cytosine, convert- ing it to a derivative that base-pairs with adenine instead of guanine.The result is a C–G to T–A transition. (d) Alkylation of G residues to give O 6 -methylguanine, which base-pairs with T. (e) Alkylating agents include nitrosoamines, nitro- soguanidines, nitrosoureas, alkyl sulfates, and nitrogen mustards. Note that nitrosoamines are mutagenic in two ways:They can react to yield HNO 2 , or they can act as alkylating agents. The nitrosoguanidine, N-methyl-NЈ-nitro- N-nitrosoguanidine, is a very potent mutagen used in labora- tories to induce mutations in experimental organisms such as Drosophila melanogaster. Ethylmethane sulfonate (EMS) and dimethyl sulfate are also favorite mutagens among geneticists. ⅷ SPECIAL FOCUS Special Focus 895 mammals and are believed to be the agents responsible for the human diseases kuru, Creutzfeldt-Jakob disease, Gerstmann-Straussler-Sheinker syndrome, and fatal familial insomnia. Prions also cause diseases in animals, including scrapie (in sheep), “mad cow disease” (bovine spongiform encephalopathy), and chronic wast- ing disease (in elk and mule deer). All attempts to show that the infectivity of these diseases is due to a nucleic acid–carrying agent have been unsuccessful. Prion dis- eases are novel in that they are genetic and infectious; their occurrence may be spo- radic, dominantly inherited, or acquired by infection. PrP, the prion protein, comes in various forms, such as PrP c , the normal cellular prion protein, and PrP sc , the scrapie form of PrP, a conformational variant of PrP c that is protease resistant, sometimes written as PrP res . These two forms are thought to differ only in terms of their secondary and tertiary structure. One model suggests that PrP c is dominated by ␣-helical elements (Figure 28.34a), whereas PrP sc has both ␣-helices and -strands (Figure 28.34b). It has been hypothesized that the presence of PrP sc can cause PrP c to adopt the PrP sc conformation. The various diseases are a consequence of the accumulation of the abnormal PrP sc form, which accumulates as amyloid plaques (amyloid ϭ starchlike), that cause destruction of tissues in the cen- tral nervous system. Ironically, recent evidence suggests that PrP c may function as a nucleic acid-binding protein. The 1997 Nobel Prize in Physiology or Medicine was awarded to Stanley B. Prusiner for his discovery of prions. (a) (b) ANIMATED FIGURE 28.34 Speculative models suggest that (a) PrP c is mostly ␣-helical, whereas (b) PrP sc has both ␣-helices and -strands. (Adapted from Figure 1 in Prusiner, S.B., 1996. Molecular biology and the patho- genesis of prion diseases. Trends in Biochemical Sciences 21:482–487.) See this figure animated at www.cengage .com/login. Gene Rearrangements and Immunology—Is It Possible to Generate Protein Diversity Using Genetic Recombination? Animals have evolved a way to exploit genetic recombination in order to generate protein diversity. This development was crucial to the evolution of the immune sys- tem. For example, the immunoglobulin genes are a highly evolved system for max- imizing protein diversity from a finite amount of genetic information. This diversity 896 Chapter 28 DNA Metabolism: Replication, Recombination, and Repair A DEEPER LOOK Inteins—Bizarre Parasitic Genetic Elements Encoding a Protein-Splicing Activity Inteins are parasitic genetic elements found within protein-coding regions of genes. These selfish DNA el- ements are transcribed and translated along with the flanking host gene sequences. The typical intein pro- tein consists of two domains: One domain is capable of self-catalyzed protein splicing; the other is an endo- nuclease that mediates the insertion of the intein nu- cleotide sequence into host genes. After the full protein is synthesized, the intein catalyzes excision of itself from the host protein and ligation of adjacent host polypep- tide regions to form the functional protein that the host gene encodes. These adjacent polypeptides are termed exteins (“external proteins”) to distinguish them from the intein (“internal protein”). Inteins have been found across all domains of life—archaea, bacte- ria, and eukaryotes—although thus far only in unicel- lular organisms. Inteins vary in size from about 130 to 600 amino acid residues. The protein splicing function of inteins is found in its N-terminal and its C-terminal regions; the endonuclease function that carries out par- asitic insertion of the intein sequence into host genes is found in the central part of the intein. Splicing of the protein is an intramolecular process that liberates the intein sequence and ligates the host protein sequences (see accompanying figure). Inteins are usually found as inserts in highly conserved host genes that have essential functions, such as genes en- coding DNA or RNA polymerases, proton-translocating ATPases, or other vital metabolic enzymes. Their location in such genes means that removal of the intein via dele- tion or genetic rearrangement is more difficult. The endo- nuclease activity of the intein recognizes a 14– to 40–base- pair sequence in a potential host gene and cleaves the DNA there. During repair of the double-stranded DNA break, the intein gene is copied into the cleavage site, thereby establishing the parasitic genetic element in the host gene. C-extein Translation Transcription 5Ј 3Ј N-extein Intein N-extein C-extein N-extein N-extein Protein Protein Splicing: mRNA DNA Intein Coding sequenceCoding sequence Intein Host gene and intein N N NH 2 H H O OH OH O O Intein C-extein N H NH 2 H 2 N O H 2 N OH OH O C C O C-extein N H NH 2 O O O O 1 2 3 4 O H 2 N OH O NH H 2 N OH O O NH ϩ N-extein C N-extein C C-extein O C-extein OH O H 2 N N H Intact Host Protein O Excised Intein O ᮣ Transcription and translation of the combined intein-host gene flanking sequences leads to synthesis of a fused intein– extein protein. The intein splices itself out when (1) the C-terminal residue of the N-extein is shifted to the O (or S) atom of a neighboring intein Ser (or Cys) residue, (2) the N-extein C-terminal carbonyl undergoes nucleophilic attack by the O (or S) atom of a Ser (or Cys) residue at the end of the C-extein in a transesterification reaction that creates a branched protein intermediate, (3) cyclization of the intein C-terminal asparagine residue excises the intein, and (4) the two exteins are properly united via a peptide bond when the N-extein C-terminus spontaneously shifts to the C-extein N-terminus to form an intact host protein. Adapted from Paulus, H., 2000. Protein splicing and related forms of protein autoprocessing. Annual Review of Biochemistry 69:447–496; and Gogarten, J. P., et al., 2002. Inteins: Structure, function, and evolution. Annual Review of Microbiology 56:263–287. Special Focus 897 is essential for gaining immunity to the great variety of infectious organisms and for- eign substances that cause disease. Cells Active in the Immune Response Are Capable of Gene Rearrangement Only vertebrates show an immune response. If a foreign substance, called an antigen, gains entry to the bloodstream of a vertebrate, the animal responds via a protective system called the immune response. The immune response involves pro- duction of proteins capable of recognizing and destroying the antigen. This re- sponse is mounted by certain white blood cells—the B- and T-cell lymphocytes and the macrophages. B cells are so named because they mature in the bone marrow; T cells mature in the thymus gland. Each of these cell types is capable of gene re- arrangement as a mechanism for producing proteins essential to the immune re- sponse. Antibodies, which can recognize and bind antigens, are immunoglobulin proteins secreted from B cells. Because antigens can be almost anything, the im- mune response must have an incredible repertoire of structural recognition. Thus, vertebrates must have the potential to produce immunoglobulins of great diversity in order to recognize virtually any antigen. Immunoglobulin G Molecules Contain Regions of Variable Amino Acid Sequence Immunoglobulin G (IgG or ␥-globulin) is the major class of antibody molecules found circulating in the bloodstream. IgG is a very abundant protein, amounting to 12 mg per mL of serum. It is a 150-kD ␣ 2  2 -type tetramer. The ␣ or H (for heavy) chain is 50 kD; the  or L (for light) chain is 25 kD. A preparation of IgG from serum is heterogeneous in terms of the amino acid sequences represented in its L and H chains. However, the IgG L and H chains produced from any given B lym- phocyte are homogeneous in amino acid sequence. L chains consist of 214 amino acid residues and are organized into two roughly equal segments: the V L and C L re- gions. The V L designation reflects the fact that L chains isolated from serum IgG show variations in amino acid sequence over the first 108 residues, V L symbolizing this “variable” region of the L polypeptide. The amino acid sequence for residues 109 to 214 of the L polypeptide is constant, as represented by its designation as the “constant light,” or C L , region. The heavy, or H, chains consist of 446 amino acid residues. Like L chains, the amino acid sequence for the first 108 residues of H polypeptides is variable, ergo its designation as the V H region, while residues 109 to 446 are constant in amino acid sequence. This “constant heavy” region consists of three quite equivalent domains of homology designated C H 1, C H 2, and C H 3. Each L chain has two intrachain disulfide bonds: one in the V L region and the other in the C L region. The C-terminal amino acid in L chains is cysteine, and it forms an in- terchain disulfide bond to a neighboring H chain. Each H chain has four intrachain disulfide bonds, one in each of the four regions. Figure 28.35 presents a diagram of IgG organization. Within the variable regions of the L and H chains, certain posi- tions are hypervariable with regard to amino acid composition. These hypervariable residues occur at positions 24 to 34, 50 to 55, and 89 to 96 in the L chains and at positions 31 to 35, 50 to 65, 81 to 85, and 91 to 102 in the H chains. The hypervari- able regions are also called complementarity-determining regions, or CDRs, because it is these regions that form the structural site that is complementary to some part of an antigen’s structure, providing the basis for antibodyϺantigen recognition. In the immunoglobulin genes, the arrangement of exons correlates with protein structure. In terms of its tertiary structure, the IgG molecule is composed of 12 dis- crete collapsed -barrel domains. Within each domain, alternating -strands are anti- parallel to one another, a pattern known by the name Greek key motif. The charac- teristic structure of this domain is referred to as the immunoglobulin fold (Figure 28.36). Each of IgG’s two heavy chains contributes four of these domains and each 898 Chapter 28 DNA Metabolism: Replication, Recombination, and Repair SS S S SS S S SS SS SS SS SS SS S S S S N N S S S S S S S S C (CH 2 O) n addition site C C H 2C H 3 446 C H 1 V H V L C L Hinge region N N Heavy Light 214 Antigen binding Antigen binding 4.5 nm FIGURE 28.35 Diagram of the organization of the IgG molecule.Two identical L chains are joined with two identical H chains. Each L chain is held to an H chain via an interchain disulfide bond.The variable regions (purple) of the four polypeptides lie at the ends of the arms of the Y-shaped molecule.These regions are re- sponsible for the antigen recognition function of the antibody molecules.The actual antigen-binding site is constituted from hypervariable residues within the V L and V H regions. For purposes of illustration, some fea- tures are shown on only one or the other L chain or H chain, but all features are common to both chains. N Immunoglobulin V L domain (a) 9 Immunoglobulin C L domain (~ C H domains) ~ (b) 1 2 7 6 8 3 4 5 V H V H V L V L C L COO – COO – CHO CHO – OOC – OOC C L C H 1 C H 2C H 2 C H 3 C H 3 C H 1 F ab F c Antigen- binding site Antigen- binding site (c) H 3 N NH 3 NH 3 NH 3 + + + + 1 2 7 6 3 8 9 ACTIVE FIGURE 28.36 The characteristic “collapsed -barrel domain”known as the immunoglobulin fold. The -barrel structures for both (a) variable and (b) constant regions are shown. (c) A schematic diagram of the 12 collapsed -barrel domains that make up an IgG molecule. CHO indicates the carbohydrate addition site; F ab denotes one of the two antigen-binding fragments of IgG, and F c , the proteolytic fragment con- sisting of the pairs of C H 2 and C H 3 domains. Test yourself on the concepts in this fig- ure at www.cengage.com/login. Special Focus 899 of its light chains contributes two. The four variable-region domains (one on each chain) are encoded by multiple exons, but the eight constant-region domains are each the product of a single exon. All of these constant-region exons are derived from a single ancestral exon encoding an immunoglobulin fold. The major variable- region exon probably derives from this ancestral exon also. Contemporary immuno- globulin genes are a consequence of multiple duplications of the ancestral exon. The discovery of variability in amino acid sequence in otherwise identical poly- peptide chains was surprising and almost heretical to protein chemists. For geneti- cists, it presented a genuine enigma. They noted that mammals, which can make millions of different antibodies, don’t have millions of different antibody genes. How can the mammalian genome encode the diversity seen in L and H chains? The Immunoglobulin Genes Undergo Gene Rearrangement The answer to the enigma of immunoglobulin sequence diversity is found in the or- ganization of the immunoglobulin genes. The genetic information for an immuno- globulin polypeptide chain is scattered among multiple gene segments along a chro- mosome in germline cells (sperm and eggs). During vertebrate development and the formation of B lymphocytes, these segments are brought together and assembled by DNA rearrangement (that is, genetic recombination) into complete genes. DNA re- arrangement, or gene reorganization, provides a mechanism for generating a variety of protein isoforms from a limited number of genes. DNA rearrangement occurs in only a few genes, namely, those encoding the antigen-binding proteins of the immune response—the immunoglobulins and the T-cell receptors. The gene segments encod- ing the amino-terminal portion of the immunoglobulin polypeptides are also unusu- ally susceptible to mutation events. The result is a population of B cells whose antibody- encoding genes collectively show great sequence diversity even though a given cell can make only a limited set of immunoglobulin chains. Hence, at least one cell among the B-cell population will likely be capable of producing an antibody that will specifically recognize a particular antig en. DNA Rearrangements Assemble an L-Chain Gene by Combining Three Separate Genes The organization of various immunoglobulin gene segments in the human genome is shown in Figure 28.37. L-chain variable-region genes are assembled from two kinds of germline genes: V L and J L ( J stands for joining). In mammals, there are two differ- ent families of L-chain genes: the , or kappa, gene family and the , or lambda, gene family; each family has V and J members. These families are on different chromo- somes. Humans have 40 functional V genes and 5 functional J genes for the light chains and 31 V genes and 4 J genes for the light chain. The V and J genes lie up- stream from the single C gene that encodes the L-chain constant region. Each V gene has its own L segment for encoding the L-chain leader peptide that targets the L chain to the endoplasmic reticulum for IgG assembly and secretion. (This leader peptide is cleaved once the L chain reaches the ER lumen.) The family of L-chain genes is organized similarly (Figure 28.37). In different mature B-lymphocyte cells, V and J genes have joined in different combinations, and along with the C–V gene, form complete L–V chains with a variety of V regions. However, any given B lymphocyte expresses only one V –J combination. Construction of the mature B-lymphocyte L-chain gene has occurred by DNA rearrangements that combine three genes (L–V , , J , , C , ) to make one polypeptide! DNA Rearrangements Assemble an H-Chain Gene by Combining Four Separate Genes The first 98 amino acids of the 108-residue, H-chain variable region are encoded by a V H gene. Each V H gene has an accompanying L H gene that encodes its essential leader peptide. It is estimated that there are 200 to 1000 V H genes and that they can 900 Chapter 28 DNA Metabolism: Replication, Recombination, and Repair be subdivided into eight distinct families based on nucleotide sequence homology. The members of a particular V H family are grouped together on the chromosome, separated from one another by 10 to 20 bp. In assembling a mature H-chain gene, a V H gene is joined to a D gene (D for diversity), which encodes amino acids 99 to 113 of the H chain. These amino acids comprise the core of the third CDR in the variable region of H chains. The V H –D gene assemblage is linked in turn to a J H gene, which encodes the remaining part of the variable region of the H chain. The V H , D, and J H genes are grouped in three separate clusters on the same chro- mosome. The four J H genes lie 7 kb upstream of the eight C genes, the closest of which is C . Any of four C genes may encode the constant region of IgG H chains: C ␥1 , C ␥2a , C ␥2b , and C ␥3 . Each C gene is composed of multiple exons (only C is shown in Figure 28.37, none of the other C genes). Ten to twenty D genes are found 1 to 80 kb farther upstream. The V H genes lie even farther upstream. In B lympho- cytes, the variable region of an H-chain gene is composed of one each of the L H –V H genes, a D gene, and a J H gene joined head to tail. Because the H-chain variable re- gion is encoded in three genes and the joinings can occur in various combinations, the H chains have a greater potential for diversity than the L-chain variable regions that are assembled from just two genes (for example, L –V and J ). In making H-chain genes, four genes have been brought together and reorganized by DNA re- arrangement to produce a single polypeptide! V–J and V–D–J Joining in Light- and Heavy-Chain Gene Assembly Is Mediated by the RAG Proteins Specific nucleotide sequences adjacent to the various variable-region genes suggest a mechanism in which these sequences act as joining signals. All germline V and D genes are followed by a consensus CACAGTG heptamer separated from a consen- sus ACAAAAACC nonamer by a short, nonconserved 23-bp spacer. Likewise, all germline D and J genes are immediately preceded by a consensus GGTTTTTGT nonamer separated from a consensus CACTGTG heptamer by a short noncon- served 12-bp spacer (Figure 28.38). Note that the consensus elements downstream of a gene are complementary to those upstream from the gene with which it re- combines. Indeed, it is these complementary consensus sequences that serve as recombination signal sequences (RSSs) and determine the site of recombination between variable-region genes. Functionally meaningful recombination happens only where one has a 12-bp spacer and the other has a 23-bp spacer (Figure 28.38). V 1 (a) V 2 V 3 J C V 1 V 2 V 3 J C Germline locus V J rearranged V N (b) V H 2 VH 1 JH C Germline locus VH N DH VH 2 VH 1 JH C DJ rearranged VH N DH VH 2 VH 1 JH C VDJ rearranged D H FIGURE 28.37 Organization of human immunoglobulin gene segments. Green, orange, blue, or purple colors indi- cate the exons of a particular V L or V H gene. (a) L-chain gene assembly: During B-lymphocyte maturation in the bone marrow, one of the 40 V genes combines with one of the 5 J genes and is joined with a C gene. During the recombination process, the intervening DNA between the gene segments is deleted (see Figure 28.39).These rear- rangements occur by a mostly random process, giving rise to many possible light-chain sequences from each gene family. (b) H-chain gene assembly: H chains are encoded by V,D, J, and C genes. In H-chain gene rearrangements, a D gene joins with a J gene and then one of the V genes adds to the DJ assembly.(Adapted from Figure 2b and c in Nossal, G.J. V., 2003.The double helix and immunology. Nature 421:440–444.) Special Focus 901 Lymphoid cell-specific recombination-activating gene proteins 1 and 2 (RAG1 and RAG2) recognize and bind at these RSSs, presumably through looping out of the 12- and 23-bp spacers and alignment of the homologous heptamer and nonamer re- gions (Figure 28.39). RAG1 and RAG2 together function as the V(D)J recombinase. RAG1/RAG2 action cleaves and processes the ends of the V and J DNA, producing 7997 23 12 79 9 7 chain chain H chain V λ J λ 23 12 V κ J κ V H J H CACAGTG ACAAAAACC GGTTTTTGT CACTGTG 7997 23 12 79 12 12 D 79 FIGURE 28.38 Consensus elements are located above and below germline variable-region genes that recom- bine to form genes encoding immunoglobulin chains. These consensus elements are complementary and are arranged in a heptamer-nonamer, 12- to 23-bp spacer pattern. (Adapted from Tonewaga, S., 1983. Somatic generation of antibody diversity. Nature 302:575.) (a) (b) (c) Discarded (d) DNA ligase Recombined gene DNA–PK complex 23 bp 12 bp RAG1/RAG2 Nonamer Heptamer DNA–PK complex DNA ligase J J J J V V V V FIGURE 28.39 Model for V(D)J recombination.A RAG1ϺRAG2 complex is assembled on DNA in the region of recombination signal sequences (a), and this complex introduces double-stranded breaks in the DNA at the bor- ders of protein-coding sequences and the recombination signal sequences (b). The products of RAG1Ϻ RAG2 DNA cleavage are novel:The DNA bearing the recombination signal sequences has blunt ends, whereas the coding DNA has hairpin ends.That is, the two strands of the V and J coding DNA segments are covalently joined as a result of transesterification reactions catalyzed by RAG1ϺRAG2.To complete the recombination process,the two RSS ends are precisely joined to make a covalently closed circular dsDNA, but the V and J coding ends undergo further processing (c). Coding-end processing involves opening of the V and J hairpins and the addition or removal of nucleotides.This processing means that joining of the V and J coding ends is imprecise, providing an additional means for introducing antibody diversity.Finally, the V and J coding segments are then joined to create a recombi- nant immunoglobulin-encoding gene (d). The processing and joining reactions require RAG1Ϻ RAG2, DNA- dependent protein kinase (DNA-PK)—Ku70, Ku80, and DNA ligase. (Adapted from Figure 1 in Weaver, D.T., and Alt,F.W., 1997. From RAGs to stitches. Nature 388:428–429.) 902 Chapter 28 DNA Metabolism: Replication, Recombination, and Repair what is effectively a double-stranded break (DSB). Proteins involved in NHEJ-type repair of DSBs (Ku70/80, DNA-PK, and DNA ligase) bind at the DSB and religate the DNA to create a recombinant immunoglobulin gene (Figure 28.39). Imprecise Joining of Immunoglobulin Genes Creates New Coding Arrangements Joining of the ends of the immunoglobulin-coding regions during gene reorganiza- tion is somewhat imprecise. This imprecision actually leads to even greater antibody di- versity because new coding arrangements result. Position 96 in chains is typically en- coded by the first triplet in the J element. Most chains have one of four amino acids here, depending on which J gene was recruited in gene assembly. However, occasion- ally only the second and third bases or just the third base of the codon for position 96 is contributed by the J gene, with the other one or two nucleotides supplied by the V segment (Figure 28.40). So, the precise point where recombination occurs during gene reorganization can vary over several nucleotides, creating even more diversity. Antibody Diversity Is Due to Immunoglobulin Gene Rearrangements Taking as an example the mouse with perhaps 300 V genes, 4 J genes, 200 V H genes, 12 D genes, and 4 J H genes, the number of possible combinations is given by 300 ϫ 4 ϫ 200 ϫ 12 ϫ 4. Thus, more than 10 7 different antibody molecules can be created from roughly 500 or so different mouse variable-region genes. Including the possibility for V –J joinings occurring within codons adds to this diversity, as does the high rate of somatic mutation associated with the variable-region genes. (Somatic mutations are mutations that arise in diploid cells and are transmitted to the progeny of these cells within the organism, but not to the offspring of the or- ganism.) Clearly, gene rearrangement is a powerful mechanism for dramatically en- hancing the protein-coding potential of genetic information. V GTTCATCTTCGA J ATGGCAAGCTTG Val 94 Gln 95 96 97 Ser Leu V GTTCATCTTCGA J ATGGCAAGCTTG Val His Ser Leu V GTTCATCTTCGA J ATGGCAAGCTTG Val His Arg Leu V GTTCATCTTCGA J ATGGCAAGCTTG Val His Leu Leu FIGURE 28.40 Recombination between the V and J genes can vary by several nucleotides, giving rise to variations in amino acid sequence and hence diversity in immunoglobulin L chains. SUMMARY 28.1 How Is DNA Replicated? DNA replication is accomplished through strand separation and the copying of each strand. Strand sepa- ration is achieved by untwisting the double helix. Each separated strand acts as a template for the synthesis of a new complementary strand whose nucleotide sequence is fixed by Watson–Crick base-pairing rules. Base pairing then dictates an accurate replication of the original DNA double helix. DNA replication follows a semiconservative mechanism where each original strand is copied to yield a complete complementary strand and these paired strands, one old and one new, remain together as a duplex DNA molecule. Replication begins at specific regions called origins of replication and proceeds in both directions. Bidirectional replication involves two replication forks, which move in opposite di- rections. Helicases unwind the double helix, and DNA gyrases act to overcome torsional stress by introducing negative supercoils at the ex- pense of ATP hydrolysis. Because DNA polymerases synthesize DNA only in a 5Ј→3Ј direction, replication is semidiscontinuous: The 3Ј→5Ј strand can be copied continuously by DNA polymerase proceeding in the 5Ј→3Ј direction. The other parental strand is copied only when a sufficient stretch of its sequence has been exposed for DNA polymerase to move along it in the 5Ј→3Ј mode. Thus, one parental strand is copied continuously to form the leading strand, while the other parental strand is copied in an intermittent, or discontinuous, mode to yield a set of Okazaki fragments that are joined later to give the lagging strand. 28.2 What Are the Properties of DNA Polymerases? All DNA poly- merases share the following properties: (1) The incoming base is se- lected within the DNA polymerase active site through base-pairing with the corresponding base in the template strand, (2) chain growth is in the 5Ј→3Ј direction antiparallel to the template strand, and (3) DNA polymerases cannot initiate DNA synthesis de novo—all require a primer with a free 3Ј-OH to build upon. DNA polymerase III holoen- zyme, the enzyme that replicates of the E. coli chromosome, is com- posed of ten different kinds of subunits. DNA polymerases are immobi- lized in replication factories. 28.3 Why Are There So Many DNA Polymerases? Both prokaryotic and eukaryotic cells have a number of DNA polymerases. These differ- ent enzymes can be assigned to families based on sequence similarities. The various families of DNA polymerases fill different biological roles; the prominent roles include DNA replication, DNA repair, and telo- mere maintenance. All DNA polymerases share a common architecture resembling a right hand, composed of distinct finger, thumb, and palm structural domains, each serving a specific role in the polymerase reaction. 28.4 How Is DNA Replicated in Eukaryotic Cells? Eukaryotic DNA is organized into chromosomes within the nucleus. These chromosomes must be replicated once (and only once!) each cell cycle. Progression through the cell cycle is regulated through checkpoints that control whether the cell continues into the next phase. Cyclins and CDKs main- tain these checkpoints. Replication licensing factors (MCM proteins) in- teract with origins of replication and render chromosomes competent for replication. Three DNA polymerases—␣, ␦, and ⑀—carry out genome replication. DNA polymerase ␣ initiates replication through synthesis of an RNA. DNA polymerase ␦ is the principal DNA polym- erase in eukaryotic DNA replication. 28.5 How Are the Ends of Chromosomes Replicated? Telomeres are short, tandemly repeated, G-rich nucleotide sequences that form pro-