Problems 903 tective caps on the chromosome ends. DNA polymerases cannot repli- cate the extreme 5Ј-ends of chromosomes, but a special polymerase called telomerase maintains telomere length. Telomerase is a ribo- nucleoprotein, and its RNA component serves as template for telomere synthesis. 28.6 How Are RNA Genomes Replicated? Many viruses have genomes composed of RNA, not DNA. DNA may be an intermediate in the repli- cation of such viruses; that is, viral RNA serves as the template for DNA synthesis. This reaction is catalyzed by reverse transcriptase, an RNA- dependent DNA polymerase. 28.7 How Is the Genetic Information Shuffled by Genetic Recombina- tion? Genetic recombination is the exchange (or incorporation) of one DNA sequence with (or into) another. Recombination between very simi- lar DNA sequences is called homologous recombination. Homologous re- combination proceeds according to the Holliday model. The RecBCD en- zyme complex unwinds dsDNA and cleaves its single strands. RecA protein acts in recombination to catalyze the ATP-dependent DNA strand ex- change reaction. Procession of strand separation and re-pairing into hy- brid strands along the DNA duplex initiates branch migration, displacing the homologous DNA strand from the DNA duplex and replacing it with the ssDNA strand. RuvA, RuvB, and RuvC resolve the Holliday junction to form the recombination products. DNA replication is an essential compo- nent of both DNA recombination and DNA repair processes. Further- more, recombination mechanisms can restart replication forks that have halted at breaks or other lesions in the DNA strands. Transposons are mobile DNA segments ranging in size from several hundred base pairs to more than 8 kbp that move enzymatically from place to place in the genome. 28.8 Can DNA Be Repaired? Repair systems correct damage to DNA in order to maintain its information content. The most common forms of damage are (1) replication errors, (2) deletions or insertions, (3) UV- induced alterations, (4) DNA strand breaks, and (5) covalent crosslink- ing of strands. Cells have extraordinarily diverse and effective DNA re- pair systems to deal with these problems, some of which are also involved in DNA replication and recombination. When repair systems fail, the genome may still be preserved if an “error-prone” mode of replication allows the lesion to be bypassed. 28.9 What Is the Molecular Basis of Mutation? Mutations change the sequence of bases in DNA, either by the substitution of one base pair for another (point mutations) or by the insertion or deletion of one or more base pairs. Point mutations arise by the pairing of bases with in- appropriate partners during DNA replication, by the introduction of base analogs into DNA, or by chemical mutagens. Chemical mutagens are agents that chemically modify bases so that their base-pairing char- acteristics are altered. 28.10 Do Proteins Ever Behave as Genetic Agents? DNA is the ge- netic material in organisms, although some viruses have RNA genomes. The possibility that proteins carry genetic information was a point of speculation early in the history of molecular biology but was ultimately discounted for lack of evidence. Prions (an acronym for proteinaceous infectious particle) may be an exception. Prion particles are devoid of nucleic acid, yet they can transmit disease. Prion diseases are novel be- cause they may be either inherited (like a genetic agent) or acquired by infection. PrP, the prion protein, comes in several forms: PrP c , the nor- mal cellular protein, and a conformational variant of PrP c known as PrP sc or PrP res found in association with prion diseases. The propensity of the PrP sc conformational form to polymerize into cell-destructive ag- gregates is thought to be the basis of prion diseases. Special Focus: Gene Rearrangements and Immunology—Is It Possible to Generate Protein Diversity Using Genetic Recombination? Animals have evolved a way to exploit genetic recombination in order to generate protein diversity. The immunoglobulin genes are a highly evolved system for maximizing protein diversity from a finite amount of genetic infor- mation. Cells active in the immune response are capable of gene re- arrangements. The antibody diversity found in IgG molecules is a prime example of proteins produced via gene rearrangements. IgG L-chain genes are created by combining three separate genes, and H-chain genes by combining four. V–J and V–D–J joining in L- and H-chain gene assem- bly is mediated by RAG proteins. PROBLEMS Preparing for an exam? Create your own study path for this chapter at www.cengage.com/login. 1. If 15 N-labeled E. coli DNA has a density of 1.724 g/mL, 14 N-labeled DNA has a density of 1.710 g/mL, and E. coli cells grown for many generations on 14 NH 4 ϩ as a nitrogen source are transferred to media containing 15 NH 4 ϩ as the sole N-source, (a) what will be the density of the DNA after one generation, assuming replication is semicon- servative? (b) Supposing replication took place by a dispersive mech- anism, what would be the density of DNA after one generation? (c) Design an experiment to distinguish between semiconservative and dispersive modes of replication. 2. (a) What are the respective roles of the 5Ј-exonuclease and 3Ј- exonuclease activities of DNA polymerase I? (b) What might be a fea- ture of an E. coli strain that lacked DNA polymerase I 3Ј-exonuclease activity? 3. Assuming DNA replication proceeds at a rate of 750 base pairs per second, calculate how long it will take to replicate the entire E. coli genome. Under optimal conditions, E. coli cells divide every 20 min- utes. What is the minimal number of replication forks per E. coli chromosome in order to sustain such a rate of cell division? 4. On the basis of Figure 28.2, draw a simple diagram illustrating replication of the circular E. coli chromosome (a) at an early stage, (b) when one-third completed, (c) when two-thirds completed, and (d) when almost finished, assuming the initiation of replica- tion at oriC has occurred only once. Then, draw a diagram showing the E. coli chromosome in problem 3 where the E. coli cell is dividing every 20 minutes. 5. It is estimated that there are forty molecules of DNA polymerase III per E. coli cell. Is it likely that the growth rate of E. coli is limited by DNA polymerase III availability? 6. Approximately how many Okazaki fragments are synthesized in the course of replicating an E. coli chromosome? How many in repli- cating an “average” human chromosome? 7. How do DNA gyrases and helicases differ in their respective func- tions and modes of action? 8. Assuming DNA replication proceeds at a rate of 100 base pairs per second in human cells and origins of replication occur every 300 kbp, how long would it take to replicate the entire diploid human genome? How many molecules of DNA polymerase does each cell need to carry out this task? 9. From the information in Figure 28.17, diagram the recombina- tional event leading to the formation of a heteroduplex DNA re- gion within a bacteriophage chromosome. 10. Homologous recombination in E. coli leads to the formation of regions of heteroduplex DNA. By definition, such regions con- tain mismatched bases. Why doesn’t the mismatch repair system of E. coli eliminate these mismatches? 11. If RecA protein unwinds duplex DNA so that there are about 18.6 bp per turn, what is the change in ⌬, the helical twist of DNA, com- pared to its value in B-DNA? 12. Diagram a Holliday junction between two duplex DNA molecules and show how the action of resolvase might give rise to either patch or splice recombinant DNA molecules. 904 Chapter 28 DNA Metabolism: Replication, Recombination, and Repair 13. Show the nucleotide sequence changes that might arise in a dsDNA (coding strand segment GCTA) upon mutagenesis with (a) HNO 2 , (b) bromouracil, and (c) 2-aminopurine. 14. Transposons are mutagenic agents. Why? 15. Give a plausible explanation for the genetic and infectious proper- ties of PrP sc . 16. Hexameric helicases, such as DnaB, the MCM proteins, and papil- loma virus E1 helicase (illustrated in Figures 16.23–16.25), unwind DNA by passing one strand of the DNA duplex through the central pore, using a mechanism based on ATP-dependent binding interac- tions with the bases of that strand. The genome of E. coli K12 consists of 4,686,137 nucleotides. Assuming that DnaB functions like papil- loma virus E1 helicase, from the information given in Chapter 16 on ATP-coupled DNA unwinding, calculate how many molecules of ATP would be needed to completely unwind the E. coli K12 chromosome. 17. Asako Furukohri, Myron F. Goodman, and Hisaji Maki wanted to dis- cover how the translesion DNA polymerase IV takes over from DNA polymerase III at a stalled replication fork (see Journal of Biological Chemistry 283:11260–11269, 2008). They showed that DNA poly- merase IV could displace DNA polymerase III from a stalled replica- tion fork formed in an in vitro system containing DNA, DNA poly- merase III, the -clamp, and SSB. Devise your own experiment to show how such displacement might be demonstrated. (Hint: Assume that you have protein identification tools that allow you to distinguish easily between DNA polymerase III and DNA polymerase IV.) 18. The eukaryotic translesion DNA polymerases fall into the Y family of DNA polymerases. Structural studies reveal that their fingers and thumb domains are small and stubby (see Figure 28.10). In addi- tion, Y-family polymerase active sites are more open and less con- strained where base pairing leads to selection of a dNTP substrate for the polymerase reaction. Discuss the relevance of these struc- tural differences. Would you expect Y-family polymerases to have 3Ј-exonuclease activity? Explain your answer. Preparing for the MCAT Exam 19. Figure 28.11 depicts the eukaryotic cell cycle. Many cell types “exit” the cell cycle and don’t divide for prolonged periods, a state termed G 0 ; some, for example neurons, never divide again. a. In what stage of the cell cycle do you suppose a cell might be when it exits the cell cycle and enters G 0 ? b. The cell cycle is controlled by checkpoints, cyclins, and CDKs. De- scribe how biochemical events involving cyclins and CDKs might control passage of a dividing cell through the cell cycle. 20. Figure 28.40 gives some examples of recombination in IgG codons 95 and 96, as specified by the V and J genes. List the codon possi- bilities and the amino acids encoded if recombination occurred in codon 97. Which of these possibilities is less desirable? FURTHER READING General Holliday, R., 1964. A mechanism for gene conversion in fungi. Genetic Research 5:282–304. The classic model for the mechanism of DNA strand exchange during homologous recombination. Kornberg, A., 2005. DNA Replication, 2nd ed., New York: Macmillan. A comprehensive detailed account of the enzymology of DNA metab- olism, including replication, recombination, repair, and more. Lewin, B., 2007. Genes IX. Sudbury, MA: Jones and Bartlett. A contem- porary genetics text that seeks to explain heredity in terms of mo- lecular structures. Meselson, M., and Stahl, F. W., 1958. The replication of DNA in Escherichia coli. Proceedings of the National Academy of Sciences U.S.A. 44:671–682. The classic paper showing that DNA replication is semiconservative. Meselson, M., and Weigle, J. J., 1961. Chromosome breakage accompa- nying genetic recombination in bacteriophage. Proceedings of the National Academy of Sciences U.S.A. 47:857–869. The experiments demonstrating that physical exchange of DNA occurs during re- combination. Ogawa, T., and Okazaki, T., 1980. Discontinuous DNA replication. An- nual Review of Biochemistry 49:421–457. Okazaki fragments and their implications for the mechanism of DNA replication. Palmiter, R. D., et al., 1982. Dramatic growth of mice that develop from eggs microinjected with metallothionein-growth hormone fusion genes. Nature 300:611–615. DNA Replication Baker, T. A., and Bell, S. P., 1998. Polymerases and the replisome: Ma- chines within machines. Cell 92:295–305. Bell, S. P., and Dutta, A., 2002. DNA replication in eukaryotic cells. An- nual Review of Biochemistry 71:333–374. Blow, J. J., and Dutta, A., 2005. Preventin g re-replication of chromoso- mal DNA. Nature Reviews Molecular Cell Biology 6:476–486. Botchan, M., 2007. A switch for S phase. Nature 445:272–274. Cvetic, C. A., and Walter, J. C., 2006. Getting a grip on licensing: Mech- anism of stable MCM2-7 loading onto replication origins. Cell 21: 143–148. Franklin, M. C., Wang, J., and Steitz, T. A., 2001. Structure of the repli- cating complex of a pol ␣ family DNA polymerase. Cell 105:657–667. Frick, D. N., and Richardson, C. C., 2001. DNA primases. Annual Review of Biochemistry 70:39–80. Goodman, M. F., 2002. Error-prone repair DNA polymerases in prokary- otes and eukaryotes. Annual Review of Biochemistry 71:17–50. Hübscher, U., Maga, G., and Spadari, S., 2002. Eukaryotic DNA po- lymerases. Annual Review of Biochemistry 71:133–163. Keck, J. L., 2000. Structure of the RNA polymerase domain of the E. coli primase. Science 287:2482–2486. Kool, E. T., 2002. Active site tightness and substrate fit in DNA replica- tion. Annual Review of Biochemistry 71:191–219. Leu, F. P., Georgescu, R., and O’Donnell, M., 2003. Mechanism of the E. coli processivity switch during lagging-strand synthesis. Molecular Cell 11:315–327. Machida, Y . J., and Dutta, A., 2005. Cellular checkpoint mechanisms monitoring proper initiation of DNA replication. Journal of Biologi- cal Chemistry 280:6253–6256. Marians, K. J., 2008. Understanding how the replisome works. Nature Structural and Molecular Biology 15:125–127. McHenry, C., 2003. Chromosomal replicases as asymmetric dimers: Studies of subunit arrangement and functional consequences. Mo- lecular Microbiology 49:1157–1165. Pomerantz, R. T., and O’Donnell, M., 2007. Replisome mechanics: In- sights into a twin DNA polymerase machine. Trends in Microbiology 15:156–164. Randell, J. C. W., Bowers, J. L., Rodriguez, H. K., and Bell, S. P., 2006. Sequential ATP hydrolysis by Cdc6 and ORC directs loading of the Mcm2-7 helicase. Molecular Cell 21:29–39. Rothwell, P. J., and Waksman, G., 2005. Structure and mechanism of DNA polymerases. Advances in Protein Chemistry 71:401–440. Steitz, T. A., 1998. A mechanism for all polymerases. Nature 391:231–232. Tye, B. K., 1999. MCM proteins in DNA replication. Annual Review of Bio- chemistry 68:649–686. Protein Rings in DNA Metabolism Hingorani, M. M., and O’Donnell, M., 2000. A tale of toroids in DNA metabolism. Nature Reviews Molecular Cell Biology 1:22–30. Wyman, C., and Botchan, M., 1995. DNA replication: A familiar ring to DNA polymerase processivity. Current Biology 5:334–337. Telomerase Blackburn, E. H., 1992. Telomerases. Annual Review of Biochemistry 61: 113–129. Further Reading 905 Collins, K., 1999. Ciliate telomerase biochemistry. Annual Review of Bio- chemistry 68:187–218. Kim, N. W., 1994. Specific association of human telomerase activity with immortal cells and cancer. Science 266:2011–2015. Nakamura, T. M., et al., 1997. Telomerase catalytic subunit homologs from fission yeast and human. Science 277:955–959. Prions Cohen, F. E., and Prusiner, S. B., 1998. Pathological conformations of prion proteins. Annual Review of Biochemistry 67:793–819. Prusiner, S. B., 1996. Molecular biology and pathogenesis of prion dis- eases. Trends in Biochemical Sciences 21:482–487. Prusiner, S. B., 1997. Prion diseases and the BSE crisis. Science 278: 245–251. Recombination Alberts, B., 2003. DNA replication and recombination. Nature 421: 431–435. Anderson, D. G., and Kowalczykowski, S. C., 1997. The translocating RecBCD enzyme stimulates recombination by directing RecA pro- tein onto ssDNA in a -regulated manner. Cell 90:77–86. Baumann, P., and West, S. C., 1998. Role of the human RAD51 protein in homologous recombination and double-stranded-break repair. Trends in Biochemical Sciences 23:247–252. Beernink, H. T. H., and Morrical, S. W., 1999. RMPs: Recombination/ replication proteins. Trends in Biochemical Sciences 24:385–389. Chen, Z., Yang, H., and Pavletich, N. P., 2008. Mechanism of homolo- gous recombination from the RecA-ssDNA/dsDNA structures. Na- ture 453:489–494. Cox, M. M., 2007. Motoring along with the bacterial RecA protein. Na- ture Reviews Molecular Cell Biology 8:127–138. Haber, J. E., 1999. DNA recombination: The replication connection. Trends in Biochemical Sciences 24:271–275. Kowalczykowski, S. C., 2000. Initiation of genetic recombination and recombination-dependent replication. Trends in Biochemical Sciences 25:156–165. Krishna, R., Prabu, J. R., Manjunath, G. P., Datta, S., et al., 2007. Snap- shots of RecA protein involving movement of the C-domain and dif- ferent conformations of the DNA-binding loops: Crystallographic and comparative analysis of 11 structures of Mycobacterium smegmatis RecA. Journal of Molecular Biology 367:1130–1144. Lovett, S. T., 2003. Connecting replication and recombination. Molecu- lar Cell 11:554–556. Lusetti, S. L., and Cox, M. M., 2002. The bacterial RecA protein and the recombinational DNA repair of replication forks. Annual Review of Biochemistry 71:71–100. Rafferty, J. B., et al., 1996. Crystal structure of DNA recombination pro- tein RuvA and a model for its binding to the Holliday junction. Sci- ence 274:415–421. Roca, A. I., and Cox, M. M., 1997. RecA protein: Structure, function, and role in recombinational DNA repair. Progress in Nucleic Acid Re- search and Molecular Biology 56:127–223. Taylor, A. F., and Smith, G. R., 2003. RecBCD enzyme is a DNA helicase with fast and slow motors of opposite polarity . Nature 423:889–893. See also Dillingham, M. S., Spies, M., and Kowalczykowski, S. C., 2003. RecBCD is a bipolar DNA helicase. Nature 423:893–897. Wigley, D. B., 2007. RecBCD: The supercar of DNA repair. Cell 131: 651–653. Yamada, K., Ariyoshi, M., and Morikawa, K., 2004. Three-dimensional structural views of branch migration and resolution in DNA homolo- gous recombination. Current Opinion in Structural Biology 14:130–137. Transposons Lambowitz, A. M., and Belfort, M., 1993. Introns as mobile genetic ele- ments. Annual Review of Biochemistry 62:587–622. Stellwagen, A. E., and Craig, N. L., 1998. Mobile DNA elements: Control- ling transposition with ATP-dependent molecular switches. Trends in Biochemical Sciences 23:486–490. V(D)J Recombination and the Immunoglobulin Genes Gellert, M., 2002. V(D)J recombination: RAG proteins, repair factors, and regulation. Annual Review of Biochemistry 71:101–132. Hiom, K., and Gellert, M., 1997. A stable RAG1-RAG2-DNA complex that is active in V(D)J cleavage. Cell 88:65–72. Lewis, S. M., and Wu, G. E., 1997. The origins of V(D)J recombination. Cell 88:159–162. Nossal, G. J. V., 2003. The double helix and immunology. Nature 421: 440–444. Transgenic Animals Morgan, R. A., and Anderson, W. F., 1993. Human gene therapy. Annual Review of Biochemistry 62:192–217. Schnieke, A. E., et al., 1997. Human factor IX transgenic sheep pro- duced by transfer of nuclei from transfected fetal fibroblasts. Science 278:2130–2133. Wilmut, I., et al., 1997. Viable offspring derived from fetal and adult mammalian cells. Nature 385:810–818. See also Campbell, K. H. S., et al., 1996. Sheep cloned by nuclear transfer from a cultured cell line. Nature 380:64–66. Repair Bartek, J., and Lukas, J., 2003. Damage alert. Nature 421:486–488. Friedberg, E. C., 2003. DNA damage and repair. Nature 421:436–440. Friedberg, E. C., Walker, G. C., and Siede, W., 1995. DNA Repair and Mu- tagenesis. Washington, DC: ASM Press. Marians, K. J., 2000. PriA-directed replication fork restart in Escherichia coli. Trends in Biochemical Sciences 25:185–189. McCollough, A. K., et al., 1999. Initiation of base excision repair: Gly- cosylase mechanisms and structures. Annual Review of Biochemistry 68:255–285. Michel, B., 2000. Replication fork arrest and DNA recombination. Trends in Biochemical Sciences 25:173–178. Modrich, P., and Lahue, R., 1996. Mismatch repair in replication fi- delity, genetic recombination, and cancer biology. Annual Review of Biochemistry 65: 101–133. Mol, C. D., et al., 1999. DNA repair mechanisms for the recognition and removal of damaged DNA bases. Annual Review of Biophysics and Bio- molecular Str ucture 28:101–128. Morgan, A. R., 1993. Base mismatches and mutagenesis: How important is tautomerism? Trends in Biochemical Sciences 18:160–163. Parikh, S. S., et al., 1999. Envisioning the molecular choreography of DNA base excision repair. Current Opinion in Structural Biology 9:37–47. Sancar, A., 1994. Mechanisms of DNA excision repair. Science 266: 1954–1956. (Science named the extended family of DNA repair en- zymes its “Molecules of the Year” in 1994. See the 23 December 1994 issue of Science for additional readings.) British Museum, UK/Bridgeman Art Library 29 Transcription and the Regulation of Gene Expression All cells contain three major classes of RNA—mRNA, ribosomal RNA (rRNA), and transfer RNA (tRNA) and all participate in protein synthesis (see Chapters 10 and 30). Further, all RNAs are synthesized from DNA templates by DNA- dependent RNA polymerases in the process known as transcription. However, only mRNAs direct the synthesis of proteins. Protein synthesis occurs via the process of translation, wherein the instructions encoded in the sequence of bases in mRNA are translated into a specific amino acid sequence by ribosomes, the “workbenches” of polypeptide synthesis (see Chapter 30). Transcription is tightly regulated in all cells. In prokaryotes, only 3% or so of the genes are undergoing transcription at any given time. The metabolic condi- tions and the growth status of the cell dictate which gene products are needed at any moment. Similarly, differentiated eukaryotic cells express only a small per- centage of their genes in fulfilling their biological functions, not the full genetic potential encoded in their chromosomes. 29.1 How Are Genes Transcribed in Prokaryotes? In prokaryotes, virtually all RNA is synthesized by a single species of DNA- dependent RNA polymerase. The only exception is the short RNA primers formed by primase during DNA replication. Like DNA polymerases, RNA po- lymerase links ribonucleoside 5Ј-triphosphates (ATP, GTP, CTP, and UTP, repre- sented generically as NTPs) in an order specified by base pairing with a DNA template: n NTP ⎯⎯→(NMP) n ϩ n PP i The enzyme moves along a DNA strand in the 3Ј→5Ј direction, joining the 5Ј-phosphate of an incoming ribonucleotide to the 3Ј-OH of the previous residue. Thus, the RNA chain grows 5Ј→3Ј during transcription, just as DNA chains do dur- ing replication. Subsequent hydrolysis of PP i to inorganic phosphate by the pyro- phosphatases present in all cells removes the product PP i , thus making the poly- merase reaction thermodynamically favorable. Prokaryotic RNA Polymerases Use Their Sigma Subunits to Identify Sites Where Transcription Begins Transcription is initiated in prokaryotes by RNA polymerase holoenzyme, a complex multimeric protein (about 400 kD) large enough to be visible in the electron micro- scope. Its subunit composition is ␣ 2 Ј. After two ␣-subunits (35 kD each) dimerize, one recruits  and the other Ј to form the clawlike core polymerase structure (Fig- ure 29.1). The two largest subunits, Ј (171 kD) and  (124 kD), perform most of the enzymatic functions. The -subunit forms most of the upper jaw of the claw and con- tains the catalytic Mg 2ϩ -binding site; Ј forms the lower jaw. DNA passes through a 2.7-nm channel between the jaws of the claw. Nucleotide substrates reach the The Rosetta stone, inscribed in 196 B.C. The writing on the Rosetta stone is in three forms: hieroglyphs, Demotic (the conventional Egyptian script of the time), and Greek (the Greeks ruled Egypt in 196 B.C.). The Rosetta stone represents the transcription of hieroglyphic symbols into two living languages. Shown here is part of the interface where hieroglyphs and Demotic meet. Now that we have all this useful information, it would be nice to do something with it. From the Unix Programmer’s Manual KEY QUESTIONS 29.1 How Are Genes Transcribed in Prokaryotes? 29.2 How Is Transcription Regulated in Prokaryotes? 29.3 How Are Genes Transcribed in Eukaryotes? 29.4 How Do Gene Regulatory Proteins Recognize Specific DNA Sequences? 29.5 How Are Eukaryotic Transcripts Processed and Delivered to the Ribosomes for Translation? 29.6 Can We Propose a Unified Theory of Gene Expression? ESSENTIAL QUESTION Expression of the information encoded in DNA depends on transcription of that infor- mation into RNA. How are the genes of prokaryotes and eukaryotes transcribed to form RNA products that can be translated into proteins? Create your own study path for this chapter with tutorials, simulations, animations, and Active Figures at www.cengage.com/login. 29.1 How Are Genes Transcribed in Prokaryotes? 907 catalytic center through a secondary channel entering on the back side of the en- zyme. Binding of the -subunit to Ј allows the RNA polymerase to recognize differ- ent DNA sequences that act as promoters. A number of related proteins, the sigma () factors, can serve as the -subunit. Promoters are nucleotide sequences that identify the location of transcription start sites, where transcription begins. Both  and Ј contribute to formation of the catalytic site for RNA synthesis. Dissociation of the -subunit from the holoenzyme leaves the core polymerase (␣ 2 Ј), which can tran- scribe DNA into RNA but is unable to recognize promoters and initiate transcription. Bacteriophage T7 expresses a simpler (monomeric) RNA polymerase (Figure 29.2) that shares the functional characteristics of prokaryotic RNA polymerases. A DEEPER LOOK Conventions Used in Expressing the Sequences of Nucleic Acids and Proteins Certain conventions are useful in tracing the course of infor- mation transfer from DNA to protein. The strand of duplex DNA that is read by RNA polymerase is termed the template strand. Thus, the strand that is not read is the nontemplate strand. Because the template strand is read by the RNA poly- merase moving 3Ј→5Ј along it, the RNA product, called the transcript, grows in the 5Ј→3Ј direction (see accompanying figure). Note that the nontemplate strand has a nucleotide se- quence and direction identical to those of the RNA transcript, except that the transcript has U residues in place of T. Portions of the RNA transcript will eventually be translated into the amino acid sequence of a protein (see Chapter 30) by a process in which successive triplets of bases (termed codons), read 5Ј→3Ј, specify a particular amino acid. Polypeptide chains are synthesized in the N⎯→C direction, and the 5Ј-end of mRNA encodes the N-terminus of the protein. By convention, when the order of nucleotides in DNA is shown as a single strand, it is the 5Ј→3Ј sequence of nucleo- tides in the nontemplate strand that is presented. Conse- quently, if convention is followed, DNA sequences are written in terms that correspond directly to mRNA sequences, which correspond in turn to the amino acid sequences of proteins as read beginning with the N-terminus. 5Ј 3Ј DNA Nontemplate strand Template strand RNA polymerase Transcription U G G C A U A A G C U C A C G U A RNA transcript 5Ј Translation Protein aa 1 Naa 2 aa 3 aa 4 C-terminal ATGGCATGCAA T A G C T C A T C G TACCGTACGTT A T C G A G T A G C FIGURE 29.1 Structure of the Thermus thermophilus core RNA polymerase ␣ 2 Ј (pdb id ϭ 2O5I).The template DNA strand is shown in green, the nontemplate DNA strand in blue, and the RNA transcript in hot pink.The two ␣ chains are orange, the  chain is cyan, and the Ј chain is yellow.The active-site Mg 2ϩ is shown as a red sphere. 908 Chapter 29 Transcription and the Regulation of Gene Expression The Process of Transcription Has Four Stages Transcription can be divided into four stages: (1) binding of RNA polymerase holoenzyme at promoter sites, (2) initiation of polymerization, (3) chain elonga- tion, and (4) chain termination. Binding of RNA Polymerase to Template DNA The process of transcription begins when the -subunit of RNA polymerase recognizes a promoter sequence (Figure 29.3), and RNA polymerase holoenzyme and the promoter form a closed promoter complex (Figure 29.3, Step 2). This stage in RNA polymeraseϺDNA interaction is re- ferred to as the closed promoter complex because the dsDNA has not yet been “opened” (unwound) so that the RNA polymerase can read the base sequence of the DNA template strand and transcribe it into a complementary RNA sequence. Once the closed promoter complex is established, the RNA polymerase holoen- zyme unwinds about 14 base pairs of DNA (base pairs located at positions Ϫ12 to ϩ2, relative to the transcription start site; see later discussion), forming the very stable open promoter complex (Figure 29.3, Step 3). Promoter sequences can be identified in vitro by DNA footprinting: RNA polymerase holoenzyme is bound to a putative promoter sequence in a DNA duplex, and the DNAϺprotein complex is treated with DNase I. DNase I cleaves the DNA at sites not protected by bound protein, and the set of DNA fragments left after DNase I digestion reveals the promoter (by definition, the pro- moter is the RNA polymerase holoenzyme binding site). RNA polymerase binding typically protects a nucleotide sequence spanning the region from Ϫ70 to ϩ20, where the ϩ1 position is defined as the transcription start site: that base in the nontemplate DNA strand that is identical with the first base in the RNA transcript. The next base, ϩ2, specifies the second base in the transcript. Nontemplate strand bases in the 5Ј, or “minus,” direction from the transcript start site are numbered Ϫ1, Ϫ2, and so on. (Note that there is no zero.) Nontemplate nu- cleotides in the “minus” direction are said to lie upstream of the transcription start site, whereas nucleotides in the 3Ј, or “plus,” direction are downstream of the tran- scription start site. The transcript start site on the template strand is usually a pyrimi- dine, so most transcripts begin with a purine. RNA polymerase binding protects 90 bp of DNA, equivalent to a distance of 30 nm along B-DNA. Because RNA polymerase is only 16 nm in its longest dimension, the DNA must be wrapped around the enzyme. Properties of Prokaryotic Promoters Promoters recognized by the principal factor, 70 , serve as the paradigm for prokaryotic promoters. These promoters vary in size from 20 to 200 bp but typically consist of a 40-bp region located on the 5Ј-side of the transcription start site. Within the promoter are two consensus sequence elements. These two elements are the Pribnow box 1 near Ϫ10, whose consensus se- quence is the hexameric TATAAT, and a sequence in the ؊35 region containing the hexameric consensus TTGACA (Figure 29.4). The Pribnow box and the Ϫ35 region are separated by about 17 bp of nonconserved sequence. RNA polymerase holo- FIGURE 29.2 Bacteriophage T7 RNA polymerase (pdb id ϭ 1MSW) in the act of transcription.T7 RNA polymerase is a 99-kD monomeric protein.The DNA is shown enter- ing the enzyme from the upper right.The template strand is green, the nontemplate strand is blue, the RNA transcript is hot pink. 1 Named for David Pribnow, who, along with David Hogness, first recognized the importance of this se- quence element in transcription. A consensus sequence can be defined as the bases that appear with highest frequency at each position when a series of sequences believed to have common function is compared. 909 Recognition of promoter by ; binding of polymerase holoenzyme to DNA; migration to promoter 5Ј 3Ј 3Ј 5Ј Promoter DNA template Formation of an RNA polymerase:closed promoter complex 5Ј 3Ј 3Ј 5Ј Unwinding of DNA at promoter and formation of open promoter complex 5Ј 3Ј 3Ј 5Ј RNA polymerase initiates mRNA synthesis, usually with a purine 5Ј 3Ј 3Ј 5Ј Purine NTP Pu P P P NTPs RNA polymerase holoenzyme- catalyzed elongation of mRNA by about 4 more nucleotides 5Ј 3Ј P N N N N P P 5Ј 3Ј 3Ј 5Ј 3Ј 5Ј Release of -subunit as core RNA polymerase proceeds down the template, elongating RNA transcript Promoter P P P N N N N 5Ј Pu Pu RNA pol Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 ACTIVE FIGURE 29.3 Sequence of events in the initiation and elongation phases of tran- scription as it occurs in prokaryotes. Nucleotides in this region are numbered with reference to the base at the transcription start site, which is designated ϩ1. Test yourself on the concepts in this figure at www .cengage.com/login. G Initiation site (+1) Pribnow box (–10 region) –35 regionGene Consensus sequence: T C T T G A C A T 42 38 82 84 79 64 53 45 41 [11–15 bp] T A T A A T 79 95 44 59 51 96 [5–8 bp] CT 42 4855 51 A G –35 region Pribnow box Initiation site araBAD araC bioA bioB galP2 lac lacI rrnA1 rrnD1 rrnE1 tRNA Tyr trp GATCCTACCTGACGCTTTTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTTTT GCCGTGATTATAGACA TTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTT G C TTCCAAAACGTGTTTT TGTTGTTAATTCGGTGTAGACTTGTAAACCTAAATCTTTTT CATAATCGACTTGTAA CCAAATTGAAAAGATTTAGGTTTACAAGTCTACACCGAATA ATTTATTCCATGTCAC CTTTTCGCATCTTTGTTATGCTATGGTTATTTCATACCATA ACCCCAGGCTTTACAC TTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCG G T CCATCGAATGGCGCAA ACCTTTCGCGGTATGGCATGATAGCGCCCGGAAGAGAGTCA AAAATAAATGCTTGAC CTGTAGCGGGAAGGCGTATT TCACACCCCCGCGCCGCT G TA CAAAAAAATACTTGTG AAAAAATTGGGATCCCTATA TGCGCCTCCGTTGAGACGACA CAATTTTTCTATTGCG CCTGCGGAGAACTCCCTATA TGCGCCTCCATCGACACG G GA CAACGTAACACTTTAC GCGGCGCGTCATTTGATAT ATGCGCCCCGCTTCCCGATAGA AAATGAGCTGTTGACA TTAATCATCGAACTAGTTA CTAGTACGCAAGTTCACGTAAA FIGURE 29.4 The nucleotide sequences of representative E. coli promoters. (In accordance with convention, these sequences are those of the nontemplate strand where RNA polymerase binds.) Consensus sequences for the Ϫ35 region, the Pribnow box, and the initiation site are shown at the bottom. The numbers represent the percent occurrence of the indicated base. (Note:The Ϫ35 region is only roughly 35 nucleotides from the transcription start site; the Pribnow box [the Ϫ10 region] likewise is located at approximately position Ϫ10.) In this figure, sequences are aligned relative to the Pribnow box. 910 Chapter 29 Transcription and the Regulation of Gene Expression enzyme uses its -subunit to bind to the conserved sequences, and the more closely the Ϫ35 region sequence corresponds to its consensus sequence, the greater is the efficiency of transcription of the gene. The highly expressed rrn genes in E. coli that encode ribosomal RNA (rRNA) have a third sequence element in their promoters, the upstream element (UP element), located about 20 bp immediately upstream of the Ϫ35 region. (Transcription from the rrn genes accounts for more than 60% of total RNA synthesis in rapidly growing E. coli cells.) Whereas the -subunit recog- nizes the Ϫ10 and Ϫ35 elements, the C-terminal domains (CTD) of the ␣-subunits of RNA polymerase recognize and bind the UP element. In order for transcription to begin, the DNA duplex must be “opened” so that RNA polymerase has access to single-stranded template. The efficiency of initiation is inversely proportional to the melting temperature, T m , in the Pribnow box, sug- gesting that the AϺT-rich nature of this region is aptly suited for easy “melting” of the DNA duplex and creation of the open promoter complex (see Figure 29.3). Nega- tive supercoiling facilitates transcription initiation by favoring DNA unwinding. The RNA polymerase -subunit is directly involved in melting the dsDNA. Inter- action of the -subunit with the nontemplate strand maintains the open complex formed between RNA polymerase and promoter DNA, with the -subunit acting as a A DEEPER LOOK DNA Footprinting—Identifying the Nucleotide Sequence in DNA Where a Protein Binds DNA footprinting is a widely used technique to identify the nucleotide sequence within DNA where a specific protein binds, such as the promoter sequence(s) bound by RNA polymerase holoenzyme. In this technique, the protein is in- cubated with a labeled (*) DNA fragment con- taining the nucleotide sequence where the pro- tein is believed to bind. (The DNA fragment is labeled at only one end.) Then, a DNA cleaving agent, such as DNase I, is added to the solution containing the DNAϺprotein complex. DNase I cleaves the DNA backbone in exposed regions— that is, wherever the presence of the DNA-binding protein does not prevent DNase I from binding. A control solution containing naked DNA (a sample of the same labeled DNA fragment with no DNA- binding protein added) is also treated with DNase I. When these DNase I digests are analyzed by gel electrophoresis, a difference is found between the set of labeled fragments from the DNAϺprotein complex and the set from naked DNA. The ab- sence of certain fragments in the digest of the DNAϺprotein complex reveals the location of the protein-binding site on the DNA (see accompany- ing figure). Adapted from Rhodes, D., and Fairall, L., 1997. Analysis of sequence-specific DNA-binding proteins. In Protein Function: A Practical Approach, Creighton, T. E., ed., Oxford: IRL Press at Oxford University Press. * * * * * * * * * * * * * * * * * * * DNA:protein complex protein Gel electrophoresis Naked DNA ( arrows indicate DNase I cleavage sites) Denature dsDNA: sets of labeled fragments Protein binding site 29.1 How Are Genes Transcribed in Prokaryotes? 911 sequence-specific single-stranded DNA-binding protein. Association of the -subunit with the nontemplate strand stabilizes the open promoter complex and leaves the bases along the template strand available to the catalytic site of the RNA polymerase. Initiation of Polymerization RNA polymerase has two binding sites for NTPs: the initiation site and the elongation site. The first nucleotide binds at the initiation site, base-pairing with the ϩ1 base exposed within the open promoter complex (see Figure 29.3, Step 4). The second incoming nucleotide binds at the elongation site, base-pairing with the ϩ2 base. The ribonucleotides are then united when the 3Ј-O of the first nucleotide makes a phosphoester bond with the ␣-phosphorus atom of the second nucleotide, and PP i is eliminated. Note that the 5Ј-end of the transcript starts out with a triphosphate attached to it. Movement of RNA polymerase along the template strand (translocation) to the next base prepares the RNA polymerase to add the next nucleotide (see Figure 29.3, Step 5). Once an oligonucleotide 9 to 12 residues long has been formed, the -subunit dissociates from RNA polymerase, signaling the completion of initiation (see Figure 29.3, Step 6). The core RNA po- lymerase is highly processive and goes on to synthesize the remainder of the mRNA. As the core RNA polymerase progresses, advancing the 3Ј-end of the RNA chain, the DNA duplex is unwound just ahead of it. About 12 base pairs of the growing RNA remain base-paired to the DNA template at any time, with the RNA strand becoming displaced as the DNA duplex rewinds behind the advancing RNA polymerase. Chain Elongation Elongation of the RNA transcript is catalyzed by the core po- lymerase, because once a short oligonucleotide chain has been synthesized, the -subunit dissociates. The accuracy of transcription is such that about once every 10 4 nucleotides, an error is made and the wrong base is inserted. Because many transcripts are made per gene and most transcripts are smaller than 10 kb, this error rate is acceptable. Two possibilities can be envisioned for the course of the new RNA chain. In one, the RNA chain is wrapped around the DNA as the RNA polymerase follows the tem- plate strand around the axis of the DNA duplex, but this possibility seems unlikely due to its potential for tangling the nucleic acid strands (Figure 29.5a). In reality, transcription involves supercoiling of the DNA, so positive supercoils are created RNA (a) RNA polymerase RNA polymerase Gyrase introducing negative supercoils (b) Topoisomerase removing negative supercoils FIGURE 29.5 Supercoiling versus transcription. (a) If the RNA polymerase followed the template strand around the axis of the DNA duplex, no supercoiling of the DNA would occur but the RNA chain would be wrapped around the double helix once every 10 bp.This possibility seems unlikely because it would be difficult to un- tangle the transcript from the DNA duplex. (b) Alternatively, gyrases and topoisomerases could remove the torsional stresses induced by transcription. 912 Chapter 29 Transcription and the Regulation of Gene Expression ahead of the transcription bubble and negative supercoils are created behind it (Figure 29.5b). To prevent torsional stress from inhibiting transcription, gyrases in- troduce negative supercoils (and thereby remove positive supercoils) ahead of RNA polymerase, and topoisomerases remove negative supercoils behind the DNA seg- ment undergoing transcription (Figure 29.5b). Chain Termination Two types of transcription termination mechanisms operate in bacteria: one that is dependent on a specific protein called rho termination factor (for the Greek symbol, ) and another, intrinsic termination, that is not. In intrin- sic termination, termination is determined by specific sequences in the DNA called termination sites. These sites are not indicated by particular bases showing where transcription halts. Instead, these sites consist of three structural features whose base-pairing possibilities lead to termination: 1. Inverted repeats, which are typically GϺC-rich, so a stable stem-loop structure can form in the transcript via intrachain base-pairing (Figure 29.6) 2. A nonrepeating segment that punctuates the inverted repeats 3. A run of 6 to 8 As in the DNA template, coding for Us in the transcript Termination then occurs as follows: A GϺC-rich, stem-loop structure, or “hairpin,” forms in the transcript. The hairpin apparently causes the RNA polymerase to pause, whereupon the AϺU base pairs between the transcript and the DNA template strand are displaced through formation of somewhat more stable AϺT base pairs be- tween the template and nontemplate strands of the DNA. The result is spontaneous dissociation of the nascent transcript from DNA. The alternative mechanism of termination—factor-dependent termination—is less common and mechanistically more complex. Rho factor is an ATP-dependent helicase (hexamer of 50-kD subunits) that catalyzes the unwinding of RNAϺDNA hybrid duplexes (or RNAϺRNA duplexes). The rho factor recognizes and binds to C-rich regions in the RNA transcript. These regions must be unoccupied by trans- lating ribosomes for rho factor to bind. Once bound, rho factor advances in the 5Ј→3Ј direction until it reaches the transcription bubble (Figure 29.7). There it cat- alyzes the unwinding of the transcript and template, releasing the nascent RNA chain. It is likely that the RNA polymerase stalls in a GϺC-rich termination region, al- lowing rho factor to overtake it. 29.2 How Is Transcription Regulated in Prokaryotes? In bacteria, genes encoding the enzymes of a particular metabolic pathway are often grouped adjacent to one another in a cluster on the chromosome. This pattern of or- ganization allows all of the genes in the group to be expressed in a coordinated fash- ion through transcription into a single polycistronic mRNA encoding all the enzymes of the metabolic pathway. 2 A regulatory sequence lying adjacent to the DNA being C C U C G G A A A DNA A T T A T A A T A T A T G C G C C G T A C G C G T A T A T A T A G C G C A T G C C G C G T A T A T A T A T A T A T A T A 5Ј 3Ј 3Ј 5Ј Inverted repeat Inverted repeat Direction of transcription G–C rich G–C rich A–T rich Last base transcribed Transcription Sense strand mRNA terminus G G A G C C U U U U UU U 3Ј terminus U UUUUAUU FIGURE 29.6 The termination site for the E.coli trp operon (the trp operon encodes the enzymes of trypto- phan biosynthesis).The inverted repeats give rise to a stem-loop, or “hairpin,” structure ending in a series of U residues. 2 A polycistronic mRNA is a single RNA transcript that encodes more than one polypeptide. Cistron is a genetic term for a DNA region representing a protein: Cistron and gene are essentially equivalent terms.