Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 14 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
14
Dung lượng
1,07 MB
Nội dung
MINIREVIEW The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression ´ ´ Christophe Maris*, Cyril Dominguez* and Frederic H.-T Allain Institute for Molecular Biology and Biophysics, Swiss Federal Institute of Technology Zurich, ETH-Honggerberg, Zurich, Switzerland ă ¨ Keywords RNA recognition motif; protein–RNA complex; structure–function relationship; RNA-binding specificity Correspondence F H.-T Allain, Institute for Molecular Biology and Biophysics, Swiss Federal Institute of Technology Zurich, ETHHonggerberg, CH-8093 Zurich, Switzerland ă ă Fax: +41 6331294 Tel: +41 6333940 E-mail: allain@mol.biol.ethz.ch Website: http://www.mol.biol.ethz.ch/ groups/allain_group *These authors contributed equally to the work The RNA recognition motif (RRM), also known as RNA-binding domain (RBD) or ribonucleoprotein domain (RNP) is one of the most abundant protein domains in eukaryotes Based on the comparison of more than 40 structures including 15 complexes (RRM–RNA or RRM–protein), we reviewed the structure–function relationships of this domain We identified and classified the different structural elements of the RRM that are important for binding a multitude of RNA sequences and proteins Common structural aspects were extracted that allowed us to define a structural leitmotif of the RRM–nucleic acid interface with its variations Outside of the two conserved RNP motifs that lie in the center of the RRM b-sheet, the two external b-strands, the loops, the C- and N-termini, or even a second RRM domain allow high RNA-binding affinity and specific recognition Protein–RRM interactions that have been found in several structures reinforce the notion of an extreme structural versatility of this domain supporting the numerous biological functions of the RRM-containing proteins (Received 16 December 2004, accepted March 2005) doi:10.1111/j.1742-4658.2005.04653.x History – what defines an RRM? The RNA recognition motif (RRM), also known as the RNA-binding domain (RBD) or ribonucleoprotein domain (RNP), was first identified in the late 1980s when it was demonstrated that mRNA precursors (pre-mRNA) and heterogeneous nuclear RNAs (hnRNAs) are always found in complex with proteins (reviewed in [1]) Biochemical characterizations of the mRNA polyadenylate binding protein (PABP) and the hnRNP protein C shed light on a consensus RNA-binding domain of approximately 90 amino acids containing a central sequence of eight conserved residues that are mainly aromatic and positively charged [2,3] This sequence, termed the RNP consensus sequence, was thought to be involved in RNA interaction and was defined as Lys ⁄ ArgGly-Phe ⁄ Tyr-Gly ⁄ Ala-Phe ⁄ Tyr-Val ⁄ Ile ⁄ Leu-X-Phe ⁄ Tyr, where X can be any amino acid Later, a second consensus sequence less conserved than the previously characterized one [1] was identified This six residue sequence located at the N-terminus of the domain Abbreviations ACF, APOBEC-1 complementary factor; CBP, cap binding protein; CstF, cleavage stimulation factor; hnRNP, heterogeneous nuclear ribonucleoprotein; HuD, Hu protein D; LRR, leucine rich repeat; MIF4G, middle domain of the translation initiation factor G; PABP, polyadenylate binding protein; PIE, polyadenylation inhibition element; PTB, polypyrimidine tract binding protein; RBD, RNA-binding domain; RNP, ribonucleoprotein; RRM, RNA recognition motif; SR, serine/arginine rich proteins; TLS, translocated in liposarcoma; U1A, U2A¢, U2B¢: U1 snRNP proteins A, A¢, B¢; U2AF, U2 snRNP auxiliary factor; UHM, U2AF homology motif; UPF, up-frameshift protein 2118 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS C Maris et al The RRM domain, a plastic RNA-binding platform RNP2 PTB PTB PTB PTB Cstf-64 LA TAP ALY hnRNP A1 hnRNP A1 HUD HUD SXL SXL PABP PABP Nucleolin Nucleolin U1A U2B" CBP20 Y14 UPF3 U2AF65 U2AF65 U2AF35 (1SJQ) (1SRJ) (1QM9) (1QM9) (1P1T) (1OWX) (1FO1) (1NO8) (1UP1) (1HA1) (1FXL) (1FXL) (2SXL) (1SXL) (1CVJ) (1CVJ) (1FJE) (1FJE) (1DZ5) (1A9N) (1H2T) (1P27) (1UW4) (1U2F) (2U2F) (1JMT) 60 183 338 455 17 244 121 106 15 105 47 133 126 212 12 99 309 396 11 41 74 52 150 260 66 RNP1 10 20 30 40 50 60 70 80 VIHIRKLPIDVTEGEVISLGLP -FGKVTNL LMLKG -KNQAFIEMNTEEAANTMVNYYTSVTPVLRGQPIYIQ RIIVENLFYPVTLDVLH-QIFSK FGTVLKI -ITFTKNN QFQALLQYADPVSAQHAKLSLDGQNIYNACCTLRID VLLVSNLNPERVTPQSLFILFGV YGDVQRV -KILFNK -KENALVQMADGNQAQLAMSHLNGHKLH GKPIRIT TLHLSNIPPSVSEEDLK-VLFSS NGGVVKG -FKFFQKD RKMALIQMGSVEEAVQALIDLHNHDLG-ENHHLRVS SVFVGNIPYEATEEQLK-DIFSE VGPVVSF -RLVYDRETGKPKGYGFCEYQDQETALSAMRNLNGREFS GRALRVD LKFSGDLDDQTCREDLHILFSNH GEIK WIDFVRGA KEGIILFKEKAKEALGKAKDANNGNLQLRNKEVTWEV KITIPYGRKYDK-AWLLSMIQSKCSVPFTPIEFHYENTRAQFFVEDASTASALKAVNYKILDRENRRISIIINSSAP PHS KLLVSNLDFGVSDADIQ-ELFAE FGTLKKA -AVHYDRSGR-SLGTADVHFERKADALKAMKQYNGVPLD GRPMNIQ KLFIGGLSFETTDESLR-SHFEQ WGTLTDC -VVMRDPNTKRSRGFGFVTYATVEEVDAAMNARP-HKVD GRVVEPK KIFVGGIKEDTEEHHLR-DYFEQ YGKIEVI -EIMTDRGSGKKRGFAFVTFDDHDSVDKIVIQKY-HTVN GHNCEVR NLIVNYLPQNMTQEEFR-SLFGS IGEIESC -KLVRDKITGQSLGYGFVNYIDPKDAEKAINTLNGLRLQ TKTIKV NLYVSGLPKTMTQKELE-QLFSQ YGRIITS -RILVDQVTGVSRGVGFIRFDKRIEAEEAIKGLNGQKPSGATEPITVK NLIVNYLPQDMTDRELY-ALFRA IGPINTC -RIMRDYKTGYSYGYAFVDFTSEMDSQRAIKVLNGITVR NKRLKV NLYVTNLPRTITDDQLD-TIFGK YGSIVQK -NILRDKLTGRPRGVAFVRYNKREEAQEAISALNNVIPEGGSQPLSVR SLYVGDLHPDVTEAMLY-EKFSP AGPILSI -RVCRDMITRRSLGYAYVNFQQPADAERALDTMNFDVIK GKPVRI NIFIKNLDKSIDNKALYDTFSAF GNILSCK VVCDENGSKGYGFVHFETQEAAERAIEKMNGMLLNDRKVFVGRFKS NLFIGNLNPNKSVAELKVAISEL FAKND -LAVVDVRTGTNRKFGYVDFESAEDLEKAL-ELTGLKVF GNEIKLE LLAKNLSFNITEDELKEVFEDAL EIRLVSQ DGKSKGIAYIEFKS EADAEKNLEEKQGAEID GRSVSLY TIYINNLNEKIKKDELKKSLYAI FSQFGQI -LDILVSRSLKMRGQAFVIFKEVSSATNALRSMQGFPFY DKPMRIQ TIYINNMNDKIKKEELKRSLYAL FSQFGHV -VDIVALKTMKMRGQAFVIFKELGSSTNALRQLQGFPFY GKPMRI TLYVGNLSFYTTEEQIY-ELFSK SGDIKKI -IMGLDKMKKTACGFCFVEYYSRADAENAMRYINGTRLD DRIIRTD ILFVTGVHEEATEEDIH-DKFAE YGEIKNI -HLNLDRRTGYLKGYTLVEYETYKEAQAAMEGLNGQDLM GQPISVD KVVIRRLPPTLTKEQLQEHLQPM PEHDYFE FFSNDTSLYPHMYARAYINFKNQEDIILFRDRFDGYVFLDNKGQEYPA RLYVGNIPFGITEEAMM-DFFNAQMR-LGGLTQAPG -NPVLAVQINQDKNFAFLEFRSVDETTQAM-AFDGIIFQ GQSLKIR KLFIGGLPNYLNDDQVK-ELLTS FGPLKAF -NLVKDSATGLSKGYAFCEYVDINVTDQAIAGLNGMQLG DKKLLVQ RSAVSDVEMQEHYDEFFEEVFTEMEEKYGEVEEM -NVC-DNLGDHLVGNVYVKFRREEDAEKAVIDLNNRWFN GQPIHA β1 L1 α1 L2 β2 L3 β3 L4 α2 L5 147 282 407 531 90 305 290 178 87 177 119 206 199 290 84 175 380 463 85 81 114 147 131 227 333 143 β4 Fig Sequence alignment of a selection of RRM domains for which the structure has been solved (PDB codes are indicated in brackets) The alignment was generated by the program CLUSTALW (http://www.ebi.ac.uk/clustalw/) [55] and manually optimized The conserved RNP and RNP sequences are displayed in yellow The amino acids highlighted in boxes refer to the aromatic residues important for primary RNA binding was defined as Ile ⁄ Val ⁄ Leu-Phe ⁄ Tyr-Ile ⁄ Val ⁄ Leu-XAsn-Leu The first consensus sequence was therefore referred as RNP and the second as RNP (Fig 1) It was then shown that this protein domain was necessary and sufficient for binding RNA molecules with a wide range of specificities and affinities (reviewed in [4–6]) Here we review the structural properties of the RRM domain in its isolated form and in complex with RNAs and ⁄ or proteins This review shows how such a simple domain can modulate its fold to recognize many RNAs and proteins in order to achieve a multitude of biological functions often associated with posttranscriptional gene regulation An abundant and ancient fold with multiple biological functions Genome sequencing projects recently showed that the RRM is found abundantly in all life kingdoms, including prokaryotes and viruses although at lower abundance than in eukaryotes To date, only 85 proteins containing an RRM domain in bacteria (mostly cyanobacteria [7]), and six such proteins in viruses have been identified Prokaryotic RRM proteins are rather small FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS (about 100 amino acids) and have a single copy of the RRM domain In eukaryotes, the RNA recognition motif is one of the most abundant protein domains To date, a total of 6056 RRM motifs have been identified in 3541 different proteins (http://www.sanger ac.uk/cgi-bin/Pfam/getacc?PF00076) [8] In humans, 497 proteins containing at least one RRM have been identified Assuming about 20 000–25 000 human genes, the RRM would therefore be present in about 2% of gene products In eukaryotic proteins, RRMs are often found as multiple copies within a protein (44%, two to six RRMs) and ⁄ or together with other domains (21%) Among the latter, the most abundant are the zinc fingers of the CCCH and CCHC type (21% of those with an additional domain), the polyadenylate binding protein C-terminal domain (PABP or PABC, 10%), and the WW domain (9%) Interestingly, contrary to the well known CCHHs that bind double-stranded DNA or RNA, the CCCH and CCHC zinc fingers are domains that bind single-stranded RNA [9,10] The PABP and the WW domains [11] are protein–protein interaction domains involved in translation [12,13] and pre-spliceosome formation, respectively [14] By association with different types of protein domains, the RRM domain can modulate its 2119 The RRM domain, a plastic RNA-binding platform RNA-binding affinity and specificity and diversify its biological functions A protein domain in such abundance is necessarily biologically important and associated with many functions in the cell Indeed, eukaryotic RRM proteins are present in all post-transcriptional events: pre-mRNA processing (for example CstF-64, LA, or UPF3 proteins), splicing (U2B¢, U2AF35, U2AF65, hnRNPA1 or Y14 proteins), alternative splicing (hnRNPA1, PTB, sex-lethal, SR proteins), mRNA stability (CBP20, PABP or HuD), RNA editing (ACF), mRNA export (TLS), pre-rRNA complex formation (nucleolin), translation regulation (PABP) and degradation [6] In plants, RRM proteins are present in chloroplasts and are involved in 3¢ end processing of chloroplast mRNA [15] They have also been discovered in plant mitochondria Their functions, however, remain unclear [16] Similarly, their roles in bacteria and viruses are still unknown The numerous three-dimensional structures of the RRM in isolation, and in complex with RNA or other proteins, shed light on the function of RRM proteins, as shown below The structure of the RRM, a babbab fold with some variations and extensions The RRM folds into an ab sandwich structure with a b1a1b2b3a2b4 topology (Figs and 2) as demonstrated by the first structure of an RNA recognition motif, the N-terminal RRM of U1A [17] The fold is composed of one four-stranded antiparallel b-sheet spacially arranged in the order b4b1b3b2 from left to right when facing the sheet (Fig 2, hnRNP A1-RRM 2, front view) and two a-helices (a1 and a2) packed against the b-sheet Most of the conserved residues of C Maris et al the RRM are in the hydrophobic core of the domain [17] except four conserved residues that contribute to RNA binding, namely RNP positions 1, and and RNP position (see the following section and Fig 1) The RNP and RNP motifs are located in the central strands of the b-sheet, namely b3 and b1, respectively, and are highly conserved apart from a few RRM domains such as ALY and TAP (Fig 1) [18,19] To date, more than 30 RRM structures have been determined either by NMR or X-ray crystallography and reveal unexpected variations as shown in Fig The loops between the secondary structure elements (loops 1–5 as indicated in Figs and 2) can have different lengths and are often disordered in the free form An exception to this is loop that often forms a small two-stranded b-sheet (b3¢ and b3¢) (Fig 2) The N- and C-terminal regions, outside the RRM, are usually poorly ordered in the isolated domains with a few exceptions where they can adopt a secondary structure (Fig 2, PTB-RRM 3, La C-terminal RRM and CstF-64) In the structures of La C-terminal RRM [20], U1A N-terminal RRM [21] and CstF-64 RRM [22], the C-terminus forms an a-helix that lies on the b-sheet surface, while in PTBRRM and it extends the size of the b-sheet by forming an extra b-strand (b5) antiparallel to b2 [23,24] CstF-64 RRM has also an additional short a-helix in its N-terminal region (Fig 2) [22] Finally, secondary structure elements of the domain can be modified; for example a-helix in U2AF35 RRM that is three times longer than in a canonical RRM (Fig 2) This unusual helix is involved in protein– protein interactions [25] (see the RRM–protein complexes section) Fig hnRNPA1 RRM 2, a typical RRM fold and its structural variations as illustrated by these different protein structures (hnRNPA1 RRM [52], PTB RRM [23], La C-terminal [20], Cst64 RRM [22] and U2AF35 [51]) This figure was generated with the program MOLMOL [56] 2120 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS C Maris et al The RRM domain, a plastic RNA-binding platform A true single-stranded nucleic acid binding domain Since the first structure of an RRM in complex with RNA (the N-terminal domain of U1A in complex with U1snRNA stem-loop II [26]) that founded our understanding of RRM–RNA recognition, 10 structures of RRMs in complex with RNA or DNA (for hnRNPA1) have been determined either by NMR [27–30] or X-ray crystallography [31–36] All of the structures present intrinsic common features and differences in RNA recognition reflecting the remarkable adaptability of this domain in order to achieve high affinity and specificity Systematic visual analysis of the conserved residues at the RRM–RNA interface for all 11 published complexes led us to define a common structural archetype of the RRM–nucleic acid interaction exemplified by hnRNPA1, an RRM protein binding both DNA and RNA with high affinity In the structure of hnRNPA1 RRM in complex with DNA [34] (Fig 3A), two deoxynucleotides, A209 and G210, stack two aromatic rings located on b1 (Phe108, RNP position 2) and b3 (Phe150, RNP position 5) strands, respectively (Fig 3A) The contacts with these two RNP positions result in a characteristic arrangement of the nucleic acid strand on the b-sheet surface in which the 5¢ end is located on the first half of the b-sheet (b4b1) and the 3¢ end on the second half (b3b2) (Fig 3B) A third aromatic residue located on b3 (Phe148, RNP position 3) interacts hydrophobically with the sugar rings of A209 and G210 Finally, a positively charged side chain (Arg146, RNP position 1) forms a salt bridge with the phosphate between A209 and G210 This small set of RRM–nucleic acid interactions, in the center of the domain, involving four conserved protein side chains of the RRM consensus sequence and two nucleotides, illustrates the perfect adaptation of the RRM for effectively binding single-stranded nucleic acids of any sequence Indeed, the essential chemical elements of this dinucleotide, namely the two bases, the two sugar rings and the phosphates in between, are recognized The two bases are stacked on conserved aromatic rings, and correspondingly, RNP position and RNP position are planar residues (Phe, Tyr, His or Trp) in 78% and 72% of the 70 RRMs studied by Birney et al [6], respectively The two sugar rings are in contact with a hydrophobic side chain (RNP position 3) that is present in 81% (67% of Phe or Tyr) of the RRMs and finally the negatively charged phosphodiester group is neutralized by a positively charged side chain (RNP position 1) present in 68% of the RRMs [6] Although the residue conservation at these four positions is strong, these four characteristic contacts are not always found all together [34] Among the RRM–RNA ⁄ DNA complexes, the two RRMs of hnRNPA1 in complex with DNA have all four characteristic contacts, whereas only one to three of those are found in the other structures (Fig 4) The most frequent ones are the two stacking A Fig hnRNPA1 RRM as a model of single stranded nucleic acid binding [25] (A) Structure of hnRNPA1 RRM in complex with single stranded telomeric DNA and scheme of the b-sheet annotated with the conserved RNP and RNP aromatic residue positions numbered according to each RNP sequence numbering The conserved aromatic residues are highlighted by green circles [34] (B) Structural arrangement of the DNA strand on the b-sheet of hnRNPA1–RRM (C) Hydrogen bond and van der Waals interaction network conferring base-binding specificity (hnRNPA1– RRM complex) This figure was generated with the program MOLMOL [56] FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS B C 2121 The RRM domain, a plastic RNA-binding platform A D C Maris et al B E C F Fig The RRM domain, a highly plastic platform for nucleic acid binding (A) Nucleolin RRM 2-sNRE complex [28] (B) Sex-lethal RRM 1–polyU–Tra mRNA [31] (C) Sex-lethal RRM 2–Tra mRNA precursor complex [31] (D) hnRNPA1 RRM 1–telomeric DNA complex [34] (E) Poly(A)-binding protein RRM 1–polyadenylate RNA complex [33] (F) Heterodimeric nuclear cap binding complex 5¢ capped polymerase II transcripts [36] In all figures, the RNA is shown in yellow and the protein side chain in green The ribbon of the RRM is shown in grey The N- and C-terminal extensions of the RRM are shown in green and red, respectively This figure was generated with the program MOLMOL [56] interactions involving RNP position (always present except in nucleolin RRM [37]) and RNP position (always present except in CBP 20 [36]) The contacts between the sugars and RNP position are present in five RRM–RNA complexes (CBP20, PABP RRM 1, nucleolin RRM and RRM and sex-lethal 2122 RRM 1) The RNP position residue does not necessarily interact with the phosphate between the dinucleotide because in all structures apart from hnRNPA1 it contacts an RNA base or a phosphate oxygen of other nucleotides Also, the RRM interactions with the sugar–phosphate backbone are fairly FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS C Maris et al limited compared to other types of RNA-binding proteins, such as ribosomal proteins, suggesting a less important role for this type of interaction [38] This basic binding platform common to all RRMs is not in essence sequence-specific as eight of the 16 dinucleotide combinations have already been found: AA [33], AG [34], CG [28], CA [26], GU [31], UC [28], UG (S D Auweter and F H.-T Allain, unpublished data) and UU [31], with any type of nucleotide either at the 5¢ or the 3¢ position The nucleotides at these two positions always adopt an anti conformation, except for the G at the 3¢ position always found in a syn conformation Specificity of this central dinucleotide recognition is provided by other non conserved elements of the RRMs The two most frequently observed elements are the protein side chains at the surface of the b-sheet (RNP position and the two adjacent positions in b1) (Fig 3A) and the backbone and side chains of the few amino acids just C-terminal to b4 These residues are base-specifically hydrogen-bonded to the RNA or DNA functional groups as illustrated by the multiple base–amino acid contacts in hnRNPA1 RRM (Fig 3C) A highly plastic domain to achieve high RNA-binding affinity and specificity Many RRMs bind RNA with high affinity (in the nm range) and high sequence-specificity, in particular all those whose structures have been determined to date Nevertheless, sequence-specificity does not necessarily imply high affinity, e.g PTB that specifically recognizes pyrimidine tracts but does not provide sufficient binding enthalpy to reach nm affinity (F C Oberstrass, S D Auweter and F H.-T Allain, unpublished data) To achieve higher affinity, some RRM proteins use the two external b4 and b2 strands, while others use the loops 1, or 5, or the C- and N- termini [39] In many proteins, multiple RRMs associate to bind longer nucleotide stretches In these cases, the interdomain linker is an essential component of RNA recognition In addition, the RNA secondary structure can be an important determinant of the protein binding affinity All of these aspects are presented in detail below Role of the two external b-strands and the loops The b-sheet surface of an RRM can be modulated by using only one or up to four b-strands for RNA binding Figure clearly illustrates that the b-sheet surface is not used to the same extent in each RRM–nucleic acid complex Exceptionally, in hnRNPA1 RRM 1, each b-strand binds one nucleotide, the DNA being FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS The RRM domain, a plastic RNA-binding platform spread on the b-sheet from b4 to b2 in the 5¢)3¢ direction More often, the nucleotide at the 5¢ end of the central dinucleotide contacts the loops at the bottom of the b-sheet (loop and loop in particular, Fig 4C) and the one at the 3¢ end stacks over the previous nucleotide (Fig 4A) In PAPB RRM 1, it is different again; while A6 and A8 stack the protein side chains at the canonical positions on b1 and b3, respectively, the nucleotide in between, A7, interacts with loop (Fig 4E) Role of the N- and C-terminal regions The N- and C-terminal regions of the RRM are often of crucial importance to dramatically enhance the RNA-binding affinity by increasing the protein–RNA interaction network In most RRM–RNA complexes, the base stacking on the aromatic residue at RNP position is sandwiched either by a protein side chain from the N-terminal region (CBP20) or by one from the C-terminal region of the RRM (Fig 4D–F) [36] This side chain can be one residue after the end of b4 as in U1A [26,27] or 16 residues afterwards as in hnRNPA1 RRM [34] (Fig 4D) The C-terminus of hnRNPA1 RRM is particularly interesting because it is unstructured in the free form and becomes ordered upon DNA binding forming a 310 helix This structural rearrangement reinforces the concept of binding by induced fit, initially proposed with the structure of the U1A–RNA complex [27] Side chain residues of this helix, His101 and Arg92, stack over A203 and G204, respectively (Fig 4D) [34] The C-terminus can also contribute to differentiating RNA from DNA by interacting with the 2¢OH group of the sugar ring as shown in Fig 4B,E The hydroxyl group can act as a hydrogen bond acceptor interacting with protein side chains (Fig 4E, Arg94; Fig 4B Arg202) as well as with the backbone amide (Fig 4B, Gly205) and ⁄ or as a hydrogen bond donor interacting with the carbonyl oxygen of the protein backbone [38] Other parts of the RRM domain, such as the b2-strand and the loops, also interact with the 2¢OHs and help to discriminate RNA from DNA [26,31,33,35] The C-terminal region does not always enhance, but can also inhibit RNA binding as shown in the structure of CBP20 [36] (Fig 4F) Two residues (Asn116 and Arg123) of the C-terminus form a salt bridge located above the RNP residue at position (Phe85) preventing any RNA binding at this key position Similarly in PTB, the C-terminal region of all the RRMs hydrophobically interacts with RNP position 5, thereby masking this binding site (F C Oberstrass, S D Auweter and F H.-T Allain, unpublished data) 2123 The RRM domain, a plastic RNA-binding platform Role of the RNA secondary structure in RRM binding Some proteins such as the N-terminal RRM of U1A bind single-stranded RNA with high affinity only if the RNA is embedded within a secondary structure, stem loop (hairpin loop II of U1 snRNA [26]) or internal loop (the regulatory element of the U1A 3¢ untranslated region [27]) For example, the U1A protein that recognizes a stem loop has a much weaker affinity (104-fold) for a single-stranded 23-mer RNA with no base pairs, even though the proper single-stranded recognition sequence is present [26] U1A RRM specifically recognizes the secondary structure of the target RNA through its loops and binding to a specific base pair In the case of U1A bound to a fragment of U1 snRNA hairpin II, Arg52 (loop 3) makes crucial interactions with the closing loop GC base pair and its substitution to Glu completely abolishes RNA binding [26] (Fig 5A) U1A not only binds a stem loop but also an internal loop [27,29] This ability to bind RNA in different environments shows the adaptability of the proteins to recognize different secondary structures as long as the key protein–RNA interactions are conserved The closely related U2B¢ RRM binds the same hexanucleotide sequence, AUUGCA, as U1A but within a different stem loop (U2 snRNA hairpin IV) and only when in complex with U2A¢ (Fig 5B) The adaptability of the RRM domain is further illustrated here, as the key residue Arg52 still interacts with the RNA stem although the closing base pair is a UU base pair in U2snRNA SLIV instead of a GC in U1snRNA SLII A C 2124 C Maris et al While both U1A and U2B¢ recognize the bases at the top of the stem through numerous hydrogen bonds, nucleolin contacts the nucleolin recognition element (sNRE) RNA stem essentially by van der Waals interactions [28] (Fig 5C) The two RRMs of nucleolin sandwich the seven nucleotide loop and RRM and its C-terminal part recognize the unusual loop E structure [28] The substitution of the loop E by two GC base pairs separated by a bulge increases the dissociation constant more than 100-fold (from nm to 0.8 lm) [30] and, as shown in Fig 5D, this substitution annihilates all van der Waals interactions (only one hydrogen bond from Lys95 is retained) The doublestranded stem is important for two reasons: first, it restricts the conformation of the RNA loop and reduces the entropy loss accompanying protein binding; and second, some structural features of the RNA such as the base pair (U1A and U2B¢) or loop E (nucleolin) that closes the RNA loop, are crucial for positioning the RRM onto the RNA It was postulated that the RNA structure is essential because it induces conformational changes in order to reach the bound state [27,40] Role of additional RRMs The combination of two or more RRM domains allows the continuous recognition of a long nucleotide sequence (8–10 nucleotides) often drastically increasing the affinity (Kd < nm) As shown previously, the b-sheet surface can bind up to four nucleotides and up to six if loops and contribute extensively to binding B D Fig Role of the RNA secondary structure in RRM binding (A) U1A spliceosomal protein–U1 snRNA hairpin II complex [26] (B) U2B¢–U2A¢ protein complex bound to U2 snRNA hairpin IV [32] (C) Nucleolin–sNRE complex [28] The loop E motif is composed of a sheared base G5-A18 pair, an A6-U17G16 and a symmetric (trans-Hoogsteen) locally parallel A7-A15 base pair (D) Nucleolin–b2NRE complex with the loop E motif substituted by a bulge (U15 between two GC base pairs) [30] The color schemes are the same as in Fig 4, except that the proteins loops and the C-terminus are shown in blue This figure was generated with the program MOLMOL [56] FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS C Maris et al The RRM domain, a plastic RNA-binding platform (S D Auweter and F H.-T Allain, unpublished data) Thus, recognition of a longer single-stranded DNA or RNA requires more than one RRM to form a larger binding platform Four structures of two consecutive RRMs in complex with RNA (sex-lethal [31], HuD [35], PABP [33] and nucleolin [28,30]) and one with DNA (hnRNPA1 [34]) have been determined In all five cases, the two RRMs and the interdomain linker cooperatively bind RNA providing high affinity and specificity In the free forms of sex-lethal and nucleolin, the linkers are disordered and the two RRM domains tumble independently [37,41] In some cases (PABP, nucleolin), the interdomain linker (that is the C-terminal region of the N-terminal RRM as described above) acts as a bridge, mediating the cooperative binding of two RRM domains with the RNA More interesting is the range of new possible conformations provided by the association of two RRMs (Fig 6) In PAPB, a large binding platform is created for the RNA; in sex-lethal and HuD, the two RRMs form a cleft in which the RNA lies; and in nucleolin the RNA is sandwiched between the RRMs As a consequence A of the relative arrangement of the two domains in sexlethal, HuD and nucleolin, several intra-RNA interactions are created upon RNA binding that contribute to the overall enthalpy of the complex, while in PABP almost no intra-RNA interactions are present On the contrary, hnRNPA1 RRMs 1–2 and PTB RRMs 3–4 (F C Oberstrass, S D Auweter and F H.-T Allain, unpublished results) are arranged in such a way that only distantly located RNA sequences of the same RNA can bind simultaneously to both RRMs These totally opposite topologies might reflect the opposite function of the various RRM proteins, as both sexlethal and HuD are splicing activators, while hnRNPA1 and PTB are splicing repressors [42] The RRM, also a protein–protein interaction domain Over the last few years, biochemical and structural studies have shown that the RRM is not only involved in RNA recognition but also in protein–protein interaction In addition to structures of multiple RRMB UP1 Nucleolin RRM RRM RRM RRM C Sex-lethal 5' 5' Fig The RRM–RRM interactions Several protein structures either free or in a complex in which two RRM domains interact are shown Structures of (A) UP1 in the free form [53] (pdb:1 lp1), (B) nucleolin in complex with RNA [28] (pdb:1fje), (C) sex-lethal in complex with RNA [31] (pdb:1b7f), (D) PABP in complex with RNA [33] (pdb:1cvj), and (E) U1A homodimer in complex with RNA [29] (pdb:1dz5) The RNA backbone is shown in yellow (A–E), the N-terminal RRM domain is displayed green, C-terminal domain blue, and linker region red (F) One monomer of U1A is displayed green and the other blue In all cases, important residues for the protein–protein interaction are displayed as balls and sticks This figure was generated using the programs MOLSCRIPT and RASTER3D [57,58] FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS D PABP 3' RRM 5' 3' RRM RRM 3' E U1A RRM 3' 5' 5' 3' 2125 The RRM domain, a plastic RNA-binding platform containing proteins as described in the previous section, structures of RRM domains in complex with various proteins or domains have been solved [32,43–51] Analysis of these structures shows that protein recognition by RRM domains is very diverse with no general mechanism emerging For clarity, we distinguish three main classes of RRM–protein interactions: between two RRMs, between an RRM-binding RNA and a non-RRM protein, and finally between RRMs that not bind RNA and another protein Protein interaction involving two RRM domains The first structure showing an interaction between two RRMs is the N-terminal region of hnRNPA1 (UP1) in its free form that contains two RRM domains separated by a short linker [52,53] The two RRMs form a compact fold and interact with each other via their a-helix The interaction is stabilized by two salt bridges connecting two arginines of the first RRM and two aspartic acids of the second (Fig 6A) This arrangement positions adjacently the b-sheets of both domains forming an extended surface of eight b-strands Similarly, PTB RRMs and 4, separated by a 24 residue linker region, not tumble independently in the free form (F C Oberstrass, S D Auweter and F H.-T Allain, unpublished data) These RRM–RRM interactions are not a general feature of all RRM proteins In the case of sex-lethal and nucleolin, in the free proteins, the linker is flexible and the two RRM domains are independent [28,41] However, upon RNA binding, the two RRM domains adopt a fixed orientation and contact each other In the nucleolin structure, the RRMs interact via two salt bridges located in the loops (Fig 6B) and in the structure of hnRNPA1, the RRMs interact by salt bridges located in the a2-helix Other examples of RNA inducing RRM–RRM interactions have also been described in the case of sex-lethal [31], PABP [33], and HuD [35] In sex-lethal and HuD, the interdomain interaction is mainly governed by two hydrogen bonds between residues located in b1 and b4 of RRM and in b2 of RRM (Fig 6C) Furthermore, additional contacts between RRM and the linker region are observed In the case of PABP, the interdomain interactions are mediated through many salt bridges and van der Waals contacts between a2 and b4 of RRM and b2 and a1 of RRM 2, respectively (Fig 6D) Another interesting example of RRM–RRM interaction is found in the structure of the N-terminal RRM domain of the U1A protein in complex with the polyadenylation inhibition element (PIE) RNA [29] In this case, two U1A proteins bind cooperatively to the 2126 C Maris et al PIE RNA [54] The structure shows that when bound to RNA, U1A RRM forms a homodimer stabilized by interactions between the two a-helical C-termini (Fig 6E) On one side the C-terminal a-helix contains charged residues that interact with the RNA and on the opposite side contains hydrophobic residues that constitute the dimer interface All of these structures clearly show that RRM domains can be involved in RRM–RRM interaction in addition to RNA binding In most of these complexes, these additional interactions contribute to the formation of a larger RNA-binding interface and are therefore critical to reach high RNA-binding affinity and specificity This feature is likely to be frequently found in multiple RRM-containing proteins, especially if the interdomain linker is short Protein interaction involving one RRM domain and another domain In some cases, it has been demonstrated that RRMcontaining proteins can associate with RNA only in the presence of another protein that acts as a cofactor Both U2B¢ and CBP20 need a cofactor, U2A¢ and CBP80, respectively, to recognize RNA Ternary structures of these complexes have been solved that partially explain the importance of a cofactor in RNA–RRM binding [32,43–45] U2A¢ consists of five consecutive leucine-rich repeats, and CBP80 of three helical hairpin repeats very similar to the fold of the middle domain of the translation initiation factor 4G (MIF4G) domain In both cases, the RRM domains of U2B¢ and CBP20 interact with the leucine rich repeat (LRR) motif or the MIF4G domain through their a-helices and loop 4, keeping the b-sheet accessible for RNA-binding (Fig 7) The interactions, however, are different as they are governed mainly by hydrophobic contacts in the U2B¢–U2A¢ complex, and salt bridges and hydrogen bonds in the CBP20–CBP80 complex Furthermore, in the case of CBP20, the N- and C-terminal extensions flanking the RRM domain become structured only when in complex with both RNA and CBP80 As for RRM–RRM interactions, these RRM– protein interactions contribute to RNA-binding specificity, U2A¢ contacting the RNA and CBP80 stabilizing both the N- and C-termini of CBP20 RRM, two key components of CBP20–RNA recognition (Fig 4) [44] RRM domains involved only in protein recognition Some proteins containing RRM domains are involved in protein–protein but not in protein–RNA interactions FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS C Maris et al The RRM domain, a plastic RNA-binding platform A U2B"-U2A' B CBP20-CBP80 Y14-Mago Fig The Y14–Magoh complex [48] Y14 is shown in green, and Magoh is shown in blue The RNP and of Y14 are shown in red This figure was generated using the programs MOLSCRIPT and RASTER3D [57,58] Fig The RRM–protein–RNA trimolecular complexes (A) The U2B¢–U2A¢–RNA ternary complex [32] (B) The CBP20–CBP80–RNA complex [36] The RNA is shown in yellow, the RRM domain in green, and leucine-rich repeats or MIF4G domains in blue Residues important for the interaction are displayed as balls and sticks This figure was generated using the programs MOLSCRIPT and RASTER3D [57,58] Recently, three-dimensional structures of such proteins in complex partially explained this unexpected behavior of the RRM domain Two different situations, however, have been reported In one case, the protein interaction involves the b-sheet of the RRM domain, thus preventing RNA binding as in the Y14–Magoh complex [46–49] or the UPF2–UPF3 complex [50] In a second case, the interaction is mediated through the a-helices, leaving the b-sheet solvent-exposed and therefore theoretically able to bind RNA, as with the U2AF35–U2AF65 [51], and the U2AF65–SF1 complexes [46] In this latter case, it was postulated that the particular behavior of these RRM domains is due mainly to FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS the identity of the amino acids on the surface of the b-sheet (see below [25]) Y14 and Magoh proteins are part of the exon junction complex that comprises several proteins Y14 and Magoh form a highly stable complex with nanomolar binding affinity [48] The C-terminal domain of Y14 has a typical RRM fold and the RNP and RNP amino acid sequences of Y14 are very similar to other RRM domains (Fig 1) However, Y14 does not bind RNA Structures of the Y14–Magoh heterodimer show that Y14 binds Magoh through its entire b-sheet [46–48] (Fig 8) This particular complex formation of the RRM neatly explains why some RRM domains not have RNA-binding activities Similarly, in the structure of the UPF2–UPF3 complex involved in non-sense mediated mRNA decay, the b-sheet of the N-terminal RRM domain of UPF3 binds UPF2 [50] Although the two RRM proteins both interact through their b-sheet, their interacting proteins, Magoh and UPF2, adopt a completely different fold UPF2 has a totally a-helical MIF4G fold very similar to CBP80, while Magoh has an ab fold (Fig 8) Also striking is the fact that both UPF2 and CBP80 adopt a MIF4G fold, but recognize RRM in a totally different manner, UPF2 recognizing the RRM b-sheet and CBP80 the RRM a-helices The structures of the splicing factors U2AF35– U2AF65 and U2AF65–SF1 are another example of the diversity encountered in protein–RRM recognition U2AF65 contains three RRM domains, the two 2127 The RRM domain, a plastic RNA-binding platform N-terminal domains binding RNA while the C-terminal domain mediates SF1 interaction U2AF35 contains a central RRM domain flanked by two zinc finger domains The structures of U2AF35 RRM in complex with the N-terminal domain of U2AF65 and of the RRM of U2AF65 in complex with the N-terminal domain of SF1 have been solved [46,51] Surprisingly, in this case, the b-sheet of the RRM domain is not implicated in protein interaction as for other non-RNA-binding RRM domains, but involves the two a-helices Analysis of the RRM fold in these two structures shows striking differences from the canonical RRM domains, mainly consisting of a longer helix a1 (Fig 2) and the absence of aromatic residues in the RNP and motifs The authors therefore proposed a novel class of protein recognition motif that they named U2AF homology motif (UHM) [25] The examples described above define a novel class of RRM domains that are involved in protein but not RNA interactions, suggesting that RRM domains might have evolved from RNA to protein recognition Although these RRM proteins not bind RNA, they are all implicated in RNA-related functions such as recognition of the exon junction (Y14), mRNA decay (UPF3) or pre-mRNA splicing (U2AF35 and U2AF65) This evolutionary process can be accompanied by amino acid substitutions in the RNA-binding regions, namely RNP and 2, as proposed for the UHM domain However, in the case of Y14 and UPF3, it is not entirely clear why these RRM domains that are very similar to the classical ones favor interaction with proteins rather than RNA Conclusion and perspectives The RNA recognition motif is an abundant and very diverse protein motif found mainly in eukaryotes Analysis of the structures of this domain in the free form as well as in complex with both RNA and proteins shows that this small domain is extremely diverse in terms of both structure and function We are now just starting to understand the structural, functional, as well as evolutionary aspects of this domain It is now clear that the original perception of the RRM as a simple rigid RNA-binding domain must evolve and that further biochemical and structural studies are needed to obtain a full picture of its role in the cell Structures of RRM domains in complex with different RNAs show that this small compact domain is a central component of RNA recognition but not the only determinant N- and 2128 C Maris et al C-terminal extensions, multiplication of RRM domains or protein cofactors can play an important role in RNA-binding specificity This review also raises many questions concerning this domain First, concerning RNA binding, analysis of the different structures shows that although some conserved aromatic residues are always found at the interface, the topology of the bound RNA is quite different in each complex and the sequence-specificity cannot easily be predicted Thus, more structures of RRM– RNA complexes are needed to fully understand the determinants of this specificity Second, RRM domains are able to bind RNA with affinities ranging from very high to weak, and the structural and thermodynamic determinants of the RNA-binding affinity still need to be elucidated Third, as it is now demonstrated that some RRM domains are specific to protein recognition rather than RNA binding, which of the identified RRM domains are true RNA-binding domains and which ones are not? In some cases, the primary sequence can differentiate between these behaviors, as for the novel UHM domain, but in other cases, such as Y14 and UPF2, structural determinants other than the amino acid sequence must be present but are still unknown and need to be identified Fourth, it is established that a high number of proteins contain both RRM and auxiliary domains, such as zinc fingers, also involved in nucleic acid binding No structural studies, however, indicate if these two RNA-binding domains within the same protein influence each other for RNA binding Finally, it has recently been discovered that the RRM domain, for a long time thought to belong exclusively to the eukaryotic world, is also present in bacteria, viruses and mitochondria From an evolutionary point of view, it would be very interesting to investigate the function of this domain in such organisms and maybe discover their common ancestor In conclusion, further structural investigations on RRM domains possibly coupled with thermodynamic and kinetic studies are still needed to confirm present hypotheses and possibly to reveal more surprises Acknowledgements The authors would like to acknowledge the financial support of the Fondation Schlumberger pour l’Education et la Recherche (postdoctoral fellowship), the Swiss National Science Foundation (Nr 31–67098.01), the Roche Research Fund for Biology at the ETH Zurich and the SNF NCCR structural biology to FHTA FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS C Maris et al References Dreyfuss G, Swanson MS & Pinol-Roma S (1988) Heterogeneous nuclear ribonucleoprotein particles and the pathway of mRNA formation Trends Biochem Sci 13, 86–91 Adam SA, Nakagawa T, Swanson MS, Woodruff TK & Dreyfuss G (1986) mRNA polyadenylate-binding protein: gene isolation and sequencing and identification of a ribonucleoprotein consensus sequence Mol Cell Biol 6, 2932–2943 Swanson MS, Nakagawa TY, LeVan K & Dreyfuss G (1987) Primary structure of human nuclear ribonucleoprotein particle C proteins: conservation of sequence and domain structures in heterogeneous nuclear RNA, mRNA, and pre-rRNA-binding proteins Mol Cell Biol 7, 1731–1739 Bandziulis RJ, Swanson MS & Dreyfuss G (1989) RNA-binding proteins as developmental regulators Genes Dev 3, 431–437 Kenan DJ, Query CC & Keene JD (1991) RNA recognition: towards identifying determinants of specificity Trends Biochem Sci 16, 214–220 Birney E, Kumar S & Krainer AR (1993) Analysis of the RNA-recognition motif and RS and RGG domains: conservation in metazoan pre-mRNA splicing factors Nucleic Acids Res 21, 5803–5816 Maruyama K, Sato N & Ohta N (1999) Conservation of structure and cold-regulation of RNA-binding proteins in cyanobacteria: probable convergent evolution with eukaryotic glycine-rich RNA-binding proteins Nucleic Acids Res 27, 2029–2036 Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M & Sonnhammer EL (2002) The Pfam protein families database Nucleic Acids Res 30, 276–280 Hudson BP, Martinez-Yamout MA, Dyson HJ & Wright PE (2004) Recognition of the mRNA AU-rich element by the zinc finger domain of TIS11d Nat Struct Mol Biol 11, 257–264 10 De Guzman RN, Wu ZR, Stalling CC, Pappalardo L, Borer PN & Summers MF (1998) Structure of the HIV1 nucleocapsid protein bound to the SL3 psi-RNA recognition element Science 279, 384–388 11 Sudol M, Sliwa K & Russo T (2001) Functions of WW domains in the nucleus FEBS Lett 490, 190–195 12 Roy G, De Crescenzo G, Khaleghpour K, Kahvejian A, O’Connor-McCourt M & Sonenberg N (2002) Paip1 interacts with poly (A) binding protein through two independent binding motifs Mol Cell Biol 22, 3769– 3782 13 Kozlov G, Trempe JF, Khaleghpour K, Kahvejian A, Ekiel I & Gehring K (2001) Structure and function of the C-terminal PABC domain of human poly (A) -binding protein Proc Natl Acad Sci USA 98, 4409–4413 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS The RRM domain, a plastic RNA-binding platform 14 Lin KT, Lu RM & Tarn WY (2004) The WW domaincontaining proteins interact with the early spliceosome and participate in pre-mRNA splicing in vivo Mol Cell Biol 24, 9176–9185 15 Schuster G & Gruissem W (1991) Chloroplast mRNA3¢ end processing requires a nuclear-encoded RNA-binding protein EMBO J 10, 1493–1502 16 Vermel M, Guermann B, Delage L, Grienenberger JM, Marechal-Drouard L & Gualberto JM (2002) A family of RRM-type RNA-binding proteins specific to plant mitochondria Proc Natl Acad Sci USA 99, 5866–5871 17 Nagai K, Oubridge C, Jessen TH, Li J & Evans PR (1990) Crystal structure of the RNA-binding domain of the U1 small nuclear ribonucleoprotein A Nature 348, 515–520 18 Liker E, Fernandez E, Izaurralde E & Conti E (2000) The structure of the mRNA export factor TAP reveals a cis arrangement of a non-canonical RNP domain and an LRR domain EMBO J 19, 5587–5598 19 Perez-Alvarado GC, Martinez-Yamout M, Allen MM, Grosschedl R, Dyson HJ & Wright PE (2003) Structure of the nuclear factor ALY: insights into post-transcriptional regulatory and mRNA nuclear export processes Biochemistry 42, 7348–7357 20 Jacks A, Babon J, Kelly G, Manolaridis I, Cary PD, Curry S & Conte MR (2003) Structure of the C-terminal domain of human La protein reveals a novel RNA recognition motif coupled to a helical nuclear retention element Structure (Camb) 11, 833–843 21 Avis JM, Allain FH, Howe PW, Varani G, Nagai K & Neuhaus D (1996) Solution structure of the N-terminal RNP domain of U1A protein: the role of C-terminal residues in structure stability and RNA binding J Mol Biol 257, 398–411 22 Perez Canadillas JM & Varani G (2003) Recognition of GU-rich polyadenylation regulatory elements by human CstF-64 protein EMBO J 22, 2821–2830 23 Conte MR, Grune T, Ghuman J, Kelly G, Ladas A, Matthews S & Curry S (2000) Structure of tandem RNA recognition motifs from polypyrimidine tract binding protein reveals novel features of the RRM fold EMBO J 19, 3132–3141 24 Simpson PJ, Monie TP, Szendroi A, Davydova N, Tyzack JK, Conte MR, Read CM, Cary PD, Svergun DI, Konarev PV, Curry S & Matthews S (2004) Structure and RNA interactions of the N-terminal RRM domains of PTB Structure (Camb) 12, 1631–1643 25 Kielkopf CL, Lucke S & Green MR (2004) U2AF homology motifs: protein recognition in the RRM world Genes Dev 18, 1513–1526 26 Oubridge C, Ito N, Evans PR, Teo CH & Nagai K (1994) Crystal structure at 1.92 A resolution of the RNA-binding domain of the U1A spliceosomal protein complexed with an RNA hairpin Nature 372, 432–438 2129 The RRM domain, a plastic RNA-binding platform 27 Allain FH, Gubser CC, Howe PW, Nagai K, Neuhaus D & Varani G (1996) Specificity of ribonucleoprotein interaction determined by RNA folding during complex formulation Nature 380, 646–650 28 Allain FH, Bouvet P, Dieckmann T & Feigon J (2000) Molecular basis of sequence-specific recognition of preribosomal RNA by nucleolin EMBO J 19, 6870–6881 29 Varani L, Gunderson SI, Mattaj IW, Kay LE, Neuhaus D & Varani G (2000) The NMR structure of the 38 kDa U1A protein – PIE RNA complex reveals the basis of cooperativity in regulation of polyadenylation by human U1A protein Nat Struct Biol 7, 329–335 30 Johansson C, Finger LD, Trantirek L, Mueller TD, Kim S, Laird-Offringa IA & Feigon J (2004) Solution structure of the complex formed by the two N-terminal RNA-binding domains of nucleolin and a pre-rRNA target J Mol Biol 337, 799–816 31 Handa N, Nureki O, Kurimoto K, Kim I, Sakamoto H, Shimura Y, Muto Y & Yokoyama S (1999) Structural basis for recognition of the tra mRNA precursor by the Sex-lethal protein Nature 398, 579–585 32 Price SR, Evans PR & Nagai K (1998) Crystal structure of the spliceosomal U2B¢-U2A¢ protein complex bound to a fragment of U2 small nuclear RNA Nature 394, 645–650 33 Deo RC, Bonanno JB, Sonenberg N & Burley SK (1999) Recognition of polyadenylate RNA by the poly(A)-binding protein Cell 98, 835–845 34 Ding J, Hayashi MK, Zhang Y, Manche L, Krainer AR & Xu RM (1999) Crystal structure of the twoRRM domain of hnRNP A1 (UP1) complexed with single-stranded telomeric DNA Genes Dev 13, 1102– 1115 35 Wang X & Tanaka Hall TM (2001) Structural basis for recognition of AU-rich element RNA by the HuD protein Nat Struct Biol 8, 141–145 36 Mazza C, Segref A, Mattaj IW & Cusack S (2002) Large-scale induced fit recognition of an m (7) GpppG cap analogue by the human nuclear cap-binding complex EMBO J 21, 5548–5557 37 Allain FH, Gilbert DE, Bouvet P & Feigon J (2000) Solution structure of the two N-terminal RNA-binding domains of nucleolin and NMR study of the interaction with its RNA target J Mol Biol 303, 227–241 38 Allers J & Shamoo Y (2001) Structure-based analysis of protein–RNA interactions using the program ENTANGLE J Mol Biol 311, 75–86 39 Varani G & Nagai K (1998) RNA recognition by RNP proteins during RNA processing Annu Rev Biophys Biomol Struct 27, 407–445 40 Showalter SA & Hall KB (2004) Altering the RNAbinding mode of the U1A RBD1 protein J Mol Biol 335, 465–480 41 Crowder SM, Kanaar R, Rio DC & Alber T (1999) Absence of interdomain contacts in the crystal structure 2130 C Maris et al 42 43 44 45 46 47 48 49 50 51 52 53 54 of the RNA recognition motifs of Sex-lethal Proc Natl Acad Sci USA 96, 4892–4897 Grabowski PJ & Black DL (2001) Alternative RNA Splicing in the nervous system Prog Neurobiol 65, 289–308 Mazza C, Ohno M, Segref A, Mattaj IW & Cusack S (2001) Crystal structure of the human nuclear cap binding complex Mol Cell 8, 383–396 Mazza C, Segref A, Mattaj IW & Cusack S (2002) Co-crystallization of the human nuclear cap-binding complex with a m7GpppG cap analogue using protein engineering Acta Crystallogr D Biol Crystallogr 58, 2194–2197 Calero G, Wilson KF, Ly T, Rios-Steiner JL, Clardy JC & Cerione RA (2002) Structural basis of m7GpppG binding to the nuclear cap-binding protein complex Nat Struct Biol 9, 912–917 Selenko P, Gregorovic G, Sprangers R, Stier G, Rhani Z, Kramer A & Sattler M (2003) Structural basis for the molecular recognition between human splicing factors U2AF65 and SF1 ⁄ mBBP Mol Cell 11, 965–976 Fribourg S, Gatfield D, Izaurralde E & Conti E (2003) A novel mode of RBD-protein recognition in the Y14-Mago complex Nat Struct Biol 10, 433– 439 Lau CK, Diem MD, Dreyfuss G & Van Duyne GD (2003) Structure of the Y14-Magoh core of the exon junction complex Curr Biol 13, 933–941 Bono F, Ebert J, Unterholzner L, Guttler T, Izaurralde E & Conti E (2004) Molecular insights into the interaction of PYM with the Mago-Y14 core of the exon junction complex EMBO Report 5, 304–310 Kadlec J, Izaurralde E & Cusack S (2004) The structural basis for the interaction between nonsensemediated mRNA decay factors UPF2 and UPF3 Nat Struct Mol Biol 11, 330–337 Kielkopf CL, Rodionova NA, Green MR & Burley SK (2001) A novel peptide recognition mode revealed by the X-ray structure of a core U2AF35 ⁄ U2AF65 heterodimer Cell 106, 595–605 Xu RM, Jokhan L, Cheng X, Mayeda A & Krainer AR (1997) Crystal structure of human UP1, the domain of hnRNP A1 that contains two RNA-recognition motifs Structure 5, 559–570 Shamoo Y, Krueger U, Rice LM, Williams KR & Steitz TA (1997) Crystal structure of the two RNA binding ˚ domains of human hnRNP A1 at 1.75 A resolution Nat Struct Biol 4, 215–222 van Gelder CW, Gunderson SI, Jansen EJ, Boelens WC, Polycarpou-Schwarz M, Mattaj IW & van Venrooij WJ (1993) A complex secondary structure in U1A pre-mRNA that binds two molecules of U1A protein is required for regulation of polyadenylation EMBO J 12, 5191–5200 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS C Maris et al 55 Thompson JD, Higgins DG & Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice Nucleic Acids Res 22, 4673–4680 56 Koradi R, Billeter M & Wuthrich K (1996) MOLMOL: a program for display and analysis of macromolecular structures J Mol Graph 51–5, 29–32 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS The RRM domain, a plastic RNA-binding platform 57 Kraulis PJ (1991) MOLSCRIPT: a program to produce both detailled and schematic plots of protein structures J Appl Crystallogr 24, 946–950 58 Merritt EA & Murphy MEP (1994) Raster3d, Version 2.0: A program for photorealistic molecular graphics Acta Crystallogr D Biol Crystallogr 50, 869–873 2131 ... structural features of the RNA such as the base pair (U 1A and U2B¢) or loop E (nucleolin) that closes the RNA loop, are crucial for positioning the RRM onto the RNA It was postulated that the RNA structure... with the RNA stem although the closing base pair is a UU base pair in U2snRNA SLIV instead of a GC in U1snRNA SLII A C 2124 C Maris et al While both U 1A and U2B¢ recognize the bases at the top... crucial importance to dramatically enhance the RNA- binding affinity by increasing the protein? ?RNA interaction network In most RRM? ?RNA complexes, the base stacking on the aromatic residue at RNP