Tài liệu Báo cáo khoa học: Looking for the ancestry of the heavy-chain subunits of heteromeric amino acid transporters rBAT and 4F2hc within the GH13 a-amylase family ppt

14 564 0
Tài liệu Báo cáo khoa học: Looking for the ancestry of the heavy-chain subunits of heteromeric amino acid transporters rBAT and 4F2hc within the GH13 a-amylase family ppt

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Looking for the ancestry of the heavy-chain subunits of heteromeric amino acid transporters rBAT and 4F2hc within the GH13 a-amylase family ˇ ˇ ˇ Marek Gabrisko1 and Stefan Janecek1,2 Institute of Molecular Biology, Slovak Academy of Sciences, Bratislava, Slovakia Department of Biotechnology, Faculty of Natural Sciences, University of SS Cyril and Methodius, Trnava, Slovakia Keywords 4F2hc; evolutionary relatedness; oligo-1,6glucosidase subfamily; rBAT; a-amylase family Correspondence ˇ ˇ S Janecek, Institute of Molecular Biology, ´ ´ Slovak Academy of Sciences, Dubravska cesta 21, SK-84551 Bratislava, Slovakia Fax: +421 59307416 Tel: +421 59307420 E-mail: Stefan.Janecek@savba.sk (Received 15 July 2009, revised 18 September 2009, accepted 12 October 2009) doi:10.1111/j.1742-4658.2009.07434.x In an effort to shed more light on the early evolutionary history of the heavy-chain subunits of heteromeric amino acid transporters (hcHATs) rBAT and 4F2hc within the a-amylase family GH13, a bioinformatics study was undertaken The focus of the study was on a detailed sequence comparison of rBAT and 4F2hc proteins from as wide as possible taxonomic spectrum and enzyme specificities from the a-amylase family The GH13 enzymes were selected from the so-called GH13 oligo-1,6-glucosidase and neopullulanase subfamilies that represent the a-amylase family enzyme groups most closely related to hcHATs Within this study, more than 30 hcHAT-like proteins, designated here as hcHAT1 and hcHAT2 groups, were identified in basal Metazoa Of the GH13 catalytic triad, only the catalytic nucleophile (aspartic acid 199 of the oligo-1,6-glucosidase) could have its counterpart in some 4F2hc proteins, whereas most rBATs contain the correspondences for the entire GH13 catalytic triad Moreover, the 4F2hc proteins lack not only domain B typical for GH13 enzymes, but also a stretch of  40 amino acid residues succeeding the b4-strand of the catalytic TIM barrel rBATs have the entire domain B as well as longer loop The higher sequence–structural similarity between rBATs and GH13 enzymes was reflected in the evolutionary tree At present it is necessary to consider two different scenarios on how the chordate rBAT and 4F2hc proteins might have evolved The GH13-like protein from the cnidarian Nematostella vectensis might nowadays represent a protein close to the eventual ancestor of the hcHAT proteins within the GH13 family Introduction To fulfil its metabolic needs, a cell uses specialized transport proteins to perform and control the uptake and efflux of crucial compounds (e.g sugars, amino acids, nucleotides, inorganic ions and drugs) across the plasma membrane These proteins have been classified into the phylogenetically derived solute carrier (SLC) families; current classification counts almost 50 SLC families [1,2] The sequence similarity between the heavy-chain subunits of heteromeric amino acid transporters (hcHATs) and the a-glucosidases from the a-amylase family [3] was first recognized more than 15 years ago [4] HATs are composed proteins consisting of a light subunit (SLC7 members) and a heavy subunit (known as rBAT or 4F2hc; SLC3 Abbreviations ATG, amino acid transporter glycoprotein; CSR, conserved sequence regions; GH, glycoside hydrolase; HAT, heteromeric amino acid transporter; hcHAT, heavy-chain subunits of heteromeric amino acid transporter; SLC, solute carrier FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7265 ˇ ˇ ˇek M Gabrisko and S Janec Origin of rBAT and 4F2hc within the GH13 a-amylase family members), connected by a disulfide bridge [2] Because of their significance in human pathology (their defects lead to primary inherited aminoacidurias, e.g failed renal reabsorption of amino acids), HATs have attracted much attention in medical studies (e.g [2,5–7]) The light subunit is a nonglycosylated hydrophobic 12-helix transmembrane protein, whereas the heavy subunit is a type II membrane N-glycoprotein with an intracellular N-terminal end, a single transmembrane region and a large extracellular C-terminal domain [2] It is the light subunit that possesses the amino acid transportation activity, although without interacting with the heavy subunit it is unable to reach the plasma membrane Thus, the role of the heavy subunit is to recognize the light subunit and to chaperone it to a proper position in the plasma membrane, i.e this subunit is not absolutely necessary for the transport activity [8], but interestingly its C-terminal extracellular domain exhibits sequence similarities to the a-amylase family enzymes [4,9] The a-amylase family [3] forms in the sequencebased classification of glycoside hydrolases (GHs), the GH-H clan [10], consisting of three GH families: GH13, GH70 and GH77 These enzymes ( 30 different EC numbers) should satisfy the following requirements: (a) the catalytic domain is formed by the (b ⁄ a)8 barrel fold (i.e TIM barrel) with a small distinct domain B protruding out from the barrel between the b3-strand and the a3-helix; (b) the catalytic machinery consists of the b4-strand aspartate (nucleophile), b5-strand glutamate (proton donor) and b7-strand aspartate (transition-state stabilizer); (c) the enzymes employ retaining reaction mechanism; and (d) sequences contain between four and seven conserved sequence regions (CSRs) covering mainly the b-strands of the catalytic TIM barrel [3,11–13] Of the three GH families of the clan GH-H, it was the family GH13 that was originally established as the a-amylase family [14–17] At present it belongs to the largest families in the entire classification of GHs [10] Although the overall sequence identity within the GH13 is extremely low [13], it contains several groups of enzymes exhibiting a higher degree of mutual sequence similarity so that the family has recently been divided into subfamilies [18] Of these, the best resemblance to hcHATs was revealed for the members of the so-called oligo1,6-glucosidase subfamily [9,19,20] This was recently confirmed by solving the three-dimensional structure of the C-terminal domain of 4F2hc [21], which most resembles the oligo-1,6-glucosidase from Bacillus cereus [22] and a-glucosidase from Geobacillus sp HTA-462 [23] 7266 In hcHATs, the regions of similarity cover the sequence segments within the C-terminal extracellular domain The segments correspond, in fact, with some of the a-amylase family CSRs, namely the b-strands b2, b3, b4 and b8 of the (b ⁄ a)8 barrel domain, and for rBAT also with the short stretch near the C-terminus of domain B [9,19] From the sequence–structure point of view, the basic difference between rBAT and 4F2hc is that rBAT possesses the segment that corresponds with domain B of GH13 enzymes, whereas 4F2hc does not have it [9,19,24,25] The main goal of the present study was to investigate further the resemblance between hcHAT proteins and the enzymes from the a-amylase family We therefore carried out a bioinformatics study focused on a detailed comparison of all available rBAT and 4F2hc sequences with GH13 enzyme representatives covering mainly the oligo-1,6-glucosidase subfamily This could help to elucidate the origin of the hcHAT proteins within the GH13 a-amylase enzyme family, as well as shed some light on the possible evolutionary events leading to separation of the heavy-chain subunit of these amino acid transporters from the enzymes involved in the metabolism of starch and related saccharides Results and Discussion Evolutionary relationships and sequence–structural comparison This study delivers the in silico analysis of 134 sequences consisting of 92 hcHAT proteins (representing known rBATs and 4F2hc proteins as well as their newly identified putative homologues) and 42 GH13 enzymes (including four GH13-like sequences) (Table 1) Their global multiple sequence alignment (not shown) covers: (a) the N-terminal region, transmembrane segment, central TIM barrel domain, including domain B and the C-terminal domain C for rBAT proteins (669 residues on average); (b) the catalytic TIM barrel domain, including domain B and the C-terminal domain C for GH13 enzymes (572 residues on average); and (c) the N-terminal region, transmembrane segment, central TIM barrel domain and the C-terminal domain C for 4F2hc proteins (542 residues on average) The length of the entire amino acid sequence alignment was 1099 positions, but it should be taken into account that, if the gaps are excluded, the overall number of comparable positions would be < 100 Figure illustrates the evolutionary relationships between the studied hcHAT proteins and GH13 enzymes from the a-amylase family The tree was FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS ˇ ˇ ˇ M Gabrisko and S Janecek Origin of rBAT and 4F2hc within the GH13 a-amylase family FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7267 Origin of rBAT and 4F2hc within the GH13 a-amylase family ˇ ˇ ˇek M Gabrisko and S Janec calculated using the neighbour-joining method [26] Other approaches, such as maximum likelihood [27], maximum parsimony [28], minimum evolution [29] and upgma [30] were also used, but they delivered basically similar topologies (not shown) The two main groups of hcHATs (Fig 1), i.e those of rBAT and 4F2hc, form their own clusters within which taxonomy is respected: (a) for the rBATs from human via representatives of mammals, birds, lizard, frogs and fishes to Urochordata (sea squirts) and Cephalochordata (lancelet); and (b) for the 4F2hc proteins from human via mammals, perhaps omitting birds (as it is not found in chicken and zebra finch), lizard, frogs, fishes and platypus to Petromyzon (sea lamprey), Urochordata (sea squirts) and even Ixodes (tick) What is also clear is the grouping of the GH13 enzymes, which cover: (a) the representatives of the 7268 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS ˇ ˇ ˇ M Gabrisko and S Janecek individual enzyme specificities from both the oligo-1,6glucosidase and neopullulanase GH13 subfamilies [20]; and (b) the additional GH13 a-glucosidases from fungi (yeast), insects and thermophilic (and soil) bacteria It is worth mentioning that the fungal (yeast) a-glucosidases are clustered with their counterparts from bacilli and the closely related specificities, such as oligo-1,6glucosidase, dextran glucosidase, trehalose-6-phosphate hydrolase and isomaltulose synthase, whereas the representatives of trehalose synthase, amylosucrase and sucrose phosphorylase share the branch leading also to members of the neopullulanase GH13 subfamily together with the intermediary enzymes (Fig 1) The overall arrangement of the tree is that the clusters of true rBAT and 4F2hc proteins are separated from each other by the GH13 enzymes All remaining sequences (except those from nematodes) that were not possible to classify as true rBAT and true 4F2hc proteins were first designated as hcHAT-like proteins Then, based on an approximate alignment, which served to construct a preliminary evolutionary tree, these hcHAT-like proteins were divided into hcHAT1 and hcHAT2 groups (Table 1) It is worth mentioning that most of them are hypothetical proteins that in some cases were retrieved from recent complete genome sequencing projects containing raw sequence data still without appropriate annotation Most hcHAT1 proteins cover the insects and, in a wider sense, the Arthropoda (daphnia), which are completed by Cephalochordata and Echinodermata (both Deuterostomia) and one representative from Cnidaria (Nematostella) The group of hcHAT2 proteins also consists of Arthropoda, i.e insects accompanied by Daphnia and Ixodes, and two representatives of schistosomes Interestingly, although present in the subgenus Drosophila, hcHAT2 proteins seem to be lacking in the melanogaster group (subgenus Sophophora) With regard to hcHAT1 from Aeges aegypti [31] and Drosophila melanogaster [32], these two proteins have already been experimentally confirmed as heavy-chain subunits (CD98hc, i.e 4F2hc) in the amino acid transporter system analogous to that known in mammals [2,21] A similar observation was reported for the SPRM1hc from Schistosoma mansoni [33], which in the present study is classified in the hcHAT2 group (Table 1) Obviously, although hcHAT1 and hcHAT2 groups retain independency from each other, both seem to be more closely related to typical 4F2hc proteins than to rBATs (Fig 1) Concerning the above-mentioned hcHAT sequences from nematodes, these proteins from Caenorhabditis elegans [34] have been named as amino acid transporter glycoproteins (ATG) Of the two groups, ATG1 Origin of rBAT and 4F2hc within the GH13 a-amylase family and ATG2 (Table 1), the relevant light chains combined only with ATG2 exhibited the transporter function [34] From the evolutionary tree (Fig 1), both ATG clusters (ATG1 and ATG2) from all studied nematodes could represent a counterpart group to hcHAT2 proteins As far as the sequence similarities and differences between the hcHAT proteins and GH13 enzymes are concerned, the basic feature discriminating the 4F2hc proteins from both rBATs and GH13 enzymes is the lack of domain B protruding out of the TIM barrel in the place of loop connecting the b3-strand to the a3-helix [9,21] Sharing domain B by rBATs and GH13 enzymes, and especially the sequence of the fifth CSR (QPDLN for both human rBAT and Bacillus cereus oligo-1,6-glucosidase) [20] (Fig 2), may indicate a shorter evolutionary distance for rBATs from the GH13 ancestor common for both rBAT and 4F2hc proteins Complete domain B with well-conserved b-strands is also present in hcHAT1 proteins In all other groups, this domain is more or less distorted, culminating in complete loss in 4F2hc proteins The presence of full GH13 domain B in hcHAT1 and the absence of its parts in hcHAT2 indicate the eventual intermediary or primordial character of both hcHAT1 and hcHAT2 with regard to the appearance of typical rBAT and typical 4F2hc proteins in animals This seems to be obvious, according to our present knowledge, from Urochordata (Fig 1) The second sequence feature clearly visible from the alignment is whether the individual catalytic residues, or even the entire catalytic triad of the GH13 a-amylase family, could be found in the hcHAT representatives Fort et al [21] reported that the human 4F2hc does not exhibit any a-glucosidase activity This is consistent with almost a complete lack of the catalytic triad in all 4F2hc proteins (Fig 2) It is worth mentioning that, especially in higher animals (mammals and also in frogs and fishes), an aspartate (aspartic acid 248 in human 4F2hc; aspartic acid 380 in Fig as both the N-terminal and transmembrane segments are involved) could be a relic of the GH13 b4-strand catalytic nucleophile [3,11–13], although shifted one position to the C-terminus (Fig 2) On the other hand, most rBAT representatives contain all three catalytic residues (Fig 2) with the exception of those from birds, lizards and frogs (lacking both essential aspartates at the b4- and b7-strands) and also from some fishes (lacking the b4-strand aspartate) This may mean that the eventuality of a-glucosidase activity of true rBATs cannot be unambiguously eliminated The selected CSRs (Fig 2) characteristic of the a-amylase enzyme family GH13 [13] illustrate the addi- FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7269 Origin of rBAT and 4F2hc within the GH13 a-amylase family ˇ ˇ ˇek M Gabrisko and S Janec Fig Evolutionary tree of the hcHAT proteins and the GH13 a-amylase family members The tree is based on the alignment of complete sequences and calculated including gaps The numbers represent the bootstrap values The individual proteins and enzymes are abbreviated as follows (see also Table 1): rBAT, true rBAT proteins; 4F2, true 4F2hc proteins; ATG1 and ATG2, ATGs from nematodes; hcHAT1 and hcHAT2, hcHAT-like proteins covering basal metazoans and arthropods; GH13, GH13-like proteins or enzymes; OGLU, oligo-1,6glucosidase; AGLU, a-glucosidase; DGLU, dextran glucosidase; T6PH, trehalose-6phosphate hydrolase; ASU, amylosucrase; SPH, sucrose phosphorylase; IMSY, isomaltulose synthase; TSY, trehalose synthase; CMD, cyclomaltodextrinase; MGA, maltogenic amylase; NPU, neopullulanase; INT, intermediary group between oligo-1,6glucosidase and neopullulanase subfamilies 7270 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS ˇ ˇ ˇ M Gabrisko and S Janecek tional sequence features conserved mutually between the hcHAT and hcHAT-like proteins and GH13 enzymes, as well as within the individual groups of hcHAT representatives, i.e rBAT, 4F2hc, hcHAT1, hcHAT2 and ATG groups (Table 1) Overall, and interestingly, the residues that have not yet been revealed to be essential for the GH13 enzymes seem to be well conserved, e.g (a) a stretch of three hydrophobic aliphatic residues (207_LII in human rBAT) preceding the important aspartate (aspartic acid 98 in oligo-1,6-glucosidase) in region I covering the b3-strand; (b) a segment of up to five residues (307_GVDGF in human rBAT) preceding the functional arginine (arginine 197 in oligo-1,6-glucosidase) in region II of the b4-strand; and (c) more or less the entire region VII, i.e the b8-strand The fact that rBATs exhibit more sequence similarities with the GH13 enzymes than the 4F2hc proteins is also clearly and easily visible in selected CSRs (Fig 2) It concerns mainly: (a) tryptophan (tryptophan 161 in human rBAT) in region VI (b2-strand); (b) histidine (histidine 215) at the end of region I (b3-strand); the entire region V in loop (i.e domain B) being 282_QPDLN in human rBAT; and (d) conserving the catalytic residues (often the entire catalytic triad) Some of these features can be traced in the sequences of hcHAT1 and hcHAT2 groups as well as of the ATG proteins (Fig 2), indicating evolutionary relationships of all these enzymes and proteins and hinting at their eventual evolutionary histories It is worth mentioning that to understand the common evolutionary history of hcHAT proteins and GH13 enzymes it is necessary to re-evaluate the CSR VII covering the b8-strand [13,20], as this segment – obviously without the GH13 functionally important residues – belongs to their best conserved shared sequence parts (Fig 2) It is also of importance to note that if the CSRs (Fig 2) serve to calculate the evolutionary tree (not shown), all hcHAT1 proteins (covering basal metazoans and arthropods) and both ATG groups (ATG1 and ATG2 from nematodes; Table 1) cluster together with rBAT proteins and GH13 enzymes (although with low bootstrap values), whereas the entire hcHAT2 group shares the branch with the 4F2hc proteins As no a-glucosidase activity was detected for the human 4F2hc [21], reflecting that only the catalytic nucleophile (aspartic acid 380; Fig 2) may be preserved, it was of interest to identify the CSRs covering the GH13 functionally important residues in hcHATs From all of them (Fig 2), CSR III (b5-strand with the glutamate acting as a proton donor) is not easily identifiable, even for the enzymatically active GH13 members [13] Therefore, one of the goals was to align Origin of rBAT and 4F2hc within the GH13 a-amylase family correctly the b5-strands of the hcHAT sequences, which was especially problematic for the 4F2hc proteins completely lacking the catalytic glutamate (Fig 2) In this regard, the putative GH13-like sequence from the cnidarian Nematostella vectensis containing the b5-strand segment 273_RLLIGE (Fig 2) should be of special importance from an evolutionary point of view, as it contains the features of both the GH13 enzymes (i.e the glutamic acid residue in a corresponding position) and typical 4F2hc proteins (i.e arginine or lysine followed by the stretch of three aliphatic hydrophobic residues, e.g 405_RLLIAG in human 4F2hc; Fig 2) This segment preceding the catalytic b5-strand glutamate is also conserved in most insect a-glucosidases, supporting the possibility that the ancestry of the hcHAT proteins within the GH13 a-amylase enzyme family could be rooted in basal metazoans, currently represented by Nematostella vectensis A comparison of the three-dimensional structures of representatives of hcHATs (human 4F2hc, 417 residues [21] and a model of the human rBAT with 535 residues) and GH13 enzymes (Geobacillus sp HTA-46 a-glucosidase; 531 residues [23]) confirmed the expected higher similarity between rBAT proteins and ˚ GH13 enzymes (root-mean-square deviation 1.62 A between the Ca atoms of 436 corresponding residues) than between 4F2hc proteins and GH13 enzymes ˚ (1.67 A for 293 Ca atoms) as well as rBATs and 4F2hc ˚ proteins mutually (1.80 A for 271 Ca atoms) However, what could be more interesting is the observation of human 4F2hc lacking not only domain B, but also a stretch of  40 amino acid residues succeeding the b4-strand (not shown) The human 4F2hc thus possesses a very short loop connecting the b4-strand to a4-helix in an opposite manner to what is seen in both the Geobacillus a-glucosidase and human rBAT protein (having the entire domain B) Regardless of whether domain B in the GH13 oligo-1,6-glucosidase subfamily members (and also in rBATs) operates in conjunction with the prolonged loop 4, it seems that the consecutive loss of domain B in 4F2hc proteins is connected with adequate shortening of loop 4, as the observation can be generalized to all 4F2hc proteins Note that the GH13 neopullulanase subfamily members [20], possessing shorter domain B [9,35–37], also lack the longer excursion of the loop segment Selection pressure With regard to close sequence similarity between the GH13 enzymes and the hcHAT proteins (especially rBATs), it is interesting to compare the selection pres- FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7271 Origin of rBAT and 4F2hc within the GH13 a-amylase family 7272 ˇ ˇ ˇek M Gabrisko and S Janec FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS ˇ ˇ ˇ M Gabrisko and S Janecek Origin of rBAT and 4F2hc within the GH13 a-amylase family Fig The CSRs of the hcHAT proteins and the GH13 a-amylase family members A list of the abbreviations of proteins and enzymes can be found in Fig The segments covering the strands b2, b3, loop (near the C-terminus of domain B connecting the b3-strand and helix 3), b4, b5, b7 and b8 represent the individual CSRs of the a-amylase family [13] The positions corresponding with the GH13 catalytic triad are boxed The individual selected residues are highlighted as follows: aspartate and glutamate – red; glycine and proline – black; valine, leucine and isoleucine – grey; phenylalanine and tyrosine – blue; tryptophan – magenta; histidine – cyan; arginine and lysine – green; cysteine – yellow sure acting on corresponding stretches of amino acid sequences For this purpose, the selecton tool [38] was chosen Figure illustrates the similarities and differences in selection pressure acting on the three studied protein groups – mammalian 4F2hc proteins, vertebrate rBATs and insect a-glucosidases In agreement with the higher degree of sequence similarity between rBAT proteins and GH13 enzymes, the selection pressure was also found to be more similar for these two groups than that observed for 4F2hc and rBAT proteins, as well as for 4F2hc proteins and GH13 enzymes (Fig 3) Remarkably, there are a few segments, namely those at or around the b2-, b3- and b8-strands (CSRs VI, I and VII, respectively) that exhibit similar selection pressure for all the three groups, i.e rBAT, 4F2hc and a-glucosidases This indicates that the residues from the above-mentioned segments of both rBAT and 4F2hc proteins, sharing the value of selection pressure with their counterparts from a-glucosidases, may also share their functions Although for the b3-strand at least the histidine (histidine 103 in Bacillus cereus oligo-1,6-glucosidase) is known to be involved in the active site of GH13 enzymes [3,11,22], no functional role has been assigned to any residue from both the b2- and b8-strands The results shown here (Fig 3) could therefore mean that they contribute to the overall structural integrity of the TIM barrel domain Concerning the GH13 catalytic triad, it is worth mentioning that in spite of their presence in rBATs, their positions (especially for the b4 catalytic nucleophile and b7 transition-state stabilizer) are selection neutral in contrast to strict purifying selection observed here for a-glucosidases (Fig 3) Eventual evolutionary scenarios This study has delivered not only evolutionary relationships (Fig 1) based on a detailed sequence comparison of all currently available sequences of rBAT, 4F2hc and hcHAT-like proteins with their GH13 enzymatic counterparts (Table 1), but it has also tried to trace the ancestry of hcHAT proteins within the GH13 a-amylase family In fact, two different evolutionary scenarios could be taken into account: (a) in one single event in basal Metazoa and a subsequent split into rBAT and 4F2hc (probably via hcHAT1 group) in chordates; and (b) in two independent branching events, i.e 4F2hc in the basal Metazoa via HAT-like proteins and rBAT directly from enzymes in deuterostomes It is worth mentioning here that both scenarios reflect the ancestry of both rBATs and 4F2hc proteins anchored within the GH13 a-amylase family The difference is only in the way leading from the GH13 enzymes either to rBAT and 4F2hc together or to rBAT and 4F2hc separately At present it is not possible to draw the evolutionary picture unambiguously The first evolutionary scenario, basically consistent with the one proposed originally [9], means that in basal Metazoa an ancestor of both the present-day 4F2hc and rBAT proteins was separated from the GH13 enzymes The ancestor acquired the N-terminal and transmembrane segments and, eventually (in most taxa), duplicated and evolved to give in chordates: (a) rBATs that have kept most of the GH13 sequence– structural features, including domain B as well as catalytic residues (often the entire catalytic triad); and (b) 4F2hc that has consecutively lost almost all of the GH13 characteristic sequence–structural features, including domain B as well as functional residues (mainly the catalytic triad) The weak points of this scenario are: (a) the striking similarity between rBATs and GH13 enzymes; (b) the higher similarity between 4F2hc and hcHAT-like proteins than between 4F2hc and rBATs; and (c) the seeming absence of rBAT ancestors in nematodes and arthropods (Fig 1) The other completely different scenario that would seemingly obey the observation of a generally higher degree of sequence–structural similarity between rBATs and GH13 enzymes than between 4F2hc proteins and GH13 enzymes would assume the independent evolution of rBATs and 4F2hc proteins This eventuality would leave both hcHAT1 and hcHAT2 groups in the history leading to the 4F2hc proteins The problems in this scenario would be: (a) the independent acquisition of both the N-terminal segment and the transmembrane region in rBAT and 4F2hc proteins, which should appear more parsimoniously only once; and (b) the gain of the analogous function Because the family GH13 enzymes are spread throughout the whole taxonomy spectrum from prokaryotes to eukaryotes and are therefore more ancient than the hcHATs (present only in Metazoa), there is FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7273 ˇ ˇ ˇek M Gabrisko and S Janec Origin of rBAT and 4F2hc within the GH13 a-amylase family Fig Selection pressure acting on rBAT and 4F2hc proteins and GH13 insect a-glucosidases (AGLU) Yellow highlighting (1 and 2) indicates a positive selection, whereas red highlighting (4–7) indicates a purifying selection The sequences used for the SELECTON analysis [38] are marked by an asterisk in Table The individual CSRs of the GH13 a-amylase family [13] are boxed; the GH13 catalytic residues are indicated by small yellow boxes The individual structural parts of the proteins, i.e the N-terminal and the transmembrane segments, domain A (TIM barrel), domain B and domain C, are indicated by green, yellow, blue and grey shadowing, respectively only one possible place for rooting the tree that is on the branch leading to the enzymes originating from non-Metazoa (the eventual outgroup) It is worth mentioning, however, that if the evolutionary tree of all proteins studied here is based on the alignment of CSRs (Fig 2), the ATG proteins from nematodes [34] and all hcHAT-like proteins designated here as hcHAT1 group (Table 1), i.e hcHAT-like proteins covering the basal Metazoa and Arthropoda, cluster together with both the rBAT proteins and GH13 enzymes, leaving the 4F2hc proteins with the hcHAT2 group at a different branch (tree not shown) It should be pointed out that despite the fact the GH13 CSRs could be considered to be something like sequence fingerprints of the GH13 a-amylase family members [13], the tree based on the CSRs is supported by low bootstrap values It is thus not possible to say which one, hcHAT1 or hcHAT2, is orthologous to rBAT or 4F2hc, if any Although both hcHAT1 (insect) and hcHAT2 (schistosoma) representatives have already been shown to function rather as 4F2hc than as rBAT [31–33], their rBAT-like role has not as yet been investigated However, as seen in Fig (the tree based on the complete alignment), the hcHAT2 group (Arthropoda) cluster with both ATG1 and ATG2 (Nematoda), indicating that the hcHAT2 and ATG proteins are orthologues Because hcHAT2 ⁄ ATG 7274 are present only in Arthropoda and Nematoda, they probably came from one hcHAT protein (i.e hcHAT1; cf Fig 1) originating from a common ancestor of Ecdysozoa However, it should be stressed that hcHAT2 proteins (except for those from Schistosoma [33]) were first identified in this study, so further research on their function and to identify a light subunit to which they bind, could throw more light on the relationships between various hcHAT proteins Finally, it should be taken into account that the a-amylase family GH13 belongs to the largest GH families covering several tens of specificities and several thousand sequences [3,13,18] where, for example, it is still complicated to trace clearly the evolutionary history, even just for the animal a-amylase [39] Conclusions The examples of a close evolutionary relatedness between the TIM barrel enzymes and their counterparts without the catalytic function are not so exceptional For example, in the family GH18 chitinases, several plant proteins, such as narbonin [40] and concanavalin B [41], have been recognized to be former chitinases that have lost their catalytic residues Even in the GH13 a-amylase family, an enzymatically inactive remote paralogous Amyrel (amylase-related) gene FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS ˇ ˇ ˇ M Gabrisko and S Janecek in fruit flies (Diptera) was revealed [42,43] Moreover, the events of horizontal gene transfer have also been discussed, as animal-like and plant-like a-amylase genes were found in actinomycetes and other bacteria [44,45] However, the main goal of the present work was to shed more light on the early history of extant rBAT and 4F2hc proteins Many sequences of hypothetical hcHAT-like proteins, covering the basal metazoans and arthropods that are very probably homologues to heavy-chain partners of the light-chain subunits of the HAT system of chordates, have been identified and analysed, which should enable the experimentalists in the field to direct their research more appropriately The significance of this study could be in the field of: (a) hcHAT protein research (expanding the taxonomy spectrum and eventually focusing on more simple model organisms, e.g Nematostella vectensis); (b) the a-amylase GH13 family enzymes (identifying the sequence features often omitted in analyses of GH13 enzymes, but also clearly conserved in rBAT and 4F2hc proteins, e.g the b8-strand segment of TIM barrel domain); and (c) protein evolution in general On the basis of our analysis, it seems that the hcHATs are present in animals starting from basal Metazoa, as we were unable to find any of their homologues in fungi, plants and single-cell eukaryotes (protists) On the other hand, it is not possible to exclude that in the future some new sequences of hcHAT-like proteins of nonmetazoan origin may become available that together with the results delivered here can move our knowledge further Materials and methods As a first step, all available sequences of hcHATs and hcHAT-like proteins were collected (Table 1) using the amino acid sequences of human 4F2hc (GenBank accession number M21904) [24] and rBAT (M95548) [25] proteins as queries for protein blast [46] throughout the default nonredundant database Most sequences were retrieved from GenBank ⁄ RefSeq [47,48], EnsEMBL [49] and Silkworm [50] databases Some hcHAT analogues (Table 1) were obtained by corrections and ⁄ or predictions from recent complete genome sequencing projects (containing raw and unannotated sequence data) using the genewise program [51], based on sequence comparison with known hcHAT proteins from related organisms Overall, 92 hcHAT and hcHAT-like proteins were studied, which were divided as follows (Table 1): true rBAT – 21 (subfamily GH13_35); true 4F2hc – 27 (subfamily GH13_34); ATG [34] – 8, plus two groups of hcHATlike proteins hcHAT1 (22) and hcHAT2 (14) Origin of rBAT and 4F2hc within the GH13 a-amylase family Eight key enzyme specificities from the oligo-1,6-glucosidase subfamily (the GH13 subfamilies 4, 16–18, 29–31 and also until now some unassigned GH13 enzymes) [18,20] were selected (Table 1) and retrieved from GenBank [47]; the proteins with known threedimensional structure being preferred In addition, the three representatives of the neopullulanase subfamily (GH13_20) [20] were added together with the three neopullulanase-like (intermediary character; GH13_36) enzymes Because a-glucosidases are produced by a wide spectrum of diverse taxonomic groups, these GH13 enzymes from bacteria (6), fungi (2) and insects (16) were also involved The set of studied proteins was completed by four hypothetical proteins (marked as GH13-like in Table 1), which could not be distinguished as hcHAT proteins or GH13 enzymes Despite possessing the catalytic triad and domain B (typical for GH13 enzymes), they may also possess the transmembrane region (typical for hcHAT proteins) because at present their sequences are apparently (sea urchin) and potentially (Nematostella) incomplete Concerning the sequences from Trichoplax, they contain the N-terminal part preceding the TIM barrel, but it is sequentially dissimilar to that found in hcHATs All sequence alignments were performed using the program clustalx2 [52] and then manually tuned with regard to CSRs, domain borders and tertiary structures known from the literature [3,9,12–14,20–22] Several preliminary evolutionary trees were calculated using the neighbour-joining [26], maximum likelihood [27], maximum parsimony [28], minimum evolution [29] and upgma [30] methods The final evolutionary tree was calculated on the European Bioinformatics Institute’s server (http://www.ebi.ac.uk/) for clustalw2 [53] as Phylip-tree type and neighbour-joining clustering [26] using the alignment of complete sequences, including the gaps The tree was displayed using the program treeview [54] Three-dimensional structures were retrieved from the Protein Data Bank [55] for the human 4F2hc (Protein Data Bank code: 2DH2; [21]), oligo-1,6-glucosidase from Bacillus cereus (1UOK; [22]) and a-glucosidase from Geobacillus sp strain HTA-46 (2ZE0; [23]) Due to very close sequence-structural similarity with both these GH13 glucosidases, the structure of dextran glucosidase [55a] was not taken into the present comparison The structure of human rBAT was modelled by the automated homology modelling program esypred3d [56] using the co-ordinates of the oligo1,6-glucosidase (1UOK) as a template The structures were overlapped to each other using the multiprot server at http://bioinfo3d.cs.tau.ac.il/MultiProt/ [57] FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7275 Origin of rBAT and 4F2hc within the GH13 a-amylase family In order to compare selection pressure acting on hcHAT proteins and a-amylase family enzymes, three different groups – mammalian 4F2hc, vertebrate rBAT and insect a-glucosidases (Table 1) – were analysed using the selecton tool [38] at http://selecton.tau.ac.il/ The ratio between nonsynonymous (Ka) and synonymous (Ks) substitutions was calculated for every codon using the default M8 codon-substitution model [58] on aligned nucleotide sequences In the analysis, the default neighbour-joining tree made by the selecton itself was used, the selection pressure not being mapped on a three-dimensional structure, and the precision level was set as the highest Acknowledgement This work was supported in part by VEGA grant no ⁄ 0114 ⁄ 08 from the Slovak Grant Agency for Science References Hediger MA, Romero MF, Peng JB, Rolfs A, Takanaga H & Bruford EA (2004) The ABCs of solute carriers: physiological, pathological and therapeutic implications of human membrane transport proteins Introduction Pflugers Arch 447, 465–468 Palacin M, Nunes V, Font-Llitjos M, Jimenez-Vidal M, Fort J, Gasol E, Pineda M, Feliubadalo L, Chillaron J & Zorzano A (2005) The genetics of heteromeric amino acid transporters Physiology 20, 112–124 MacGregor EA, Janecek S & Svensson B (2001) Relationship of sequence and structure to specificity in the a-amylase family of enzymes Biochim Biophys Acta 1546, 1–20 Wells RG & Hediger MA (1992) Cloning of a rat kidney cDNA that stimulates dibasic and neutral amino acid transport and has sequence similarity to glucosidases Proc Natl Acad Sci USA 89, 5596–5600 Chillaron J, Roca R, Valencia A, Zorzano A & Palacin M (2001) Heteromeric amino acid transporters: biochemistry, genetics, and physiology Am J Physiol Renal Physiol 281, 995–1018 Broer S & Wagner CA (2002) Structure–function relationships of heterodimeric amino acid transporters Cell Biochem Biophys 36, 155–168 Peters T, Thaete C, Wolf S, Popp A, Sedlmeier R, Grosse J, Nehls MC, Russ A & Schlueter V (2003) A mouse model for cystinuria type I Hum Mol Genet 12, 2109–2120 Reig N, Chillaron J, Bartoccioni P, Fernandez E, Bendahan A, Zorzano A, Kanner B, Palacin M & Bertran J (2002) The light subunit of system b0,+ is fully functional in the absence of the heavy subunit EMBO J 21, 4906–4914 7276 ˇ ˇ ˇek M Gabrisko and S Janec Janecek S, Svensson B & Henrissat B (1997) Domain evolution in the alpha-amylase family J Mol Evol 45, 322–331 10 Cantarel BL, Coutinho PM, Rancurel C, Bernard T, Lombard V & Henrissat B (2009) The CarbohydrateActive EnZymes database (CAZy): an expert resource for glycogenomics Nucleic Acids Res 37(Database Issue), D233–D238 11 Matsuura Y, Kusunoki M, Harada W & Kakudo M (1984) Structure and possible catalytic residues of Taka-amylase A J Biochem 95, 697–702 12 Kuriki T & Imanaka T (1999) The concept of the aamylase family: structural similarity and common catalytic mechanism J Biosci Bioeng 87, 557–565 13 Janecek S (2002) How many conserved sequence regions are there in the a-amylase family? Biologia 57(Suppl 11), 29–41 14 MacGregor EA & Svensson B (1989) A supersecondary structure predicted to be common to several a-1,4D-glucan-cleaving enzymes Biochem J 259, 145–152 15 Henrissat B (1991) A classification of glycosyl hydrolases based on amino acid sequence similarities Biochem J 280, 309–316 16 Takata H, Kuriki T, Okada S, Takesada Y, Iizuka M, Minamiura N & Imanaka T (1992) Action of neopullulanase Neopullulanase catalyzes both hydrolysis and transglycosylation at a-(1 fi 4)- and a-(1 fi 6)glucosidic linkages J Biol Chem 267, 18447–18452 17 Jespersen HM, MacGregor EA, Henrissat B, Sierks MR & Svensson B (1993) Starch- and glycogen-debranching and branching enzymes: prediction of structural features of the catalytic (b ⁄ a)8-barrel domain and evolutionary relationship to other amylolytic enzymes J Protein Chem 12, 791–805 18 Stam MR, Danchin EG, Rancurel C, Coutinho PM & Henrissat B (2006) Dividing the large glycoside hydrolase family 13 into subfamilies: towards improved functional annotations of a-amylase-related proteins Protein Eng Des Sel 19, 555–562 19 Janecek S (2000) Proteins without enzymatic function with sequence relatedness to the a-amylase family Trends Glycosci Glycotechnol 12, 363–371 20 Oslancova A & Janecek S (2002) Oligo-1,6-glucosidase and neopullulanase enzyme subfamilies from the aamylase family defined by the fifth conserved sequence region Cell Mol Life Sci 59, 1945–1959 21 Fort J, de la Ballina LR, Burghardt HE, Ferrer-Costa C, Turnay J, Ferrer-Orta C, Uson I, Zorzano A, Fernandez-Recio J, Orozco M et al (2007) The structure of human 4F2hc ectodomain provides a model for homodimerization and electrostatic interaction with plasma membrane J Biol Chem 282, 31444–31452 22 Watanabe K, Hata Y, Kizaki H, Katsube Y & Suzuki Y (1997) The refined crystal structure of Bacillus cereus ˚ oligo-1,6-glucosidase at 2.0 A resolution: structural FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS ˇ ˇ ˇ M Gabrisko and S Janecek 23 24 25 26 27 28 29 30 31 32 33 34 35 36 characterization of proline-substitution sites for protein thermostabilization J Mol Biol 269, 142–153 Shirai T, Hung VS, Morinaka K, Kobayashi T & Ito S (2008) Crystal structure of GH13 a-glucosidase GSJ from one of the deepest sea bacteria Proteins 73, 126– 133 Gottesdiener KM, Karpinski BA, Lindsten T, Strominger JL, Jones NH, Thompson CB & Leiden JM (1988) Isolation and structural characterization of the human 4F2 heavy-chain gene, an inducible gene involved in T-lymphocyte activation Mol Cell Biol 8, 3809–3819 Lee WS, Wells RG, Sabbag RV, Mohandas TK & Hediger MA (1993) Cloning and chromosomal localization of a human kidney cDNA involved in cystine, dibasic, and neutral amino acid transport J Clin Invest 91, 1959–1963 Saitou N & Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees Mol Biol Evol 4, 406–425 Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach J Mol Evol 17, 368–376 Eck RV & Dayhoff MO (1966) Atlas of Protein Sequence and Structure National Biomedical Research Foundation, Silver Springs, MD Rzhetsky A & Nei M (1993) Theoretical foundation of the minimum-evolution method of phylogenetic inference Mol Biol Evol 10, 1073–1095 Sneath PHA & Sokal RR (1973) Numerical Taxonomy Freeman, San Francisco, CA Jin X, Aimanova K, Ross LS & Gill SS (2003) Identification, functional characterization and expression of a LAT type amino acid transporter from the mosquito Aedes aegypti Insect Biochem Mol Biol 33, 815–827 Reynolds B, Roversi P, Laynes R, Kazi S, Boyd CA & Goberdhan DC (2009) Drosophila expresses a CD98 transporter with an evolutionarily conserved structure and amino acid-transport properties Biochem J 420, 363–372 Krautz-Peterson G, Camargo S, Huggel K, Verrey F, Shoemaker CB & Skelly PJ (2007) Amino acid transport in schistosomes: characterization of the permease heavy chain SPRM1hc J Biol Chem 282, 21767–21775 Veljkovic E, Stasiuk S, Skelly PJ, Shoemaker CB & Verrey F (2004) Functional characterization of Caenorhabditis elegans heteromeric amino acid transporters J Biol Chem 279, 7655–7662 Kim JS, Cha SS, Kim HJ, Kim TJ, Ha NC, Oh ST, Cho HS, Cho MJ, Kim MJ, Lee HS et al (1999) Crystal structure of a maltogenic amylase provides insights into a catalytic versatility J Biol Chem 274, 26279– 26286 Lee HS, Kim MS, Cho HS, Kim JI, Kim TJ, Choi JH, Park C, Lee HS, Oh BH & Park KH (2002) Cyclomal- Origin of rBAT and 4F2hc within the GH13 a-amylase family 37 38 39 40 41 42 43 44 45 46 47 48 49 50 todextrinase, neopullulanase, and maltogenic amylase are nearly indistinguishable from each other J Biol Chem 277, 21891–21897 Hondoh H, Kuriki T & Matsuura Y (2003) Threedimensional structure and substrate binding of Bacillus stearothermophilus neopullulanase J Mol Biol 326, 177–188 Doron-Faigenboim A, Stern A, Mayrose I, Bacharach E & Pupko T (2005) Selecton: a server for detecting evolutionary forces at a single amino-acid site Bioinformatics 21, 2101–2103 Da Lage JL, Danchin EG & Casane D (2007) Where animal a-amylases come from? An interkingdom trip FEBS Lett 581, 3927–3935 Hennig M, Schlesier B, Dauter Z, Pfeffer S, Betzel C, Hohne WE & Wilson KS (1992) A TIM barrel ¨ protein without enzymatic activity? Crystal-structure ˚ of narbonin at 1.8 A resolution FEBS Lett 306, 80–84 Hennig M, Jansonius JN, Terwisscha van Scheltinga AC, Dijkstra BW & Schlesier B (1995) Crystal struc˚ ture of concanavalin B at 1.65 A resolution An ‘‘inactivated’’ chitinase from seeds of Canavalia ensiformis J Mol Biol 254, 237–246 Da Lage JL, Renard E, Chartois F, Lemeunier F & Cariou ML (1998) Amyrel, a paralogous gene of the amylase gene family in Drosophila melanogaster and the Sophophora subgenus Proc Natl Acad Sci USA 95, 6848–6853 Maczkowiak F & Da Lage JL (2006) Origin and evolution of the Amyrel gene in the a-amylase multigene family of Diptera Genetica 128, 145–158 Janecek S (1994) Sequence similarities and evolutionary relationships of microbial, plant and animal aamylases Eur J Biochem 224, 519–524 Da Lage JL, Feller G & Janecek S (2004) Horizontal gene transfer from Eukarya to bacteria and domain shuffling: the a-amylase model Cell Mol Life Sci 61, 97–109 Altschul SF, Gish W, Miller W, Myers EW & Lipman DJ (1990) Basic local alignment search tool J Mol Biol 215, 403–410 Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J & Sayers EW (2009) GenBank Nucleic Acids Res 37(Database Issue), D26–D31 Pruitt KD, Tatusova T, Klimke W & Maglott DR (2009) NCBI reference sequences: current status, policy and new initiatives Nucleic Acids Res 37(Database Issue), D32–D36 Hubbard TJ, Aken BL, Ayling S, Ballester B, Beal K, Bragin E, Brent S, Chen Y, Clapham P, Clarke L et al (2007) Ensembl 2009 Nucleic Acids Res 37(Database Issue), D690–D697 Wang J, Xia Q, He X, Dai M, Ruan J, Chen J, Yu G, Yuan H, Hu Y, Li R et al (2005) SilkDB: a knowl- FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS 7277 ˇ ˇ ˇek M Gabrisko and S Janec Origin of rBAT and 4F2hc within the GH13 a-amylase family 51 52 53 54 55 edgebase for silkworm biology and genomics Nucleic Acids Res 33(Database Issue), D399–D402 Birney E, Clamp M & Durbin R (2004) GeneWise and Genomewise Genome Res 14, 988–995 Jeanmougin F, Thompson JD, Gouy M, Higgins DG & Gibson TJ (1998) Multiple sequence alignment with Clustal X Trends Biochem Sci 23, 403–405 Thompson JD, Higgins DG & Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice Nucleic Acids Res 22, 4673–4680 Page RD (1996) TreeView: an application to display phylogenetic trees on personal computers Comput Appl Biosci 12, 357–358 Berman H, Henrick K, Nakamura H & Markley JL (2007) The worldwide Protein Data Bank (wwPDB): 7278 55a 56 57 58 ensuring a single, uniform archive of PDB data Nucleic Acids Res 35(Database Issue), D301–D303 Hondoh H, Saburi W, Mori H, Okuyama M, Nakada T, Matsuura Y & Kimura A (2008) Substrate recognition mechanism of alpha-1,6-glucosidic linkage hydrolyzing enzyme, dextran glucosidase from Streptococcus mutans J Mol Biol 378, 913–922 Lambert C, Leonard N, De Bolle X & Depiereux E (2002) ESyPred3D: prediction of proteins 3D structures Bioinformatics 18, 1250–1256 Shatsky M, Nussinov R & Wolfson HJ (2004) A method for simultaneous alignment of multiple protein structures Proteins 56, 143–156 Yang Z, Nielsen R, Goldman N & Pedersen AM (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites Genetics 155, 431–449 FEBS Journal 276 (2009) 7265–7278 ª 2009 The Authors Journal compilation ª 2009 FEBS ... Janecek Origin of rBAT and 4F2hc within the GH13 a-amylase family Fig The CSRs of the hcHAT proteins and the GH13 a-amylase family members A list of the abbreviations of proteins and enzymes can... identifiable, even for the enzymatically active GH13 members [13] Therefore, one of the goals was to align Origin of rBAT and 4F2hc within the GH13 a-amylase family correctly the b5-strands of the hcHAT... reflect the ancestry of both rBATs and 4F2hc proteins anchored within the GH13 a-amylase family The difference is only in the way leading from the GH13 enzymes either to rBAT and 4F2hc together

Ngày đăng: 18/02/2014, 13:20

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan