Chen et al BMC Genomics (2021) 22:264 https://doi.org/10.1186/s12864-021-07566-6 RESEARCH ARTICLE Open Access Imprints of independent allopolyploid formations on patterns of gene expression in two sibling yarrow species (Achillea, Asteraceae) Duo Chen1†, Peng-Cheng Yan2† and Yan-Ping Guo1* Abstract Background: Polyploid species often originate recurrently While this is well known, there is little information on the extent to which distinct allotetraploid species formed from the same parent species differ in gene expression The tetraploid yarrow species Achillea alpina and A wilsoniana arose independently from allopolyploidization between diploid A acuminata and A asiatica The genetics and geography of these origins are clear from previous studies, providing a solid basis for comparing gene expression patterns of sibling allopolyploid species that arose independently Results: We conducted comparative RNA-sequencing analyses on the two Achillea tetraploid species and their diploid progenitors to evaluate: 1) species-specific gene expression and coexpression across the four species; 2) patterns of inheritance of parental gene expression; 3) parental contributions to gene expression in the allotetraploid species, and homeolog expression bias Diploid A asiatica showed a higher contribution than diploid A acuminata to the transcriptomes of both tetraploids and also greater homeolog bias in these transcriptomes, possibly reflecting a maternal effect Comparing expressed genes in the two allotetraploids, we found expression of ca 30% genes were species-specific in each, which were most enriched for GO terms pertaining to “defense response” Despite species-specific and differentially expressed genes between the two allotetraploids, they display similar transcriptome changes in comparison to their diploid progenitors Conclusion: Two independently originated Achillea allotetraploid species exhibited difference in gene expression, some of which must be related to differential adaptation during their post-speciation evolution On the other hand, they showed similar expression profiles when compared to their progenitors This similarity might be expected when pairs of merged diploid genomes in tetraploids are similar, as is the case in these two particular allotetraploids Keywords: Allopolyploid speciation, RNA-sequencing, Inheritance of gene expression, Homeolog express bias, Achillea * Correspondence: guoyanping@bnu.edu.cn † Duo Chen and Peng-Cheng Yan contributed equally to this work Key Laboratory of Biodiversity Science and Ecological Engineering of the Ministry of Education, and College of Life Sciences, Beijing Normal University, Beijing, China Full list of author information is available at the end of the article © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data Chen et al BMC Genomics (2021) 22:264 Background Polyploidy is an important mechanism of plant speciation In an allopolyploid species, the combined effects of two or more diverged subgenomes and their regulatory interactions can lead to a myriad of genetic and epigenetic modifications described as genomic and transcriptomic shock [1–6] The resulting changes in gene expression may often generate phenotypic variation affecting individual fitness and evolution of allopolyploids [7–13] Analyses of synthetic polyploid plants have demonstrated that genomic and transcriptomic shock usually occurs immediately after polyploidization [1, 14– 17], though changes may also take place during later stages of the evolutionary history of a polyploid species [3, 6, 18, 19] Polyploid species often consist of lineages that originated independently and recurrently from the same parental species [20, 21] Such recurrent formation can result in karyotypic, genomic, transcriptomic and phenotypic variation across lineages as demonstrated in recently originated allotetraploid species of Tragopogon (Asteraceae) [22–27] However, whereas different lineages of the same allopolyploid species have been studied in detail, divergent species derived by independent origins from the same parental species have been reported less frequently and studied less [28–30] Only in the orchid genus, Dactylorhiza, has research been conducted on gene expression and epigenetic differences among sibling allotetraploids derived from the same parental species pair This showed that both kinds of differences occurred and were stable among these allotetraploid species, raising the possibility that they reflect divergent adaptation to the different environmental conditions experienced by the species [28, 29] To shed further light on how gene expression might differ between allopolyploid species that originated independently from the same progenitor species, we focus here on two allotetraploid yarrow species, Achillea alpina L and A wilsoniana Heimerl ex Hand -Mazz., and their parental species, A acuminata (Ledeb.) Sch -Bip and A asiatica Serg (Asteraceae) In China, these tetraploid species have different distributions, with A alpina occurring in the northeast and A wilsoniana in the southwest of the country [30, 31] Our previous research indicated that the two tetraploids originated independently 35–80 kya following hybridization between their diploid parents during the megainterstadial before the Last Glacial Maximum Two independent contacts between the parental species were involved, possibly in deglaciated habitats located near refugia present in the mountains of northeast China and relatively southwestern in the Qinling Mountains, respectively [30] According to plastid sequencing data, A asiatica mostly likely acted as the maternal parent of both tetraploids [32, 33] Page of 12 To investigate transcriptome changes occurred during allopolyploidization and the following long-term evolution, it is not only necessary to check specific and coexpressed genes among progeny and progenitor species, but also to examine total and relative expression levels of homeologous genes in allotetraploids Relative expression levels of homeologs may reflect preexisting parental relative levels (parental legacy) or originate following allopolyploidy with one homeolog preferentially expressed relative to the other (expression bias) [34–37] Achillea alpina and A wilsoniana are ideal for such analysis for the following reasons First, their parental-offspring relationships are clear and simple (no complicated reticulate relationships are involved according to previous studies) Second, the parental species are extant, making it feasible to compare data from allopolyploids with that of their progenitors Third, the progenitor species, A acuminata and A asiatica, show high levels of genomic sequence divergence [32, 38], while each allopolyploid species maintains both parental genomes intact, having experienced only low levels of homeologous recombination [30, 32, 33] For these reasons, it is easy to distinguish homeologous genes from each other in the allopolyploid transcriptome, and to measure parental contributions and homeolog expression bias In this study, we screened the transcriptome profiles of the two Achillea allotetraploid species and their diploid progenitor species by means of whole transcriptome sequencing By a comparative analysis of these transcriptomes, we examined first the inheritance of parental gene expression, and second relative parental contributions and homeolog expression bias From our results, we ask whether parental effects which are frequently found in plant hybrid/allopolyploid transcriptomes, are apparent in the present polyploid system Furthermore and most importantly, we question to what extent inherited patterns of gene expression are similar in different allopolyploids derived from the same parental species, and how significant evolutionary factors, e.g natural selection and/or genetic drift, have influenced divergent gene expression profiles of the two independently evolved tetraploid species Results Transcriptome profiles Approximately 34–49 million 100 bp paired-end raw reads were generated for a library of each of the studied Achillea species After removing adapter sequences and filtering out reads with low quality, 93.2–96.4% of clean reads were obtained (Table S1) The initial transcripts were assembled and filtered to 51,414–88,150 unigenes across the studied species, with the N50 length of unigenes always longer than the average length of unigenes in each sample (Table 1) The proportion of unigenes with complete or partial ORFs was 63–71% These unigenes were used for subsequent gene expression analysis Chen et al BMC Genomics (2021) 22:264 Page of 12 Table Information of unigenes in the present RNA-Seq data acuARX acuQL asi alp Number of assembled transcripts by Trinity 177,816 180,194 272,030 282,619 wil 300,158 Number of unigenes 51,414 55,391 59,600 81,143 88,150 Average length of unigenes (bp) 1230.20 1243.32 1074.86 976.40 1011.16 N50 length of unigenes (bp) 1678 1687 1544 1386 1432 Number of lncRNAs 5794 5794 7604 11,409 13,748 Number of unigenes with no ORF 9229 11,094 10,342 15,078 18,603 Number of unigenes with complete ORF 21,801 24,123 20,997 24,412 27,984 Number of unigenes with partial ORF 14,590 14,380 20,657 30,244 27,815 Abbreviation of accession names: acuARX for Arxan population of A acuminata; acuQL for Qinling population of A acuminata; alp for A alpina; asi for A asiatica; wil for A wilsoniana (Table 1) The FPKM values of unigenes showed that data correlation among biological replicates of the same tissue/organ of a species/population was higher than among different tissues/organs, indicating that experimental sampling was repeatable and reliable (Fig S1) Specifically expressed and coexpressed genes among each allotetraploid species and its diploid progenitors As shown in the Venn diagrams (Fig 1), there were 23, 614 (29.1%) and 27,535 (31.2%) genes showing speciesspecific expression in the allotetraploids A alpina and A wilsoniana, respectively, equating to higher proportions than in the diploid parental species (20–25%) and indicating rather high amounts of novel gene expression in both allotetraploids The numbers of genes expressed in both parents, but not detected in the allotetraploid transcriptome, were 2150/2137 and 2320/2217 in A alpina and A wilsoniana, respectively, suggesting a relatively low level of gene silencing or loss With regard to coexpression of genes, 35,286 unigenes (about 43.5% of all unigenes) were coexpressed between A alpina and both diploid species, and 36,385 (about 41.3% of all the unigenes) were coexpressed by A wilsoninana and the two diploids (Fig 1) Particularly interesting are the genes of each tetraploid specifically coexpressed with each parental species as this indicates the relative contribution of each parent to the transcriptome of each tetraploid We found that A alpina specifically coexpressed 9922 unigenes with diploids A acuminata, and 12,321 Fig Venn diagrams showing amounts of coexpressed and specifically expressed genes of the studied allotetraploid species and their diploid progenitors As the two allopolyploid species originated independently in different regions, and as the diploid A acuminata shows population genetic differentiation, the analysis was conducted separately for each tetraploid species In the coexpressed gene category, gene number in each species is given (copy-number on some loci may be different among species) Abbreviations: ARX, Arxan Mt.; QL, Qinling Mts Chen et al BMC Genomics (2021) 22:264 unigenes with A asiatica; while A wilsoniana coexpressed 11,348 and 12,882 unigenes with A acuminata and A asiatica, respectively (Fig 1) Thus, both tetraploids coexpressed more genes with A asiatica than with A acuminata Gene Ontology (GO) analysis indicated significant enrichment of these coexpressed genes mostly in terms “response to stress” and “defense response”, suggesting that the tetraploid species inherited environmental response genes separately from both progenitors (Fig 2: A, B; Additional file 6) Species-specific and coexpressed genes in the two allotetraploid species Table shows that comparing the expressed genes in the tetraploids, 29.4% genes expressed in A alpina showed species-specific expression and 33.9% genes Page of 12 expressed in A wilsoniana were species-specific Among the coexpressed genes, 78%–83% were expressed equally in both species, while only about 10% showed up- or down-regulation in one or the other (Table 2) Most enriched GO terms related to biological process (BP) of genes exhibiting species-specific expression pertained to “defense response” in both tetraploids (Fig 2: C, D; Additional file 7) In parallel, we found approximately 30% of genes showing population-specific expression in diploid A acuminata; these were most enriched for GO terms pertaining to “defense response” and/or “response to stress” (Fig 2: E, F; Additional file 7) Moreover, genes coexpressed by each tetraploid with its sympatric A acuminata population were also most enriched for GO terms related to “response to stress” and “defense response” (Fig 2: A, B; Additional file 6]) These results imply that Fig The top ten most enriched GO terms related to biological process (BP) of specifically coexpressed genes of each allotetraploid species with its sympatric population of A acuminata (a & b), species-specific expressed genes in a comparison between the two allotetraploid species (c & d), and population-specific expressed genes in a comparison between two populations of diploid A acuminata (e & f) (P-value < 0.05) These data suggested that the specifically expressed genes were mostly enriched in gene classes pertaining to biological response to environment The full information of enriched GO terms are listed in Additional Files and Abbreviations: ARX, Arxan population of A acuminata; QL, Qinling population of A acuminata Chen et al BMC Genomics (2021) 22:264 Page of 12 Table Number of specifically and differentially expressed genes in the two studied allotetraploid species Specific in A alpina Specific in A wilsoniana Expressed in both tetraploids (stem apex) Expressed in both tetraploids (leaf) 23845 (29.4%) 29881 (33.9%) 4049 (up-regulate in A alpina) 2879 (up-regulate in A alpina) 8328 (up-regulate in A wilsoniana) 6727 (up-regulate in A wilsoniana) 44524 (equal expression in both) 47267 (equal expression in both) the two geographically separated tetraploids may have inherited genes and expression patterns from their sympatric diploid parental species which could be important in local adaptation Inheritance patterns of gene expression Figure shows the numbers and proportions of differentially expressed genes (DEGs) among all expressed genes in the allotetraploids Most of these genes (71.49% in A alpina and 67.30% in A wilsoniana) were ‘conserved’, meaning that the total expression of homeologs for a given gene in the allotetraploids was statistically similar to the expression levels of that gene in both parental species Altered gene expression in the tetraploids was evidenced by expression inheritance patterns classified into 12 categories Thus, 5.8 and 5.0% of expressed genes in A alpina and A wilsoniana, respectively, had expression levels intermediate to the parental species (categories I and XII in Fig 3) Approximately 15% of genes showed “expression-level dominance” (categories II, XI, IV and IX) with both tetraploids exhibiting greater A asiatica expression-level dominance (S-dominance) than A acuminata dominance (C-dominance) (categories IV and IX vs II and XI) Finally, both tetraploids possessed more transgressively downregulated genes (categories III, VII and X) than transgressively upregulated genes (categories V, VI and VIII) Relative homeolog contribution and homeolog expression bias The two allotetraploids displayed a relatively small proportion of silent/lost parental genes Moreover, they exhibited imbalanced silencing/loss of homeologs between the two parental subgenomes Silence/loss of genes were more evident for A acuminata-homeologs than for A asiatica-homeologs, implying preferential expression of the A asiatica-subgenome in both tetraploids (Table 3) The relative homeolog contribution to total expression levels of allotetraploid genes was quantified by Rh [Rh = log2 (acu-homeolog/asi-homeolog)] (Fig 4) Fig Inheritance categories of gene expression of the studied allotetraploid species The categorization involving 12 states of differential expression (labeled with Roman numeral I–XII) is modified from Rapp et al (2009) [39] A cartoon depiction is provided for each of the 12 states, where parental states (S for A asiatica; C for A acuminata) are on the outer edges and the allotetraploid is in the middle Dots on the same horizontal line indicate statistically equal expression level, whereas dots on higher or lower horizontal lines refer to significantly higher or lower expression level The ‘Intermediate’ states, I and XII, indicate gene expression levels in the allopolyploid being significantly different from, but intermediate between the parental levels The ‘conserved’ refers to genes with basically equal expression levels among the allotetraploid and both parental species The number of genes of each category is given, and the percentage of each category group in all expressed genes is provided Chen et al BMC Genomics (2021) 22:264 Approximately two-thirds of homeolog pairs displayed equal expression of parental copies, and the remaining one-third exhibited different expression levels of parental homeologs Among the differentially expressed homeologous pairs, more exhibited higher expression of the A asiatica copy than the A acuminata copy To determine if the detected differential expression of homeologs is derived from pre-existing differences in parental gene expression levels, or is due to homeolog expression bias, we compared Rh with the relative expression of orthologs between the parental species, Rp [Rp = log2 (A acuminata/A asiatica)] (Fig 5) Approximately 79% of homeolog pairs in the tetraploids displayed vertical inheritance of pre-existing parental expression levels, that is, without expression bias Among the remaining 21% homeolog pairs that displayed parental expression bias, S-bias (bias toward A asiatica copy) was more common than C-bias (bias toward A acuminata copy) To understand the possible influence of expression bias to the relative contribution of the parental homeologs, we integrated data sets of relative homeolog expression level and homeolog bias (Table S2) Of the homeolog pairs showing equal expression of parental copies, 35% showed expression bias, while the rest simply maintained pre-existing parental expression levels Of the homeolog pairs with unequal expression levels, most might have resulted from homeolog expression bias For instance, out of 1396 homeolog pairs showing higher expression level of the A asiatica copy, 1037 (74.3%) displayed expression bias toward the A asiatica copy (Table S2) Validation of RNA-Seq analysis by RT-qPCR To validate the analysis and data obtained by RNAsequencing, differential expression of genes was checked using RT-qPCR assays Unigenes exhibting different inheritance patterns of gene expression (intermediate expression, A acuminata/A asiatica expression-level dominance, transgressive expression) were randomly chosen for RT-qPCR verifying For all 10 unigenes tested, expression patterns revealed by qRT-PCR assays were consistent with those evident in the RNA-Seq data (Fig S2), demonstrating the reliability of data produced by RNA-sequencing Page of 12 Discussion To understand the influence of hybridization and polyploidy on the inheritance of gene expression from parental to allopolyploid species, we conducted a transcriptome analysis on two allotetraploid Achillea species that originated independently from the same two parental species We evaluated RNA-Sequencing data to determine: (i) species-specific gene expression and coexpression among both tetraploid and progenitor diploid species; (ii) inheritance patterns of parental gene expression; and (iii) parental contribution to gene expression level in the tetraploids, and occurrence of homeolog expression bias Gene expression profiles in the allotetraploid species with influence of maternal effect Both hybridization and polyploidization can alter gene expression between progenitors and allopolyploid offspring by affecting the number of expressed genes and their expression levels In the present analysis only 3.6%–4.7% (Fig 1) genes expressed in the diploids were not detected in the tetraploid species, suggesting a low level of gene silencing (or loss) On the other hand, each of the tetraploid species possessed a high proportion (approximately 30%) of species-specific expressed genes (23,614 out of 81,143 genes in A alpina and 27,535 out of 88,150 genes in A wilsoniana, Fig 1), suggesting that hybridization and polyploidy activate some genes not expressed in the diploids In hybrid plants, maternal effects may have a strong influence on morphological, life-history and physiological traits, which can be beneficial if the maternal phenotype is linked to increased fitness [40–43] The present study showed that global gene expression of both Achillea tetraploids was frequently more similar to A asiatica than to A acuminata, as reflected by the number of coexpressed genes between species, expression-level dominance, relative homeolog contribution, homeolog-specific expression and homeolog expression bias This similarity to A asiatica suggests a maternal effect on gene expression with both tetraploids previously shown to have had an A asiatica-like ancestor as their maternal parent [32, 33] It has been suggested that parental expression-level dominance in allopolyploids mainly results from up- or down-regulation of one of the homeologous copies, Table Number of silent/lost homeologs in the studied allotetraploids Samples Number of silent/lost A asiatica-homeologs (%) Number of silent/lost A acuminata-homeologs (%) A alpina (Stem apex) 362 (3.60%) 539 (5.36%) A alpina (Leaf) 311 (3.67%) 479 (5.65%) A wilsoniana (Stem apex) 370 (3.50%) 517 (4.89%) A wilsoniana (Leaf) 331 (3.99%) 417 (5.02%) Chen et al BMC Genomics (2021) 22:264 Page of 12 Fig Histograms showing relative expression levels of homeologous genes in the studied allotetraploid transcriptomes a, b Data from stem apex c, d Data from leaf tissue a, c for A alpina; b, d for A wilsoniana The abscissa is Rh [log2 (acu-homeolog/asi-homeolog)], and the ordinate is the number of homeolog pairs Gray columns correspond to homeolog pairs with equal expression level of two parental copies; light blue columns correspond to homeolog pairs with higher expression of the A asiatica-copy, and vise versa, dart blue columns correspond to homeolog pairs with higher expression of the A acuminata-copy (P-value < 0.05, FDR < 0.05) Numbers at the upper right corner indicate the number of homeolog pairs of each category usually of the ‘less dominant’ parent [44, 45] Homeolog expression bias may lead to higher expression of one of the parental gene copies due possibly to a difference between parental subgenomes in number and distribution of transposable elements (usually repressing nearby genes), mismatches between parental copies of trans-elements and their target genes, and persistent epigenetic resetting [6, 36, 37, 46, 47] Maternal effects resulting from one or more of these causes have been reported previously in a number of allopolyploids, e.g Gossypium hirsutum [18, 48], Spartina anglica [49], Triticum aestivum [50] and Tragopogon miscellus [51] Comparative global gene expression patterns of allopolyploids independently derived from the same parent species Previous research on Dactylorhiza showed that three sibling allotetraploid species derived from the same two parental species were divergent epigenetically and in gene expression, and it was suggested that these ... diploid species; (ii) inheritance patterns of parental gene expression; and (iii) parental contribution to gene expression level in the tetraploids, and occurrence of homeolog expression bias Gene expression. .. inheritance patterns of gene expression (intermediate expression, A acuminata/A asiatica expression- level dominance, transgressive expression) were randomly chosen for RT-qPCR verifying For all 10 unigenes... Discussion To understand the influence of hybridization and polyploidy on the inheritance of gene expression from parental to allopolyploid species, we conducted a transcriptome analysis on two allotetraploid