Plant Physiology Preview Published on June 3, 2014, as DOI:10.1104/pp.114.241521 Running Head : GWAS in tomato for metabolic traits Corresponding Author: Christopher Sauvage (Ph.D) INRA, UR1052, Génétique et Amélioration des Fruits et Légumes (GAFL) Domaine St Maurice – 67 Allée des Chênes, CS 60094, F-84143 MONTFAVET Cedex, France Tél : +33 (0)4 32 72 27 54 Email: christopher.sauvage@avignon.inra.fr 10 11 Research Area: 12 Genes, Development and Evolution 13 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved Copyright 2014 by the American Society of Plant Biologists 14 Title: Genome Wide Association in tomato reveals 44 candidate loci for fruit 15 metabolic traits 16 17 18 Authors : Christopher Sauvage 1, Vincent Segura 2, Guillaume Bauchet Stevens , Phuc Thi Do 4,5 1,3 , Rebecca , Zoran Nikoloski , Alisdair R Fernie and Mathilde 19 Causse 20 1- INRA, UR1052, GAFL, 67 allée des Chênes Domaine Saint Maurice – CS60094, 21 84143 Montfavet Cedex (France) 22 2- INRA, UR0588, 2163 avenue de la Pomme de Pin, 45075 Orléans Cedex 23 (France) 24 3- Syngenta Seeds, 12 chemin de l’Hobit, 31790 Saint Sauveur (France) 25 4- Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476 Potsdam-Golm 26 (Germany) 27 5- Faculty of Biology, VNU University of Science, Vietnam National University, 28 Hanoi 334 Nguyen Trai, Thanh Xuan, Hanoi, Vietnam 29 30 Summary : Genome wide association has shed light on the genetic architecture of 31 metabolic traits underlying fruit quality in tomato and allowed identification of 32 candidate genes 33 34 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 35 Financial Source: 36 37 The EU EUSOL project PL016214-2 funded this work 38 39 Corresponding Author : 40 41 Christopher Sauvage (Ph.D) 42 INRA, UR1052, Génétique et Amélioration des Fruits et Légumes (GAFL) 43 Domaine St Maurice - Allée des Chênes, CS 60094, F-84143 MONTFAVET Cedex, 44 France 45 Tél : +33 (0)4 32 72 27 54 46 Email: christopher.sauvage@avignon.inra.fr 47 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 48 ABSTRACT 49 Genome-wide association studies have been successful in identifying genes involved 50 in polygenic traits and are valuable for crop improvement Tomato (Solanum 51 lycopersicum) is a major crop and is highly appreciated worldwide for its health 52 value We used a core collection of 163 tomato accessions composed of S 53 lycopersicum, S lycopersicum cv cerasiforme and S pimpinellifolium to map loci 54 controlling variation in fruit metabolites Fruits were phenotyped for a broad range of 55 metabolites including amino acids, sugars and ascorbate In parallel, the accessions 56 were genotyped with 5995 SNP markers spread over the whole genome Genome 57 Wide Association analysis was conducted on a large set of metabolic traits that were 58 stable over two years, using a multi-locus mixed model as a general method for 59 mapping complex traits in structured populations (Segura et al., 2012) and applied to 60 tomato We detected a total of 44 loci that were significantly associated with a total of 61 19 traits including sucrose, ascorbate, malate and citrate levels These results not only 62 provide a list of candidate loci to be functionally validated but also a powerful 63 analytical approach for finding genetic variants that can be directly used for crop 64 improvement and deciphering the genetic architecture of complex traits Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 65 INTRODUCTION 66 In crops, linkage mapping has proved invaluable for detecting quantitative trait loci 67 (QTL) for traits of interest and to unravel their underlying genetic architecture This 68 approach is based on the analysis of the segregation of polymorphism between the 69 parental lines and their progeny However, one of the limitations of this approach is 70 the reduced number of recombination events that occur per generation (for review, see 71 Korte & Farlow, 2013) This leads to extended linkage blocks that reduce the 72 accuracy of the linkage mapping An alternative to linkage-based mapping studies is 73 to perform linkage disequilibrium (LD) mapping in a population of theoretically 74 unrelated individuals The ancestral polymorphism segregating through this 75 population (or panel) is far more informative compared to the polymorphism of the 76 parental lines of the linkage mapping population (Mauricio, 2001) LD mapping, also 77 known as genome-wide association (GWA), relies on the natural patterns of LD in the 78 population investigated The aim of GWA is to reveal trait-associated loci by taking 79 advantage of the level of LD Depending on the decay of LD, the mapping resolution 80 can be narrowed from a large genomic portion where the level of LD is relatively high 81 to a single marker when the LD level is very low 82 Following domestication, crops are prone to (1) increased levels of LD, (2) population 83 structure (remote common ancestry of large groups of individuals) and (3) cryptic 84 relatedness (the presence of close relatives in a sample of unrelated individuals) 85 (Riedelsheimer et al., 2012) Population structure and cryptic relatedness may lead to 86 false-positive association in GWA studies (Astle & Balding, 2009) but their effect is 87 now relatively well accounted for in mixed linear models (MLM) (for reviews see 88 (Sillanpää, 2010; Listgarten et al., 2012) The problem of high LD in GWA scans 89 must also be taken into account: Segura et al (2012) investigated this difficulty by 90 proposing a multi-locus mixed-model (MLMM) that handles the confounding effect of 91 background loci that may be present throughout the genome due to linkage 92 disequilibrium This approach revealed multiple loci in LD and associated with 93 sodium concentration in leaves in A thaliana while previous methods failed to 94 identify this complex pattern (Segura et al., 2012) 95 In parallel, the development of cost-effective high-throughput sequencing 96 technologies has identified increasingly dense variant loci necessary to conduct GWA 97 scans, especially in model species such as rice for agronomic traits (Huang et al., 98 2010) or maize for drought tolerance (Lu et al., 2010 but see Soto-Cerda & Cloutier, Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 99 2012, for a complete review) However, GWA is not restricted to model species and is 100 becoming increasingly widespread in non-model ones such as sunflower (Mandel et 101 al., 2013) and tomato (Xu et al., 2013) where numerous associations have been 102 successfully identified for traits related to plant architecture (branching in the case of 103 sunflower) and fruit quality (e.g fresh weight in tomato) 104 Tomato is a crop of particular interest as the fruit are an important source of fiber and 105 nutrients in the human diet and a model for the study of fruit development 106 (Giovannoni, 2001) Over the last two decades, numerous QTL have been identified 107 for traits such as fresh weight (fw) using linkage approaches (Frary et al., 2000; Zhang 108 et al., 2012; Chakrabarti et al., 2013) but also for other fruit related traits such as fruit 109 ascorbic acid levels (Stevens et al., 2007), sensory and instrumental quality traits 110 (Causse et al., 2002), sugar and organic acids (Fulton et al., 2002) or metabolic 111 components (Schauer et al., 2008) Large tomato germplasm collections have been 112 characterized at the molecular level using SSR (Ranc et al., 2008) and SNP markers 113 (Blanca et al., 2012; Shirasawa et al., 2013) giving insights into population structure, 114 tomato evolutionary history and the genetic architecture of traits of agronomical 115 interest These screens of nucleotide diversity were made possible (for review, see 116 Bauchet & Causse, 2012) in the last couple of years due to the release of the tomato 117 genome sequence (The Tomato Genome Consortium, 2012) and derived genomic 118 tools such as a high density SNP genotyping array (Sim et al., 2012) The 119 combination of large germplasm collections, high-throughput genomic tools and traits 120 of economical interest provide a framework to apply GWA study in this species In 121 tomato, previous association studies have been limited to a targeted region (e.g 122 chromosome 2, Ranc et al., 2012), used low-density genome-wide distributed SNP 123 markers (Xu et al., 2013) or investigated a limited number of agronomical of traits 124 with low precision on the association panel (Shirasawa et al., 2013) 125 Using tomato (Solanum lycopersicum) as a model, we aimed to investigate the genetic 126 architecture of traits related to fruit metabolic composition, at high resolution To 127 reach this objective, we carried out an investigation into LD patterns at the genome- 128 wide scale and a GWA scan using the MLMM approach We present results on the 129 genetic architecture of fruit metabolic composition for metabolites such as organic 130 acids, amino acids, sugars and ascorbate in tomato 131 RESULTS 132 PHENOTYPING Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 133 We phenotyped a panel composed of 163 accessions for a total of 76 134 metabolic traits including amino acids, organic acids and sugars The tomato diversity 135 panel was composed of 28 S lycopersicum (S.L), 119 S lycopersicum cv cerasiforme 136 (S.C) and 16 S pimpinellifolium (S.P) derived from the previously published core 137 collection described in Xu et al (2013) From the set of 76 phenotypes, 36 of these 138 (47.3%) were highly correlated over the two years of sampling Of these 36 139 phenotypes, significant differences between the three groups of tomato accessions 140 were identified for 26 phenotypes (70.3% - see Figure 1) The post-hoc TukeyHSD 141 test provided a more thorough investigation of the significant differences among the 142 three groups for each trait Comparisons including S.P (S.P-S.L and S.P-S.C) were 143 more significantly different to the comparison S.L-S.C (see Figure 2) 144 The correlation pattern revealed clusters of highly correlated compounds in the 145 metabolic profile that largely corresponded to a functional classification of the 146 metabolites (Figure 1) For example, the concentration of fructose, sucrose, maltitol, 147 erythritol and maltose clustered together with SSC (Soluble Solid Content) while 148 amino acids (e.g serine, threonine, methionine, asparagine) also clustered together 149 We conducted GWA on this set of 36 phenotypes, which were stable (correlated) over 150 the two experimentation years using the MLMM approach (see Supplementary Table 151 S1 for the complete phenotype dataset) 152 GENOTYPING 153 From the initial 8,784 SNPs of the SOLCAP genotyping array, 7,720 (87.8%) 154 passed the manufacturing quality control and constituted our raw data set (see 155 Materials & Methods) From this raw data set, the quality filtering gave a total of 156 5,995 reliable SNP (77.6%), thus constituting the analysed data set for GWA The 157 overall average percentage of missing data per locus was estimated at 3.84% in the 158 whole population, while ranging from 2.25% in S.L to 4.07% in S.P The missing data 159 were imputed by the most common allele of the SNP as no missing data is allowed in 160 the MLMM 161 The MAF (Minor Allele Frequency) values were evenly distributed from 162 0.001% to 0.5% and showed differences in their distribution between groups The S.L 163 accessions showed an excess of rare variants with a skewed distribution of the MAF 164 values (median MAF= 0.107), while S.C and S.P accessions showed a broader 165 distribution of the MAF values (median MAF= 0.161 and 0.214, respectively) Such a 166 low median MAF in the S.L accessions may be attributed to (1) a higher proportion of Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 167 nearly monomorphic SNPs and (2) the shared ancestry within this group This 168 observation is supported by a previous study that investigated the MAF pattern in sub- 169 populations of tomato (Sim et al, 2012b) 170 POPULATION STRUCTURE 171 The pairwise-population FST differentiation index was estimated to be ~1% 172 (0.0102) between S.L and S.C, while between S.L and S.P and between S.C and S.P a 173 stronger population differentiation, estimated to be 0.2132 and to 0.1583 respectively, 174 was detected These results are supported by the estimation of the population structure 175 using the STRUCTURE software Following the ad hoc statistic 176 structure was apparent with the number of ancestral populations estimated to be two 177 (K=2) Whereas the first group was composed of a cluster of the S.L accessions and 178 the S.C accessions (N=147), the second group was composed of a cluster of the S.P 179 accessions only (N=16) ΔK, population 180 ESTIMATES OF KINSHIP AND LINKAGE DISEQUILIBRIUM IN THE COLLECTION 181 Within the 163 accessions, the pairwise kinship estimates revealed a low 182 degree of relatedness between individuals with a mean overall estimate of 0.0738 183 Pairwise linkage disequilibrium estimates (rs2) within each group revealed different 184 levels of LD along chromosomes On average, LD was higher in S.L (rs2=0.57), 185 medium in S.C (rs2=0.54) and lower in S.P (rs2=0.34) Within each group and for the 186 12 chromosomes, rs2 estimates ranged from 0.29 (K3) to 0.39 (K12) in S.P, from 187 0.5117 (K12) to 0.5619 (K11) in SC and from 0.52 (K9) to 0.62 (K6) in SL More 188 details on LD estimates for each chromosome in the three groups by chromosome are 189 given in Table 190 GENOME-WIDE ASSOCIATION 191 GWA was conducted trait by trait in order to dissect the optimal model 192 obtained from the MLMM After correcting for multiple testing, GWA scan identified 193 a total of 44 loci that were significantly associated with 19 out of the 36 traits 194 (52.7%) These 44 loci were spread unevenly over the genome as all chromosomes 195 carried at least one association (chromosomes and 12) but up to 10 associated loci 196 were located on chromosome Moreover, the number of associated loci per trait 197 ranged from one (for traits in total) to nine (for SSC) Table reports the detailed 198 statistics of GWA (i.e p-value, genomic location) for the loci associated with these 19 199 traits For each trait, the heritability (estimated at the step of the model, based on the Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 200 the variance component ߪଶ , computed for all markers and representing the estimated 201 genetic variance of the trait) ranged from 0.168 (threonate level) to 0.773 (proline 202 level) with a median value of 0.553 (over all traits), while the missing heritability (not 203 explained by the markers included in the model) ranged from 0.007 (threonine level) 204 to 0.458 (nicotinate level) with a median value of 0.250 The percentage of variation 205 explained for each trait was estimated from the optimal model obtained from the 206 MLMM The percentage of variation explained ranged from 16.2% to 74.3% for the 207 aspartate level and the dehydroascorbate level traits, respectively, while for the SSC, 208 the percentage of variation was estimated as 0.611 (61%, see Table for details) For 209 each trait, the Manhattan plot displaying p-values for each locus in relation to its 210 genomic location are shown in Supplementary Figure 211 Finally, the peak SNP associated with SSC (Solcap_snp_sl_26678) that 212 belongs to a candidate gene (Solyc09g010080.2 (lin5), a fruit-specific Beta- 213 fructofuranosidase or invertase), which plays a role in sugar metabolism (Fridman et 214 al., 2004), validates the methodological approach we employed by its mapping in our 215 panel We identified putative candidate genes in the present study, especially in close 216 vicinity to peak SNPs For example, the peak SNP Solcap_snp_sl_26678 217 (chromosome 9, position: 2,411,368 bp) is associated with fruit ascorbate content, and 218 is located ≈423 kbp upstream of a monodehydroascorbate reductase (NADH)-like 219 protein (MDHAR, Solyc09g009390.2, position: 2,835,367 bp) previously shown to be 220 linked to fruit ascorbate levels under stress conditions (Stevens et al., 2008) 221 Similarly, the peak SNPs associated with nicotinate (Solcap_snp_sl_29349), malate 222 (Solcap_snp_sl_19899) and sucrose levels (Solcap_snp_sl_17956) are located at 680 223 kbp, 7.9 kbp and 68 kbp respectively from three putative candidate genes that play a 224 role in the genetic architecture of the variation of these traits that are described as a 225 nicotinate 226 48,771,224 bp), an aluminum-activated malate transporter-like (Solyc06g072910.2, 227 position: 41,337,629 bp) and a sugar transporter (galactosylgalactosylxylosyl protein 228 3-beta-glucuronosyltransferase, 229 respectively phosphoribosyl transferase protein Solyc04g076920.2, (Solyc02g093290.2, position: position: 59,461,803 bp), 230 As a case study, we focused on the results associated with fruit malate content 231 by compiling all the results obtained for this trait Figures and illustrate these 232 results Malate levels were stable over the two years of the experiment (R2=0.621, Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 233 Figure 3A), differences in malate levels were significant between groups (Figure 234 and 3B) and the trait was normally distributed within the panel of accessions (Figure 235 3C) GWA identified two significant SNP associated with malate levels (see Figure 236 3D) without inflation in the distribution of p-values at the optimal step of the model 237 (see figure 3E) suggesting that population structure was well controlled These two 238 SNPs explained a proportion of the trait variation estimated at 39% (see figure 3F) 239 For each peak SNP, located on chromosome and 6, the allelic effects of each 240 genotypic class (homozygotes and heterozygote) were estimated (see figure G and 241 H) Finally, we used the pairwise LD estimates (rs2) for each genomic location to (1) 242 narrow down the genomic interval and (2) seek for putative candidate genes in the 243 vicinity of the two peak SNP (Figure 4A and B), providing a local overview of the 244 extent of LD and revealing an aluminum-activated malate transporter-like 245 (Solyc06g072910.2, position: 41,337,629 bp) as a good candidate gene (see above) 246 247 DISCUSSION 248 The aims of the present study were to (1) investigate LD patterns in a panel of 163 249 tomato accessions including wild, admixed and cultivated accessions and (2) 250 implement a stepwise GWA approach to reveal associations between SNP markers 251 and traits related to fruit metabolites We successfully achieved this objective with (1) 252 the investigation of the LD patterns revealing different levels of LD along 253 chromosomes and between the three groups constituting the panel and (2) the 254 detection of genome-wide associations for 19 fruit metabolic traits Finally, we 255 demonstrated that GWA is powerful enough to link metabolic composition of fruits in 256 tomato with genetic variation at a high resolution, despite a high level of LD and 257 population structure 258 259 METABOLITE PROFILING AND PHENOTYPING OF TRAITS 260 The phenotypic traits focused on in the present study were measured for two years in 261 a row (2007 and 2008) under similar growth conditions on an identical set of 163 262 tomato accessions Only 36 traits (47.3%) were stable over the two years suggesting 263 that metabolite profiling is highly sensitive to the environmental conditions Previous 264 studies have reported developmental stage × genotype or environmental × genotype 265 interactions for metabolite profiles, supporting the present results For example, in 266 tomato, metabolite profiling of 26 compounds revealed significant genotype by Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 10 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 Frary A, Nesbitt TC, Grandillo S, Knaap E, Cong B, Liu J, Meller J, Elber R, Alpert KB, Tanksley SD (2000) fw2.2: a quantitative trait locus key to the evolution of tomato fruit size Science 289 (5476): 85-88 Fridman E, Carrari F, Liu YS, Fernie AR, Zamir D (2004) Zooming in on a quantitative trait for tomato yield using interspecific introgressions Science 305 (5691): 1786-1789 Friendly M (2002) Corrgrams: Exploratory displays for correlation matrices Am Stat 56: 316-324 Fulton TM, Bucheli P, Voirol E, Lopez J, Petiard V, Tanksley SD (2002) Quantitative trait loci (QTL) affecting sugars, organic acids and other biochemical properties possibly contributing to flavor, identified in four advanced backcross populations of tomato Euphytica 127(2): 163-177 Gilbert KJ, Andrew RL, Bock DG, Franklin MT, Kane NC, Moore J-S, Moyers BT, Renaut S, Rennison DJ, Veen T, Vines TH (2012) Recommendations for utilizing and reporting population genetic analyses: the reproducibility of genetic clustering using the program structure Mol Ecol 21(20): 4925-4930 Giovannoni J (2001) Molecular biology of fruit maturation and ripening Annu Rev Plant Physiol Plant Mol Biol 52: 725-749 Gomez L, Bancel D, Rubio E, Vercambre G (2007) The microplate reader: an efficient tool for the separate enzymatic analysis of sugars in plant tissuesvalidation of a micro-method J Sci Food Agric 87(10): 1893-1905 Hamblin MT, Buckler ES, Jannink JL (2011) Population genetics of genomicsbased crop improvement methods Trends Genet 27(3): 98-106 Hamilton JP, Sim S-C, Stoffel K, Van Deynze A, Buell CR, Francis DM (2012) Single nucleotide polymorphism discovery in cultivated tomato via sequencing by synthesis Plant Gen 5(1): 17-29 Hardy OJ, Vekemans X (2002) SPAGeDi: a versatile computer program to analyse spatial genetic structure at the individual or population levels Mol Ecol Notes 2: 612-620 Howie B, Fuchsberger C, Stephens M, Marchini J, Abecasis GR (2012) Fast and accurate genotype imputation in genome-wide association studies through prephasing Nat Genet 44(8): 955-959 Huang X, Wei X, Sang T, Zhao Q, Feng Q, Zhao Y, Li C, Zhu C, Lu T, Zhang Z, Li M, Fan D, Guo Y, Wang A, Wang L, Deng L, Li W, Lu Y, Weng Q, Liu K, Huang T, Zhou T, Jing Y, Lin Z, Buckler ES, Qian Q, Zhang QF, Li J, Han B (2010) Genome-wide association studies of 14 agronomic traits in rice landraces Nat Genet 42(11): 961-967 Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, Eskin E (2008) Efficient control of population structure in model organism association mapping Genetics 178(3): 1709-1723 Kopka J, Schauer N, Krueger S, Birkemeyer C, Usadel B, Bergmuller E, Dormann P, Weckwerth W, Gibon Y, Stitt M, Willmitzer L, Fernie AR, Steinhauser D (2005) GMD@CSB.DB: the Golm Metabolome Database Bioinformatics 21(8): 1635-1638 Korte A, Farlow A (2013) The advantages and limitations of trait analysis with GWAS: a review Plant Methods 9(1): 29 Kumar S, Skjaeveland A, Orr R, Enger P, Ruden T, Mevik B-H, Burki F, Botnen A, Shalchian-Tabrizi K (2009) AIR: A batch-oriented web program package for construction of supermatrices ready for phylogenomic analyses BMC Bioinformatics 10(1): 357 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 23 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 Listgarten J, Lippert C, Kadie CM, Davidson RI, Eskin E, Heckerman D (2012) Improved linear mixed models for genome-wide association studies Nat Methods 9(6): 525-526 Loiselle BA, Sork VL, Nason J, Graham C (1995) Spatial genetic structure of a tropical understory shrub, Psychotria officinalis (Rubiaceae) Am J Bot 82: 1420-1425 Lu Y, Zhang S, Shah T, Xie C, Hao Z, Li X, Farkhari M, Ribaut J-M, Cao M, Rong T, Xu Y (2010) Joint linkage-linkage disequilibrium mapping is a powerful approach to detecting quantitative trait loci underlying drought tolerance in maize Proc Natl Acad Sci USA 107(45): 19585-19590 Mandel JR, Nambeesan S, Bowers JE, Marek LF, Ebert D, Rieseberg LH, Knapp SJ, Burke JM (2013) Association mapping and the genomic consequences of selection in sunflower PLoS Genet 9(3): e1003378 Mangin B, Siberchicot A, Nicolas S, Doligez A, This P, Cierco-Ayrolles C (2011) Novel measures of linkage disequilibrium that correct the bias due to population structure and relatedness Heredity 108: 285-291 Marchini J, Howie B (2010) Genotype imputation for genome-wide association studies Nat Rev Genet 11(7): 499-511 Mauricio R (2001) Mapping quantitative trait loci in plants: uses and caveats for evolutionary biology Nat Rev Genet 2(5): 370-381 Mounet F, Moing A, Garcia V, Petit J, Maucourt M, Deborde C, Bernillon S, Le Gall G, Colquhoun I, Defernez M, Giraudel JL, Rolin D, Rothan C, Lemaire-Chamley M (2009) Gene and metabolite regulatory network analysis of early developing fruit tissues highlights new candidate genes for the control of tomato fruit composition and development Plant Physiol 149(3): 1505-1528 Muños S, Ranc N, Botton E, Bérard A, Rolland S, Duffé P, Carretero Y, Le Paslier M-C, Delalande C, Bouzayen M, Brunel D, Causse M (2011) Increases in tomato fruit size and locule number is controlled by two key SNP located near Wuschel Plant Physiol 156(4) Nesbitt TC, Tanksley SD (2002) Comparative sequencing in the genus Lycopersicon: implications for the evolution of fruit size in the domestication of cultivated tomatoes Genetics 162(1): 365 - 379 Porcu E, Sanna S, Fuchsberger C, Fritsche LG (2013) Genotype imputation in genome-wide association studies Curr Protoc Hum Genet Chapter 1: Unit 25 Powell JE, Kranis A, Floyd J, Dekkers JCM, Knott S, Haley CS (2012) Optimal use of regression models in genome-wide association studies Anim Genet 43(2): 133-143 Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data Genetics 155(2): 945-959 Prudent M, Causse M, Génard M, Tripodi P, Grandillo S, Bertin N (2009) Genetic and physiological analysis of tomato fruit weight and composition: influence of carbon availability on QTL detection J Exp Bot 60(3): 923-937 Purcell S, Neale BD, Todd-Brown K, Thomas L, Ferreira ME, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC (2007) Plink: a toolset for whole-genome association and population-based linkage analysis Am J Hum Gen 81 (3):559-75 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 24 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 Ranc N, Munos S, Santoni S, Causse M (2008) A clarified position for Solanum lycopersicum var cerasiforme in the evolutionary history of tomatoes (Solanaceae) BMC Plant Biol 8: 130 Ranc N, Munos S, Xu J, Le Paslier M-C, Chauveau A, Bounon R, Rolland S, Bouchet JP, Brunel D, Causse M (2012) Genome-wide association mapping in tomato (Solanum lycopersicum) is possible using genome admixture of Solanum lycopersicum var cerasiforme G3 2(8): 853-864 Riedelsheimer C, Lisec J, Czedik-Eysenberg A, Sulpice R, Flis A, Grieder C, Altmann T, Stitt M, Willmitzer L, Melchinger AE (2012) Genome-wide association mapping of leaf metabolic profiles for dissecting complex traits in maize Proc Natl Acad Sci U S A 109(23): 8872-8877 Robbins C, Sim S-C, Yang W, Deynze AE, Knaap E, Joobeur T, Francis D (2011) Mapping and linkage disequilibrium analysis with a genome-wide collection of SNPs that detect polymorphism in cultivated tomato J Exp Bot 62(6): 1831-1845 Schauer N, Semel Y, Balbo I, Steinfath M, Repsilber D, Selbig J, Pleban T, Zamir D, Fernie AR (2008) Mode of inheritance of primary metabolic traits in tomato Plant Cell 20(3): 509-523 Schauer N, Semel Y, Roessner U, Gur A, Balbo I, Carrari F, Pleban T, PerezMelis A, Bruedigam C, Kopka J, Willmitzer L, Zamir D, Fernie AR (2006) Comprehensive metabolic profiling and phenotyping of interspecific introgression lines for tomato improvement Nat Biotechnol 24(4): 447-454 Schauer N, Steinhauser D, Strelkov S, Schomburg D, Allison G, Moritz T, Lundgren K, Roessner-Tunali U, Forbes MG, Willmitzer L, Fernie AR, Kopka J (2005) GC-MS libraries for the rapid identification of metabolites in complex biological samples FEBS Lett 579(6): 1332-1337 Segura V, Vilhjalmsson BJ, Platt A, Korte A, Seren U, Long Q, Nordborg M (2012) An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations Nat Genet 44(7): 825-830 Shirasawa K, Fukuoka H, Matsunaga H, Kobayashi Y, Kobayashi I, Hirakawa H, Isobe S, Tabata S (2013) Genome-wide association studies using single nucleotide polymorphism markers developed by re-sequencing of the genomes of cultivated tomato DNA Res 20(6): 593-603 Sillanpää MJ (2010) Overview of techniques to account for confounding due to population stratification and cryptic relatedness in genomic data association analyses Heredity 106(4): 511-519 Sim S-C, Durstewitz G, Plieske Jr, Wieseke R, Ganal MW, Van Deynze A, Hamilton JP, Buell CR, Causse M, Wijeratne S, Francis DM (2012) Development of a large SNP genotyping array and generation of high-density genetic maps in tomato PLoS ONE 7(7): e40563 Soto-Cerda BJ, Cloutier S (2012) Association mapping in plant genomes In: (Ed.) PMC ed Genetic Diversity in Plants ISBN: 978-953-51-0185-7, InTech, DOI: 10.5772/33005 Available from: http://www.intechopen.com/books/genetic-diversity-in-plants/associationmapping-in-plant-genomes Steinbach D, Alaux M, Amselem J, Choisne N, Durand S, Flores Rl, Keliet A-O, Kimmel E, Lapalu N, Luyten I, Michotey Cl, Mohellibi N, Pommier C, Reboux Sb, Valdenaire De, Verdelet D, Quesneville H (2013) GnpIS: an information system to integrate genetic and genomic data from plants and fungi Database 2013 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 25 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 Stevens R, Buret M, Duffe P, Garchery C, Baldet P, Rothan C, Causse M (2007) Candidate genes and quantitative trait loci affecting fruit ascorbic acid content in three tomato populations Plant Physiol 143(4): 1943-1953 Stevens R, Buret M, Garchery C, Carretero Y, Causse M (2006) Technique for rapid, small-scale analysis of vitamin C levels in fruit and application to a tomato mutant collection J Agric Food Chem 54(17): 6159-6165 Stevens R, Page D, Gouble B, Garchery C, Zamir D, Causse M (2008) Tomato fruit ascorbic acid content is linked with monodehydroascorbate reductase activity and tolerance to chilling stress Plant Cell Environ 31(8): 1086-1096 Tabangin ME, Woo JG, Martin LJ (2009) The effect of minor allele frequency on the likelihood of obtaining false positives BMC Proc Suppl 7: S41 The Tomato Genome Consortium (2012) The tomato genome sequence provides insights into fleshy fruit evolution Nature 485(7400): 635-641 Visscher PM, Brown MA, McCarthy MI, Yang J (2012) Five years of GWAS discovery Am J Hum Genet 90(1): 7-24 Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure Evolution 38: 1358-1370 Xu J, Ranc N, Munos S, Rolland S, Bouchet J-P, Desplat N, Paslier M-C, Liang Y, Brunel D, Causse M (2013) Phenotypic diversity and association mapping for fruit quality traits in cultivated tomato and related species Theor Appl Genet 126(3): 567-581 Zhang N, Brewer M, Knaap E (2012) Fine mapping of fw3.2 controlling fruit weight in tomato Theor Appl Genet 125(2): 273-284 Zhao K, Tung CW, Eizenga GC, Wright MH, Ali ML, Price AH, Norton GJ, Islam MR, Reynolds A, Mezey J, McClung AM, Bustamante CD, McCouch SR (2011) Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa Nat Commun 2: 467 852 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 26 853 FIGURE LEGENDS 854 Figure | Lower matrix displaying the correlations between each analyzed phenotype 855 adjusted for the year effect The correlation coefficient (Spearman) ranges from -1 856 (red colour) to +1 (blue colour) 857 858 Figure | Boxplot graphical representations of the distribution for the 19 traits that 859 showed a significant association In all graphs, mean values labeled with different 860 letters are significantly different, whereas those with same letters are not (Tukey’s 861 test, P < 0.05) Abbreviation : S.C - Solanum cerasiforme; S.L - Solanum 862 lycopersicum; S.P - Solanum pimpinellifolium 863 864 Figure | A focus on the malate level results (A) Correlation for the malate level over 865 the two years of sampling in the collection of 163 accessions; (B) Variation of malate 866 level adjusted for the year effect within the three groups; (C) Distribution of the 867 adjusted malate level in the collection; (D) Manhattan plot for the 12 tomato 868 chromosomes (X-axis) and associated p-values for each marker (Y-axis); (E) QQplots 869 of the observed p-value distribution; (F) Evolution of genetic variance at each step of 870 the MLMM model (Blue: Genetic Variance Explained; Green: Total Genetic Variance; 871 Red: Error) for the optimal model (step=2-BIC criterion); (G and H) Allelic effect for 872 the two associated markers on chromosomes and Abbreviation : S.C - Solanum 873 cerasiforme; S.L - Solanum lycopersicum; S.P - Solanum pimpinellifolium 874 875 Figure | Manhattan plots displaying the –log10(p-values) (Y-axis) over genomic 876 positions (X-axis) in a window of 2.5Mbp up and downstream of the two loci 877 associated with the malate level trait that are located on chromosome (Fig 5a) and 878 chromosome (Fig 5b) Different colour schemes are used to represent the pairwise 879 LD estimates (rs2) for each genomic location 880 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 27 881 Table I | Intra-chromosomal linkage disequilibrium (rs2) in each tomato group This estimate takes into account the effect of population structure 882 (see Mangin et al., 2011) Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 883 884 Mean pairwise rs2 K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 K11 K12 all K S lycopersicum 0.5508 0.5988 0.5895 0.5570 0.6029 0.6235 0.5416 0.5397 0.5231 0.5539 0.5938 0.5389 0.5678 S cerasiforme 0.5391 0.5318 0.5394 0.5191 0.5500 0.5530 0.5320 0.5337 0.5204 0.5315 0.5619 0.5117 0.5353 S pimpinellifolium 0.3323 0.3239 0.2884 0.3872 0.3557 0.2604 0.3478 0.3338 0.3431 0.3923 0.2917 0.3968 0.3378 885 886 28 887 Table II | Detailed information for the 44 significant associations detected within the 36 traits analysed using the MLMM model Locus name Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved Phenotype SNP1 Chromosome SNP Position (bp)2 p-value3 ASA solcap_snp_sl_12749 36931366 1.42e-05 ASA solcap_snp_sl_37057 63886939 2.94e-10 ASA solcap_snp_sl_26678 2411418 1.09e-07 Repressor of silencing Solyc09g009080.2 ASA solcap_snp_sl_46662 61773785 1.07e-05 Gene of unknown function Solyc09g074480.1 ASA solcap_snp_sl_62616 11 3393838 4.66e-08 ATP dependent RNA helicase Solyc11g010310.1 Asparagine solcap_snp_sl_32389 48943496 1.93e-07 Copine-like protein Solyc02g093520.2 Aspartate solcap_snp_sl_11456 58318210 1.67e-07 BHLH transcription factor Solyc04g074810.2 SSC solcap_snp_sl_26136 29851816 7.79e-26 Mannose-6-phosphate isomerase Solyc02g063220.2 UniRef Annotation Peptide transporter, TGF-beta receptor, type I/II extracellular region Conserved gene of unknown function (ITAG 2.3)4 Solyc06g065020.2 Solyc07g064580.2 SSC CT232_snp229 43207682 7.73e-10 UV excision repair protein RAD23 protein Solyc02g085840.2 SSC solcap_snp_sl_63048 71026 0.0006 CXE carboxylesterase Solyc03g005100.2 SSC solcap_snp_sl_35206 1748271 2.92e-21 Auxin signaling F-box1 family protein Solyc06g007830.1 SSC solcap_snp_sl_53288 60078938 1.22e-12 SSC solcap_snp_sl_65072 59477446 5.57e-08 SSC solcap_snp_sl_39725 3477979 1.34e-33 Beta-1,3-galactosyl-O-glycosyl-glycoprotein beta-1,6-N-acetylglucosaminyltransferase Solyc07g054440.2 Agenet domain containing protein Solyc08g078530.2 Beta-fructofuranosidase (aka lin5) Solyc09g010080.2 Single-stranded nucleic acid binding R3H SSC solcap_snp_sl_10594 11 2481288 1.89e-13 SSC solcap_snp_sl_659 12 45751611 2.41e-06 Gene of unknown function NA Citrate solcap_snp_sl_19899 41345468 1.48e-07 Conserved gene of unknown function Solyc06g072930.2 DHA solcap_snp_sl_69445 64606433 3.16e-39 domain protein Ubiquitin carboxyl-terminal hydrolase family protein Solyc11g008250.1 Solyc09g089560.2 29 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved DHA solcap_snp_sl_21770 11 3063738 8.49e-07 Pentatricopeptide repeat-containing protein SGN-U564017 Erythritol solcap_snp_sl_13558 36559326 1.24e-07 Pollen allergen Che a Solyc02g076860.2 Erythritol solcap_snp_sl_60698 10 64445598 5.98e-16 Flavin oxidoreductase/NADH oxidase Solyc10g086220.1 Fructose solcap_snp_sl_16136 59787171 9.31e-07 Conserved gene of unknown function Solyc05g050500.1 Fructose solcap_snp_sl_27215 38384375 9.05e-07 Katanin p60 ATPase-containing subunit Solyc06g066810.2 Fucose solcap_snp_sl_20802 60860146 2.70e-07 UV excision repair protein RAD23 protein Solyc03g117780.2 Fucose solcap_snp_sl_53149 53628534 1.63e-06 Structural constituent of ribosome Solyc04g056530.1 GABA solcap_snp_sl_35255 1330594 5.53e-08 Malate solcap_snp_sl_6196 13905175 1.28e-06 Gene of unknown function SGN-U565892 Malate solcap_snp_sl_19899 41345468 2.48e-08 Conserved gene of unknown function Solyc06g072930.2 Nicotinate solcap_snp_sl_29349 49451582 3.83e-06 Uridylyltransferase PII Solyc02g094300.2 Proline solcap_snp_sl_100675 28798838 3.71e-06 Conserved gene of unknown function NA Proline solcap_snp_sl_32499 21807134 3.91e-07 Rhamnose solcap_snp_sl_40309 84253735 2.61e-08 Embryo-specific SGN-U565850 Rhamnose solcap_snp_sl_34196 59102190 2.32e-09 Conserved gene of unknown function Solyc03g115250.2 TatD DNase domain-containing deoxyribonuclease Membrane-associated progesterone receptor component Solyc06g007310.2 Solyc06g035870.2 Rhamnose solcap_snp_sl_56631 1403227 9.41e-06 Patatin-1-Kuras Solyc08g006860.2 Rhamnose solcap_snp_sl_39722 3484890 2.10e-10 Gene of unknown function SGN-U565153 Sucrose solcap_snp_sl_13549 36490995 2.57e-06 Conserved gene of unknown function Solyc02g076800.1 Sucrose solcap_snp_sl_17956 59392982 6.01e-05 Glutamyl-tRNA reductase Solyc04g076870.2 Sucrose solcap_snp_sl_29483 4037126 9.51e-09 Glycosyltransferase family GT8 protein Solyc05g009820.2 Threonate solcap_snp_sl_11456 58318160 5.73e-06 BHLH transcription factor Solyc04g074810.2 Threonine solcap_snp_sl_32389 48943446 3.75e-07 Copine-like protein Solyc02g093520.2 Tocopherol solcap_snp_sl_46445 10 2199297 4.35e-07 Conserved gene of unknown function Solyc10g008030.2 30 Tyramine solcap_snp_sl_14531 2587919 1.12e-05 Conserved gene of unknown function Solyc08g008120.2 Tyramine solcap_snp_sl_64706 57571484 1.18e-07 Lysine-specific demethylase 5A Solyc08g076390.2 Tyramine solcap_snp_sl_36166 11 762353 1.54e-06 Transcription regulator SGN-U275742 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 888 889 890 891 892 –SNP names as given in the SOLCAP SNP array (see http://solcap.msu.edu) – SNP genomic position onto the tomato reference genome (v2.40) – SNP p-values – Name of the locus the peak SNP belongs to (according to the tomato genome annotation v2.30) 893 31 894 Table III | Summary of trait associations showing the heritability of the trait (h2, step 895 in the MLMM model), the missing heritability (h2 at the optimal model), the 896 percentage of associated variation of the trait (PVE), the number of significant loci 897 associated with the trait variation Phenotype Trait h2 Missing h PVE # associations ASA 0.553 0.333 0.561 Asparagine 0.417 0.208 0.220 Aspartate 0.284 0.301 0.162 Brix 0.600 0.185 0.611 Citrate 0.423 0.299 0.181 DHA 0.595 0.192 0.743 Erythritol 0.534 0.286 0.358 Fructose 0.565 0.250 0.386 Fucose 0.415 0.365 0.481 GABA 0.415 0.184 0.237 Malate 0.642 0.182 0.390 Nicotinate 0.595 0.458 0.279 Proline 0.773 0.282 0.461 Rhamnose 0.579 0.195 0.504 Sucrose 0.585 0.220 0.439 Threonate 0.168 0.174 0.170 Threonine 0.348 0.007 0.187 Tocopherol 0.306 0.261 0.224 Tyramine 0.612 0.347 0.472 Min 0.168 0.007 0.162 Max 0.773 0.458 0.743 Median 0.553 0.250 0.386 898 899 Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved 32 Citrate Putrescine Saccharate Phenylalanine Lysine Methionine Asparagine Glutamine betaAlanine GABA Serine Threonine Malate Galacturonate Glucuronate Xylose Glutarate2oxo Tocopherol Inositol1P Erythritol Brix Fructose Sucrose Maltitol Proline ASA DHA Nicotinate Tyramine Glycerol3P Glutamate Aspartate Threonate Rhamnose Fucose Maltose Citrate Putrescine Saccharate Phenylalanine Lysine Methionine Asparagine Glutamine betaAlanine GABA Serine Threonine Malate Galacturonate Glucuronate Xylose Glutarate2oxo Tocopherol Inositol1P Erythritol Brix Fructose Sucrose Maltitol Proline ASA DHA Nicotinate Tyramine Glycerol3P Glutamate Aspartate Threonate Rhamnose Fucose Maltose Figure | Lower matrix displaying the correlations between each analyzed phenotype adjusted for the year effect The correlation coefficient (Spearman) ranges from -1 (red colour) to +1 (blue colour) -0.6 on July-0.4 -0.2 by www.plant.org -1 from -0.8 Downloaded www.plantphysiol.org 8, 2014 - Published Copyright © 2014 American Society of Plant Biologists All rights reserved 0.2 0.4 0.6 0.8 Threonine -0.4 0.0 Proline 0.4 -1.0 b a S.L c S.P b a a a S.P c -0.2 Aspartate 0.2 -0.4 a a a S.P b S.P a b 0.4 0.4 S.L Groups a S.L Groups b S.P c S.P c S.P 0.0 0.0 S.C b S.C a S.L Groups a S.L Groups a S.L Fructose 0.2 0.5 S.L Groups a -0.6 0.0 S.C b 0.4 S.C b S.C b S.C Groups -0.2 Groups a Erythritol Nicotinate 0.4 S.C a S.P a 0.0 -1.0 S.L Groups c S.P b S.P a b S.P -0.6 Asparagine 0.0 S.C b S.L Groups a S.L Groups a S.L 0.0 0.4 0.0 -0.4 Malate Threonate -0.4 DHA 0.4 S.C b S.C b S.C -0.6 S.P -0.2 Sucrose 0.4 S.L -0.8 1.1 1.3 1.5 1.7 S.C Groups a c GABA 1.0 ASA 0.0 0.4 0.8 b S.P b 0.6 0.2 S.L Groups a 0.2 Citrate S.C b S.P a S.P Groups 0.0 0.0 Rhamnose 1.0 Tyramine -0.4 -0.2 S.L Groups a S.L 0.6 0.0 S.C a S.C Groups Tocopherol 0.2 Fucose SSC 0.8 10 b S.C b S.C b a S.L Groups a S.L Groups a c S.P c S.P c S.P a S.L a S.P S.C a S.L Groups S.C Groups Figure | Boxplot graphical representations of the distribution for the 19 traits that showed a significant association In all graphs, mean values labeled with different letters are significantly different, whereas those with same letters are not (Tukeyʼs test,Downloaded P < 0.05) Abbreviation : S.C - Solanum cerasiforme; S.L from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved Solanum lycopersicum; S.P - Solanum pimpinellifolium (B) 0.2 0.6 Malate content in each species -0.6 -0.2 Malate content 0.0 0.2 0.4 -0.4 Malate content measured in 2008 (A) Year effect on the measurement of malate content -0.5 0.0 0.5 S.P S.C S.L Malate content measured in 2007 (D) Distribution of Malate content in the Collection 12 Manhattan plot 0.0 -log10(pval) 1.0 0.5 Density 1.5 (C) -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 Malate Content 10 11 12 Chromosome (F) Partition of variance at each step of MLMM 1.0 QQplot for Optimal model according to mbonf 0.6 0.4 %var 10 0.2 0.0 Observed ! log10(p ) 0.8 (E) 4 5 Expected ! log10(p ) 10 15 20 25 step 0.6 0.2 -0.6 -0.2 Malate content 0.2 -0.2 -0.6 Malate content 0.6 (G) Allelic effect for the associated SNP on chromosome (H) Allelic effect for the associated SNP on chromosome CC CG GG TT TC CC Figure | A focus on the malate level results (A) Correlation for the malate level over the two years of sampling in the collection of 163 accessions; (B) Variation of malate level adjusted for the year effect within the three groups; (C) Distribution of the adjusted malate level in the collection; (D) Manhattan plot for the 12 tomato chromosomes (X-axis) and associated p-values for each marker (Y-axis); (E) QQplots of the observed p-value Downloaded on July 8, 2014 - Published by Explained; www.plant.org distribution; (F) Evolution of genetic variance atfrom eachwww.plantphysiol.org step of the MLMM model (Blue: Genetic Variance Green: Total Genetic Variance; Copyright © 2014 American Society of Plant Biologists All rights reserved Red: Error) for the optimal model (step=2-BIC criterion); (G and H) Allelic effect for the two associated markers on chromosomes and Abbreviation : S.C - Solanum cerasiforme; S.L - Solanum lycopersicum; S.P - Solanum pimpinellifolium solcap_snp_sl_6255 solcap_snp_sl_6249 solcap_snp_sl_6247 solcap_snp_sl_31198 solcap_snp_sl_31196 solcap_snp_sl_6240 solcap_snp_sl_6239 solcap_snp_sl_31193 solcap_snp_sl_31192 solcap_snp_sl_31191 solcap_snp_sl_31190 solcap_snp_sl_31189 solcap_snp_sl_6232 solcap_snp_sl_53126 solcap_snp_sl_6229 solcap_snp_sl_6228 solcap_snp_sl_6227 solcap_snp_sl_6226 solcap_snp_sl_31184 solcap_snp_sl_31179 solcap_snp_sl_6206 solcap_snp_sl_6201 solcap_snp_sl_6199 solcap_snp_sl_6196 solcap_snp_sl_6195 solcap_snp_sl_6194 solcap_snp_sl_31177 solcap_snp_sl_6192 solcap_snp_sl_6189 solcap_snp_sl_6185 solcap_snp_sl_6184 solcap_snp_sl_6183 solcap_snp_sl_53120 solcap_snp_sl_6180 solcap_snp_sl_6177 solcap_snp_sl_6176 solcap_snp_sl_26911 solcap_snp_sl_1379 solcap_snp_sl_1380 solcap_snp_sl_40819 solcap_snp_sl_1381 solcap_snp_sl_1382 solcap_snp_sl_1384 solcap_snp_sl_1385 solcap_snp_sl_1386 solcap_snp_sl_40820 solcap_snp_sl_40823 solcap_snp_sl_5885 solcap_snp_sl_31002 solcap_snp_sl_52609 solcap_snp_sl_52608 solcap_snp_sl_5884 solcap_snp_sl_5883 solcap_snp_sl_5882 solcap_snp_sl_5881 solcap_snp_sl_5879 solcap_snp_sl_5875 solcap_snp_sl_5872 solcap_snp_sl_5871 solcap_snp_sl_69276 solcap_snp_sl_11572 solcap_snp_sl_69274 solcap_snp_sl_11569 solcap_snp_sl_32184 solcap_snp_sl_32181 −log(p−value) (A) 0.2 Chromosome ! ! !! 0.4 ! 0.6 ! ! ! ! ! 0.8 ! ! ! !! ! ! ! ! ! ! ! ! !! !!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved ! !! ! !! ! !!!! ! ! ! !! 11041792 15817593 Physical Distance: 4775.8 kb LD Map Type: r−square sample 0.2 0.4 0.6 0.8 39116919 ! Downloaded from www.plantphysiol.org on July 8, 2014 - Published by www.plant.org Copyright © 2014 American Society of Plant Biologists All rights reserved ! ! solcap_snp_sl_57093 solcap_snp_sl_57106 solcap_snp_sl_57155 !!! solcap_snp_sl_19852 ! ! ! ! solcap_snp_sl_57180 solcap_snp_sl_57217 solcap_snp_sl_57228 solcap_snp_sl_57252 solcap_snp_sl_57254 solcap_snp_sl_57294 solcap_snp_sl_57327 ! solcap_snp_sl_57347 CL016102−0429_solcap_snp_sl_57352 Bcyc_868 solcap_snp_sl_57374 ! solcap_snp_sl_57376 ! solcap_snp_sl_57377 solcap_snp_sl_24428 ! ! solcap_snp_sl_57398 ! solcap_snp_sl_57435 !! solcap_snp_sl_57455 solcap_snp_sl_57465 solcap_snp_sl_57512 solcap_snp_sl_57513 !! CL015545−0057_solcap_snp_sl_57521 solcap_snp_sl_24436 solcap_snp_sl_24437 ! !! ! ! ! !! !! ! ! ! ! ! ! ! ! ! solcap_snp_sl_19898 solcap_snp_sl_19899 solcap_snp_sl_57533 solcap_snp_sl_24440 solcap_snp_sl_57574 solcap_snp_sl_24449 SL10882_924 solcap_snp_sl_57590 ! solcap_snp_sl_57593 ! solcap_snp_sl_57594 solcap_snp_sl_24450 12 solcap_snp_sl_57607 solcap_snp_sl_24454 ! solcap_snp_sl_19915 ! solcap_snp_sl_57644 solcap_snp_sl_57681 ! solcap_snp_sl_57714 ! ! ! ! solcap_snp_sl_57749 solcap_snp_sl_57766 ! ! solcap_snp_sl_57810 solcap_snp_sl_57817 solcap_snp_sl_57866 solcap_snp_sl_39395 solcap_snp_sl_39360 ! solcap_snp_sl_39318 solcap_snp_sl_39312 ! solcap_snp_sl_39310 solcap_snp_sl_39281 solcap_snp_sl_39246 solcap_snp_sl_16543 −log(p−value) (B) Chromosome ! 11 10 ! ! ! ! 43961486 Physical Distance: 4844.6 kb LD Map Type: r−square sample Figure | Manhattan plots displaying the –log10(p-values) (Y-axis) over genomic positions (X-axis) in a window of 2.5Mbp up and downstream of the two loci associated with the malate level trait that are located on chromosome (Fig 5a) and chromosome (Fig 5b) Different colour schemes are used to represent the pairwise LD estimates (rs2) for each genomic location ...14 Title: Genome Wide Association in tomato reveals 44 candidate loci for fruit 15 metabolic traits 16 17 18 Authors : Christopher Sauvage 1, Vincent Segura 2, Guillaume Bauchet... rights reserved 65 INTRODUCTION 66 In crops, linkage mapping has proved invaluable for detecting quantitative trait loci 67 (QTL) for traits of interest and to unravel their underlying genetic architecture... constituting the panel and (2) the 254 detection of genome- wide associations for 19 fruit metabolic traits Finally, we 255 demonstrated that GWA is powerful enough to link metabolic composition of fruits