Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.Nghiên cứu phát triển chỉ thị phân tử phục vụ chọn giống tu hài (Lutraria rhynchaena, Jonas 1844) theo hướng tăng trưởng.MINISTRY OF EDUCATION AND TRAINING HA NOI NATIONAL UNIVERSITY OF EDUCATION TRIEU ANH TUAN RESEARCH AND DEVELOPMENT OF MOLECULAR MARKERS FOR BREED SELECTION OF OTTER CLAM (Lutraria rhynchaena Jonas, 18.
MINISTRY OF EDUCATION AND TRAINING HA NOI NATIONAL UNIVERSITY OF EDUCATION TRIEU ANH TUAN RESEARCH AND DEVELOPMENT OF MOLECULAR MARKERS FOR BREED SELECTION OF OTTER CLAM (Lutraria rhynchaena Jonas, 1844) IN THE DIRECTION OF GROWTH Subject: Genetics Code: 9.42.01.21 SUMMARY OF THE DISSERTATION FOR THE DEGREE DOCTOR OF PHILOSOPHY (Ph.D) IN BIOLOGY Ha Noi – 2023 This dissertation is submitted to the Committee of the: HA NOI NATIONAL UNIVERSITY OF EDUCATION Supervisors: Assoc Prof Dr Nguyen Xuan Viet Assoc Prof Dr Thai Thanh Binh Referee 1: Assoc Prof Dr Nguyen Dang Ton Viện Nghiên cứu hệ gen – Viện Hàn lâm KH&CN Việt Nam Referee 2: Assoc Prof Dr Dang Thi Lua Research Institute for Aquacultrure No.I Referee 3: Assoc Prof Dr Nguyen Quang Huy Hanoi University of Science, Vietnam National University This research has been performed at the Ha Noi National University of Education Date of Oral Presentation:……,… /…… /2023 Copy of this dissertation is available at: - The National Library of Viet Nam - Library of Ha Noi National University of Education PREAMBLE Urgency of Dissertation Title Otter Clam (Lutraria rhynchaena Jonas, 1844) is a bivalve mollusk, preferring to live in warm waters, from 18-30oC and salinity ranging from 2530‰ (Ha Duc Thang, 2010) Otter clam mainly feeds on organic humus, algae and plankton (Ha Duc Thang, 2010) In Vietnam, otter clam is naturally distributed as well as widely farmed in the waters of Quang Ninh, Hai Phong and Khanh Hoa provinces with a total water surface area of about 1,000 ha, raising output of about 2,621.6 tons, and an estimated income of over 200 billion VND/year Otter clam has high nutritional value It’s rich in protein, minerals and amino acids in their body (Do Dang Khoa et al., 2014) Recently, the demand for otter clam is increasing, the economic efficiency brought from raising them is large, so the farming area of commercial otter clam has increased significantly As a result, there is a persistent demand for certified otter clam for breeding in huge quantities According to the report of the Directorate of Fisheries, it is estimated that each year, the commercial market for otter clam needs about 100 million grade I breeders (Directorate of Fisheries, 2020) Meanwhile, the advanced breeding programs in our country for commercial otter clam have not yet been applied, the production of seed by the traditional method both does not provide enough demand for qualified breeders and causes breed degradation due to inbreeding in breeder production Cultivated otter clam thus proves to suffer from slower, uneven growth and are vulnerable to diseases (Van Don Department of Agriculture and Rural Development, 2019) According to decision No 1664/QĐ-Tg dated October 4, 2021 on the development strategy for marine aquaculture to 2030 and a vision to 2045, for marine aquaculture in general and mollusk culture in particular, including otter clam, is towards sustainable production development, minimizing negative impacts on the environment and marine ecosystems (Government, 2021) To implement this strategy, it is necessary to focus on exploiting and developing indigenous breed sources with the application of scientific breeding programs to improve survival rates, rapid growth and tolerance to environmental conditions Breeding with the support of molecular markers is increasingly receiving attention and the applications of molecular markers in breeding in general and in aquaculture in particular have achieved proud achievements (Gjedrem et al., 2012) With the advent and application of new generation gene sequencing (NGS), many biological genomes in general and aquatic species in particular have been sequenced Exploiting genome sequences and transcriptomes, many molecular markers have been published that are reliable and effective tools for breeders in rapidly creating new varieties as well as improving genetics of existing varieties (Hetzel et al., 2000) Among them, the single nucleotide polymorphism (SNP) marker is considered to be one of the most accurate, effective and popular markers currently applied in breeding related to yield and quality (Liu, 2007) With outstanding strengths, the SNP indicator is being used as an effective selection tool for the traits of interest in breeding programs in many livestock species in general and aquatic species in particular (Liu, 2007) Importantly, SNPs can appear in the coding region, directly affecting the site of interest, and are very effective in determining the correlation between SNPs and certain traits (Beuzen et al., 2000) Therefore, the application of molecular biology techniques to accurately identify the species of interest and the successful use of nextgeneration sequencing techniques can provide genome sequencing documentation in order to develop a set of molecular markers in general, a set of molecular markers related to the growth trait of otter clam in particular will have both theoretical and practical significance for the culturing of otter clam farmers Stemming from the above-mentioned theories and practices, we selected and conducted the topic "Research and development of molecular markers for breed selection of Otter clam (Lutraria rhynchaena, Jonas 1844) in the direction of growth” Research objective The development of a set of molecular markers related to growth traits in Otter clam contributes to the development of its farming in particular and Vietnam's marine culture in general 3 Research content - Investigate and evaluate the species composition and potential status of Otter clam farming in Van Don district - Quang Ninh province - Research on building DNA barcoding for Otter Clam white faucet (Lutraria rhynchaena) - Research on sequencing the comedian genome and develop a set of SNP markers to provide a database of Lutraria rhynchaena genomes collected in Van Don district, Quang Ninh province - Research and development a number of SNP markers related to growth trait in Otter clam Scientific and Practical Significance of Dissertation Title Scientific significance: - The thesis topic has provided some important data on the composition of natural Otter clam distributed in Van Don, the current status and potentials of Otter clam farming in Van Don district, Quang Ninh province - The results of the study on the construction of the DNA barcode of the Otter clam is an important scientific basis for the accurate identification of the Otter clam species for sampling for genome sequencing and research and can be meaningful for the traceability of products from Vietnamese Otter clam - The genome sequence data of the Otter clam built from the results of the thesis is valuable for genomics research, which is the basis for exploiting and developing a set of molecular markers for researchers to select and breed Otter clam - Development of the number of SNP markers related to the growth trait of Otter clam provides a molecular biology tool that may be valuable for growthoriented breeding in Vietnam Practical significance: - The detected DNA barcode provides a molecular tool that can accurately identify Otter clam for research and traceability - With the genomic dataset, for the first time, the reference genome of Otter clam is provided - A number of SNP markers related to growth trait have been developed that are meaningful for growth-oriented selection and breeding by molecular markers in Otter clam species New contributions of the dissertation - The dissertation has evaluated the composition of the species of Otter clam distributed naturally in Van Don and the potential of its farming in Van Don, Quang Ninh - Constructed DNA barcodes for Otter clam based on 16S rRNA and COI gene sequences, providing additional tools that can accurately identify and classify Otter clam species - For the first time, Otter clam genome was published on GenBank, contributing to providing reference genome data forthe research and exploitation of Otter clam’s genomic data - Detected and selected 11 growth-related SNPs indicators for growth-oriented breeding in Otter clam CHAPTER RESEARCH OVERVIEW 1.1 Classification position, distribution area of Otter Clam 1.1.1 Classification position Mollusca: Mollusca Linnaeus, 1758 Bivalves: Bivalvia Linnaeus, 1758 Clam set (barrel gills): Veneroida Bieler R., 2010 Family: Mactridae Lamarck, 1799 Breed: Lutralia Lamarck, 1799 Species: Lutralia rhynchaena, Jonas 1844 (synonyms L philippinarum Reeve, L philippinarum Deshayes) Vietnamese name: Otter Clam, English name: Snout Otter Clam, Otter Clam Figure 1.1 Otter Clam (Lutralia rhynchaena) was collected in Van Don, Quang Ninh 1.1.2 Natural distribution of Otter Clam In the world, Otter Clam is distributed mainly in the waters west and south of Australia, some Asian countries such as China, Thailand, Philippines and North America (Beuzen et al., 2000) In Vietnam, Otter Clam is distributed mainly in the North Sea, the salinity is stable from 25 to 30 ‰, the bottom is sand, small gravel, or mollusk shells in the waters of Lan Ha Bay of Cat Ba island - Hai Phong City to Van Don island district - Quang Ninh (Pham Thuoc, 2006) 1.1.3 Economic value and situation of farming and exploiting Otter Clam In the world, according to statistics, the global fishery production in 2018 reached nearly 80 million tons, in 2020 it will reach 94.6 million tons, of which mollusk production accounts for one third, otter clam production is estimated at million tons (FAO, 2020) 1.2 DNA barcoding and its application in classification and species identification 1.2.1 DNA barcoding DNA barcoding has become a popular, important and meaningful term for more than a decade all over the world In 2003, for the first time, the concept of DNA barcodes was introduced by Professor Paul Hebert to help identify samples, using short DNA fragments from specific gene segments or combining many genes (Hebert et al., 2003) Using DNA barcoding techniques in research will help taxonomists to accurately identify species and contribute to the assessment of genetic diversity of organisms (Zalapa et al., 2012) 1.2.2 Application of DNA barcoding in the classification of creatures and mollusks In classification, DNA barcoding has been applied to classify many species of organisms Classification of fish classes (Lara et al., 2010), (Murgarella et al., 2016), (Weigt et al., 2012) Crustaceans were used to classify lobster species by sequencing the COI gene region, including one Vietnamese lobster and three other species collected in Australia and Sri Lanka (Hieu et al., 2019) On molluscs, DNA barcoding has been successfully applied to identify 14 bivalve species, the COI sequences have contributed to new findings related to the high biodiversity of mollusks (Moira et al., 2021), DNA barcoding has been applied and proved very effective in classifying oysters (Trivedi et al., 2012), echinoderms (Ward et al., 2019), and sweet snails (Binh et al., 2017) In barcoding analysis of clams, Liu (2018) used COI genomic DNA sequences of 56 species and 16S gene sequences of 19 species to determine the phylogenetic relationships of clams (Liu and Zhang, 2018) 1.3 Gene sequencing technology and molecular markers SNPs 1.3.1 Gene sequencing technology and applications Next generation sequencing – NGS was born in that context, the birth of NGS technology based on the simultaneous development of sample preparation techniques (Template preparation, Sequencing and imaging), Genome aligment and assembly The use of NGS technology creates a huge amount of data (Solving from Gb to 600 Gb), cheap price, low cost and outstanding advantages of fast and accurate sequencing Commonly used next-generation sequencing systems include Roche 454 (454 Life Sciences), HiSeq 2000/4000 (Illumina) and AB SOLiD (Life Technologies) sequencing RAD- Sequencing (Restriction site Associated DNA Sequencing) is a genome sequencing technique that uses restriction enzymes to cut genomic DNA into short segments and attach adapters to both ends, a large number of genetic variants such as SNPs can be identified from the DNA sequence data by nextgeneration sequencing (Baird et al., 2008) Gene sequencing technology has been applied to successfully decode the genomes of 13 mollusc species, including: Argopecten purpuratus (Li et al., 2018), Bathymodiolus platifrons (Sun et al., 2017), Chlamys farreri (Li et al., 2017), C gigas (Zhang et al., 2012), Limnoperna fortunei (Uliano-Silva et al., 2018), Modiolus philippinarum (Sun et al., 2017), Mytillus galloprovosystemis (Nguyen et al., 2014), Patinopecten yessoensis (Wang et al., 2017), Pinctada fucata (Takeuchi et al., 2012), (Du et al., 2017), Ruditapes philippinarum (Mun et al., 2017), Saccostrea glomerata (Powell et al., 2018), Scapharca bringonii (Bai et al., 2019) and Venustaconcha ellipsiformis (Renaut et al., 2018) 1.3.2 SNPs Molecular markers and applications Single nucleotide polymorphisms (SNPs) are DNA sequence modifications that occur when a single nucleotide (A, T, C, or G) in the genome sequence is altered in a population The distribution of SNPs in the genome is heterogeneous, with SNPs occurring at different frequencies in different chromosomal regions and in non-coding regions often higher than in coding regions (Liu and Cordes, 2004), most of the SNPs are in the form of two alleles and involve the substitution of Cytosine (C) with Thymine (T), which is the most abundant variation in DNA sequences between individuals in a population SNP markers have been applied in research in some important livestock, the SNP density varies from 6.5K to 930K (Wall et al., 2014) In the proteincoding region of the Pacific oyster, one SNP every 60 bp, in the non-coding region every 40 bp contains an SNP (Sauvage et al., 2007) Whereas in European flat oysters, the SNP density was shown to be relatively high, in the proteincoding region every 76 bp contains one SNP and the non-protein-coding region every 47 bp contains one SNP (Harrang et al., 2013) 1.4 Some genes related to growth trait in bivalve molluscs Several studies have shown polymorphisms in growth-related genes in bivalve molluscs, the research group of Liying et al., (2014) has also studied IGFBP gene characterization (PyIGFBP) ) related to the growth trait of the scallop Yesso (Liying et al., 2014) Two SNPs were associated with growth in scallops, PyE2F3-1 and PyE2F3-2, in which the PyE2F3-1 gene was correlated with growth in length, shell height, body weight and muscle weight (Xianhui et al., 2019) On oyster subjects that had association and haplotype analysis of single nucleotide polymorphisms (SNPs), four SNPs (positions - 904, - 522, - 272, - 262 bp) and two haplotypes (CCCC and TCTC) is directly related to the growth rate of C Gigas (Liting et al., 2021) CHAPTER RESEARCH MATERIALS AND METHODS 2.1 Research Materials Research Materials + Samples were collected at Breeding Center of Brackish and Salty Aquaculture, College of Economics, Technology and Fisheries The collection sample includes 01 Otter Clam muscle tissue sample collected for Otter Clam genome sequencing Muscle tissue samples of 30 fast growing Otter Clam individuals, 30 slow growing Otter Clam individuals for SNP screening sequencing Otter Clam digestive gland samples were collected for genomic sequencing Research location: The extraction and purification of DNA products was carried out at the biotechnology department, College of Economics, Engineering and Fisheries, Tu Son city - Bac Ninh province Sequencing and reading were conducted at the Genomic Technology Center, Deakin University, Victoria Australia 2.2 Research Methods 2.2.1 Survey and assessment of Otter Clam species composition, current status and economic efficiency 2.2.1.1 Survey on species composition of Otter Clam in Van Don, Quang Ninh province - Using the survey method applied to the tidal area, using the quantitative frame technique (10m2) for the sampling points The collected samples were preserved according to the documentation of English (1994) (English et al., 1997) Regulation on marine integrated survey of the State Committee of Science and Technology (State Science and Technology Commission, 1981) and the procedure for investigating marine resources and environment (Institute of Marine Resources and Environment, 1994) - Identification of comedian species by morphological method: Collected individuals are classified based on external morphological characteristics (shape, shell color, color of trunk, morphological classification criteria) according to the following criteria: Classification course on bivalve molluscs by Dang Ngoc Thanh (Thanh Dang Ngoc, 2007) 2.2.1.2 Evaluation of the current status and economic efficiency of Otter Clam farming in Van Don The total number of survey samples was 400, the surveyed information included: farming area, farming method, stocking density, grow-out time, harvest 10 RNA was performed using the Zymo Quick-RNA Miniprep kit, an RNA library developed using the Nugen Universal Plus mRNA-Seq Kit (Tecan Genomics, San Carlos, CA) according to the manufacturer's instructions Otter Clam transcriptome sequencing was performed using an Illumina NovaSeq6000 sequencer 2.3.4 ezRAD sequencing for SNP screening for Vietnamese Otter Clam to search for growth trait-associated SNPs The fast- and slow-growing Otter Clam genomic DNA libraries were developed using the ezRAD technique (Toonen et al., 2013), using the TruSeq Nano HT Library Preparation kit (Illumina) The DNA library obtained by the ezRAD technique is a band of 350-550 bp in size ezRAD Sequencing: Sequencing using the HiSeq6000 next-generation sequencing system (Illumina), at the Center for Genomic Technology, Deakin University, Australia 2.2.5 Data analysis and processing methods 2.2.5.1 Processing of survey data on the species composition of the comedian and assessing the current status and potential of the tuftede farming profession: Data analysis and processing methods: Survey data is processed and calculated according to (Institute of Marine Resources and Environment, 1994), survey data, interviews, rural rapid assessment (RRA) and questionnaire survey (QS) are carried out according to Groves' method (Groves et al., 2004) Combined one-factor ANOVA analysis using Minitab 16.4 software and using Microsoft Office Excel 2016 2.2.5.2 Data processing to build DNA barcodes to identify the remains of L rhynchaena: The 16S rRNA genomic DNA sequences, COI of Lutraria species were aligned using Bioedit and MAFFT software (Katoh and Standley, 2013) Then conduct analysis using BioEdit software and MEGA X software 2.2.5.3 Otter Clam L rhynchaena genomic analysis group: Genome analysis group of Otter Clam L rhynchaena: Otter Clam Genome Sizing: Preprocessing Raw Datasets Using Fastp Tool 11 (v0.19.4) (Chen et al., 2018), dataset quality was preprocessed with the program Trimmomatic (v0.36) (Bolger et al., 2014) Evaluation of the assembly quality of Otter Clam gene sequences using bowtie2 v2.3.3.1 software (Langmead and Salzberg, 2012) Estimate the size of the Otter Clam genome using the kmer value in the program with Jellyfish v2.2.6 (Marỗais and Kingsford, 2011) Using GenomeScope software to construct k-mer charts and estimate the genome size of Otter Clam (Vurture et al., 2017) Assemble the Otter Clam genome using MaSuRCA software (De novo) (Zimin et al., 2013), evaluate the completeness of the assembly process using BUSCO v3.0.2 to (Simão et al., 2015) Editing of long reads and assemblies using minimap2 software (Li et al., 2018), error correction using Purge Haplotigs software (Renaut et al., 2018) *Heterozygous estimation, repeat sequence and polymorphism search: Single nucleotide polymorphism (SNP) detection by bowtie2 software (Langmead and Salzberg et al., 2012), repeat sequence determination by RECON and RepeatScout software (Langmead and Salzberg, 2012), (Chen et al., 2018), (Price et al., 2005) *Some features of the bivalve genome: Assembly size, length of the N50 gene region of the L rhynchaena genome and other characteristics will be compared with the genomes of 13 molluscs that have been announced in advance *Assembly of the transcriptional genome: The RNA-seq library was sequenced, the raw data set was preprocessed, and a transcriptome was created to aid in genomic prediction De novo assembly was performed using Trinity software (Grabherr, 2011) Processing short read sequences using Bowtie2 software (Langmead and Salzberg, 2012) Align the transcriptome genome using GMAP software (Zalapa et al., 2012) *Gene prediction and annotation: Gene prediction was performed using MAKER software (Holt and Yandell, 2011), major protein groups were aligned using MAFFT v7.394 software (Katoh and Standley, 2013), proteins were determined by software IQ-TREE v1.5.5 (Nguyen et al., 2014) Analysis of SNP molecular markers using BWA software (Li and Durbin, 2009) and detection of SNPs using SAMtools (Li, 2009) SNPs were aligned with contigs and nucleotide differences were detected on at least four read sequences 12 (Gao et al., 2012) Data processing according to Stacks process (Catchen et al., 2011), (Rochette et al., 2019) 2.2.5.4 SNP Screening Analytical Methods Group for Otter Clam L rhynchaena Sequencing and preprocessing datasets with Trimmoatic for fast-growing and slow-growing Otter Clam groups Adapters and sequences of low quality (Q 30.0; Qual by Depth value: QD < 2.0; appear or more SNPs in 35bp window frame (SnpCluster); The SNP is located 20-25bp from the ends of the read sequence Separate screening: The resulting vcf file after running GATK will be included in the vcfisec tool to filter out positions on the same transcript that only detect mutations in of groups of fast or slow growing Otter Clam The group of methods for designing primers and multiplying sequences containing SNP markers in the fast-growing and slow-growing Otter Clam groups: Based on the results of screening potential SNP markers by bioinformatics method, design primer pairs for the respective SNPs in both fast growing and slow growing Otter Clam groups based on fasta files Extract and rely on primer design software with preset parameters: primer length, primer binding site, and primer temperature (Tm) 13 CHAPTER RESEARCH RESULTS AND DISCUSSION 3.1 Species composition of otter clams distributed naturally in Van Don district, current status and potential for development of clam culture in Van Don district, Quang Ninh province 3.1.1 Survey on the composition of otter clam species naturally distributed in Van Don district The results of the survey on Otter Clam species composition recorded that the natural distribution at most of the survey sites in Van Don was the whitetailed Otter Clam L rhychaena, the red-tailed Otter Clam L arcuata found only in Dong Xa commune, Southeast - Co To district - Quang Ninh province Table 3.1 Species composition of Otter Clam in Van Don, Quang Ninh Species composition Evaluation criteria Number of collected Lutraria rhynchaena, Lutraria arcuata Jonas 1844 Deshayes in Reeve, 1854 61 3,317 0,112 0,15 ± 0,03 0,1 ± 0,02 samples (child) Sample weight (kg) Density (individual/m2) 3.1.2 Evaluation of the current status and potential of developing Lutraria rhynchaena farming in Van Don district today The survey results of 400/1250 aquaculture households in Van Don in 2019 showed that the total number of households raising otter clam was 152, The species raised is Lutraria rhychaena, Jonas 1884 With a total area estimated at 209 hectares, the average area of aquaculture is 1.38 hectares/household, the average scale of otter clam (Lutraria rhychaena) farming is 0.18 hectare/household The largest area for raising otter clam is 9.25 hectares in Ban Sen commune, the smallest area is 1.5 hectares in Ngoc Vung commune The average scale of raising them ranges from 0.08-0.3 hectare/household The salinity of the seawater environment in Van Don is always stable, 14 ranging from 28.2 to 29.3‰, which is within the development threshold of the otter clam Our survey shows that in summer, the rainfall is great, the salinity has decreased but still fluctuates in the range of 25.65 - 28.75‰; in autumn and winter with low rainfall, salinity is the same, ranging from 28.5 to 29.5‰ (Appendix 9) The results of assessing the influence of salinity on the growth and survival rate of the otter clam presented in (Appendix 10) also show that the salinity of the sea water at Van Don is perfectly suitable for the growth of the otter clam (Lutraria rhychaena) In addition, otter clam rearing households have an average of 12.68 ± 15.54 years of farming experience accumulated through many years of practice in farming techniques, information and experience sharing among local people involved in raising Lutraria rhychaena The price of commercial otter clam is always stable in the range of 100,000 - 120,000 thousand VND/kg, higher than other types of mollusks, with great demand 3.2 Building DNA barcodes for Otter clam L rhynchaena 3.2.1 Comparison of the 16S and COI gene sequences of Vietnamese Otter Clam compared with the DNA sequences of other Otter clam species published on GenBank In order to develop a DNA molecular marker to identify and distinguish the Vietnamese Otter Clam species from the species of the genus Lutraria, a separate database set for the genus Lutraria was built based on the results of the linkage sequence screening related to species of the genus Lutraria present in GenBank In addition, sequences of 16S and COI gene regions of Spisula solida (MG934910.1) were also exploited to match eggs when building taxonomic trees Comparison of DNA sequences of 16S gene region of species of the genus Lutraria From the results of comparing the DNA sequences of the 16S gene region of Otter Clam L rhynchaena compared with the four DNA sequences of three Otter Clam species published on GenBank, it shows that Vietnamese Otter Clam has a high degree of similarity with L australia, the similarity reached 100%, while compared with the other two species, the similarity ranged from 92.2492.43% Comparison of DNA sequences of the COI gene region of species of the 15 genus Lutraria The results of comparing the COI gene sequences of Otter Clam in Vietnam L rhynchaena with those of Otter Clam species published on GenBank showed that Vietnamese Otter Clam has a high degree of similarity compared to that of Otter Clam species L australia, similarity reached 99.84%, but low compared to species L maxima, L arcuata, similarity reached from 86.73-87.18% A comparison of DNA sequences with a similarity index greater than or equal to 97 % is considered to be the same species and conversely if the analytic index is less than 97 % is considered to be different species (Stackebrandt and Ebers, 2006) This confirms that Otter Clam from Vietnam (L rhynchaena) and Otter Clam L australis are the same species and have the synonym L philippinarum 3.2.2 Phylogenetic tree construction for 16S gene region and COI among Otter Clam species The evolutionary tree was built from 16S gene sequence by ML algorithm (Figure 3.1A), BI (Figure 3.1B) From the phylogenetic tree diagram in Figure 3.1A and B, it can be seen that the tree consists of branches, branch is the genus Spisula, branch includes Lutraria species, in which L maxima and L arcuate species are closely related species L arcuate is classified into one clade, the other branch includes two closely related species L rhynchaena and L australis Figure 3.1 The evolutionary relationship between Lutraria species is based on analysis of 16S gene sequences using the ML algorithm of the MEGA X program (A) and the BI algorithm of the BEAST program (B) Therefore, the 16S gene sequence can be used as a barcode to identify Otter Clam L rhynchaena species from other species 16 The results of building evolutionary tree from COI gene region sequence using ML algorithm (Figure 3.2A) and BI (Figure 3.2B) show that species of Lutraria genus are in the same large group of phylogenetic tree, branch the other was the control species S solida The first branch of the phylogenetic tree consists of two subgroups, the first includes two subgroups, the first includes the species L rhynchaena and L australis, the second includes L arcuate and L maxima, these two groups are related closely related, the other branch is L Lutraria The research results show that the COI gene sequence can be used as a barcode to identify Otter Clam L rhynchaena species with other species Figure 3.2 The evolutionary relationship between Lutraria species is based on COI gene sequence analysis using the ML algorithm of the MegaX program (A) and the BI algorithm of the BEAST program (B) Thus, for the COI gene sequence when using both MEGA X and BEAST programs, the results are similar, in which the species L rhynchaena and L australis are in the same group and are closely related to each other, the remaining subgroup includes the remaining L maxima, Lutraria The results show that the COI sequence can be used as an indicator to identify L rhynchaena species with other species in the genus Lutraria 3.2.3 Development of barcoding DNA markers Otter Clam (L rhynchaena) The results of sequence analysis of the 16S gene region with a common length of 438 nucleotides of Lutraria species to provide a specific sequence region for L rhychaena based on the occurrence of specific nucleotides in the sequence, when analyzing the whole sequence, there are 35 positions of nucleotide difference between L rhynchaena species and the other species (L australis A1, L maxima, L arcuata G1 and L arcuata G3) In which, between Vietnamese Otter Clam L rhynchaena and L australis A1 there are differences 17 at nucleotides 66 and 67 The results of COI gene sequence analysis of Lutraria species, after being aligned and aligned with a length of 615 nucleotides, identified 143 different positions in the Lutraria genera Between L rhynchana species and L australis species, a nucleotide difference was identified at position 253, while with species L maxima and L arcuata, 78 and 79 different positions have been identified, respectively Between L rhynchana species and L Lutraria species, 117 different nucleotide positions have been identified The results showed significant differences for the 16S and COI gene regions Therefore, the 16S and COI genes can be used as DNA barcoding to identify L rhynchaena species from other species in the genus Lutraria 3.3 Genome sequencing, transcriptional genome of the Otter Clam L rhychaena 3.3.1 Library establishment and sequencing of the Otter Clam genome The two exposition libraries were developed according to the manufacturer's instructions, ensuring eligibility for genome sequencing The resulting sequencing and analysis of the Otter Clam genome data was made up of 122 Gbp (55.16 Gbp + 66.77 Gbp) short reads with a splicing length (150 bp) and 14 Gbp of long reads The average length is 4,960 bp, the longest segment 58,804 bp, the shortest 200 bp In addition, 34 Gbp transcripts were generated for genome prediction 3.3.2 Library establishment and sequencing of the Otter Clam transcriptome genome The library is quality tested and meets the standards required by Illumina to be used for sequencing The Otter Clam transcriptome genome was sequenced using the Illumina NovaSeq6000 sequencing system The results of sequencing and analysis of Otter Clam genomic data were generated with a read length of 150bp In addition, 34 Gbp transcripts were generated for genome prediction 3.3.3 Estimation of genome size The kmer analysis of the dataset determined that the Otter Clam genome size is estimated to be between 545 and 547 Mbp Based on analysis results from 18 short read sequence and long read sequence with sequencing coverage of 200X and 25X, respectively 3.3.4 Assembling the Otter Clam Genome The assembly of the Otter Clam genome consists of 1,502 triplets, the length of the N50 gene region is 1.84 Mbp and the assembly size is 586.5 Mbp The degree of completion of the assembly process reached 95.9% The result has created a genome assembly with a capacity of 544 Mbp contained in 622 triplets with the length of the gene region N50: 2.14 Mbp, which contains 95.6% short read sequences (Illumina) Accuracy reached 95.8%, with 1.5% BUSCO detected in duplicates The results of Otter Clam genome sequencing have been submitted and published on GenBank 3.3.5 Transcriptional genome assembly, prediction and annotation of the Otter Clam genome The results have generated short-stranded specific RNA reads, with a length of 34.6 Gbp, the transcription genome has a length of 295,234 triplets, the completeness is 83.9% A total of 96.4% of RNA reads were matched to assembled transcripts, and 79% were genome-edited for splice recognition Otter Clam genome annotation results identified 26,380 protein-coding genes (AED ≤ 0.5), 89.8% of which were functionally annotated 3.3.6 Heterozygous estimation, repeat sequencing and polymorphism search Contig sequencing and screening results detected a total of 4,903,576 single nucleotide polymorphisms (SNPs) with an estimated heterozygosity of 0.90%, using GenomeScope software based on the values kmer value has determined the heterozygosity rate to reach 1.60% Thus, the heterozygosity rate is within the allowable limit for bivalves that have been studied and published before (values range from 0.51 to 2.02%) 3.3.6 Assembly of the transcriptional genome As a result, the set of transcripts has a length of 295,234 triplets and a length of 34.6 Gbp BUSCO processed with 83.9% completeness A total of 96.4% of RNA reads were matched to assembled transcripts, and 79% were genome-edited for splice recognition Both association ratios indicated that the quality of the genome transcription was sufficient and warranted to perform gene prediction