Whole Genome Sequencing

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	6
Dung lượng	465,01 KB

Nội dung

Whole Genome Sequencing tài liệu, giáo án, bài giảng , luận văn, luận án, đồ án, bài tập lớn về tất cả các lĩnh vực kinh...

A phylogenomic study of the genus Alphavirus employing whole genome comparison Phân tích phát sinh chủng loài đối với nhóm Aplhavirus đã đư ợc thực hiện từ lâu trước đây bằng cách sử dụng một đoạn gene nào đó, hoặc cả một gene hoặc một phần dữ liệu proteomics. Trên tờ Comaparative and Functional Genomics 6: 217-227 (xem bài báo đính kèm), nhóm tác giả Luers và cộng sự đã sử dụng trình tự cDNA và protein t ừ ngân hàng dữ liệu GenBank để phân tích quá trình tiến hóa của nhóm virus này. Dữ liệu dùng cho phân tích phần lớn là các gene đã được giải trình tự có chiều dài đầy đủ, chỉ một số rất ít là đoạn gene chưa hoàn ch ỉnh. Quá trình phát sinh chủng loài của nhóm alphavirus đư ợc suy luận từ 3 phép phân tích riêng rẽ là (1) phân tích vùng mã hóa cấu trúc vỏ capsid, (2) phân tích vùng mã hóa không cấu trúc và (3) phân tích toàn bộ genome thông qua phương pháp khoảng cách lân cận (distance/neighbour-joining). Kết quả phân tích trong nghiên cứu này đã xác nhận lại rằng nhánh Western Equine Ecepphalitis (WEE) là do quá trình tái tổ hợp mà ra cũng như đồng ý với nhiều công bố trước đó cho rằng việc phát xạ loài theo quy mô địa lý và sự phân kỳ của các nhóm khác nhau là do các cơ chế khác nhau. Dữ liệu phân tích của nhóm Luers cho thấy hai nhánh Salmon Pancreatic Disease Virus và Sleeping Disease Virus là một nhánh có chung một thủy tổ và thủy tố của chúng chính là một loài alphavirus nào đó trong quá khứ đã phân ly khỏi nhóm alphavirus một cách thành công. Hơn nữa, kết quả nghiên cứu trong bài báo này cho th ấy có nhiểu điểm không giống với một số báo cáo trước đó, ví dụ như Barmah Forest Virus và Middelburg Virus có vẻ như là thành viên của nhánh Semliki Forest. Đặc biệt là Southern Elephant Seal Virus chính là một thành viên của nhánh Semliki Forest mặc dù khoảng cách di truyền của loài Southern Elephant Seal Virus khá xa so với các loài khác trong nhóm Alphavirus. Điều thú vị cuối c ùng trong bài báo này đó là các tác giả nhận thấy to àn bộ genome c ủa Rubella có thể đóng vai trò như một nhóm ngoại (nhóm đối chứng) rất lý tưởng cho việc nghiên cứu phát sinh chủng loài của nhóm Alphavirus. Whole-Genome Sequencing Whole-Genome Sequencing Bởi: OpenStaxCollege Although there have been significant advances in the medical sciences in recent years, doctors are still confounded by some diseases, and they are using whole-genome sequencing to get to the bottom of the problem Whole-genome sequencing is a process that determines the DNA sequence of an entire genome Whole-genome sequencing is a brute-force approach to problem solving when there is a genetic basis at the core of a disease Several laboratories now provide services to sequence, analyze, and interpret entire genomes For example, whole-exome sequencing is a lower-cost alternative to whole genome sequencing In exome sequencing, only the coding, exon-producing regions of the DNA are sequenced In 2010, whole-exome sequencing was used to save a young boy whose intestines had multiple mysterious abscesses The child had several colon operations with no relief Finally, whole-exome sequencing was performed, which revealed a defect in a pathway that controls apoptosis (programmed cell death) A bone-marrow transplant was used to overcome this genetic disorder, leading to a cure for the boy He was the first person to be successfully treated based on a diagnosis made by wholeexome sequencing Today, human genome sequencing is more readily available and can be completed in a day or two for about $1000 Strategies Used in Sequencing Projects The basic sequencing technique used in all modern day sequencing projects is the chain termination method (also known as the dideoxy method), which was developed by Fred Sanger in the 1970s The chain termination method involves DNA replication of a single-stranded template with the use of a primer and a regular deoxynucleotide (dNTP), which is a monomer, or a single unit, of DNA The primer and dNTP are mixed with a small proportion of fluorescently labeled dideoxynucleotides (ddNTPs) The ddNTPs are monomers that are missing a hydroxyl group (–OH) at the site at which another nucleotide usually attaches to form a chain ([link]) Each ddNTP is labeled with a different color of fluorophore Every time a ddNTP is incorporated in the growing complementary strand, it terminates the process of DNA replication, which results in multiple short strands of replicated DNA that are each terminated at a different point during replication When the reaction mixture is processed by gel electrophoresis after being separated into single strands, the multiple newly replicated DNA strands form 1/6 Whole-Genome Sequencing a ladder because of the differing sizes Because the ddNTPs are fluorescently labeled, each band on the gel reflects the size of the DNA strand and the ddNTP that terminated the reaction The different colors of the fluorophore-labeled ddNTPs help identify the ddNTP incorporated at that position Reading the gel on the basis of the color of each band on the ladder produces the sequence of the template strand ([link]) A dideoxynucleotide is similar in structure to a deoxynucleotide, but is missing the 3' hydroxyl group (indicated by the box) When a dideoxynucleotide is incorporated into a DNA strand, DNA synthesis stops Frederick Sanger's dideoxy chain termination method is illustrated Using dideoxynucleotides, the DNA fragment can be terminated at different points The DNA is separated on the basis of size, and these bands, based on the size of the fragments, can be read Early Strategies: Shotgun Sequencing and Pair-Wise End Sequencing In shotgun sequencing method, several copies of a DNA fragment are cut randomly into many smaller pieces (somewhat like what happens to a round shot cartridge when fired from a shotgun) All of the segments are then sequenced using the chain-sequencing method Then, with the help of a computer, the fragments are analyzed to see where their sequences overlap By matching up overlapping sequences at the end of each fragment, the entire DNA sequence can be reformed A larger sequence that is assembled from 2/6 Whole-Genome Sequencing overlapping shorter sequences is called a contig As an analogy, consider that someone has four copies of a landscape photograph that you have never seen before and know nothing about how it should appear The person then rips up each photograph with their hands, so that different size pieces are present from each copy The person then mixes all of the pieces together and asks you to reconstruct the photograph In one of the smaller pieces you see a mountain In a larger piece, you see that the same mountain is behind a lake A third fragment shows only the lake, but it reveals that there is a cabin on the shore of the lake Therefore, from looking at the overlapping information in these three fragments, you know that the picture contains a mountain behind a lake that has a cabin on its shore This is the principle behind reconstructing entire DNA sequences using shotgun sequencing Originally, shotgun sequencing only analyzed one end of each fragment for overlaps This was sufficient ...Cirulli et al. Genome Biology 2010, 11:R57 http://genomebiology.com/2010/11/5/R57 Open Access RESEARCH © 2010 Cirulli et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Research Screening the human exome: a comparison of whole genome and whole transcriptome sequencing Elizabeth T Cirulli †1 , Abanish Singh †1 , Kevin V Shianna 1 , Dongliang Ge 1 , Jason P Smith 1 , Jessica M Maia 1 , Erin L Heinzen 1 , James J Goedert 2 , David B Goldstein* 1 for the Center for HIV/AIDS Vaccine Immunology (CHAVI) Abstract Background: There is considerable interest in the development of methods to efficiently identify all coding variants present in large sample sets of humans. There are three approaches possible: whole-genome sequencing, whole- exome sequencing using exon capture methods, and RNA-Seq. While whole-genome sequencing is the most complete, it remains sufficiently expensive that cost effective alternatives are important. Results: Here we provide a systematic exploration of how well RNA-Seq can identify human coding variants by comparing variants identified through high coverage whole-genome sequencing to those identified by high coverage RNA-Seq in the same individual. This comparison allowed us to directly evaluate the sensitivity and specificity of RNA- Seq in identifying coding variants, and to evaluate how key parameters such as the degree of coverage and the expression levels of genes interact to influence performance. We find that although only 40% of exonic variants identified by whole genome sequencing were captured using RNA-Seq; this number rose to 81% when concentrating on genes known to be well-expressed in the source tissue. We also find that a high false positive rate can be problematic when working with RNA-Seq data, especially at higher levels of coverage. Conclusions: We conclude that as long as a tissue relevant to the trait under study is available and suitable quality control screens are implemented, RNA-Seq is a fast and inexpensive alternative approach for finding coding variants in genes with sufficiently high expression levels. Background The study of common human diseases is rapidly moving away from an exclusive focus on common variants using genome-wide association studies and toward sequencing approaches that represent most variants, including those that are rare in the general population. Although rapidly falling, the per base costs of next generation sequencing platforms still preclude the generation of large sample sizes of entirely sequenced genomes at high coverage. In addition to this economic constraint, it is widely appreciated that the very large number of variants identified in such studies will make it difficult to use association evidence alone to identify causal sites. For these reasons, there has been considerable interest in focusing attention on coding variants as a first step at complete representation of human variation. Part of the motivation for this approach stems from the experience with Mendelian diseases, in which 59% of the causal variants are either missense or nonsense mutations [1]. Although there has been considerable speculation on the topic, there are in fact no solid data showing that the picture is any different for common diseases, which may also be influenced by variants that are in or near protein coding sequence [1]. The most comprehensive approach for focusing on exons alone is clearly exome capture, where regions * Correspondence: d.goldstein@duke.edu 1 Center for Human Genome Variation, Duke University School of Medicine, Box 91009, Durham, NC 27708, USA † Contributed equally Full list of author information is available at the end of the article Cirulli et al. Genome Biology 2010, Genome Biology 2007, 8:201 Minireview Analysis of genetic systems using experimental evolution and whole-genome sequencing Matthew Hegreness* † and Roy Kishony* ‡ Addresses: *Department of Systems Biology, Harvard Medical School, Longwood Avenue, Boston, MA 02115, USA, † Department of Organismic and Evolutionary Biology and ‡ School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138, USA. Correspondence: Roy Kishony. Email: roy_kishony@hms.harvard.edu Abstract The application of whole-genome sequencing to the study of microbial evolution promises to reveal the complex functional networks of mutations that underlie adaptation. A recent study of parallel evolution in populations of Escherichia coli shows how adaptation involves both functional changes to specific proteins as well as global changes in regulation. Published: 1 February 2007 Genome Biology 2007, 8:201 (doi:10.1186/gb-2007-8-1-201) The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2007/8/1/201 © 2007 BioMed Central Ltd The comparative study of extant genomes has revolutionized biology, shedding light not only on evolution but also on physiology, genetics and medicine. But the utility of comparisons among naturally evolved isolates is lessened by incomplete knowledge of the environment to which the organisms adapted. Precise knowledge of conditions is attainable only in comparative genomic studies of organisms that have diverged under the controlled conditions of the laboratory, where it is possible to run replicate experiments that distinguish which outcomes are inevitable and which the result of mere chance. Advanced sequencing and mutation-detection technologies now make it possible to reveal the complete genetic basis for an adaptive trait that separates an evolved clone from a reference strain [1-4]. The first whole-genome sequencing of cellular organisms adapted to controlled laboratory conditions has already revealed mutations that contribute to symbiosis [1] and cooperative behavior [5-7]. A new study by Herring et al. [8] takes whole-genome sequencing a significant step further by exploring parallel evolution and its dynamics in replicate populations of Escherichia coli. They also provide direct characterizations of the effects of the detected mutations using site-directed mutagenesis. Their results offer clues to how complex biological systems function and evolve, suggesting that adaptive regulation can occur not only at the loci of genes that are directly involved in the adaptive trait but also in distant areas of the network. Whole-genome sequencing of parallel evolved strains promises to reveal novel functional links among genes and genetic modules. Future studies may be able to use genome-sequencing technologies to answer a range of pressing questions in biology and evolution: how biological networks are constructed, constrained, and modified; how clonal interference shapes the outcomes of evolution; and what is the complete spectrum of genetic mutations available to selection. The advantages of bacteria for experimental evolution In 1893, HL Russell, a bacteriologist at the University of Wisconsin, enumerated some of the “evident advantages that bacteria possess for experimental research in evolutionary biology” [9]. These included how the “physical and chemical environment [in which bacteria grow] can be so rigidly controlled that the variability of conditions …is practically excluded”, as well as how, by virtue of short generation times, a “rapid successive transference of cul- tures to fresh media can secure the effect of an experiment covering an immense number of generations within a limited space of time” [9]. Russell’s ideas appear to have remained unrealized for nearly a century, but the field of experimental evolution finally emerged as a vibrant and independent discipline towards the end of the twentieth century [10]. With advances in the culture and Genome Biology 2009, 10:R53 Open Access 2009Bontellet al.Volume 10, Issue 5, Article R53 Research Whole genome sequencing of a natural recombinant Toxoplasma gondii strain reveals chromosome sorting and local allelic variants Irene Lindström Bontell *¥ , Neil Hall † , Kevin E Ashelford † , JP Dubey ‡ , Jon P Boyle § , Johan Lindh ¶ and Judith E Smith * Addresses: * Institute of Integrative and Comparative Biology, Clarendon Way, University of Leeds, Leeds, LS2 9JT, UK. † School of Biological Sciences, University of Liverpool, Crown Street, Liverpool, L69 7ZB, UK. ‡ United States Department of Agriculture, Agricultural Research Service, Animal and Natural Resources Institute, Animal Parasitic Diseases Laboratory, Baltimore Avenue, Beltsville, MD 20705, USA. § Department of Biological Sciences, University of Pittsburgh, Fifth Avenue, Pittsburgh, PA 15260, USA. ¶ Department of Parasitology, Mycology and Environmental Microbiology, Swedish Institute for Infectious Disease Control (SMI), Nobels väg, 171 82 Solna, Sweden. ¥ Current address: Division of Clinical Microbiology, Department of Medicine, Karolinska Institutet, Alfred Nobels Allé, 141 86 Stockholm, Sweden. Correspondence: Judith E Smith. Email: j.e.smith@leeds.ac.uk © 2009 Lindström Bontell et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Toxoplasma genome evolution<p>Extensive sequence analysis of eight Toxoplasma gondii isolates from Uganda has revealed chromosome sorting and local allelic vari-ants.</p> Abstract Background: Toxoplasma gondii is a zoonotic parasite of global importance. In common with many protozoan parasites it has the capacity for sexual recombination, but current evidence suggests this is rarely employed. The global population structure is dominated by a small number of clonal genotypes, which exhibit biallelic variation and limited intralineage divergence. Little is known of the genotypes present in Africa despite the importance of AIDS-associated toxoplasmosis. Results: We here present extensive sequence analysis of eight isolates from Uganda, including the whole genome sequencing of a type II/III recombinant isolate, TgCkUg2. 454 sequencing gave 84% coverage across the approximate 61 Mb genome and over 70,000 single nucleotide polymorphisms (SNPs) were mapped against reference strains. TgCkUg2 was shown to contain entire chromosomes of either type II or type III origin, demonstrating chromosome sorting rather than intrachromosomal recombination. We mapped 1,252 novel polymorphisms and clusters of new SNPs within coding sequence implied selective pressure on a number of genes, including surface antigens and rhoptry proteins. Further sequencing of the remaining isolates, six type II and one type III strain, confirmed the presence of novel SNPs, suggesting these are local allelic variants within Ugandan type II strains. In mice, the type III isolate had parasite burdens at least 30-fold higher than type II isolates, while the recombinant strain had an intermediate burden. Conclusions: Our data demonstrate that recombination between clonal lineages does occur in nature but there is nevertheless close homology between African and North American isolates. The quantity of high confidence SNP data generated in this study and the availability of the putative parental strains to this natural recombinant provide an excellent basis for future studies of the genetic divergence and of genotype-phenotype relationships. Published: 20 May 2009 Genome Biology 2009, 10:R53 (doi:10.1186/gb-2009-10-5-r53) Received: 27 February 2009 Revised: 1 May 2009 Accepted: 20 May 2009 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/20 Genome Biology 2009, 10:R82 Open Access 2009Ecket al.Volume 10, Issue 8, Article R82 Research Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discovery Sebastian H Eck ¤ * , Anna Benet-Pagès ¤ * , Krzysztof Flisikowski † , Thomas Meitinger *‡ , Ruedi Fries † and Tim M Strom *‡ Addresses: * Institute of Human Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, Ingolstädter Landstr., 85764 Neuherberg, Germany. † Lehrstuhl für Tierzucht, Technische Universität München, Hochfeldweg, 85354 Freising- Weihenstephan, Germany. ‡ Institute of Human Genetics, Klinikum rechts der Isar, Technische Universität München, Trogerstr., 81675 München, Germany. ¤ These authors contributed equally to this work. Correspondence: Tim M Strom. Email: TimStrom@helmholtz-muenchen.de © 2009 Eck et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. SNP detection in cattle<p>The next generation sequencing of a single cow genome with low-to-medium coverage has revealed 2.44 million new SNPs.</p> Abstract Background: The majority of the 2 million bovine single nucleotide polymorphisms (SNPs) currently available in dbSNP have been identified in a single breed, Hereford cattle, during the bovine genome project. In an attempt to evaluate the variance of a second breed, we have produced a whole genome sequence at low coverage of a single Fleckvieh bull. Results: We generated 24 gigabases of sequence, mainly using 36-bp paired-end reads, resulting in an average 7.4-fold sequence depth. This coverage was sufficient to identify 2.44 million SNPs, 82% of which were previously unknown, and 115,000 small indels. A comparison with the genotypes of the same animal, generated on a 50 k oligonucleotide chip, revealed a detection rate of 74% and 30% for homozygous and heterozygous SNPs, respectively. The false positive rate, as determined by comparison with genotypes determined for 196 randomly selected SNPs, was approximately 1.1%. We further determined the allele frequencies of the 196 SNPs in 48 Fleckvieh and 48 Braunvieh bulls. 95% of the SNPs were polymorphic with an average minor allele frequency of 24.5% and with 83% of the SNPs having a minor allele frequency larger than 5%. Conclusions: This work provides the first single cattle genome by next-generation sequencing. The chosen approach - low to medium coverage re-sequencing - added more than 2 million novel SNPs to the currently publicly available SNP resource, providing a valuable resource for the construction of high density oligonucleotide arrays in the context of genome-wide association studies. Published: 6 August 2009 Genome Biology 2009, 10:R82 (doi:10.1186/gb-2009-10-8-r82) Received: 21 April 2009 Revised: 22 June 2009 Accepted: 6 August 2009 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2009/10/8/R82 http://genomebiology.com/2009/10/8/R82 Genome Biology 2009, Volume 10, Issue 8, Article R82 Eck et al. R82.2 Genome Biology 2009, 10:R82 Background The bovine reference genome sequence assembly resulted from the combination of shotgun and bacterial artificial chromosome sequencing of an inbred Hereford cow and her sire using capillary sequencing. Most of the more than 2 million bovine SNPs deposited in dbSNP represent polymorphisms detected in these two Hereford animals [1]. Recently, Van Tassell et al. [2] contributed more than 23,000 SNPs to the bovine SNP collection by next-generation sequencing of reduced representation libraries. The study involved 66 cattle representing different lines of a dairy breed (Holstein) and the 7 most common beef breeds (Angus, Red Angus, Cha- rolais, Gelbvieh, ... abnormalities can be discovered 4/6 Whole- Genome Sequencing using microarrays, whereas whole- genome sequencing can provide information about all six billion base pairs in the human genome Although the study... 5/6 Whole- Genome Sequencing uses only deoxynucleotides uses labeled dNTPs A Whole- genome sequencing can be used for advances in: the medical field agriculture biofuels all of the above D Sequencing. .. regarding health and privacy Section Summary Whole- genome sequencing is the latest available resource to treat genetic diseases Some doctors are using whole- genome sequencing to save lives Genomics has

Ngày đăng: 31/10/2017, 01:10

Xem thêm