Detecting bad SNPs from Illumina BeadChips using Jeffreys distance Nguyễn Hồng Sơn Trường Đại học Cơng nghệ Luận văn Thạc sĩ ngành: Khoa học máy tính; Mã số: 60 48 01 Người hướng dẫn: TS Lê Sỹ Vinh Năm bảo vệ: 2011 Keywords Công nghệ thông tin; Khoa học máy tính Content Table of Contents Overview 1 Introduction 1.1 Biological Background 1.2 SNP Genotyping 1.3 Quality Control and Quality Assurance Related Work 10 2.1 Naive method for SNP genotyping 10 2.2 GenCall 11 2.3 Illuminus 12 2.4 GenoSNP 12 2.5 Discussion 13 Method 3.1 Kullback-Leibler divergence 3.2 Approximate relative entropy between two Student distributions 3.2.1 Approximate Student distribution 3.2.2 The matched bound approximation 3.3 Estimate conflict degree between three callers 3 15 15 16 16 17 19 Experimental Results 23 4.1 Data description 23 4.2 Parameter estimation 24 4.3 Evaluation 26 Conclusion 34 References Anderson Carl A, Pettersson Fredrik H, C G M C L R M A P Z K T Collins, F S., & McKusick, V A (2001) Implications of the human genome project for medical science JAMA: The Journal of the American Medical Association, 285, 540-544 El Attar, A., Pigeau, A., & Gelgon, M (2009) Fast aggregation of student mixture models European Signal Processing Conference (Eusipco’2009) (pp 312-216) Glasgow, Royaume-Uni Miles (Pays-de-la-Loire) Giannoulatou, E., Yau, C., Colella, S., Ragoussis, J., & Holmes, C C (2008) Genosnp: a variational bayes within-sample snp genotyping algorithm that does not require a reference population Bioinformatics, 24, 2209-2214 Goldberger, J., Gordon, S., & Greenspan, H (2003) An efficient image similarity measure based on approximations of kl-divergence between two gaussian mixtures In Proc ICCV (pp 487-493) Group, G C R (2007) New models of collaboration in genome-wide association studies: the genetic association information network Nat Genet, 39, 1045-1051 Hershey, J., & Olsen, P (2007) Approximating the kullback leibler divergence between gaussian mixture models Acoustics, Speech and Signal Processing, 2007 ICASSP 2007 IEEE International Conference on (pp IV-317 -IV-320) Illumina Inc (2005) Spotlight, Illumina Gencall Data Analysis Software Laurie, C C., Doheny, K F., Mirel, D B., Pugh, E W., Bierut, L J., Bhangale, T., Boehm, F., Caporaso, N E., Cornelis, M C., Edenberg, H J., Gabriel, S B., Harris, E L., Hu, F B., Jacobs, K B., Kraft, P., Landi, M T., Lumley, T., Mano- lio, T A., McHugh, C., Painter, I., Paschall, J., Rice, J P., Rice, K M., Zheng, X., Weir, B S., & for the GENEVA Investigators (2010) Quality control and quality assurance in genotypic data for genome-wide association studies Genetic Epidemiology, 34, 591-602 Peiffer, D A., Le, J M., Steemers, F J., Chang, W., Jenniges, T., Garcia, F., Haden, K., Li, J., Shaw, C A., Belmont, J., Cheung, S W., Shen, R M., Barker, D L., & Gunderson, K L (2006) High-resolution genomic profiling of chromosomal aberrations using infinium whole-genome genotyping Genome Research, 16, 11361148 Ritchie, M E., Liu, R., Carvalho, B S A., & New Zealand Multiple Sclerosis Genetics Consortium (ANZgene), Irizarry, R A (2011) Comparing genotyping algorithms for Illumina’s Infinium whole-genome SNP BeadChips BMC Bioinformatics, 12, 68 Steemers, F J., Chang, W., Lee, G., Barker, D L., Shen, R., & Gunderson, K L (2006) Whole-genome genotyping with the single-base extension assay Nature Methods, 3, 3133 Teo, Y Y., Inouye, M., Small, K S., Gwilliam, R., Deloukas, P., Kwiatkowski, D P., & Clark, T G (2007) A genotype calling algorithm for the illumina beadarray platform Bioinformatics, 23, 2741-2746 Wigginton, J E., Cutler, D J., & Abecasis, G R (2005) A note on exact tests of hardyweinberg equilibrium The American Journal of Human Genetics, 76, 887 - 893 ... Giannoulatou, E., Yau, C., Colella, S., Ragoussis, J., & Holmes, C C (2008) Genosnp: a variational bayes within-sample snp genotyping algorithm that does not require a reference population Bioinformatics,... Implications of the human genome project for medical science JAMA: The Journal of the American Medical Association, 285, 540-544 El Attar, A., Pigeau, A., & Gelgon, M (2009) Fast aggregation... approximations of kl-divergence between two gaussian mixtures In Proc ICCV (pp 487-493) Group, G C R (2007) New models of collaboration in genome-wide association studies: the genetic association