1. Trang chủ
  2. » Thể loại khác

An ant colony optimization approach for phylogenetic tree reconstruction problem

66 35 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 66
Dung lượng 23,44 MB

Nội dung

Vietnam National University, Hanoi College of Technology Huy Quang D inh A n A nt Colony O p tim izatio n A p p ro ach for P hylogenetic Tree R eco n stru ctio n P ro b le m M ajor : Information Technology Code : 1.01.10 M ASTER THESIS Advisor : Prof A rndt von Haeseler Co-Advisor : Dr Hoang Xuan Huan Hanoi - December, 2006 C o n te n ts A b s tr a c t *ii D e c la tio n IV A c k n o w le d g e m e n ts v In tr o d u c tio n 1.1 M o tivation 1.2 1.1.1 Com putational B iology 1.1.2 Phytogeny R e c o n stru c tio n Thesis Works and S t r u c t u r e P h y lo g e n e tic T re e R e c o n s tru c tio n 2.1 Phylogenetic T r e e s 2.2 Sequence Alignment 2.2.1 Biological D a t a 2.2.2 Pairwise and Multiple sequence a lig n m e n t 2.3 Approaches for phylogeny re c o n stru ctio n 11 2.4 Maximum Parsimony Principle 11 2.4.1 Parsimony C o n c e p t 11 2.4.2 Counting evolutionary changes 12 2.4.3 Remarks on Maximum Parsimony A p p ro a ch e s 14 2.5 Finding the best tree by heuristic searches 15 2.5.1 Sequential Addition Methods 15 2.5.2 Tree Arrangement M e th o d s 16 vi C ontents _ _ 2.5.3 3.2 The Ant Algorithms 20 20 3.1.1 Double bridge experim ents 20 3.1.2 Ant S y s te m 22 3.1.3 Ant Colony S y s te m 24 3.1.4 Max-Min Ant S ystem 25 Ant Colony Optimization M eta-h eu ristic 27 3.2.1 Problem R e p re se n ta tio n 27 3.2.2 Artificial A n ts 28 3.2.3 Meta-heuristic S c h e m e 29 3.3 Remarks on ACO A p p licatio n 30 3.4 ACO approaches in phylogenetics 31 P hylogenetic Inference w ith Ant Colony O ptim ization 33 4.1 Related W orks 33 4.2 Tree Graph D escription 34 4.3 4.4 4.5 19 A n t C olony O p tim iz a tio n 3.1 Other heuristic search m e t h o d s 4.2.1 BD Tree C o d e 34 4.2.2 State Graph D e sc rip tio n .38 Our ACO-applicd A p p ro a c h 39 4.3.1 Pheromoue Trail and Heuristic Information 4.3.2 Solution Construction Procedure 4.3.3 Pheromonc Update Chosen Procedure Simulation Results 40 40 42 42 4.4.1 Simulated Data 43 4.4.2 Real D a t a 44 D iscussion 46 C onclusion a n d O u tlo o k 48 B ibliography 50 C o n te n ts viii A p p e n d ix Probabilistic Decision R u le 57 Tree encoding from a BD tree c o d e BD tree code Decoding a lg o rith m .58 ACC) Solution Construction P ro c e d u re 58 Pltcromone Trails Update P r o c e d u re 59 Algorithm for calculating evolutionary changes 60 57 57 List of Figures 1.1 The exponentially growth of nucleotide d a ta b a s e s 2.1 A looted tree of life 2.2 Unrooted tree representation of annelid relationships 2.3 Three possible topologies of unrooted tree for four t a x a 2.4 An example of four types of nucleotide mutations (Nei and Kumar, 0 ) 2.5 Multiple Sequence Alignment E x a m p le 10 2.6 An example for Fitch a lg o rith m 13 2.7 An example of sequential addition m e th o d 16 2.8 An example of Nearest-Neighbor Interchange O p eratio n 17 2.9 An example of Subtree Pruning and Regrafting Operation 18 2.10 An example of Tree Bisection and Rcconnncction O p e r a t io n 18 3.1 Experimental setup for the double bridge e x p e rim e n t 21 3.2 Results gained in the double bridge experiment 4.1 22 Example of encoding a tree from a given BD tree code 36 4.2 Graph structure description with a*,iV p la n e 38 4.3 An example of tree building on the a, N p la n e .39 4.4 A found tree with 17 real species ix .45 List of T ables 2.1 Twenty different types of amino acids with corresponding codc 4.1 The number of instances for which the reconstructed tree and the generated true tree arc identical in the simulated data instances 4.2 Simulation results with real data of our proposed a p p ro a c h 44 x 43 C h a p te r I n t r o d u c t io n 1.1 1.1.1 M o tiv a tio n C o m p u ta tio n a l B iology Nowadays, based on the modern com puter technologies and the development of efficient sequencing technologies, a huge am ount of genetic d a ta is collected in many genome projects including GenBank(USA), EM BL-Bank(Europe), and DNA D atabase of Japan (DDBJ) (see Figure 1.1) The size of the GenBank database is extrem ely large: over 65 billion DNA ba.se pairs in 61 million molecular sequences J This drastic growth of biological d a ta requires com putational tools for biological data (.so-called bioinformatics tools) being capable of handing a large-scale analysis The term s bioinformatics and com putational biology are often used interchange­ ably It is further emphasized th a t there is a tight coupling of developments and knowledge between the more hypothesis-driven research in com putational biology and technique-driven research in bioinformatics A lot of approaches in com puter science have been applied to solve more and more complex problems in com putational biology (Baldi and Brunak, 2000); unfortunately almost, all such problems are N P-hard or NP-complctc Therefore, heuristic search m ethods play an im portant role in tackling the com binatorial optim ization problems 1htt p: / / www n cb i.n lm nih.gov / G e n b a n k / 2h tt p: / / www.b is ti.nill gov/ C om puB ioD ef pdf 1.1 M o tiv a tio n Figure 1.1: The exponentially growth of nucleotide databases Growth of the International Nucleotide Sequence Database Collaboration B.IS« P & rs by 'S H n fla rk ii— ** t M S i — OOBJ —• http://w w w ncbi.nlm.nih.gov/Genbank/ Recently, Ant Colony Optimization (Dorigo 1992) has been proposed and shortly afterwards has been recognized as one efficient method for finding an approximate solution for NP-hard problems The first application is traveling salesman problem by inspiring by the real ants’s behavior when traveling from the colony to the food resource and transporting the food back ACO technique is widely used in various types of combinatorial optimization problems including in bioinformatics (Dorigo and Stutzle, 2004) 1.1.2 P hylogeny R eco n stru ctio n Since the t ime of Charles Darwin, evolutionary biology has been a main focus among biologists to understand the evolutionary history of all organisms Where the re­ lationship of the structure of the organisms is often expressed as a phylogenetic tree (Haeckel 1866) Since the mid of twentieth century, the emergence of rnolec- 1.2 T h e sis W o rk s an d S tr u c tu r e ul,u biology has given rise to a new branch ot study based on inolccular scqucnce (e.g DNA or protein) Moreover, phylogenetic analysis helps not only elucidate the evolutionary pattern but also understand the process of adaptive evolution at the molecular level (Nei and Kumar 2000) In molecular phylogenetics, the sequences of the contemporary species arc given and one asks for the tree topology (including the branch lengths) which explains the data It is commonly accepted that phytogenies arc rooted bifurcating trees, where the root is the most common ancestor of the contemporary species The leaves represent contemporary species, and the internal nodes stand for spéciation events Among plenty of approaches to rcconstruc phylogenetic trees, the statistic-based methods have been recognized as sound and accurate methods Determining the best phylogénies based on optimality critcrions such as maximum parsimony, mini­ mum evolution and maximum likelihood was proved as NP-hard and NP-completc problems (Graham and Foulds, 1982; Day and Sankoff, 1986; Chor and Tullcr, 2005) 1.2 T hesis W orks and Structure In this thesis, we will build a general framework to apply ACO principle into phylo­ genetics and mainly deal with maximum parsimony However, such approach can be easily adapted to any objective function Our contribution is the formal description of framework to apply ACO mctaheuristics to solve the phylogcny reconstruction problem Attempts to solve the phylogenetic reconstruction problem using ACO gained only a poor results partly because of the poor construction graph (Ando and lba, 2002: Kumnorkacw et a l, 2004; Perrctto and Lopes, 2005) We proposed a mure general graph representation to overcome this problem Except the introduction and conclusion, the thesis is organized into chapters The first chapter sketches the major problem of reconstructing phylogenetic trees from given biological sequences The second chapter will show the general building block of ACO technique and application for solving the combinatorial optimization problems The third chapter describes the main outcome of the thesis It will de­ scribe our approach and some initial experiences to employ ACO into phylogenetics C h a p te r P h y lo g e n e tic T ree R e c o n s tru c tio n The goal of the phylogenetic tree reconstruction problem is to assemble a tree rep­ resenting a hypothesis about the evolutionary relationship among a set of genes, species, or other taxa In this chapter, we will briefly introduce the main concept of phylogenetics and the state-of-the-art methods In particular, we will concen­ trate on the maximum parsimony principle used as an objective function for our optimization approach discussed in chapter 2.1 P h y lo g en etic Trees According to Charles Darwin’s evolution theory, all species have evolved from an­ cestors under the pressure of natural selection (Darwin, 1872) Evolutionary trees or phylogenetic trees in phylogenetics terminology arc the one way to display the evolu­ tionary relationships among species A phylogenetic tree, also called an evolutionary tree , or a phytogeny is a graph-theoretic tree representing the evolutionary relation­ ships among a number of species having a common ancestor Figure 2.1 depicts the phylogenetic tree of life consisting of three domains of all existing species: Bacte­ ria Archaea, and Eukarya In a phylogenetic: tree, each internal node represents ¿in unknown common ancestor th at split into two or more species, its descendants Each external node or leaf represents a living spec ies, each branch has a length cor­ responding to the time between two splitting events or to the amount of changes that accumulated between two splits 46 4.5 Discussion is wrv meaningful in which we can see that some pairs whose the elose relationship in nature such as (chicken duck), {mount, rat), (sheep, cow), and ( rabbit, hare) have the same evolutionary father while hamster is closer mouse than pig and marsupial in evolut ionarv relationship The experimental results with some real data sets proved that our method can he considered as the useful effective in phylogenetics under maximum parsimony criterion It cm be easily applied with other optimality criterion such as maximum likelihood Ibr gaining the "true” tree Adjusting the parameters in ACO application is very difficult due to we have to observe the increasing and decreasing of both the optimal result and the pheromonc t rails in each iteration lor possible changes Therefore, we run each instance 25 times for obtaining the average result In ACO it is very important, factor besides the best result gained (Dorigo and Stutzle, 2004) Anyway, we overcame the limitations of the previous ACO approach for phylogenetic tree reconstruction problem Wc can deal wit h the large data with any objective function with the acceptable results Due to the limited time, the thesis failed to show the better experimental results However, we believe that this results is enough good to show the efficiency of our ACO approac h 4.5 Discussion This chapter showed our main contribution in the thesis, the general ACO framework for phylogenetic reconstruction We proposed the construction graph based on BD tree code (Bandclt and Dress, 1986) Thanks to the way to construct tree according to stepwise addition, we found a successful way to map the path of ant on the graph to the solution as the phylogenetic tree Based on that, we can search and move in tree space quickly and efficiently without too much memory requirement It will provide the ACO application in phylogenetics many interesting suggestions Our simulations and the analysis on both the simulated data and real data showed that our method works However, more investigations arc necessary to fully exploit, the properties of ACO in phylogenetics Besides that, wc have the 4.5 Discussion 47 disadvantage that, it is poor heuristic information The better heuristic information can gain the better experimental results Chousing the suitable and efficient heuristic information that can express the phylogenetic information for each so-called state node or partial tree is really dif­ ficult In ca.se of not good heuristic information, the searching process of ACO approach is bias to random search The scope of this thesis did not focus on that, t lit: m.tin goal is building the general state graph and proving the efficiency of that graph by computer simulations in some typical data in both simulated cases and real ins!antes C h a p te r C onclusion and O utlook In conclusion our work provides a new AGO framework for solving phylogenetic: tree reconstruction problem thanks to the efficient construction graph The initially experimental results proved the efficiency of the proposed framework in both terms of accuracy and openness feature We believe that it is deserved to be considered by the phylogenetic community • C o n stru c tio n G p h We have built the construction graph successfully described in the chapter Thanks to BD tree code, we represented the phylogenetic tree as the path of ants according to sequential addition strategy Our graph with only n points where 11 is the number of given species is more general and efficient compared to the existed graph reviewed in the last sect ion of chapter With the proposed graph, we can deal with any objective fund ion in phylogenetics such as : maximum parsimony as our work, distance* based and maximum likelihood Then, it can be considered by both ACO and pliyli«genetics community • S im ulation P erform ance In this thesis, we built the computer simulation with many experimental data and showed both best and average results for consideration Although it is the general approach, the experimental results still are comparable with one of the most well-known parsimony methods, PhyLip version 3.5c (Fclsenstcin, 1993) in small data sets Therefore, our preliminary results motivate for further improvements C h a p te r Conclusion and O utlook 49 Outlook • in the lullin', we will try more adjusting experimental parameters As well known, ilie experimental parameters (initial and boundary pheromone values, the coefficient a i the number of ants, the pheromone evaporation parame­ ter the number of iterations) play a very important role in ACO application (Dorigo and St utzle, 2004) They arc totally different for each type of problem Due' to i he limited time, we paused in some initial experimental results showed in chapter We believe that under the detail observing mechanism for the pur«meters, we can adjust them more efficiently Besides that, other ACO ap­ proaches such as ACS multi-level ant system including some another efficient local searching strategies such as SPR, TBR (reviewed in chapter 2) will be considered as the further works In addition, finding a more suitable heuristic informal ion will guide the searching process not only focus on the random search but also exploit the reinforcement learning information for gaining the better solution • After that, we will try with the objective function maximum likelihood due to its efficiency It need some rclatcd-changes such as where the pheromone trail put on type of heuristic information, and also the experimental results It may be gain the promising results with the efficient ACO skeleton B ibliography Ando S and Iba II (2002) Ant, algorithm for construction of evolutionary tree In Evolutionary Computation, 2002 CEC ’02 Proceedings of the 2002 Congress on, vol pages 1552 1557, IEEE Press Baldi P and Brimak S (2000) Bioi.nfonriti.es: the machine learning approach MIT Press Cambridge, Massachusetts, London, England, Fourth cdn Mmdelt II.-.1 and Dress A (1986) Reconstructing the shape of a tree from observed dissimilarity data Adv Appl Math., 7, 309-343 Barker, D (2004) LVB: parsimony and simulated annealing in the search for phylo­ genetic trees Bunnfornia.ti.es, 20, 274-275 Brauer M Holder M T Dries, L A Zwickl, D J., Lewis, P O and Hillis, 1) M (2002) Genetic algorithms and parallel processing in maximum-likelihood phytogeny inference Mol Biol Evol., 19, 1717 1726 Brown E W Kotewicz, M L and Cebula, T A (2002) Detection of recombination among salmonella enterica strains using the incongruence length difference test Mol Phylogenet Evol 24, 102-120 Cavalli-Sforza L L and Edwards A W F (1967) Phylogenetic analysis: Models and estimation procedures Amer J Human Genet., 19, 233-257 Chor B and Tuller T (2005) Maximum likelihood of evolutionary trees is hard In Procmlings of the 9th Annual International Conference on Research in Com­ putational Molecular Biology (REC'OMB 2005), vol 3500 of Lecture Notes in Computer Science, page« 296-310, New York, USA, ACM Press 50 51 B IB L IO G R A P H Y (niigilou (2001) A genetic approach to claclistics In Principles o f Data Mining und l\ notch ilt/e Discovci'g cd L.DcRacdt und A.Siebes, Lecture Notes in Coinput>/ St /i net No ¿108 pages 67-78, Berlin Germany, Springer-Verlag Cook \V .) Gmmingliitm W II Pulleyblank W R and Schrijvcr, A (1997) Cm nimmt octal Optimization, lolm W iley and Sons Press First edn Cormen T II Leiserson G E Rivest, R L and Stein C (2001) Introduction to Algorithms The MIT Press Cambridge Massachusetts, Sccond cdn Darwin C ( 1872) On the Origin, of Species John Murray, London, 6th cdn Day \V 11 F and Sankolf, D (198C) Computational complexity of inferring phylogenies by compatibility Syst Zooi, 35, 224-229 Denciibourg, J.-L., Aron, S G S and Pasteeis, J.-M (1990) The self-organizing exploratory pattern of the argentine ant Journal of Insect Behavior, 3, 159-168 Dinli II Q Do D D and Hoang, H X (2006) Multi-level ant system - a new approach through the new pheromone update for ant colony optimization In Proceeding* of 111 \'FOG the /,th IEEE Internultioiial Conference in COmputer Srit'it i t s Resi nrcli Innovation and Vision fo r Future, pages 55-58, IEEE Press Dorigo M (1992) Optnni.zati.oii, Learning anil Natural Algorithms Ph.D thesis, Milan Polytcclmique Milano, Italy Dorigo M and L.M.Gambardclla (1997) Ant colony system: A cooperative learning approach to the traveling salesman problem IE E E Transactions on Evolutionary Computation, Dorigo, M Maniezzo, V and Colorni, A (1996) Ant system: Optimization by a colony of cooperating agents IEEE Transactions on Systems, Man, and Ctjhi•rn.ciu s-PartB 26 Dorigo M and Stutzle T (2004) Ant Colony Optimization The MIT Press, Cam­ bridge Ma.saclmsctrs First edn B IB L IO G R A P H Y 52 Edgar R C (2004) MUSCLE; multiple sequence alignment with high accuracy and high throughput .Xitel Acids He.s., 32, 1792-1797 Farris •) (1970) Methods tor computing wager trees Syst Zoul., 19 83-92 Felsenstein J (1978) The number of evolutionary trees Syst Zool., 27, 27-33 Felscnstein .1 Í 1993) PHYLI P (Phyloyeny inference Package) version 3.5c Depart­ ment of Genetics University of Washington Seattle, Distributed by the author Felsenstein .1 (2004) infering Phytogenies Sinauer Associates, Sunderland, Massachuset ts Fitc h W M (1971) Toward defining the course of evolution: Minimum change for a specific tree topology Syst Zool., 20, 406-416 Fitch W M and Margoliash, E (1967) Construction of phylogenetic trees Science, 155 279-284 Foiilds, L R and Graham R L (1982) The Steiner problem in phylogcny is NPcomplete Adv Appl Math 3, 43-49 Goloboll P A (1999) Analyzing large d atasets in reasonable times: Solutions for composite optima Cladistics 15 415-428 Graham R L and Foulds, L R (1982) Unlikelihood that minimal phytogenies for a realistic biological study can be constructed in reasonable computational time Math, Bwsci 60 133 142 Haeckel E (1866) Generelle Morphologie der Organismen: Allgemeine Grandzüge der organischen Formen Wissenschaft mechanisch begründet durch die von Charles Du rem reformierte Descenderá- Theorie Georg Riemer, Berlin von Haeseler A (1988) Rekonstruktion phylogenetischer Bäume mit Hilfe von Vari­ anten der Vier-Punkt-Bcdingung Materialien LVI, Unversität Bielefeld, Schwer­ punkt Matbematisierung, Bielefeld, Germany B IB L IO G R A P H Y 53 Harding E F (1971) The probabilities of rooted tree-shapes generated by random bifurcation Adv Appl Prob, 44-77 Hoang II X and Dinli II T (2002) On the ant colony system for postman problem I- tt tnain National Univer sity Journal of Science , 1, 29-38 Holland .1 (J975) Adaption in Natural and Artificial Systems University of Michi­ gan Press Ann Arbor First cdn katoh K Klima K.-i and Miyata, T (2001) Genetic algorithm-based maximumlikelihood analysis for molecular phylogenv ■/ Mol Evol, 53, 477-484 Kirkpatrick S Gelatt C D and Vecchi, M P (1983) Optimization by simulated annealing Science 220 G71-680 Kluge A and Farris .1 S (1969) Quantitative pheletics and the evolution of anurans Syst Biol., 18 1-32 Kimmorkacw P., Rucnglcrtpanyakul, W and Ku, H.-M (2004) Application of ant colony optimization to evolutionary tree construction In Evolutionary Computa­ tion ¿002 CEC '02 Proceedings o f the 2002 Congress on, vol 6, pages 321-330 Lemmon, A R and Milinkovitch, M C (2002) The mctapopulation genetic algo­ rithm: An efficient solution for the problem of large phylogeny estimation Proc Nall Acad S o USA 99 10516 10521 Lewis P () (1998) A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide' sequence data Mol Biol Evol., 15, 277-283 Maddison I) R (1991) The discovery airnl importance of multiple islands of most parsimonous trees Syst Biol 42, 200-210 Matsuda, II (1996) Protein phylogenetic inference using maximum likelihood with a genetic algorithm In Proceedings o f the 1st Pacific Symposium on Biocomputing (PSB 1990), pages 512-523, Hawaii Moilanen A (1999) Searching for most parsimonious trees with simulated evolu­ tionary optimization Cludist.ics, 15 39-50 54 B IB L IO G R A P H Y Morgenstern B (1999) DIALIGN 2: improvement of the scgmcnt-to-segmcnt ap­ proach lo mttltiple sequence alignment, Bi.oinfonnat.ics, 15, 211-218 Xei M and Kumar S (2000) Molecular Evolution and Phylogenetics Oxford Uni­ versity Press, Oxford UK Nixon K C (1999) The parsimony ratchet, a new method for rapid parsimony analysis Cladistics 15 -107 -114 Notre.lame (' Higgins L) and Heringa J (2000) T-COFFEE: A novel method for mult iple sequence alignments Journal of Molecular Biology, 302, 205-217 Ferret to M and Lopes 11 S (2005) Reconstruction of phylogenetic trees using the ant colony optimization paradigm Genetics arid Molecular Research , Quicke D L ! Taylor, ! and Purvis, A (2001) Changing the landscape: A new strategy for estimating large' phylogcnies Syst Biol., 50, 60-66 Rambant, A and Crassly, N (1998) Seq-Gen manual, version 1.1 Department, of Zoology University of Oxford, UK Roslian U., Moret B M E., Williams, T L and Warnow, T (2004) Performance of supertree methods on various dataset decompositions In Bininda-Emonds, O R P (ed.) Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, pages 301 328 Kluwcr Academic, Dordrecht, The Netherlands Sankolf L) (1975) Minimal mutation trees of sequences SI A M Journal o f Applied M athtunities, 35 42 Semple C and Steel M (2003) Phylogenetics, vol 24 of Oxford Lecture Series in Mathematics and Its Applications Oxford University Press, Oxford, UK Soeha K Knowles .1 and Sampels M (2002) A max-min ant system for the university course timetabling problem In Proceedings of A N T S 2002 - Third In ­ ternational Workshop on Ant Algorithms, pages 1-13, Berlin, Germany, Springer- Verlag B IB L IO G R A P H Y 55 Spencer M Susko E and Roger, A .) (2005) Likelihood, parsimony, and hetero­ geneous evoltit ion Mol Biol Evol 22 1161- 1164 Stamat.ikis A P (2005) An efficient program for phylogenctic inference using simulated annealing In Online Proceedings of the 4th I EEE International Workshop on Iliijh Pi rformance Computational Biology (JIICOMB 2005), page 8, Denver Stutzle, T and Hoos H (1997) The max-min arit system and local search for the travelling salesman problem In Proceedings of IC E C ’97 - 1997 IE E E J^th Inter­ national Conference on Evolutionary Computation, pages 308-313, IEEE Press SwoHurd I) L (2002) PALJP*: Phylogenetic analysis using parsimony (and other m e t h od s ) Simmer Associates, Sunderland MA Swofibrd, D L Olsen G I Waddell P ,1 and Hillis, D M (1996) Phylogcny reconstruction In Hillis, D M., Moritz, C and Mable, B K (eds.), Molecu­ lar Syslcinati.cs pages 407-514 Siuaucr Associates, Sunderland, Massachusetts, S c't o n d edn Tateiio Y Takezaki N and Nei, M (1994) Relative efficiencies of the maximumlikolihood, neighbor joining, and maximum-parsimony methods when substitu­ tion rate varies with site Mol Biol Evol., 11, 261-277 Thompson .1 D Higgins, D G and Gibson, T J (1994) CLUSTAL W: Improv­ ing the sensitivity of progressive multiple sequence alignment through sequence weighting, positions specific gap penalties and weight matrix choice Nucleic Acids He* 22 4673 4680 Vandamme A.-.M (2003) Basic concepts of molecular evolution In Salcrni, M and Vandamme A.-.M (eds.) The Pliylogentic Handbook, pages 1-23, Cambridge Uni­ versity Press Cambridge, UK Waterman M and Smith T (1978) On the similarity of dendrograms Journal of Theoretical Biology 73 789-800 Waterman M S (1995) Introduction to Computational Biology Chapman and Hall, London 131B1 l o o n A P H Y _ 56 W'att'iinaii M S (2000) Introduction to Comput.at/.onal Biology Chapman and Hall, London UK first crc press cdn Ymig ’/ Nielson H Goldman, N and Pedersen, A.-M K (2000) Codonsubstitmion models for heterogeneous selection pressure at amino acid sites Ge­ net»* 155 -131 449 P ro b ab ilistic Decision Rule 57 Probabilistic Decision Rule The pseudo-code of algorithm selecting the suitable state among n given states with the given probabilities based on probabilistic decision rule A lgorithm 1: Probabilistic decision rule algorithm in p u t : a states with » correspondence probabilities P \ , P-z, ■■■, Pn o u t p u t : Selected state k sum I): for i to A’ [_ sum «— sum + P,: n n i n , Q,m.urked[k] — false}; / then «.+! — l>k : else if k„l,u:\\(i'[k\, i then ", + i — K>Knu t ;i: : else if km(U ^ then n I U i-H _ "max ? else rn.ark[k1tmx] 6[ii (h /tiar]] * true; k'nuix ? for t t h d l Tjj * Tmini Convert the input path to a correspondence BandeltDress code; Build the tree structure from the above codc; length

Ngày đăng: 23/09/2020, 23:08

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w