1. Trang chủ
  2. » Công Nghệ Thông Tin

James devillers genetic algorithms in molecular (bookfi)

333 49 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 333
Dung lượng 17,22 MB

Nội dung

Genetic Algorithms in Molecular Modeling by James Devillers ISBN: 0122138104 Publisher: Elsevier Science & Technology Books Pub Date: June 1996 Contributors J.M Caruthers, Laboratory for Intelligent Process Systems, School o f Chemical Engineering, Purdue University, West Lafayette, IN 47907, USA K Chan, Laboratory for Intelligent Process Systems, School of Chemica l Engineering, Purdue University, West Lafayette, IN 47907, USA J Devillers, CTIS, 21 rue de la Banniere, 69003 Lyon, France D Domine, CTIS, 21 rue de la Banniere, 69003 Lyon, France W.J Dunn, College of Pharmacy, University of Illinois at Chicago, 833 S Wood Street, Chicago, IL 60612, USA R C Glen, Tripos Inc , St Louis, MO 63144, USA H Hamersma, Department of Computational Medicinal Chemistry, N V Organon, P.O Box 20, 5340 BH Oss, The Netherlands A.J Hopfinger, Laboratory of Molecular Modeling and Design, M/C 781 , The University of Illinois at Chicago, College of Pharmacy, 833 S Woo d Street, Chicago, IL 60612-7231, USA G Jones, Krebs Institute for Biomolecular Research and Department o f Information Studies, University of Sheffield, Western Bank, Sheffield S1 2TN, UK R Leardi, Istituto di Analisi e Tecnologie Farmaceutiche ed Alimentari , Universit y di Genova, via Brigata Salerno (ponte), I—16147 Genova, Italy B.T Luke, International Business Machines Corporation, 522 South Road , Poughkeepsie, NY 12601, USA T.D Muhammad, Department of Biological Chemistry, Finch University o f Health Sciences/The Chicago Medical School, 3333 Green Bay Road , North Chicago, IL 60064, USA H.C Patel, Laboratory of Molecular Modeling and Design, M/C 781, Th e University of Illinois at Chicago, College of Pharmacy, 833 S Wood Street , Chicago, IL 60612-7231, USA C Putavy, CTIS, 21 rue de la Banniere, 69003 Lyon, France D Rogers, Molecular Simulations Incorporated, 9685 Scranton Road, Sa n Diego, CA 92121, USA A Sundaram, Laboratory for Intelligent Process Systems, School of Chemical Engineering, Purdue University, West Lafayette, IN 47907, USA V.J van Geerestein, Department of Computational Medicinal Chemistry , NV Organon, P.O Box 20, 5340 BH Oss, The Netherlands S.P van Helden, Department of Computational Medicinal Chemistry, N V Organon, P.O Box 20, 5340 BH Oss, The Netherlands x Contributors V Venkatasubramanian, Laboratory for Intelligent Process Systems , School of Chemical Engineering, Purdue University, West Lafayette, IN 47907, USA D.E Walters, Department of Biological Chemistry, Finch University o f Health Sciences/The Chicago Medical School, 3333 Green Bay Road , North Chicago, IL 60064, USA P Willett, Krebs Institute for Biomolecular Research and Department o f Information Studies, University of Sheffield, Western Bank, Sheffield S1 2TN, UK Preface Genetic algorithms are rooted in Darwin's theory of natural selection an d evolution They provide an alternative to traditional optimization method s by using powerful search techniques to locate optimal solutions in comple x landscapes The popularity of genetic algorithms is reflected in the ever increasing mass of literature devoted to theoretical works and real-worl d applications on various subjects such as financial portfolio management , strategy planning, design of equipment, and so on Genetic algorithms an d related approaches are also beginning to infiltrate the field of QSAR an d drug design Genetic Algorithms in Molecular Modeling is the first book on the use o f genetic algorithms in QSAR and drug design Comprehensive chapters report the latest advances in the field The book provides an introduction to th e theoretical basis of genetic algorithms and gives examples of applications i n medicinal chemistry, agrochemistry, and toxicology The book is suited for uninitiated readers willing to apply genetic algorithms for modeling th e biological activities and properties of chemicals It also provides traine d scientists with the most up-to-date information on the topic To ensure th e scientific quality and clarity of the book, all the contributions have bee n presented and discussed in the frame of the Second International Workshop on Neural Networks and Genetic Algorithms Applied to QSAR and Dru g Design held in Lyon, France (June 12-14, 1995) In addition, they have been reviewed by two referees, one involved in molecular modeling and anothe r in chemometrics Genetic Algorithms in Molecular Modeling is the first volume in the serie s Principles of QSAR and Drug Design Although the examples presented in the book are drawn from molecular modeling, it is suitable for a more genera l audience The extensive bibliography and information on software avail ability enhance the usefulness of the book for beginners and experience d scientists James Devillers Table of Contents Contributors Preface Genetic Algorithms in Computer-Aided Molecular Design An Overview of Genetic Methods 35 Genetic Algorithms in Feature Selection 67 Some Theory and Examples of Genetic Function Approximation with Comparison to 87 Evolutionary Techniques Genetic Partial Least Squares in QSAR 109 Application of Genetic Algorithms to the General QSAR Problem and to Guiding 131 Molecular Diversity Experiments Prediction of the Progesterone Receptor Binding of Steroids Using a Combination of Genetic Algorithms and Neural Networks 159 Genetically Evolved Receptor Models (GERM): A Procedure for Construction of Atomic-Level 193 Receptor Site Models in the Absence of a Receptor Crystal Structure Genetic Algorithms for Chemical Structure 211 Handling and Molecular Recognition 10 Genetic Selection of Aromatic Substituents for Designing Test Series 243 Computer-Aided Molecular Design Using 11 Neural Networks and Genetic Algorithms Designing Biodegradable Molecules from the 12 Combined Use of a Backpropagation Neural Network and a Genetic Algorithm 271 Genetic Algorithms in Computer-Aided Molecula r Design J DEVILLER S CTIS, 21 rue de la Banniere, 69003 Lyon, France Genetic algorithms, which are based on the principles of Darwinian evolution , are widely used for combinatorial optimizations We introduce the art an d science of genetic algorithms and review different applications in computer aided molecular design Information on software availability is also given We conclude by underlining some advantages and drawbacks of geneti c algorithms KEYWORDS : computer-aided molecular design; genetic algorithms; QSAR; software INTRODUCTIO N The design of molecules with desired properties and activities is an important industrial challenge The traditional approach to this problem ofte n requires a trial-and-error procedure involving a combinatorially large numbe r of potential candidate molecules This is a laborious, time-consuming and expensive process Even if the creation of a new chemical is a difficult task , in many ways it is rule-based and many of the fundamental operations ca n be embedded in expert system procedures Therefore, there is considerabl e incentive to develop computer-aided molecular design (CAMD) method s with a view to the automation of molecular design (Blaney, 1990 ; Bug g et al., 1993) In the last few years, genetic algorithms (Holland, 1992) have emerged a s robust optimization and search methods (Lucasius and Kateman, 1993, 1994) Diverse areas such as digital image processing (Andrey and Tarroux, 1994) , scheduling problems and strategy planning (Cleveland and Smith, 1989 ; Gabbert et al., 1991 ; Syswerda, 1991 ; Syswerda and Palmucci, 1991 ; Easto n In, Genetic Algorithms in Molecular Modeling (J Devillers, Ed ) Academic Press, London, 1996, pp 1-34 Copyright © 1996 Academic Press Limite d ISBN 0-12-213810-4 All rights of reproduction in any form reserved J Devillers and Mansour, 1993 ; Kidwell, 1993 ; Kobayashi et al., 1995), engineerin g (Bramlette and Bouchard, 1991 ; Davidor, 1991 ; Karr, 1991 ; Nordvik and Renders, 1991 ; Perrin et al., 1993 ; Fogarty et al., 1995), music composition (Horner and Goldberg, 1991), criminology (Caldwell and Johnston, 1991 ) and biology (Hightower et al., 1995; Jaeger et al., 1995) have benefited from these methods Genetic algorithms have also largely infiltrated chemistry, an d numerous interesting applications are now being described in the literatur e (e.g Lucasius and Kateman, 1991 ; Leardi et al., 1992 ; Li et al., 1992 ; Hartke, 1993 ; Hibbert, 1993a ; Wehrens et al., 1993 ; Xiao and Williams, 1993, 1994; Chang and Lewis, 1994; Lucasius et al., 1994 ; Mestres and Scuseria, 1995 ; Rossi and Truhlar, 1995 ; Zeiri et al., 1995) Among them, those dedicated t o molecular modeling appear promising as a means of solving some CAM D problems (Tuffery et al., 1991 ; Blommers et al., 1992 ; Dandekar and Argos, 1992, 1994 ; Fontain, 1992a,b ; Judson, 1992 ; Judson et al., 1992, 1993; Hibbert , 1993b ; Jones et al., 1993 ; McGarrah and Judson, 1993 ; Unger and Moult , 1993a,b; Brown et al., 1994 ; May and Johnson, 1994 ; Ring and Cohen , 1994; Sheridan and Kearsley, 1995) Under these conditions, this chapter i s organized in the following manner First, a survey of the different classe s of search techniques is presented Secondly, a brief description of how geneti c algorithms work is provided Thirdly, a review of the different applications of genetic algorithms in quantitative structure—activity relationshi p (QSAR) and drug design is presented Fourthly, information on softwar e availability for genetic algorithms and related techniques is given Finally, the chapter concludes by underlining some advantages and drawbacks o f genetic algorithms CLASSES OF SEARCH TECHNIQUES Analysis of the literature allows the identification of three main types o f search methods (Figure 1) Calculus-based techniques are local in scope an d depend upon the existence of derivatives (Ribeiro Filho et al., 1994) According to these authors, such methods can be subdivided into two classes : indirect and direct The former looks for local extrema by solving the equations resulting from setting the gradient of the objective function equal t o zero The search for possible solutions starts by restricting itself to point s with slopes of zero in all directions The latter seeks local optima by workin g around the search space and assessing the gradient of the new point, whic h drives the search This is simply the notion of `hill climbing' where the search is started at a random point, at least two points located at a certain distanc e from the current point are tested, and the search continues from the best of the tested nearby points (Koza, 1992; Ribeiro Filho et al., 1994) Due to their lack of robustness, calculus-based techniques can only be used on well defined problems (Goldberg, 1989a ; Ribeiro Filho et al., 1994) GA in Computer-Aided Molecular Design Search method s Calculus-based method s Indirect Enumerative Guided random searc h methods method s r Direct Simulate d annealing Evolution strategies Evolutionary programming Evolutionary computatio n Genetic Geneti c algorithms - - programming Figure Different classes of search methods Enumerative methods (Figure 1) search every point related to an objectiv e function's domain space, one point at a time They are very simple to implement, but may require significant computation and therefore suffe r from a lack of efficiency (Goldberg, 1989a) Guided random search techniques (Figure 1) are based on enumerativ e approaches, but use supplementary information to guide the search Two major subclasses are simulated annealing and evolutionary computation Simulated annealing is based on thermodynamic considerations, wit h annealing interpreted as an optimization procedure The method probabilistically generates a sequence of states based on a cooling schedule to converg e ultimately to the global optimum (Metropolis et al., 1953 ; Kirkpatrick et al , 1983) The main goal of evolutionary computation (de Jong and Spears, 1993 ) is the application of the concepts of natural selection to a population o f structures in the memory of a computer (Kinnear, 1994) Evolutionary computation can be subdivided into evolution strategies, evolutionary J Devillers programming, genetic algorithms, and genetic programming (Kinnear, 1994 ; Angeline, 1995) Evolution strategies were proposed in the early 1970s by Rechenberg (1973) They insist on a real encoding of the problem parameters Evolutio n strategies are frequently associated with engineering optimization problem s (Kinnear, 1994) They promote mutations rather than recombinations Basically, evolutionary programming is also sceptical about the usefulness o f recombinations but allows any type of encoding (Fogel, 1995) With geneti c algorithms, a population of individuals is created and the population is the n evolved by means of the principles of variation, selection, and inheritance Indeed, genetic algorithms differ from evolution strategies and evolutionary programming in that this approach emphasizes the use of specific operators, in particular crossover, that mimic the form of genetic transfer in biota (Port o et al., 1995) Genetic programming (Koza, 1992 ; Kinnear, 1994) is an extension of genetic algorithms in which members of the population are pars e trees of computer programs Genetic programming is most easily implemented where the computer language is tree structured and therefore LISP is often used (Kinnear, 1994) MECHANICS OF SIMPLE GENETIC ALGORITHM S An overview of the natural selectio n In nature, the organisms that are best suited to competition for scant y resources (e g food, space) survive and mate They generate offspring, allowing the transmission of their heredity by means of genes contained i n their chromosomes Adaptation to a changing environment is essential fo r the perenity of individuals of each species Therefore, natural selection lead s to the survival of the fittest individuals, but it also implicitly leads to th e survival of the fittest genes The reproduction process allows diversificatio n of the gene pool of a species Evolution is initiated when chromosomes fro m two parents recombine during reproduction New combinations of genes ar e generated from previous ones and therefore a new gene pool is created Segments of two parent chromosomes are exchanged during crossovers, creating the possibility of the `right' combination of genes for better individuals Mutations introduce sporadic and random changes in the chromosomes Repeated selection, crossovers and mutations cause the continuou s evolution of the gene pool of a species and the generation of individual s that survive better in a competitive environment Pioneered by Holland (Holland, 1992), genetic algorithms are based on the above Darwinia n principles of natural selection and evolution They manipulate a populatio n of potential solutions to an optimization (or search) problem (Srinivas an d Patnaik, 1994) Specifically, they operate on encoded representations of the 3I6 Annexe Table I Structures of compounds 1-31 No 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 R1 R2 3-NHCHO 3-NHCHO 5-NO 5-SCH3 5-SOCH 3-NO 5-CN 5-NO 3-SCH3 5-SO CH 5-NO 5-NO 5-NO 5-NO 3-SO CH 5-NO 3-NHCHO 3-NHCHO 3-NHCOCH 5-NO 3-NO 3-NO -5-C1 5-NO 5-NO 3-NO 5-NO 5-NO 5-NO 5-NO 5-NO 5-NO NHC 14H NH-3-Cl-4-(4-C1C6H4 O)C H NH-3-Cl-4-(4-C1C6H4 O)C H NH-3-C1-4-(4-C1C6H4 O)C H NH-3-Cl-4-(4-C1C6H4O)C H NH-3-C1-4-(4-C1C6H4 O)C H NH-3-C1-4-(4-C1C6H4 O)C H NH-4-(4-CF3C H4O)C6H4 NH-3-C1-4-(4-C1C6H4O)C H NH-3-C1-4-(4-C1C6H4O)C H NH-4-(C6H O)C6H4 NH-3-Cl-4-(4-C1C6H4 CO)C6H3 NH-4-(2-C1-4-NO 2C6H O)C6H4 NH-3-C1-4-(4-CH OC6H4 O)C6 H3 NH-3-C1-4-(4-C1C 6H4O)C H NH-3-C1-4-(4-C1C6H4S)C6H3 NHC6 H NHC H NHC 14H NHC NHC 14H 14H NHC 14H NH-4-C( CH NHC NHC 3) C6H4 12H 16 H 3 NH-3-C1-4-(4-C1C6H4NH)C6H NH-4-(3-CF3C H4O)C6H4 NH-3-C1-4-(4-SCF C6H4O)C6H NH-3-C1-4-(3-CF C6H4O)C H NH-4-(C H CHOH)C6H 4-C10 H chemicals For the derivation of QSAR, Selwood and coworkers retrieve d from the literature and calculated a set of 53 physicochemical descriptor s which are listed in Table III and whose values are given in Table IV Thanks are due to Dr D J Livingstone (ChemQuest, UK) for kindly providing and validating the values of the 53 physicochemical descriptor s reported in Table IV and the activity categories given in Table II Annexe Table II Biological activity of compounds 1-31 No In vitro 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 activity : EC50*(1M) 04 48 15 0.0145 0.095 0.38 0 074 12 17 044 >10 0 128 152 0435 59 039 1 37 094 028 085 >10 33 - log in vitro activity -0 85 -0 38 40 32 -0 88 82 84 02 42 00 10 13 92 77 30 36 -1 -0 41 -0.9 89 0.82 36 0.23 41 -0 04 43 03 55 07 -1 0 48 Categoryt 1 2 2 2 1 2 3 3 In vivo activity: % worm reduction * 89 toxic 80 58 84 toxic 17 72 52 28 89 14 70 61 42 70 NT NT NT NT NT NT 44 NT 48 80 85 ; some toxicit y 74 39 61 * EC50 = effective concentration at which 50% of the adenine taken u p was released into the medium i- : Inactive ; : Intermediate ; : Active NT = not tested Table III Physicochemical descriptors 1-10 Partial atomic charges (ATCH) for atoms 1-1 11-13 Vectors (X, Y, and Z) of the dipole moment (DIPV ) 14 Dipole moment (DIPMOM) 15-24 Electrophilic superdelocalizabilities (ESDL) for atom s 1-10 25-34 Nucleophilic superdelocalizabilities (NSDL) for atom s 1-1 35 van der Waal ' s volume (VDWVOL) 3I 318 Annexe Table III continued 36 Surface area (SURF_A ) 37-39 Moments of inertia (X, Y, and Z) (MOFI ) 40-42 Principal ellipsoid axes (X, Y, and Z) (PEAX) 43 Molecular weight (MOL_WT ) 44-49 Parameters describing substituent dimensions in the X, Y, and Z directions (S8_1D) and the coordinates of th e center of the substituent (S8_1 C ) 50 Calculated log P (LOGP ) 51 Melting point (M_PNT ) 52-53 Sums of the F and R substituent constants (SUM ) Table IV Values of the 53 physicochemical parameters for compounds 1-31 No ATCH1 ATCH2 ATCH3 ATCH4 ATCH5 ATCH6 ATCH7 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 0391 0408 -0 1463 -0 1373 -0 1403 -0 1642 -0 1369 -0 1466 -0 2545 -0 1435 -0 1462 -0 1453 -0 1461 -0 1469 -0 5230 -0 1458 0386 0391 0395 -0 1471 -0 1652 -0 1612 -0 1469 -0 1471 -0 1651 -0 1452 -0 1468 -0 1459 -0 1460 -0 1477 -0 1462 -0.0092 -0.0055 0.0923 0.0525 0.0876 0.1018 0.0619 0.0920 0650 1007 0916 0.0925 0928 0.0917 0.1274 0.0923 -0 0084 -0.0089 -0.0095 0.0895 0970 1127 0906 0895 0.0969 0911 0918 0924 0926 0910 0.0918 -0 1005 -0 0980 -0 1616 -0 2566 -0 4688 -0 1469 -0 0760 -0 1618 -0 1308 -0 5146 -0 1611 -0 1603 -0 1610 -0 1623 -0 1442 -0 1610 -0 1010 -0 1006 -0 1004 -0 1628 -0 1480 -0 0938 -0 1619 -0 1628 -0 1478 -0 1600 -0 1619 -0 1611 -0 1612 -0 1629 -0 1634 0027 0009 1116 0753 1117 0833 0738 1112 0401 1454 1105 1104 1113 1116 0806 1103 0035 0028 0020 1108 0840 0995 1104 1108 0837 1084 1113 1113 1111 1116 1174 -0.2403 -0.2499 -0.2897 -0.2834 -0 2873 -0 2946 -0.2813 -0 2894 -0 2751 -0 2904 -0 2882 -0.2901 -0 2898 -0 2889 -0 2874 -0 2888 -0 2410 -0 2401 -0 2405 -0 2808 -0 2854 -0 2808 -0 2877 -0 2808 -0 2854 -0 2861 -0 2886 -0 2902 -0 2899 -0 2877 -0 2886 -0 252 -0 252 -0 213 -0 231 -0 2236 -0 1885 -0 2237 -0 213 -0 231 -0 2170 -0 2125 -0 2105 -0 213 -0 2132 -0 2260 -0 211 -0 2524 -0 252 -0 2520 -0 2125 -0 187 -0 181 -0 213 -0 2125 -0 1874 -0 2097 -0 214 -0 2126 -0 2126 -0 2148 -0 2084 1687 1750 2609 2325 2473 2834 2373 2601 2323 2621 2593 2605 2602 2604 2913 2600 1685 1686 1683 2562 2791 2838 2589 2562 2793 2578 2601 2609 2612 2600 2483 Annexe 31 Table IV continued No ATCH8 ATCH9 ATCH10 DIPV_X DIPV_Y DIPV_Z DIPMOM ESDL 1 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 4177 0.4260 0.4274 0.4267 0.4263 4267 4260 4275 0.4257 4272 4262 4268 4283 4264 4261 4268 4188 0.4181 4176 4181 4178 4178 4257 4181 4173 4245 4274 4274 4273 4257 3493 -0 4132 -0.3994 -0.3942 -0 3976 -0 3961 -0.3894 -0.3958 -0.3983 -0.3981 -0.3976 -0.3924 -0.3877 -0.3970 -0.3954 -0.3931 -0.3906 -0.4125 -0.4129 -0.4135 -0.4063 -0.4012 -0.3985 -0.3971 -0.4063 -0.4015 -0.3937 -0.3986 -0.3938 -0.3942 -0.3958 -0.3284 -0.4033 -0.3270 -0.3242 -0.3281 -0.3269 -0 3248 -0 3267 -0.3238 -0.3279 -0 3243 -0 3268 -0 3272 -0.3249 -0 3229 -0 3264 -0 3264 -0 4043 -0 4039 -0 4033 -0 3998 -0 4015 -0 3993 -0 3243 -0 3998 -0 4013 -0 3248 -0 3237 -0 3243 -0 3243 -0 3246 -0 1500 1993 -1 2508 0.1077 -3 1042 0.0267 0.3801 -1 5169 -0.0356 -2 3652 -3 1640 1464 -0.7323 -2.3472 8977 0.9609 0.2799 0017 0.9936 0.9507 4223 -1 1027 0.4214 4332 4223 -1 1208 2.2681 4356 -0.2955 -1 0840 1309 5821 -1 7847 -1 4858 4632 -2.0453 -2.2836 -8 0741 -0.5424 7144 -3 7578 5843 0687 7748 1618 -0 2785 -6 1255 0840 -2 6660 -2 6456 -2 3962 -0 3826 -8 9307 -8 0769 1762 -0 3826 -8 9224 3472 6107 5579 9937 -1 3363 7919 -0 0244 3883 4468 -1 0010 6789 3869 4442 2436 -0 8217 -2 7014 7662 2510 3489 -0 0015 -3 3030 -0 2142 -0 0328 -0 0186 0587 -0 0241 -0 0316 -0 0305 2453 -0 0241 -0 0355 6614 -1 5316 4620 -1 5566 1914 0.6987 8324 9806 5337 8498 3301 0923 6711 9172 5155 4518 3806 0952 8286 9111 0253 1399 8481 8261 5786 4437 9986 0880 4464 4437 9926 7197 0580 6161 7519 4095 0658 -0 5466 -7 2024 -5 8408 -2 0706 -1 2090 -1 0867 -1 0820 -8 4444 -5 2785 -0 8667 -18 5242 -3 627 -3 625 -51 017 -1 3643 -5 5805 -0 545 -0 547 -0 5643 -0 512 -1 2692 -0 9495 -0 7424 -0 5128 -1 2662 -34 2046 -9 8302 -4 1539 -4 111 -0 6642 -41 8125 Table IV continued No ESDL2 ESDL3 10 11 12 -1 7022 -1 6017 -0 8313 -1 3785 -1 1233 -0 7912 -1 0338 -0 8606 -1 6241 -0 9508 -0 9389 -0 7832 -0.8054 -3 7462 -3 5819 -1 3477 -0.8732 -0 4146 -0.8195 -4.9054 -2 0048 -0.7624 -10.2312 -2 4015 ESDL4 -0 2892 -5 5173 -2 7471 -1 8399 -1 0692 -0 6555 -0 9286 -3 9635 -3 5297 -1 0341 -8 6129 -1 7172 ESDL5 ESDL6 ESDL7 ESDL8 ESDL -1 2932 -4 0391 -1 2534 -2 3639 -1 5280 -0 5825 -1 3942 -1 4281 -2 5197 -1 2133 -2 1310 -1 0703 -1 4297 -1 3249 -1 0284 -0 9981 -0 8362 -0 5955 -0 7768 -1 3369 -1 1995 -0 7052 -2 3426 -0 7893 -0 4022 -1 3459 -1 1591 -0 6437 -0 5079 -0 4950 -0 4861 -1 5322 -1 1224 -0 4496 -2 9728 -0 8350 -0.4658 -0.5084 -0.7395 -0.4550 -0 4483 -0.3911 -0.4077 -0.8218 -0 4962 -0.5518 -1 2358 -0.5919 -0 6069 -0 6027 -0 6892 -0 5623 -0 5486 -0 506 -0 527 -0 7378 -0 5924 -0 5905 -0 9782 -0 6121 320 Annexe Table IV continued No ESDL2 ESDL3 ESDL4 ESDL5 ESDL6 ESDL7 ESDL8 ESDL9 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 -0 7935 -1 5157 -0 8610 -0 8240 -1 6953 -1 7001 -1 7833 -0 8149 -0 8419 -1 0875 -1 0148 -0 8149 -0 8415 -1 0883 -0 8699 -0 7949 -0 7949 -0 9866 -2 0914 -1 0545 -4 8865 -0 8798 -1 2303 -1 2909 -1 2900 -1 3583 -0 9536 -0 6042 -3 0831 -1 0363 -0 9536 -0 6029 -2 4267 -1 5162 -1 1159 -1 1127 -1 0243 -4 0222 -0 8342 -7 7191 -0 5146 -1 1163 -0 4025 -0 4022 -0 4054 -0 3904 -0 5282 -0 4709 -0 4243 -0 3904 -0 5273 -0 6293 -4 5312 -0 3688 -0 6970 -0 4663 -0 4646 -0 4852 -0 2376 -0 2530 -0 7914 -1 0779 -0 2376 -0 2528 -5 1572 -1 5257 -1 7318 -0 9134 -0 9078 -0 4143 -6 0766 -0 8566 -0 6187 -0 6215 -0 9300 -5 7077 -0 623 -2 7104 -0 4993 -0 673 -0 6060 -0 6062 -0 6206 -0 438 -0 448 -0 738 -0 853 -0 438 -0 448 -1 023 -0.757 -0.624 -0.625 -0.784 -3 4349 -2 3137 -29 8732 -0 9999 -3 4701 -0 8034 -0 8030 -0 8395 -0 5490 -0 4326 -0 7380 -1 0745 -0 5490 -0 4321 -14 0039 -5 5302 -2 6584 -2 6324 -0 9460 -27 2335 -1 7701 -22 4600 -0 8580 -2 6015 -0 2885 -0 2896 -0 2900 -0 3995 -0 6660 -2 4801 -0 4334 -0 3995 -0 6647 -18 5399 -4 6660 -1 9782 -1 9609 -0 4211 -16 5747 -0.8627 -4.2065 -0.6758 -0.9914 -1 4224 -1 4288 -1 4943 -0.6531 -0.6639 -1 1205 -0 6888 -0 6531 -0 6637 -7 0824 -1 5364 -0 8589 -0 8565 -0 7004 -3 3515 Table IV continued No ESDL10 NSDL1 NSDL2 NSDL3 NSDL4 NSDL5 NSDL6 NSDL7 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 4.2388 0.9939 0780 0.9646 1509 4842 0965 0630 0564 2053 0505 1065 1073 2010 1112 0785 4.1605 2790 6538 8047 9648 9410 5615 8047 1007 9135 1817 9486 0352 7658 0528 1540 9536 1735 1364 2310 2209 1353 2251 1797 0990 1020 0581 1287 2609 2260 0783 1287 0696 8790 7285 0544 1675 17 2603 9101 6896 8935 1778 6642 8079 8030 9021 1739 7299 9683 1255 1802 7149 5870 6949 4.3860 7149 4.0671 0.8228 0289 0.8049 0.9393 18 2373 0.8766 0187 8389 9469 0084 0507 0518 1461 0567 0307 0048 1088 4714 3436 7579 0373 3004 3436 0302 9414 1907 9603 0662 5904 0322 1572 9515 0980 1380 2366 2257 4756 1252 1852 0283 0315 0043 4154 2959 2497 8484 4154 275 0.311 0.370 0.331 0.373 0.375 0.354 0.368 317 3793 368 3772 3766 728 3657 3725 259 2854 1158 061 357 3897 1676 061 -0 3896 -0 4706 -0 4688 -0 4129 -0 3762 -0 3280 -0 3649 -0 5404 -0 4499 -0 3473 -0 7942 -0 4057 -0 4094 -1 4855 -0.3427 -0.4597 -0.3895 -0.3893 -0.3945 -0.3277 -0.3296 -0.4324 -0.3360 -0 3277 6719 7982 0031 9121 1601 1351 0388 9950 8166 1427 9909 0216 0239 5093 9949 0046 5333 7428 5450 8768 9932 1258 6018 8768 Annexe 32 Table IV continued No ESDL10 NSDL1 NSDL2 NSDL3 NSDL4 NSDL5 NSDL6 NSDL7 25 26 27 28 29 30 31 9733 0579 0596 0966 0973 4281 0278 2608 1390 1476 2085 2101 0804 1688 5997 6811 6826 7759 7782 0356 6686 7751 0183 0159 0430 0431 2915 9694 2974 1370 1507 2138 2176 6839 1517 357 372 367 3740 3740 0902 3783 -0 3296 -1 5813 -0 5798 -0 4219 -0 4211 -0 3444 -2 7701 0.9945 0.9953 9933 0152 0158 0402 9738 Table IV continued No NSDL8 NSDL9 NSDL10 VDWVOL SURF_A MOFLX 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 766.6057 37378 3047 36572.980 1686.4363 21047 6348 19971 365 2651 7319 19371 6113 17267 402 2776.7034 19569 5352 17354 976 3099 5256 21450 8535 18866 433 1726 7526 20579 1602 19403 095 2345 0713 17645 0371 15850 812 2267 2917 22328 7871 20831 539 1822 8052 20390 8359 19440 972 3400 4380 22357 9707 19442 1465 1894 8627 15368 8545 13607 305 3188 1543 20791 5469 17782 863 2622 2456 21989 5332 20498 271 2529 6760 19250 0039 17192 373 1993 3540 23754 7051 22894 0762 3615 8687 20164 7129 17081 3086 574 2045 9189 6504 8711 1963 623 5986 13796 0986 13431 9482 793 2323 40830 8164 39243 7344 1650 2062 25060 2734 23662 1348 765 7248 37097 8516 36119 4805 1628 6172 39861 3984 38461 703 1422 5823 10025 0400 8799 7803 1650 2062 25060 2734 23662 1348 808 4082 49091 1367 46927 3906 3064 9253 19326 9492 16851 7227 1846 9785 22716 9297 21426 619 3010 1609 28906 9922 26566 662 2447 4656 23459 0742 21598 6250 1837 9072 15123 2236 13701 3262 1631 9397 5819 5015 4289 1333 4560 6498 2705 7084 8967 7846 9231 1169 6883 0456 0422 4834 4008 2.0959 2.2353 2.2456 4602 4571 4370 1902 1162 0261 8164 2.1902 2.1135 2.0056 0825 2.3728 2.3853 8805 2232 4936 5105 7144 5357 0.5984 2.4673 0.6101 0.6596 0.5269 0.6542 0.6446 0.8010 0.7563 0.6674 0.7190 0.7114 0.4934 0.4939 0.4840 0.8049 0.7561 0.6627 0.5803 0.8049 0.7559 0.6292 0.6477 0.7509 0.7541 0.5986 0.7991 6788 8485 0592 8624 9381 1601 9397 0010 8613 9685 9576 1053 0960 0505 0460 0432 6803 6791 6736 8552 8497 8511 0563 8552 8496 9671 9897 0921 0988 0391 0453 379 99982 327 99991 320 59991 328 79990 342 29987 320 59991 313 79993 321 59988 328 79990 353 59985 304 79996 333 79993 331 99991 328 49991 353 59985 332 29990 250 40009 282 80002 396 19980 340 19989 372 59979 387 09976 280.40002 340.19989 404 99973 328 29990 321 59988 353 99988 336 09991 307 79996 216 20004 488 07489 386.58533 379 91208 395 36938 404.42609 378.27496 373 30637 386.11441 396.11472 413 67645 355 80188 385 53317 392.10461 394.02155 411 79279 391 05966 314.22668 358.10831 506.67865 436.40833 477.42938 493.83658 330 78622 436 40833 520 96588 380 38226 385 27243 421 53546 401 82898 364 02640 259 55621 MOFI_Y MOFI_ Z 322 Annexe Table IV continued No PEAX_X PEAX_Y PEAX_Z MOL_WT S8_1DX S8_1DY S8_1DZ S8_1C X 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 22 0590 15 4115 14 2916 14 3081 14 6589 15 1625 14 0235 15 6953 15.0954 14.6221 13.6757 14.3769 15 2597 14.3277 15 7683 13 9506 12 8093 15 0923 22 5436 18 3366 21 8901 21 5989 11 7708 18 3366 24 2130 14 1238 15 9624 16 5302 15 4201 13 6143 7625 5074 9195 8139 8358 7245 8192 8648 2.1538 2.2877 6426 0.9595 0239 5703 6904 5114 7554 9522 4904 2558 3418 1859 1800 2534 3418 8889 8848 8312 8678 8099 6903 9592 2329 0845 3462 4702 7286 1767 1119 7621 0771 9303 0251 0164 9018 2696 3.9863 6.2265 1573 2.9081 9062 4.6663 3963 4.2919 4.5915 6663 2780 7771 3473 2789 9030 7317 3520 376 54135 417 24887 419 22134 420 31082 436 31021 419 22134 399 23355 418.32956 420.31082 452.30960 362.34225 431 23227 429.77383 414.80270 452 30960 435 28189 264 32553 292 37949 390 56833 350 45987 378 51382 412 95883 314 34192 350 45987 406 56778 418 23660 418 32956 484.83456 452 77457 364 35831 277 66428 17 0896 11 8036 11 7737 11 7604 11 7727 11 7962 11 7825 12 7140 11 7836 11 7258 11 7909 11 8248 12 4982 12 9533 11 6899 10 1820 9012 9625 10 8214 10 0295 10 7617 10 7824 2551 10 0295 11 5158 8.6220 11 1225 10.5676 11 1316 9.4340 6061 13 6199 9590 10 0132 10 1024 10 6567 9618 9721 8967 9.9952 10.6374 9.5562 11 6162 9.2788 10.0760 9.6063 11 9139 7093 11 8851 18 7441 16 4804 18 8944 18 9532 3913 16 4804 21 1088 11 7029 11 9605 13 4492 11 6704 10 5818 7776 7.0784 6.2589 2309 2214 5288 2654 2570 2800 2577 6009 8335 7136 4438 4081 4384 9458 3613 9513 9792 1094 2670 2213 2754 1094 5424 8834 6.7951 6.8999 6.2340 6.8240 4.4484 11 1982 5330 5327 519 5385 533 5308 8229 528 5063 5199 535 6865 119 467 7.365 5.236 5 753 7.178 6.763 30.975 31 319 6.340 6.763 33 7905 9993 1463 857 1444 4758 871 Table IV continued No S8_1CY S8_1CZ LOGP M_PNT SUM_F SUM_ R 10 11 12 -1 5653 -1 2271 -1 2850 -1 0338 -1 1429 -1 2412 -1 2838 -1 5777 -1 2163 -1 4219 2512 2017 239 960 994 372 730 994 755 695 372 670 888 6.205 81 183 207 143 165 192 256 199 151 195 212 246 25 25 67 20 52 67 51 :67 20 54 67 67 -0 23 -0 23 -0 0 1 -0 22 -0 23 -0 23 -4 9646 -1 2691 -1 3552 -1 4426 -1 6179 -1 2725 -1 3163 -1 9160 -1 3386 -1 6703 -2.0179 -2.2636 Annexe 32 Table IV continued No S8_1 CY S8_1 CZ LO GP M_PNT SUM_F SUM_R 13 -1 5737 14 -1 7659 15 -1 2903 16 -2 4667 17 -5 0591 18 -6 1533 19 -9 5838 20 -8 4629 21 16 4613 22 16.5005 23 -4.3674 24 -8.4629 25 18.2964 26 -5 7537 27 -5 9329 28 -6.6848 29 -5 7897 30 -5 4216 31 -3 7972 -1 1706 -1 2340 -1 7341 2822 -0 5692 -0 9282 -1 5017 -1 4630 -1 6728 -1 6306 1964 -1 4630 0000 -1 6195 -1 2049 -1 2939 -1 2802 -0 9107 2244 113 180 681 838 007 065 230 466 470 300 354 408 520 811 695 869 269 686 654 208 159 178 222 62 78 71 90 67 81 227 85 79 173 176 195 192 178 170 0.67 0.67 0.54 0.67 0.25 0.25 0.28 0.67 0.67 08 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67 67 -0 23 -0 23 -0 23 -0 23 -0 23 -0.2 6 01 0.1 6 6 6 6 16 REFERENC E Selwood, D.L , Livingstone, D.J., Comley, J C.W., O'Dowd, A B , Hudson, A T , Jackson, P., Jandu, K S , Rose, V.S , and Stables, J.N (1990) Structure-activity relationships of antifilarial analogues : A multivariate pattern recognition study J Med Chem 33, 136-142 Index Adaptive least squares 161 Adaptive mutation 64 Adaptive parallel genetic algorithm ADE-4 251 AERUD 303, 305 AGAC 14 Anesthetic activity 83 Angiotensin II receptor antagonist 235 Antimycin analogs 78, 118, 315 Aromatic substituents 11, 243, 245, 246 Automated ligand docking 219 Backpropagation neural network 11, 159, 271, 303 Benzodiazepine receptor 13, 234 Biased uniform crossover 47 Binary encoding 5, 42 Biodegradation 303 Boltzmann probability Brookhaven protein data bank 220 Calculus-based techniques f -Carboline 13 Cataclysmic mutation 62 CHARMm 195 CHC adaptive search algorithm 62 Chemical structure handling 211 Chem-X 167 Cluster analysis 11, 83, 161, 236 CoMFA 110, 160, 161, 176 Computer-aided molecular design 1, 27 CONCORD 215 Conformation 134, 213 Constraint 250, 307 CORINA 167 Correlation coefficient 11 Correspondence analysis 161 Cost 57 Criminology Crossover 7, 89 Cross-validated feature selection 122 Crowding factor 60 Crowding genetic algorithm 60 3-D conformational search 21 3-D database 13, 23 2-D plot 24 Decision tree 244 Delta coding 38, 52, Dendrogram 244 Density 28 Deterministic replacement 59 Digital image processing Dihydrofolate reductase 224 Directed tweak 21 Distance geometry 21 Dopamine B-hydroxylase inhibitor DPGA 3-D QSAR 110, 132, 137, 237 Dynamic QSAR analysis 149, 150 Elitist strategy 55, 287, 31 Encoding mechanism Enumerative methods Environmental fate 306 Equation oriented approach 276 Euclidean distance 11, 243 EvoFrame 14, Evolution strategies 3, 4, 35, Evolutionary computation Evolutionary programming 3, 4, 35 , 36, 43, 46, 50, 59, 87, 100 Evolver Feature reduction 11 Feature selection Fitness 5, 6, 35, 56, 57 Flexible docking 22 Flexible ligand Focusing 43 Forward problem 12, 273, 27 Free energy force field 132 Fuzzy decision tree(s) 20 Fuzzy logic system 20 326 Inde x Fuzzy network 20 GABA analog 13 GAC 16 GAGA 16 GAL 16 GAME 16 GANNET 17 GAUCSD 17 GA Workbench 17 Gaussian scaling Gene based coding 38, Gene invariant genetic algorithm 60 Generational approach 10, 58, 59 Genes 36, 38 GENESIS 17 Genetic alphabet 36, 38, 40 Genetic function approximation 12, 87, 88, 113, 131, 141, 159, 175 Genetic invariance 61 Genetic partial least-squares 104, 109, 115 Genetic programming 3, Genocop 18 GenocopllI 18 GERM 13, 193, 195 GFA see genetic function approximation Glass transition temperature 282 GOLPE 116 Gray coding 5, 6, 38, 51, 21 Group contribution approach 275 G/SPLINES algorithm 88 Guided random search methods Halogenated ethyl methyl ether 83 Hammett constant 243 Hamming cliffs Hamming distance 40 H-bonding acceptor (HBA) 11, 243 H-bonding donor (HBD) 11, 243 Heterogeneous recombination 62 Hill climbing HIV protease inhibitor 139, 140 5-HT, antagonist 235 Hybrid morphine molecule EH-NAL 232 Hybrid system(s) 12, 303 Inductive parameter 11, 244 Influenza virus replication 225 Initial population 36, 44 Integer representation 5, 40 Intercommunicating hybrid system(s ) 12 Interim solution Intermediate recombination Intermolecular 3-D QSAR 132 Interval crossover 49 Intraclass correlation coefficient 24 Intramolecular 3-D QSAR 132 Intrinsic molecular shape 13 Inverse problem 12, 273, 274, 27 Island model 228 JOIN 63 K nearest neighbors 20 K-point mutation Knowledge-based system 274 Kohonen network see Kohonen self organizing ma p Kohonen self organizing map 20, 244 Lack-of-fit 89, 113, 114, 127, 141, 175 , 18 Leave-one-out 96, 126, 127, 17 Leu-enkephalin 13, 232 Ligand—receptor recognition 209 Linear scaling Local elitism LOF see lack-of-fi t LOO see leave-one-ou t MARS algorithm 11, 20, Massively distributed parallel geneti c algorithm 62 Mating population 36, Mating operator Maturation 36, 52, 53, Mean Euclidean coefficient 250, 25 Messy genetic algorithm 63 Meta-genetic algorithm 63 MLR model 78 Model-based approach 27 Model building 1 Molar refractivity 11, 24 Molecular alignment 134 Molecular docking Molecular field 136 Molecular modeling Molecular recognition 21 Molecular shape analysis 93, 110, 13 Monte Carlo simulation Morphine MPGA 18 Index MSA 3-D QSAR 133, 136, 139 Multi-niche crowding genetic algorithm 60 Multi-point crossover 9, 48 MUSEUM 100, 101, 103 Music composition Mutation 10, 50, 58 Niching 43, 51, 62 Node based coding 38, 49, 51 Nonlinear map 11, 243, 244, 251 NPOP 37, 46 N2M 244 N-methyl-D-aspartate 12 Offspring 35, 36, 52, 53 OMEGA 19 One-point crossover 7, 8, 47, 48, 215, 223, 231 Outlier(s) 11, 116 Outlier limiting 116, 121 Parallel genetic algorithm 62 Penalty 214, 307 Pharmacophore 12, 212, 226 it constant 11, 243 PLS 68, 83, 96, 98, 105, 109, 110, 111, 112, 121, 135, 142, 162, 178 PMX crossover operator 231 Polymer design 27 POMONA89 database 215 Power law scaling PRESS 112 Probabilistic replacement 59 Progesterone receptor 12, 159, 189 Pseudo-Hamming distance 292 QSAR 11, 21, 35, 40, 52, 68, 88, 105, 109, 131, 147, 159, 193, 24 (Q)SBR 303 QSPR 35, 40, 52 REALizer 14, 15 Receptor model 194 Recombination operator(s) Regression splines 90 Renin inhibitor 147, 148 Resonance parameter 11, 244 Root-mean-square error 281 327 Roulette wheel selection 6, 7, 197 , 212, 215, 221, 228, 232, 24 SAMUEL Scheduling problem(s) Schema Score plot 244 Search techniques Segmented crossover Sigma truncation Sigmoidal scaling Similarity 43 Simulated annealing 3, 20, 52, 60, 219 , 233, 236 Solvent design 274 Spectral map 244 Spline modeling SPLIT 63 Standard genetic algorithm 50, 58 STATQSAR 251, 306, 309 Steady-state approach Stepwise selection 75, 10 Steroids 12, 159, 189 Strategy planning Structural constraint(s) 31 Structural descriptors 306 Sweeteners 200 SYBYL 220, 222, 227 Systematic search 21 Test series 11, 24 Topological indices 274, 27 Tournament selection 7, 46, 5 Toxicity Traveling salesman problem Trimethoprim Two-point crossover 8, 9, 47, 223 , 23 Unbiased uniform crossover 47 Uniform crossover 9, 47, 48, 247 Uniform mutation UNIPALS algorithm 11 Updating the population 5 Variable selection 1 Variance coefficient 11, 251, 26 XpertRule 20 Plate Docking of methotrexate into dihydrofolate reductase Plate Docking of folate into dihydrofolate reductase Plate Docking of D-galactose into an L-arabinose binding protein Plate Docking of 4-guanidino-2-deoxy-2,3-di-dehydro D-N-acetylneuramic acid into influenza sialidase Plate Overlay of leu-enkephalin and hybrid morphine The hybrid morphine , EH-NAL is shown coloured by atom type and leu-enkephalin is shown coloure d purple The elucidated pharmacophore is indicated by the yellow circles Plate Overlay of benzodiazepine receptors ligands CGS-8216 is coloured by atom type, Rol15-1788 is in orange and methyl-beta-carboline-3-carboxylate is in purple Points of interest are indicated by the yellow circles (see the text for discussion) Plate Overlay of 5-HT antagonists Structure 37 is coloured by atom type, structure 44 is i n purple, structure 45 is cyan and structure 47 is orange The elucidated pharmacophore is indicated by the yellow circles Plate Overlay of six angiotensin II receptor antagonists The base molecule L-158809, is shown coloured by atom type, GLAXO is coloured magenta SEARLE is orange, SKB 108566 is cyan, TAK is green and DuP is yellow Points of interest are indicated by the yellow circle s (see the text for discussion)

Ngày đăng: 13/04/2019, 01:28