Nonlinear system identification using genetic programming

ACKNOWLEDGEMENTS I would like to express my appreciation to a number of people who have contributed, directly or indirectly, to this thesis First and foremost, I would like to express gratitude and appreciation to Assistant Professor Lakshminarayanan Samavedham, for his guidance, encouragement and support during this study His optimism, nourishing encouragement-words and understanding of one’s limit and potential were the keys toward the success of this research I am very much indebted to the National University of Singapore for providing a Research Scholarship that made my studies possible at the Department of Chemical and Environmental Engineering Immeasurable thanks are given to my “Data Analysis and Control System Group” members for providing a congenial environment I would like to thank our group members particularly Madhukar, Prabhat, Dharmesh, Mranal and May Su Tun for their proof reading of this thesis and giving pleasurable time throughout the one and half years I am very grateful to Mr P Vijaysai of the Department of Chemical Engineering, IIT(Bombay), India who suggested the use of differential evolution optimization Finally, I would like to dedicate this work to my parents, brothers and sisters, who brought me to this level and hence special thanks are due to them A special “thank you” is given to my beloved Chaw Su Thwin, for her understanding, encouragement and support i TABLE OF CONTENTS Acknowledgements i Table of contents ii Summary vi Nomenclature viii List of tables ix List of figures xi CHAPTER 1: INTRODUCTION 1.1 Background 1.2 Outline of the thesis CHAPTER 2: THE BASICS OF GENETIC ALGORITHMS AND GENETIC PROGRAMMING 2.1 Introduction 2.2 Overview of Genetic Algorithms (GA) 2.3 Genetic Programming 2.3.1 Initializing a GP population 13 2.3.2 Genetic operator 13 2.3.2.1 Reproduction 14 2.3.2.2 Crossover 14 2.3.2.3 Mutation 15 2.3.3 Selection method 17 2.3.3.1 Fitness-proportional selection 17 2.3.3.2 Tournament selection 18 2.4 Genetic programming theory 18 2.5 Shortcomings and limitations of genetic programming 19 ii 2.6 Summary 20 CHAPTER 3: APPLICATION OF GENETIC PROGRAMING TO SYSTEM IDENTIFICATION 3.1 Introduction 21 3.2 Representation scheme 24 3.3 Parameterization 25 3.4 Improved genetic operators 26 3.4.1 Superposition crossover 26 3.4.2 Adaptation 27 3.5 Fitness measure 28 3.6 Expert knowledge integration 34 3.7 Algebraic simplification 36 3.8 Summary 36 CHAPTER 4: IMPLEMENTATION DETAILS OF THE GP-BASED SYSTEM IDENTIFICATION PROCEDURE 4.1 Introduction 37 4.2 Choice of programming language 38 4.3 Data structure for gene and chromosome 38 4.4 Simulation engine 41 4.5 Parameter estimation 45 4.5.1 Global optimization 46 4.5.1.1 Genetic Algorithm 47 4.5.1.2 Differential Evolution 47 4.5.1.3 Simulated annealing 48 4.5.2 Local optimization 49 iii 4.5.2.1 Gauss-Newton optimization 49 4.5.2.2 Gauss-Newton optimization algorithm 51 4.6 Fitness function 53 4.7 DACS-GP algorithm 53 4.8 Running DACS-GP 58 4.9 DACS-GP configuration script 60 4.10 Computational resources 64 4.11 Guidelines for configuring GP parameters 65 4.12 Summary 67 CHAPTER 5: APPLICATIONS AND CASE STUDIES 5.1 Introduction 69 5.2 Identification of algebraic systems 71 Case study 1: Enzyme kinetics data 71 Case study 2: Biological oxygen demand 75 5.3 Identification of implicit algebraic equation systems Case study 3: Ellipse equation 77 78 5.4 Identification of discrete dynamic algebraic systems 81 Case study 4: Simulated nonlinear dynamical system 81 Case study 5: Experimental heat exchanger system 83 Case study 6: Modeling of an acid-base neutralization system 85 Case Study 7: Modeling of rainfall-runoff data 92 5.5 Identification of ordinary differential equation systems Case study 8: Cracking of gas oil 95 95 Case study 9: Reversible first order series reaction 102 Case study 10: Lotka-Volterra system 105 iv 5.6 Determination of number of states 108 Case Study 11: Linear series reaction system 5.7 Integration of nonparametric regression techniques into DACS-GP Case Study 12: Simulated data 109 112 114 5.8 Summary 119 CHAPTER 6: CONCLUSIONS AND RECOMMENDATIONS 6.1 Conclusions 121 6.2 Recommendations for further research 122 REFERENCES 122 v SUMMARY The objective of the present study is to develop system identification tools using genetic program paradigm The developed software is able to identify several types of models ranging from algebraic to differential equation system using process data Identification for state space model using genetic programming is a relatively new area of application which has been attempted here Genetic programming (GP) is a powerful tool for system identification when little is known about the underlying model structure in the data The technique is attractive due to its ability in searching discontinuous complex nonlinear spaces GP works on a population of individuals, each of which represents a potential solution A population of model structures evolves through many generations towards a solution using certain evolutionary operators and ‘survival-of-the-fittest’ selection scheme GP operators include reproduction, crossover and mutation The developed program employs an unique approach in model representation which helps to develop faster and robust computer programs The representation is flexible so that it can be applied to different domains of application without changing the main GP algorithm Extensive literature on algebraic modeling using GP approach is available; however there have been little or no reported applications of GP in the context of state space models Many chemical processes could be more conveniently represented by a set of nonlinear differential-algebraic equations There is also no standard for identifying differential equation models from experimental data alone We have successfully identified nonlinear differential equation models for several batch reaction data We introduce a new concept to take advantage of process knowledge vi using user defined ‘evolution policy’ A new fitness measure that takes into account of functional complexity of the model is also proposed We also propose several enhancements to improve efficiency of GP such as modified genetic operators, new block model representation using Simulink process simulator, distributed computing, integration of nonparametric techniques and implicit algebraic equation modeling The results of these are shown and promising improvements are recommended The developed program was applied and its performance examined on a wide range of system identification tasks System identification studies are carried out on systems modeled by algebraic, dynamic algebraic and nonlinear state space equations Workability of this is illustrated by twelve case studies Simulated data sets and experimental data sets have been used in the case studies The developed program satisfactorily identified nonlinear dynamic systems for all the systems studied The executable version of the program as well as example files and data are attached at the back of the thesis It has easy, intuitive and interactive graphical user interface with online and offline analysis tools The program was written in MATLAB programming language vii NOMENCLATURE Abbreviation Explanation ACE Alternating Conditional Expectation Analysis ASA Adaptive Simulated Annealing B ‘B’ combining with variable name indicates sample lag Eg yB1 is one sample lag output variable DACS Data Analysis and Control System group DE Differential Evolution f fitness score F functional set G sensitivity matrix GP Genetic Programming GA Genetic Algorithm k1 , k , L optimal parameter MARS Multivariate Adaptive Regression Splines PCA Principal Components Analysis RMSE root mean square error SA simulated annealing SSE sum square error t time T terminal set u1, u2, … input variables y1, y2, … output variables z1 , z , L state variables µ step factor in Gauss-Newton iteration viii LIST OF TABLES Table 3.1 Complexity value used in DACS-GP 32 Table 3.2: Comparison of fitness functions 34 Table 3.3: Evolution policy 35 Table 4.1 Gene library 40 Table 4.2 DACS-GP configuration script file 61 Table 4.3 Choosing Parameters for the GP Run 65 Table 5.1 Configuration for Case Study 72 Table 5.2 Results of GP runs for Case Study 73 Table 5.3 Configuration for Case Study 76 Table 5.4 Results of GP runs for Case Study 77 Table 5.5 GP Configuration for Case Study 79 Table 5.6: Results of GP runs for Case Study 80 Table 5.7 GP configuration script file for Case Study 82 Table 5.8 Results of GP runs for Case Study 82 Table 5.9 GP configuration for Case Study 84 Table 5.10 Results of GP runs for Case Study 84 Table 5.11 Nominal operating conditions for the acid-base neutralization system 88 Table 5.12 Configuration for Case Study (acid-base system) 90 Table 5.13 Results of GP runs on Case Study 91 Table 5.14 Configuration for rainfall runoff modeling (Case Study 7) 93 Table 5.15 Results of GP runs for Case Study 94 Table 5.16 Configuration details for Case Study 97 Table 5.17 Results of GP runs for Case Study 102 Table 5.18: Configuration details for Case Study 103 ix Table 5.19 Configuration details for Case Study 10 106 Table 5.20 Results of GP runs for the Lotka-Volterra system data (Case Study 10) 107 Table 5.21 Results of GP runs using only the first state output as measurement 109 Table 5.22 Results of GP runs using only the second state output as measurement 110 Table 5.23 Results of GP runs using only both the state outputs as measurement 111 Table 5.24 GP Configuration for runs without “ACE advice” 115 Table 5.25 Results of GP models without “ACE advice” 116 Table 5.26 Results of GP models with “ACE advice” 117 Table 5.27 Summary of case studies 120 x Table 5.26 Result of GP models with “ACE advice” Fit VRMSE Model -132.55 0.247917 ((2.9422*u3)+(0.27497*((exp((1.3184*(u2*exp(((u20)*u1)))))-1)-u1))) -305.21 0.0441034 ((3.0332*u3)+(0.07077*((u2+exp((u2+u2+exp(u2))))*(u2*u1 )))+ (0.94659*(u2*u1))) -228.85 0.0968511 ((0.18493*(u2*((u1*(u2*exp(((u2*exp(u2))+u2))))+u2)))+(2 9553*u3)) -328.62 0.0348978 ((-1.0041*u2)+u2+(7.0952*((u2*(u2*u2))*((u2*(u1*u2)) *u2)))+ (2.9977*u3)) -376.97 0.0215183 ((7.7647*(u1*(u2*(u2*((u2*u2)*u2)))))+(3.0192*u3)+ (0.95916*(u1*(u2*u2)))) -219.58 0.101471 (-3.0239*(((u1/((-0.68539*u2)+1))-u3)-(u1/((0.90937*(1u2))+ (0.1741*1))))) -127.53 0.248948 ((2.0998*u3)+u3+(-0.15247*(exp((-3.3883*(u1-u2)))-u2))+ (0.0035212*((exp((7.3807*u2))-u1)-(u3-u2)))) -399.13 0.0164655 ((0.50405*(u1*u2))+(16.416*((u2+(-0.61743*1))* ((u1*(u2*u2))*u2)))+ (0.0076678*u2)+(2.9942*u3)) -328.62 0.0348978 ((-1.0041*u2)+(7.0952*((((u2*(u2*u1))*(u2*u2))*u2)*u2)) + u2+(2.9977*u3)) -244.2 0.07933 (0.23192*(((1.9955*(((((exp(exp(u2))*u1)*u2)+(0.32828*u3) )*u2)*u2))+ (12.942*u3))-u2)) The GP optimization is now run in two configurations for 60 generations and a population size of 25: one with “ACE advice” (i.e., with u1, u2 and u3 only as input variables) and another without “ACE advice” (i.e., use all the five input variables to model y) The configuration for the GP runs (without “ACE advice”) is shown in Table 5.24 A similar configuration file was used for the GP runs with “ACE advice” with change made only to the terminal set (u4 and u5 are removed in these runs) The success rate of the GP program is then examined A successful run is defined as one with a root mean square error (RMSE) value below 0.04 117 It is found that the success rate with “ACE advice” was out of 10 runs and the success rate without “ACE advice” was out of 10 runs It is therefore concluded that ACE is able to improve the success rate of GP modeling for algebraic systems The best-fit model obtained with “ACE advice” had an RMSE of 0.0164655 as compared to the best-fit model obtained for a model without “ACE advice” (RMSE of 0.0334999) The “quality” of the model is also better indicating that the removal of spurious variables using ACE, MARS or otherwise is a required step in the modeling effort The modeling results from these runs are summarized in Tables 5.25 and 5.26 The best models are shown in bold script The model predictions from the best model (obtained with “ACE advice”) are compared with the “true” data in Figure 5.24 ((0.50405*(u1*u2))+(16.416*((u2+ (-0.61743*1))*((u1*(u2*u2))*u2)))+ (0.0076678*u2)+(2.9942*u3)) Data Model 0 10 20 30 40 50 sample 60 70 80 90 Figure 5.24 Data vs model prediction for Case Study 12 118 100 5.8 Summary In this chapter, we demonstrated several case studies covering a spectrum of models: algebraic, implicit algebraic, dynamical systems reducible to static systems (using lagged variables) and differential equation systems (state space modeling) using DACSGP Case studies included tutorial problems to understand the working of GP, benchmark problems to compare DACS-GP with other GP implementations, simulated data obtained from first principles models and laboratory data The result of case studies are summarized in Table 5.27 According to the results shown in Table 5.27, GP was found to be suitable for solving a wide range of nonlinear system identification problems A novel integration of GP with non-parametric data analysis tools was also illustrated in case study 12 119 Case study Table 5.27 Summary of case studies Best GP model CPU time (min) True model Expression RMSE Expression RMSE + 220.87u u + 0.033622 8.168 212.7u 0.0641 + u 9.98104 333.15(1 − exp( −0.3831u )) 2.055 334.3(1 − exp( −0.3807 u )) 2.10322 142 ( y * y ) = −0.9928u * u + 0.1638 y2 +u2 =1 0.15712 99 (− 1.067u - (− 1.067u + 1.027 y ( k −1) ) 4.643e-8 ( k −1) + 0.841 y( k −1) ) ( k −1) 112 4.191e-8 y (k − 1) + 0.35 y (k − 1) 513 − 0.4 u (k − 1) y (k − 1) − 2.5u (k − 1) + y ( k −1) − 2.5u ( k −1) u 2( k − ) + 2.748u 2( k −1) − 6.165u 2( k −1) + 0.8256 y( k −1) 0.01888 460 0.001058 749 0.302172 1711 0.9134 y(k −1) − 1.162 + u3( k −5) + + u3( k − ) u1( k −1) + + (1 + 11.22u1 ( k − 2) − u3( k − ) + y( k − 4) + y( k −5) ) 0.0667u1( k −1) u1 + 0.231u12 + 0.0477u1( k −1) − 2.103 y( k −1) + y( k −1) − 0.0594u1( k − 2) + 0.0711 y ( k −3) u1 − 14.81z1 ,12.32 z1 − 7.86 z − 3.98 z1 + 2.001z , − 0.390 z + 1.376 z1 , 40.94 z + 1.26 z1 − 21.07 z 10 11 12 2 2.967 z1 − 2.98 z1 z , z z1 − z − 5.0041z1 , 4.004 z1 + ( z1 − z ) 0.008295 − 14 z1 ,12 z1 − z 0.01318 0.003075 2.334e-4 7.7647u1u25 + 3.019u3 + 0.9591u1u 2 0.02151 120 − z1 + z , z1 − 42 z + 20 z , 40 z − 20 z 3.0 z1 (1 − z ), 1.0 z ( z1 − 1) − z1 , z1 − z 0.01242 1405 0.008828 1931 0.006687 9495 3.763e-4 6215 2869 Chapter CONCLUSIONS AND RECOMMENDATIONS One can't die if he's brave Even if he dies his name lasts forever – A Myanmar proverb 6.1 Conclusions In this research, the objective was to develop a nonlinear system identification tool using the genetic programming paradigm A comprehensive system of computer codes have been developed (from scratch) and packaged into a GUI-based MATLAB suite called DACS-GP Besides the core GP programs, DACS-GP integrates parameter estimation routines, data preprocessing tools (ACE), model simplification routines, performance analysis tools, etc Parameter estimation is effected by gradientbased procedures as well as recently developed stochastic procedures such as differential evolution A novel fitness measure that considers model complexity (complexity of operators) has been developed This work has also introduced and implemented improved genetic operators, gene representation scheme and evolution policy The domain of applications shown here is quite broad ranging from simple algebraic equations to nonlinear ordinary differential equation systems The introduction of novel representation for genes significantly improves the success rate of the GP runs This has improved the ability of GP to search efficiently over a wider space of functions Different types of problems can be solved using appropriate 121 simulation engines Since chromosomes are stored as matrix, manipulations are faster and require less memory The workability of the approach is demonstrated by several case studies in the domain ranging form algebraic to state space modeling Applications of field include industrial, experimental data as well as simulated data Some case studies are means to benchmark with currently available other genetic programming systems Our genetic programming system significantly contributes in the application field of state space modeling This open a new paradigm for direct state space modeling form experimental data, otherwise no other such method exists DACS-GP has solved several batch reaction kinetic data of several states Genetic programming based tool is interesting because it inherently handles nonlinear systems 6.2 Recommendations for further research The main bottleneck in the GP methodology is the availability of computing power With sufficient computing power, it is possible to use GP beneficially for solving a wide variety of realistic problems Availability of multiple processors and parallel implementation of the GP code will be the key factors for serious GP applications MATLAB may not be quite suitable for this scenario (unless new MATLAB toolboxes are developed to fill this void) Compiled languages such as Fortran C, C++ may have to be employed This is one logical step that is being pursued currently 122 Solving realistic industrial size problems will require a larger population size and number of generations than we have employed in the case studies demonstrated here This will mean a large increase in computational time Increase in computation power can be realized by the speed of computation or parallelizing the application In an attempt to run the program faster, we employed distributed computing in MATLAB environment However, the communication is limited and can only handle about up to a maximum of “slave” computers There are several research publications in parallelization of GP Andre and Koza (1998) show the benefits of parallel implementation of genetic programming Salhi et al., (1998) describes parallel implementation of genetic programming based tool for symbolic regression This means that we may have to migrate from MATLAB platform to Fortran / C / C++ This migration is also needed to make the software more “user friendly” by providing an easy to use user interface The availability of domain specific knowledge can narrow the search space and may lead to more meaningful models Physical insights and results of nonparametric methods should be used whenever possible Also, when data is highly correlated, multivariate statistical tools such as principal components analysis (PCA) should be employed as preprocessing tools The PCA model must then be integrated into the GP system Checks for asymptotic behavior of models should be included and made use of in parameter estimation (to specify the constraints on the search space) In the calculation of fitness, we could also include a penalty if the expected asymptote nature is not met As a means of improving the efficiency of GP, one can even consider using the “ACE transformed” variables as a terminal gene in GP 123 There can be other interesting applications of the GP methodology in the control area Given a first principles model of an interacting multivariable system, we can use GP to find transformations of the manipulated variables and/or controlled variables so as to make the system linear (i.e transform from a nonlinear system in original coordinates to a linear or “flat” system in altered coordinates) and/or non-interacting (i.e transform from a interacting system in the original coordinates to a less interacting or noninteracting system in the “new” coordinates) Measures of linearity/nonlinearity and interaction need to be incorporated for such applications These efforts can be useful in nonlinear and multivariable controller design 124 REFERENCES [Andre and Koza 1998] Andre D and Koza J R., “A parallel implementation of genetic programming that achieves super-linear performance”, Journal of Information Science 106, pp 201-218, 1998 [Ashour et al., 2003] Ashour, A F., L F Alvarez and V V Toropov, “Empirical Modelling of Shear Strength of RC Deep Beams by Genetic Programming”, Computers and Structures, 81, pp 331-338, 2003 [Banzhaf et al., 1998] Banzhaf, W., Frank D Francone, R E Keller and P Nordin, “Genetic Programming – an introduction: on the automatic evolution of computer programs and its application”, Morgan Kaufmann Publishers, Inc USA, 1998 [Bard 1974] Bard, Y., “Nonlinear Parameter Estimation”, Aceademic Press, New York, NY, 1974 [Bates and Watts 1988] D.M Bates and D.G Watts, “Nonlinear Regression Analysis and its Applications”, John Wiley and Sons, New York, 1988 [Box and Jenkins 1994] G E P Box and G M Jenkins “Time series analysis : forecasting and control”, San Francisco, Holden-Day , 1994 [Breiman and Friedman 1985] Breiman L and J H Friedman, “Estimating Optimal Transformations for Multiple Regression and Correlation (with discussion)”, Journal of the American Statistical Association 80, 580, 1985 [Bunday and Garside 1987] Bunday, B.D and Garside G.R “Optimisation Methods in Pascal”, Edward Arnold Publ., 1987 125 [Cao et al., 1999] Hongqing Cao, Jingxian Yu, Lishan Kang, Yuping Chen and Yongyan Chen “The kinetic evolutionary modeling of complex systems of chemical reactions” Computers & Chemistry, 23(2), pp 143-152, 1999 [Davidson et al., 2003] Davidson, J W., Savic, D A and Walters G A., “Symbolic and numerical regression: experiments and applications”, Information Sciences, 150 (1-2), pp 95-117, March 2003 [DeVeaux et al., 1993] DeVeaux R.D., Psichogios D.C., and L.H Ungar, “A comparison of Two Nonparametric Estimation Schemes: MARS and Neural Networks”, Comp & Chem Engg., 17(8), pp 819-837, 1993 [Englezos and Kalogerakis 2001] Englezos, P., and Kalogerakis, N., “Applied parameter estimation for chemical engineers”, Marcel Dekker, Inc New York, 2001 [Eskinat et al., 1991] Eskinat, E., J., S H and Luyben, W L “Use of Hammerstein models in identification of nonlinear system”, A.I.Ch.E J 37(2), pp 255-268 1991 [Esposito et al., 2000] Esposito, W R and Floudas, C A., “Global Optimization for the Parameter Estimation of Differential-Algebraic Systems”, Ind Eng Chem Res., 39, pp 1291-1310, 2000 [Friedman 1991] Friedman, J H “Multivariate Adaptive Regression Splines (with discussion), Annals of Statistics”, 19, pp 1-141, 1991 [Gao and Loney 2001] Gao, L and N W Loney, “Evolutionary Polymorphic Neural Network in Chemical Process Modelling”, Comp & Chem Engg., 25, pp 14031410, 2001 [Goldberg 1989] Goldberg, D.E., “Genetic Algorithms in Search, Optimization and Machine Learning”, Addison-Wesley, 1989 126 [Gray et al., 1996] Gray, G J., D J Murray-Smith, Y Li and K C Sharman, “Nonlinear Model Structure Identification using Genetic Programming and a Block Diagram Oriented Simulation Tool”, Electronic Letters, 32, pp 1422-1424, 1996 [Greeff and Aldrich 1998] Greeff, D J and C Aldrich, “Empirical Modelling of Chemical Process Systems with Evolutionary Programming”, Comp & Chem Engg., 22(7-8), pp 995-1005, 1998 [Grosman and Lewin, 2002] Grosman, B and D R Lewin “Automated nonlinear model predictive control using genetic programming”, Comp & Chem Engg, 26, pp.631–640 2002 [Hall and Seborg 1989] Hall R.C and Seborg D.E “Modeling and multi-loop control of a multivariable pH neutralization process: Part I: Modeling and multiloop control”, In Proceedings of the American Control Conference, Pittsburgh, 1989 [Henson and Seborg 1994] Henson, A M and Seborg, E D., “Adaptive nonlinear control of a pH neutralization process”, IEEE Trans Control Sys Tech (3), 1994 [Holland 1975] Holland, J H., “Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence”, MIT Press, Cambridge, MA 1975 [Hong and Rao 2003] Hong, Y-S and B Rao, “Evolutionary self-organising modeling of a municipal wastewater treatment plant”, Water Research, 37, pp 1199-1212, 2003 [Horst and Pardalos 1995] Reiner Horst and Panos M Pardalos (editors) “Handbook of global optimization”, Dordrecht; Boston, Kluwer Academic Publishers, 1995 127 [Iba 1993] Iba, H., T Kurita and T Sato, “System Identification using Structured Genetic Algorithms”, Proceedings of the 5th International Joint Conference on Genetic Algorithms, pp 41-53, 1993 [Iba 1994] Iba, H., T Sato and Hugo de Garis, “System Identification Approach using Genetic Programming”, Proceedings of the First IEEE Conference on Evolutionary Computation, 1, pp 401-406, Orlando, Florida, June 27 – June 29, 1994 [Ingber 1989] Ingber, L “Very fast simulated re-annealing”, Journal of Mathematical Computer Modelling, 12, 967-973, 1989 [Jeong 1996] Jeong, W and Lee, J “Adaptive Simulated Annealing Genetic Algorithm for system identification”, Engg Applic Artif Intell 9(5), pp 523-532, 1996 [Kaboudan 2003] Kaboudan, M A., ‘Forecasting with Computer-Evolved Model Specifications: a Genetic Programming Application”, Computers and Operations Research, 30, 1661-1681, 2003 [Kirkpatrick 1983] Kirkpatrick S Jr., C.D Gelatt, M Vecchi, “Optimization by Simulated Annealing”, Science, 220 (4598), pp 671–680, 1983 [Koza 1992] John R Koza “Genetic programming : on the programming of computers by means of natural selection”, Cambridge, Mass., MIT Press, 1992 [Koza 1999a] Koza, J R., F H Bennet III, M A Kleane, “Genetic Programming III: Darwinian Invention and Problem Solving”, Morgan Kaufmann, San Francisco, CA, 1999 [Koza 1999b] Koza, J R., F H Bennet III, W Mydlowec, M A Kleane, J Yu, O Stiffelman “Searching for the impossible using Genetic Programming”, GECCO99, Morgan Kaufmann, San Francisco, CA, pp 1083-1091, 1999 128 [Lang 1995] Lang, K J “Hill climbing beats genetic search on a Boolean circuit synthesis of Koza’s”, In Proceeding of the twelfth international conference on machine learning, Tahoe, CA Morgan Kaufmann, San Francisco, CA [Langdon and Poli 2002] Langdon, William B., Riccardo, Poli “Foundations of genetic programming”, New York, Springer, 2002 [Lakshminarayanan et al., 1995] Lakshminarayanan S., Shah L Sirish and Nandakumar K “Identification of Hammerstein models using multivariate statistical tools”, Chem Engg Sci., 50 (22), pp 3599-3613, 1995 [Lakshminarayanan 2000] Lakshminarayanan, S., H Fujii, B Grosman, E D and D.R Lewin., “New product design via analysis of historical databases”, Comp & Chem Engg., 24, pp 671-676, 2000 [Ljung 1989] Lennart Ljung “System identification : theory for the user”, Englewood Cliffs, NJ : Prentice-Hall , 1987 [Ljung 1999] Lennart Ljung “System identification : theory for the user”, Upper Saddle River, N.J : Prentice Hall, 1999 2nd ed [Luus 1998] Luus, R “Parameter Estimation of Lotka-Volterra Problem by Direct Search Optimization”, Hung J Ind Chem 26, 287 1998 [Marenbach 1998] Marenbach P “Using prior knowledge and obtaining process insight in data based modeling of bioprocesses”, SAMS, 31 pp 39-59, 1998 [McKay 1997] McKay, B., Willis, M and Barton, G., “Steady-state modeling of chemical processing system using genetic programming”, Computer chem Engng 21 (9), pp 981-996, 1997 129 [McPhee and Miller 1995] McPhee, N F and Miller, J D “Accurate replication in genetic programming”, In Eshelman, L., Editor, Genetic Algorithms: Proceedings of the Sixth International Conference (ICGA95), pp 303-309, 1995 [More 1997] More´, J “The Levenberg–Marquardt Algorithm: Implementation and Theory” Numerical Analysis, in: G.A Watson (Ed.), Lecture Notes in Math, vol 630, Springer Verlag, Berlin, 1977 [Salhi et al., 1998] Salhi A., H Glaser and D De Roure 1998 “Parallel implementation of a genetic-programming based tool for symbolic regression”, Information Processing Letters, 66, pp 299-307, 1998 [Sen and Stoffa1995] Sen, M K and Stoffa, P L “Global Optimization Methods in Geophysical Inversion”, Elsevier, Amsterdam, pp 294, 1995 [Srinivas and Deb1995] Srinivas, N., and Deb, K “Multiobjective function optimization using nondominated sorting genetic algorithms”, Evolutionary Computation, 2, pp 221–248, 1995 [Storn and Price 1997] Storn R and Price K “Differential Evolution - A simple and efficient adaptive scheme for global optimization over continuous space” Journal of Global Optimization, 11, pp 341–359, 1997 [Swain and Morris 2003] A K Swain and A S Morris “An evolutionary approach to the automatic generation of mathematical models”, Applied Soft Computing, (1), pp 1-21, 2003 [Tan et al 2003] K C Tan, Q Yu, C M Heng and T H Lee “Evolutionary computing for knowledge discovery in medical diagnosis”, Artificial Intelligence in Medicine, 27 (2), pp 129-154, 2003 130 [Tang and Li 2002] K Tang and T Li “Combining PLS with GA-GP for QSAR”, Chemometrics and Intelligent Laboratory Systems, 64 (1), pp 55-64, 2002 [Tjoa and Biegler 1991] Tjoa, T B.; Biegler, L T “Simultaneous Solution and Optimization Strategies for Parameter Estimation of Differential-Algebraic Equation Systems”, Ind Eng Chem Res., 30, 376, 1991 [Torni et al 1999] A Törni, M.M Ali and S Viitanen “Stochastic Global Optimization: Problem Classes and Solution Techniques”, Journal of Global Optimization, 14, pp 437–447, 1999 [Watson and Parmee 1996] Watson, A and Parmee, I “System identification using genetic programming” in Proceeding of ACEDC’96 PEDC, University of Plymouth, UK 1996 [Willis 1997] Willis, M., Hiden, H., Hinchliffe, M., McKay, B., and Barton, G W “Systems modeling using genetic programming”, Comp & Chem Engg., 21, S1161-1166 Supplemental, 1997 [Yee et al., 2003] Yee K Y A., A K Ray, G P Rangaiah “Multiobjective optimization of an industrial styrene reactor”, Computers and Chemical Engineering 27, pp 111 - 130, 2003 131 [...]... states of the system and limits what the system can learn” Figure 2.1 is a flowchart for genetic programming paradigm The flowchart contains a loop executing multiple independent runs of genetic programming The important parameters to be initialized are: number of runs, number of generations, population size, probability of genetic operations and specification for termination criterion Genetic programming. .. of genetic programming and hopefully to be solved as computational power of computers increase Success of genetic programming requires good fitness function that guides the evolution In some cases, formulating a fitness function is as difficult as solving the problem With lack of strong theory how genetic programming work, there is no way of definitive improvement Current improvements on genetic programming. .. concept of genetic programming and its basic genetic algorithm is explained Genetic programming works on a population of individuals, each of which represents a potential solution A population of model structures represented as a tree evolves through many generations towards a solution using certain evolutionary operators and a ‘survival-of-the-fittest’ selection scheme Each essential step in genetic programming: ... ‘survival-of-the-fittest’ selection scheme Each essential step in genetic programming: genetic operators, fitness functions and selection methods are explained in detail Finally, genetic programming theory was presented and limitations of genetic programming were pinpointed 20 Chapter 3 APPLICATION OF GENETIC PROGRAMING TO SYSTEM IDENTIFICATION Never think of knowledge and wisdom as little Seek it and store... example files is also attached to this thesis 5 Chapter 2 THE BASICS OF GENETIC ALGORITHMS AND GENETIC PROGRAMMING Listening and noting well enriches knowledge Knowledge enhances progress – Loka Niti 2.1 Introduction The general task of a system identification problem is to approximate the input-output behavior of an unknown process system using an appropriate model Appropriate modeling components must be... topic of genetic algorithm (GA) since GP is based on it This review is followed by an introduction to genetic programming wherein a common form of GP algorithm is explained in detail The chapter ends by pointing out the limitations or shortcomings of the genetic programming methodology 2.2 Overview of Genetic Algorithms (GA) John Holland's pioneering book “Adaptation in Natural and Artificial Systems”... be very hard for a GP system to generate models without any idea of what a given argument or function could mean to the output These shortcomings could be at least partially addressed by employing efficient genetic operators as we demonstrate in the next chapter In this chapter, our primary focus will be on developing an efficient genetic programming system tailored for system identification tasks 23... while maintaining the diversity of the gene pool 2.3 Genetic Programming Genetic programming is an extension of the conventional genetic algorithm in which each individual in the population is a computer program The most important characteristic of the GP is the use of tree structure representation scheme Koza (1992) criticized the limitation of genetic algorithm representation by noting that the “representation... rules etc.) A relatively recent addition to this family is the genetic programming (GP) that has its origins in genetic algorithms (GA) (Holland, 1975) GP is a biologically inspired, domain-independent method that genetically breeds populations (of computer programs, mathematical formulae etc.) to solve problems The first runs of genetic programming were made in 1987 but the first detailed description... has been used for in chemical engineering and other domains Iba et al., (1993, 1994) use GP for system identification based on data from static and dynamical simulated systems (interestingly, in their 1993 paper, they do not use the term genetic programming They refer to their method as “a variant of genetic algorithm which permits a GA to use structured representations”) Their method (STROGANOFF) ... data Identification for state space model using genetic programming is a relatively new area of application which has been attempted here Genetic programming (GP) is a powerful tool for system identification. .. selection 18 2.4 Genetic programming theory 18 2.5 Shortcomings and limitations of genetic programming 19 ii 2.6 Summary 20 CHAPTER 3: APPLICATION OF GENETIC PROGRAMING TO SYSTEM IDENTIFICATION. .. THE BASICS OF GENETIC ALGORITHMS AND GENETIC PROGRAMMING 2.1 Introduction 2.2 Overview of Genetic Algorithms (GA) 2.3 Genetic Programming 2.3.1 Initializing a GP population 13 2.3.2 Genetic operator

Định dạng
Số trang	143
Dung lượng	1,13 MB