Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 92 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
92
Dung lượng
1,41 MB
Nội dung
PARTICLE SWARM OPTIMIZATION IN MULTI-AGENTS COOPERATION APPLICATIONS XU LIANG NATIONAL UNIVERSITY OF SINGAPORE 2003 PARTICLE SWARM OPTIMIZATION IN MULTI-AGENTS COOPERATION APPLICATIONS XU LIANG, B.ENG NANJING UNIVERSITY OF AERONAUTICS AND ASTRONAUTICS A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF ENGINEERING DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING NATIONAL UNIVERSITY OF SINGAPORE 2003 Acknowledgements To my supervisors,Dr Tan Kay Chen and Dr Vadakkepat Prahlad Their patient and instructive guidance has shown me that for every challenge on one side, there is solution on the other side To my friends and fellows in the Control & Simulation Lab I have benefited so much from those valuable assistance and discussions It is really lucky to have so many sincere friends here Special thanks to the National University of Singapore for research scholarship, library facilities, research equipments, and an enthusiastic research atmosphere This is an ideal campus for study, research and life Finally, my gratitude goes to my family for their firm support and unreserved love, which have made my life abroad such an enjoyable experience i Summary In the past decades, rapid progress has been made in the development of individual intelligence This progress has consequently made group intelligence, which is based on individual intelligence, applicable and, therefore, more attractive Concerning current research focus, most of research works on group intelligence are concentrated on external-driven group intelligence, whereas, inner-motivated group intelligence is yet rather a research direction than a research topic However, as in many circumstances, especially in an isolated environment, since external-driven cooperation is not applicable, inner-motivated group intelligence is necessary FAMAC (Fully Automatic Multi-Agents Cooperation), to be presented in this thesis, is the very one designed to explore inner-motivated group intelligence so as to offer multi-agents the ability to perform autonomic cooperation independently of external instructions In the first part of this thesis, the origination, principles, and structure of FAMAC are described in detail Human cooperation in soccer game is studied and the principles of human cooperation are replanted into FAMAC For this reason, FAMAC strategy adopts a structure which combines distributed control with global coordination and comprises of three functional units: the Intelligent Learning and Reasoning Unit (ILRU), the Intelligent Analyzing Unit (IAU) and Central Controlling Unit (CCU) ii Equipped with ILRU and IAU, intelligent individuals are supposed to be capable of thinking, analyzing and reasoning The CCU, however, helps to coordinate the group behavior In the second part, two main components, ILRU and IAU, of FAMAC are detailed Additional knowledge of Neural Network and Fuzzy logic as well as their functions and applications in IAU and ILRU are covered in this part A series of simulations are conducted and analyzed in the third part These simulations are designed to validate the feasibility of FAMAC and compare the effectiveness of M2PSO network with other computational algorithms regarding their performance in the training of FAMAC Through simulations, significant advance has been achieved with the multi-agents system that adopts the FAMAC strategy Further advance has also been achieved after the introduction of M PSO-NETWORK into FAMAC These experimental results have proved that the inner-motivated group intelligence, may or may not be in the format of FAMAC, is realizable and is efficient in prompting the capacity of multi-agents as a united team iii Contents Acknowledgements i Summary ii Contents iv List of Figures vii List of Tables ix List of Abbreviations x Introduction 1.1 Overview: the Main Task…… ……………………………………… 1.2 Outline of Thesis…………………………………………………… Background Knowledge 2.1 Agents, Multi-Agents System, and Multi-Agents Cooperation……… 2.2 A review of MAC…………………………………………………… 2.3 Intelligent Computation Algorithms in this Thesis………………… 14 iv 2.3.1 Fuzzy Logic……………………………………………… 14 2.3.2 Neural Network…………………………………………… 15 2.3.3 Genetic Algorithm………………………………………… 17 2.3.4 Particle Swarm Optimization…………………………… 18 Fully Automatic Multi-Agents Cooperation (FAMAC) 22 3.1 The proposed FAMAC……………………………………………… 22 3.1.1Origination of Idea of FAMAC…………………… …… 23 3.1.2 System Structure of FAMAC………………… ……….… 26 3.2 The Intelligent Analyzing Unit (IAU)…………………………….… 28 3.2.1 Functions of IAU………………………………….….…… 28 3.2.2 Fuzzification……………………………………………… 29 3.2.3 Fuzzy Rules……………………………………………… 33 3.2.4 Aggregation of Outputs and Defuzzification…………… 36 3.3 Intelligent Learning and Reasoning Unit (ILRU)…………………… 37 3.3.1 Functions of ILRU………………………………………… 37 3.3.2 Optimization for Neural Network………………………… 39 3.3.3 Structure of M2PSO Network…………………………… 47 3.3.4 Training process of M PSO-Network…………………… 49 Simulations 52 v 4.1 Simulation Facilities………………………………………………… 52 4.2 The Simulation Platform for FAMAC……………………………… 53 4.2.1 General Description of Platform………………………… 53 4.2.2 Agents’ Actions and Cooperation………………………… 55 Results and Discussions 59 5.1 Test of PSO in Global Optimization for NN………………………… 59 5.2 Performance of FAMAC in Static Cooperation……………………… 64 5.3 Comparison between M PSO- Network and Neural Network in FAMAC…………………………………………………………… 67 5.4 Dynamic Cooperation of FAMAC with M2PSO-Network…………… 69 Conclusions 73 References 76 Author’s Publications 80 vi List of Figures Fig.1 First rank of MAC: Passive cooperation…………………………………… Fig.2 Second rank of MAC: Semi-autonomous cooperation……….…………… 10 Fig.3 Application of Fuzzy Logic into Tipping problems……………………… 15 Fig.4 Particle Swarm Optimization……………………………………………… 20 Fig.5 Illustration of a typical training cooperation strategy learning through daily training in real soccer sports……………………………………………………… 24 Fig.6 Idea representation FAMAC and its structure…………………………… 26 Fig.7 An example of Fuzzification…………………………………………………….… 30 Fig.8 IAU: Membership functions……………………………………………… 32 Fig.9 Illustration of function of ILRU…………………………………………… 38 Fig.10 Structure of neural network ……………………………………………… 40 Fig.11 One of the (3!) subspaces in a 3-dimension solution space…………… 43 Fig.12 Multi-level Particle Swarm Optimization……………………………… 45 Fig.13 M PSO-Network……………………………………………………… 48 Fig.14 Functional decomposition of M PSO-Network………………………… 51 Fig.15 Simulation platform……………………………………………………… 54 vii Fig.16 Box plot of training results……………………………………………… 61 Fig.17 Outputs of trained Neural Networks and the tracking error……………… 62 Fig.18 Weights of trained Neural Networks and the error against benchmark weights…………………………………………………………………………… 62 Fig.19 Tracking error of Neural Network in the solution space ……………………………………………………………………………… 63 Fig.20 The membership function adjusting itself to the environment during simulation………………………………………………………………………… 65 Fig.21 Performance of FAMAC with respect to training…… ………………… 66 Fig.22 Comparison of learning performance between NN (BP/GA/PSO) and M PSO………….……………………………………………………………… 67 Fig.23 Step 1: Roles assignment according to initial status……………………… 70 Fig.24 Step 2: Roles reassignment according to new situation………………… 71 Fig.25 Final result -Team A wins this round………………………………….… 71 viii Chapter Results and Discussions Numbers of bouts has been simulated Here, Neural Networks in ILRU are trained by BP algorithm At the beginning, 100 bouts of simulation were carried out Results of simulation were saved into the database of IAU and were analyzed In succession, ILRU is trained using this database Once the training of ILRU is successfully done, FAMAC is upgraded with the new IAU and ILRU and another 100 bouts of simulation were made This process cycled and the performance of FAMAC in every 100 bouts of simulation is compared A full record of this training in totally 2000 bouts is shown in Fig.21: 90 (%) 80 70 60 50 40 200 400 600 800 1000 1200 1400 1600 1800 2000 Fig.21: Performance of FAMAC with respect to training At the beginning, as both teams choose to cooperate randomly, two teams got tied; each has a 50% chance to win a round of match However, with training going on, significant progress in the performance of the team facilitated with FAMAC has been observed A highest rate of success of 86.75% appeared in the end of 2000 bouts 66 Chapter Results and Discussions 5.3 Comparison of M PSO- Network and Neural Network in FAMAC The simulations in this section are targeted at enhancing the advantages and significances of FAMAC on current base of BP-Neural Networks Learning Error NN(PSO) M PSO-Network NN(GA) NN(BP) Time Fig.22: Comparison of learning performance between NN (BP/GA/PSO) and M PSO As shown in Fig.22, because of the property of gradient decent, the tracking error of BP training drops much faster than any other methods With training process going on, the decrease of BP tracking error slowed down quickly and finally no decrease can be observed after it has reached a local optimum GA, due to the large number of individuals, presented a smallest value of tracking error at the beginning among all methods However after that, the decrease of its tracking error is neither rapid nor 67 Chapter Results and Discussions lasting While in PSO training for Neural Network, the tracking error drops much more slowly than BP method However, this drop process lasted for a much longer time than BP So though make little improvement in tracking performance, long time accumulative reduction leads to lower tracking error than BP Considering the speed, M PSO-Network is fasted than GA and PSO and is slower than BP The decrease in its tracking error is much more lasting than any other method Five Simulations, each of which comprising 1000 bouts of game, are carried out to evaluate the agents’ ability to think while working and their adaptability to the dynamic environment that changes continuously all the time In the 1st simulation there are no intelligent cooperation in both teams In the following simulations, FAMAC realized by BP-trained Neural Networks, GA-trained Neural Networks, PSO-trained Neural Networks and M PSO-Network are implemented respectively Table Results of 1000 matches before and after training Training Method Goals of our team Goals of opponent Rate of Win/Lose Untrained 487 513 0.95 874 126 6.94 891 109 8.17 907 93 9.75 924 76 12.16 Neural Network (BP) Neural Network (GA) Neural Network (PSO) M PSO-Network 68 Chapter Results and Discussions This comparison is not a straight one since each method only competes against a same third-part random cooperation strategy Table show the result of a straight comparison of FAMAC using M2PSO-Network and FAMC using Neural Network: Table Direct comparisons between M PSO and PSO/BP Bouts of win (of 600 bouts) Win ratio 386 64.33% 214 35.67% Team A (M PSO-Network) 352 58.67% Team B (GA) 238 41.33% Team A (M PSO-Network) 332 55.33% Team B (PSO) 268 44.67% Simulation Team A (M PSO-Network) Team B (BP) 2 Value of Win/Lose 1.459 1.419 1.239 5.4 Dynamic Cooperation of FAMAC with M2PSO-Network Further simulations were carried out on the dynamic cooperation of multi-agents system In the dynamic cooperation, agents were required to cooperate continuously from the beginning to the end of one bout of game matches were simulated between two teams In these simulations, in each step, the agents can obtain and analyze their new situations in the environment and exchange their roles for better performance To keep the size of the database so as not to slow down the learning process, based on a 69 Chapter Results and Discussions First-In-First-Out (FIFO) rule, old data in the database is regarded to be obsolete and be deleted from the database once the database is full Figures below illustrates an example of continuous steps cooperation process of the agents in a round of match Illustrations: (1) In Fig.23, at the beginning, roles are intelligently assigned to the agents of our team according to the agents’ initial states (2) In Fig.24, since a step of actions has been carried out, the states of agents have been changed and thus roles may need to be reassigned (3) In Fig.25, finally, agent of team A reached the target flag ahead of its opponent and won a goal in a round of match 2 3 A g e n t o f te a m A g e n t o f te a m T a rg e t fla g A B -O ffe n d e r -W a rd e r -D e ffe n d e r Fig.23: Step 1: Roles assignment according to initial status 70 Chapter Results and Discussions 3 3 2 A g e n t o f te a m A g e n t o f te a m T a r g e t f la g A B - - - O f f e n d e r - - - W a r d e r - - - D e f f e n d e r Fig.24: Step 2: Roles reassignment according to new situation 3 3 2 A g e n t o f te a m A g e n t o f te a m T a rg e t fla g A B -O ffe n d e r -W a rd e r -D e ffe n d e r Fig.25: Final result: Team A reached the flag in the first place The results of overall tests of dynamic cooperation are presented in table With more steps of cooperation, rate of success of team A has increased to 98.33% The reason is that with the number of steps of cooperation increased, the chance that team will run into proper cooperation strategies in all these steps has been greatly cut down No one can flip a coin into face for a consecutive 100 times Neither can agents choose to right by chance all the way 71 Chapter Results and Discussions Table Results of six rounds of matches after training 72 Chapter Conclusions We have proposed a new cooperation strategy namely Fully Automatic Multi-Agents Cooperation (FAMAC) for Multi-Agents System The FAMAC is made up of three units: IAU, ILRU and CCU These units correspond to the functional units of human intelligent respectively: human analysis, human reasoning and global coordinator Three different training methods, BP, GA and PSO, were applied to train the Neural Network in ILRU And a Multi-level-Multi-step Network ( M PSO-Network) is also put forward to further improve the performance of ILRU as well as that of FAMAC A number of important contributions have resulted from these works First of all, the combination of ILRU and IAU has enabled agents to think, remember and analyze what happened, happening and to happen All these abilities of agents have made MAC achievable Secondly, M PSO-Network is introduced to take the place of traditional neural network for the sake of a better tracking performance Comparison between them has proved such improvement 73 Chapter Conclusions Through research in this thesis, some conclusions can be drawn: (1) It is effective and applicable to decompose and reproduce team intelligence using three intelligent units: ILRU, IAU and CCU ILRU is intrinsically a learning machine With Neural Networks, ILRU does well in tracking objects whose information is explicit and rational However, in a material world, not all information is so direct and explicit enough to be easily numerated Such information need to be fuzzifized and then transformed into numerical format IAU is ace in dealing with this CCU, as an irreplaceable unit, will solve the conflicts among agents and harmonize agents’ behavior (2) More training leads to better performance until it reaches its climax In real soccer game, a team is more likely to succeed with more extensive training It’s the same in the system of FAMAC As we can see in the simulation results, the success rate rises continuously as the training time increases However there is a threshold for this success rate After this threshold point the success tare increases very slowly with respect to the increasing training time This is caused by many factors Future work is expected to increase the value of this threshold In general, a progressional method is developed in this thesis to generate a suitable cooperation for a team pursuing a common goal And since this method is not critical 74 Chapter Conclusions about the abundance of information, it will have a wide range of usage As further research, both ILRU with M2PSO-Network and IAU unit with fuzzy logic need improvements to fit for a much more complicated environment And if we want to implement this method into practice, we need to speed the algorithm up to handle with the fast change of robots’ and ball’s positions and velocities 75 References [1] Dan L Grec, David C Brown, ‘Learning by design agents during negotiation’, 3rd International Conference on artificial Intelligence in design -Workshop on Machine Learning in design, Lausanne, Switzerland, 1994 [2] Ferreira, J R da S., Cavalcanti, J H F & Alsina, P J., ‘Intelligent Tasks Scheduler’, Proceedings of the International Joint Conference on Neural Networks - IJCNN-99, Washington - DC, USA July1999 [3] Young D Kwon, Dong Min Shin, Jin M Won, et al., ‘Multi agents cooperation strategy for soccer robots’, Fira’98, pp 1-6, July 1998 [4] J.E Doran, S Franklin, N.R Jennings and T.J Norman, ‘On Cooperation in Multi-Agents Systems’, The Knowledge Engineering Review, 12(3), pp.309-314, 1997 [5] Hamid R Berenji, David Vengerov, ‘Learning, cooperation, and coordination in multi-agent systems’, Intelligent Inference systems Corp technical report IIS-00-10, 2000 76 [6] Paolo Pirjanian and Maja Mataric, ‘A decision-theoretic approach to fuzzy behavior coordination’, IEEE International S ymposium on Computational Intelligence in Robotics and Automation November 1999 [7] Li Shi, Chen Jiang, Ye Zhen et al(2001), ‘Learning Competition in Robot Soccer Game based on an adapted Neuro-Fuzzy Inference System’, Proceedings of the 2001 IEEE International symposium on Intelligent control, pp.195-199, 2001 [8] Il-Kwon Jeong and Ju-Jang Lee (1999), ‘Evolving fuzzy logic controllers for multiple mobile robots solving a continuous pursuit problem’, Fuzzy Systems Conference Proceedings, pp.685 -690 vol.2, 1999 [9] Xu, L., Huang, J.Q., ‘Dynamic identification with neural networks for aircraft engines in the full envelope’, Journal of NUAA, Vol.33, No.4, pp 334-338 [10] Kennedy, J., Eberhart, R., ‘Particle swarm optimization’, Neural Networks, 1995 Proceedings., IEEE International Conference on , vol.4, 1995 pp.1942 -1948 [11] Yoshida, H., Kawata, K., Fukuyama, Y., Takayama, S and Nakanishi, Y., ‘A particle swarm optimization for reactive power and voltage control considering voltage security assessment’, IEEE Transactions on Power Systems, Volume: 15 Issue: 4, Nov 2000 Page(s): 1232 –1239 77 [12] Salerno, J., ‘Using the particle swarm optimization technique to train a recurrent neural model’, Tools with Artificial Intelligence, 1997 Proceedings., Ninth IEEE International Conference on , 1997 Page(s): 45 –49 [13] Suzuki, K Kitagawa, S and Ohutchi, A., ‘Incremental evolution of weight modifiers for Neural Networks in cooperative behavior generation’, Systems, Man, and Cybernetics, 1999 IEEE SMC '99 Conference Proceedings, pp.121 -126 vol.6 [14] Chen,L.H.,’ A Global Optimization Algorithm For Neural Network Training’, Neural Networks, 1993 IJCNN '93-Nagoya Proceedings of 1993 International Joint Conference on Neural Networks, Volume: Page(s): 443 –446 [15] Berenji, H.R.and Vengerov D, ‘Cooperation and coordination between fuzzy reinforcement learning agents in continuous state partially observable Markov decision processes’, Fuzzy Systems Conference Proceedings, 1999, pp.621 -627 vol.2 [16] João Sequeira, Pedro Lima, M Isabel ribeiro, et al., ‘Behavior-based cooperation with application to space robots’, Proceedings of 6th ESA Workshop on ASTRA 2000,Noordjwikerhout, The Netherlands, 2000 [17] Shang,Y., Wah, B.W., ‘Global optimization for neural network training’, Computer , Volume: 29 Issue: , March 1996 Page(s): 45 –54 78 [18] FUKUDA, ‘Coordinative Behavior by Genetic Algorithm and Fuzzy in Evolutionary Multi-Agent System’, Robotics and Automation, 1993 Proceedings., 1993 IEEE International Conference on, 1993 760 -765, vol.1 [19] Xu,L., Tan, K.C., Vadakkepat, P., Lee, T.H.,“Multi-Agents Competition and Cooperation Using Fuzzy Neural Systems”, ASCC2002, pp 1326-1331 [20] Man-Wook Han and Kopacek, P., ‘Neural networks for the control of soccer robots’, Industrial Electronics, 2000 ISIE 2000 Proceedings of the 2000 IEEE, pp.571 -575 vol.2 [21] K S, Narrendra, K, P, Parthasarathy,“Identification and Control of Dynamical Systems Using Neural Networks”, IEEE Trans on Neural Networks,vol.1,no.1,pp.4-27,March 1990 [22] Huang, J.Q., Xu, L., Lewis,F.L., “Neural Network Smith Predictive Control for Telerobots with Time Delay”, Transactions of NUAA , 2001 Vol.18 No.1, pp 35-40 79 Author’s Publications The author has contributed to the following publications: Xu,L., Tan, K.C., Vadakkepat, P., Lee, T.H.,“Multi-Agents Competition and Cooperation Using Fuzzy Neural Systems”, ASCC2002, pp 1326-1331 Xu, L., Tan, K.C., and Vadakkepat, P., “A Fully Automatic Multi-agents Cooperation Strategy using M PSO-Network”, Submitted 80 [...]... Logic Genetic Algorithm Intelligent Analyzing Unit Intelligent Learning and Reasoning Unit Multi- Agents Multi- Agents Cooperation Multi- Agents System Multi- level Particle Swarm Optimization Multi- level Multi- step Particle Swarm Optimization Neural Network Particle Swarm Optimization x Chapter 1 Introduction 1.1 Overview: The main tasks Intelligent individuals, such as robots and flying vehicles, have become... sharing individual knowledge, as well as temporary information, among all agents to overcome the inherent limitation of individual agents in identifying and solving complicated problem In a word, agents in this system are required to communicate, negotiate, and coordinate each other In this manner, agents may be expected to work both independently and interactively A typical example can be found in. .. training process is shown in Figure below: Soccer Field step 3 step 1 Individual Resoning Global Coordinating Individual Perception step 2 Individual Analysis Fig.5: Illustration of a typical training cooperation strategy learning through daily training in real soccer sports At the first step of this process, each individual player tries to explore the working environment by itself Here, the working... enabling agents to learn to cooperate independently of human instruction and be capable of adapting to dynamic environment A fully autonomous multi- agents cooperation strategy namely FAMAC is proposed in this thesis Agents adopting FAMAC strategy are expected to behave like social beings as a result of introduction of the three intelligent components, Intelligent Learning and Reasoning Unit (ILRU), Intelligent... rank of MAC: Passive cooperation In this kind of cooperation, agents are individuals that are capable of doing something rather than thinking about something and do not have any idea about cooperation Therefore, to design cooperation for such agents, human designer needs to arrange everything about cooperation by telling what they should and should not do For this reason, this cooperation is critical... detail about the cooperation, their workload has been significantly cut down According to the classification, research on semi-autonomous includes: Multiple objective decisions making based on behavior coordination and conflict resolution using fuzzy logic in [4] In [5], the authors report a fuzzy reinforcement learning and experience sharing method in dealing with multi- agent learning in dynamic, complex... defects of FAMAC are referred in this chapter Following that, a retrospection the research work done in this thesis is conducted 4 Chapter 2 Background Knowledge 2.1 Agents, Multi- Agents System, and Multi- Agents Cooperation Agent, referred to as a kind of intelligent individual, is a widely quoted concept in both academic research and technical applications Since different definition may be given when... discussion, MAS has led agents evolve from the initial nature individual to social cell and therefore made Multi- Agents Cooperation (MAC) possible Multi- Agents Cooperation (MAC) is targeted at letting agents work together to 6 Chapter 2 Background Knowledge achieve a common goal, minimizing their counterwork while maximizing their mutual support The cooperation ranges from competitive cooperation, to antagonistic... presented in [1] A task-oriented approach and a motion-oriented approach is used for multi- robots cooperation in the space [2] On the other hand, in other kind of fixed cooperation strategies, the roles of agents are not that absolutely fixed, instead, they can demonstrate some property of variability when agents are working in the environment As in [3], a fixed role assignment is put introduced for agents. .. agent is an intelligent individual capable of perceiving, thinking, interacting and working And it can either have a real material body, such as biologic agent and robot agent, or have an imaginary dummy body, such as software agent Multi- Agents System (MAS) is a systematic integration of agents The purpose of this 5 Chapter 2 Background Knowledge integration is to make each agent informatively accessible ... Algorithm Intelligent Analyzing Unit Intelligent Learning and Reasoning Unit Multi- Agents Multi- Agents Cooperation Multi- Agents System Multi- level Particle Swarm Optimization Multi- level Multi- step Particle. .. MAC: Passive cooperation In this kind of cooperation, agents are individuals that are capable of doing something rather than thinking about something and not have any idea about cooperation Therefore,... resolution using fuzzy logic in [4] In [5], the authors report a fuzzy reinforcement learning and experience sharing method in dealing with multi- agent learning in dynamic, complex and uncertain environments