1. Trang chủ
  2. » Ngoại Ngữ

Case-Based Learning Behavior in a Real Time Multi-Agent System

146 3 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Case-Based Learning Behavior in a Real Time Multi-Agent System By Juan Luo Presented to the Faculty of The Graduate College at University of Nebraska In Partial Fulfillment of Requirements For the Degree of Master of Science Major: Computer Science Under the Supervision of Professor Leen-kiat Soh Lincoln, Nebraska April 2003 Case-Based Learning Behavior in a Real-Time Multi-Agent System Juan Luo, M.S Department of Computer Science and Engineering University of Nebraska-Lincoln, 2003 Advisor: Dr Leen-kiat Soh ABSTRACT: A distributed multi-strategy learning methodology based on case-based reasoning in which an agent conducts both individual learning by observing its environment and cooperative learning by interacting with its neighbors is proposed in this Master’s project Cooperative learning is generally more expensive than individual learning due to the communication and processing overhead Thus, our methodology employs a cautious utility-based adaptive mechanism to combine the two (cooperative learning and individual learning), an interaction protocol for soliciting and exchanging information, and the idea of a chronological casebase Here we report on experimental results on the roles and effects of the methodology in a real-time, distributed and resource-constrained multi-agent environment II Table of Contents Chapter Introduction .1 Chapter Background Chapter Methodology 21 Chapter Implementation .34 Chapter Discussion of Results .52 Chapter Future Work and Conclusions 127 References 130 Appendix A Tables .1 Appendix B Programmer’s Manual Appendix C User’s Manual III List of Figures Figure Machine learning topology Figure CBR Life Cycle 16 Figure The CBR module, the negotiation task, and the learning modules in our methodology for an agent 22 Figure The incremental and refinement learning features of the individual learning strategy 25 Figure The expected learning curve of an agent’s casebase (four phases) 25 Figure The cooperative learning strategy: usage history profiling, utility-based trigger, neighbor selection, interaction protocol, and case adaptation before learning 28 Figure The class hierarchy that had been implemented in CBR for the original ANTS software 37 Figure The new class hierarchy of CBR in our project 38 Figure The case message passing process 45 Figure 10 Interface with other modules in the system 49 Figure 11 The experiment setup of CEA .55 Figure 12 The percentages of different initiating negotiation outcomes in ES1 111 Figure 13 The percentages of different responding negotiation outcomes in ES1 111 Figure 14 The percentages of different initiating negotiation outcomes in ES2 115 Figure 15 The percentages of different responding negotiation outcomes in ES2 .115 Figure 16 The percentages of different initiating negotiation outcomes in ES3 118 Figure 17 The percentages of different responding negotiation outcomes in ES3 .119 Figure 18 The percentages of different initiating negotiation outcomes in ES4 122 Figure 19 The percentages of different responding negotiation outcomes in ES4 .122 IV List of Tables Table The usage history that an agent profiles of each case 23 Table Utility of each outcome for a case .46 Table Experiment sets For example, in ES1, every agent has 16 cases in its casebase; and so on 54 Table The results of the initial and final casebases after going through nonselective learning for two sets of experiments, initiating casebases Exp1 uses both the cooperative and individual learning mechanisms Exp2 uses only the individual learning mechanism 59 Table The results of the initial and final casebases after going through nonselective learning for two sets of experiments, responding casebases Exp1 uses both the cooperative and individual learning mechanisms Exp2 uses only the individual learning mechanism 60 Table The utility and difference gains of the agents’ learning steps for Experiment 1, in which both individual and cooperative learning mechanisms are active 62 Table The utility and difference gains of the agents’ learning steps for Experiment 2, in which only individual learning is active 64 Table Utility and difference gains for both Experiments and 2, after the second stage, for initiating casebases 66 Table Utility and difference gains for both Experiments and 2, after the second stage, for responding casebases 67 Table 10 The number of deleted/replaced unused, pre-existing cases .68 Table 11 The results of the initial and final casebases after going through nonselective learning for two sets of experiments, initiating casebases Exp1 uses both the cooperative and individual- learning mechanisms Exp2 uses only the individual learning mechanism 71 Table 12 The results of the initial and final casebases after going through nonselective learning for two sets of experiments, responding casebases Exp1 uses both the cooperative and individual learning mechanisms Exp2 uses only the individual learning mechanism 72 Table 13 The utility and difference gains of the agents’ learning steps for Experiment 1, in which both individual and cooperative learning mechanisms are active 73 Table 14 The utility and difference gains of the agents’ learning steps for Experiment 2, in which only individual learning is active 74 Table 15 The utility and difference gains of the A1’s learning steps for Experiment 1, in which both individual and cooperative learning mechanisms are active 76 Table 16 Utility and difference gains for both Experiments and 2, after the second stage, for initiating casebases 79 Table 17 Utility and difference gains for both Experiments and 2, after the second stage, for responding casebases 80 Table 18 The number of deleted/replaced unused, pre-existing cases .81 Table 19 The results of the initial and final casebases after going through nonselective learning for two sets of experiments, initiating casebases Exp1 uses both the cooperative and individual- learning mechanisms Exp2 uses only the individual learning mechanism 84 V Table 20 The results of the initial and final casebases after going through nonselective learning for two sets of experiments, responding casebases Exp1 uses both the cooperative and individual learning mechanisms Exp2 uses only the individual learning mechanism 85 Table 21 The utility and difference gains of the agents’ learning steps for Experiment 1, in which both individual and cooperative learning mechanisms are active 86 Table 22 The utility and difference gains of the agents’ learning steps for Experiment 2, in which only individual learning is active 87 Table 23 Utility and difference gains for both Experiments and 2, after the second stage, for the initiating casebase 90 Table 24 Utility and difference gains for both Experiments and 2, after the second stage, for responding casebases 91 Table 25 The results of the initial and final casebases after going through nonselective learning for two sets of experiments, initiating casebases Exp1 uses both the cooperative and individual- learning mechanisms Exp2 uses only the individual learning mechanism 95 Table 26 The results of the initial and final casebases after going through nonselective learning for two sets of experiments, responding casebases Exp1 uses both the cooperative and individual learning mechanisms Exp2 uses only the individual learning mechanism 96 Table 27 The utility and difference gains of the agents’ learning steps for Experiment 1, in which both individual and cooperative learning mechanisms are active 97 Table 28 The utility and difference gains of the agents’ learning steps for Experiment 2, in which only individual learning is active 98 Table 29 Utility and difference gains for both Experiments and 2, after the second stage, 101 Table 30 Utility and difference gains for both Experiments and 2, after the second stage, for responding casebases .102 Table 31 Actual numbers of negotiations in ES1, ES2, ES3 and ES4 .108 Table 32 Utility and difference gains for ES1 .109 Table 33 Utility and difference gains for ES2 .114 Table 34 Utility and difference gains for ES3 .117 Table 35 Utility and difference gains for ES4 .120 VI Chapter Introduction 1.1 Introduction A distributed multi-strategy learning methodology based on case-based reasoning in which an agent conducts both individual learning by observing its environment and cooperative learning by interacting with its neighbors is proposed in this Master’s project Cooperative learning is generally more expensive than individual learning due to the communication and processing overhead Thus, our methodology employs a cautious utility-based adaptive mechanism to combine both cooperative learning and individual learning together An interaction protocol for soliciting and exchanging information, the strategy of neighbor selection and the idea of a chronological casebase are also implemented in our Master’s project Another important observation in our project is that agents have different learning behavior if situated in different environment Particularly, our research is built on a real-time and distributed multi-agent system (MAS) Even though Machine Learning (ML) has been studied in the past, the research has been mostly independent of agent research and only recently received more attention in connection with agents and multi-agent systems The ability to learn and adapt is one of the most important features of intelligence It implies a certain degree of autonomy that in turn requires the ability to make independent decisions The agents have to be provided with appropriate tools to make such decisions In most dynamic domains a designer cannot possibly foresee all situations that an agent might encounter and therefore the agent needs the ability to adapt to such environments This is especially true for multi-agent systems Consequently, learning is a crucial part of autonomy and thus is a major focus of agent and multi-agent research Alternatively, multi-agent systems pose the problem of distributed learning, i.e., many agents learning separately to achieve a common goal Existing learning algorithms have been developed for single agent to learn separately and independently Once the learning process is distributed among several learning agents, such learning algorithms require extensive modification In distributed learning, agents need to cooperate and communicate in order to learn effectively, and these issues are being investigated extensively by MAS researchers, but to date they have only started to receive attention in the areas of learning Overall, collaboration between MAS and ML would be highly beneficial for each other and definitely they can benefit from each other In our project, the learning behavior is embodied in case-based learning At the same time, our system is implemented a distributed multi-agent system Each agent is capable of individual and cooperative learning instead of individual learning alone Individual learning refers to learning based on an agent’s perceptions and actions, without communicating directly with other agents in the environment This learning mechanism allows an agent to build its casebase from its own experience, eventually forming its own area of specialization Cooperative learning refers to learning through interaction among agents 1.2 Motivation We propose a distributed multi-strategy (both individual and cooperative) learning methodology based on case-based reasoning (CBR) in a multi-agent environment The motivation behind this is that agents try to learn to better solve a problem, that an agent has failed to solve or solve satisfactorily by itself However, cooperative learning may be too costly or risky The additional communication and coordination overhead may be too expensive or too slow for cooperative learning to be cost-effective or timely Moreover, since an agent learns from its experience and its view of the world, its solution to a problem may not be applicable for another agent facing the same problem This injection of foreign knowledge may also be risky as it may add to the processing cost without improving the solution quality of an agent This concern was evident in [2] in which the authors warned that some multi-agent environments could lead to a significant role specialization of individuals, and that sharing experiences of individuals in different roles or equivalently training individuals by letting them execute different roles could sometimes be significantly detrimental to team performance To prevent the multi-agent system from degrading team performance, we employ a cautious utility-based adaptive mechanism to combine cooperative learning and individual learning together in our project What we want to find out in this project is whether the combined learning (both cooperative learning and individual learning) can bring higher performance to the multi-agent system than the individual learning alone The research objectives to study are: The effects of cooperative learning in subsequent individual learning, The roles of cooperative learning in agents of different initial knowledge, The feasibility of our multi-strategy learning methodology, Agents’ adaptability to different problem domains/environments 1.3 Problem Domain Our problem domain is a multi-agent system with multiple agents that perform multi-sensor target tracking and adaptive CPU reallocation in a noisy environment (simulated by a JAVA-based program called RADSIM [21]) Each agent has the same capabilities It controls a sensor that is located at a unique position and can activate the sensor to search-and-detect the environment When an agent detects a moving target, it tries to implement a tracking coalition by cooperating with at least two neighbors And this is why a CPU shortage may arise: the activity may consume more CPU resource When an agent detects a CPU shortage, it needs to form a CPU coalition to address the crisis Agents are situated in a dynamic and unpredictable environment Their knowledge of the state of world is likely to be incomplete and only partially inaccurate They must notice and respond rapidly enough to important changes in the environment Otherwise, they will miss the chances or fail to avoid undesirable outcomes However, the bounded constraint resources (both CPU with limited speed and memory, and physically limited sensors) of agents will bring more difficulties into the building of multi-agent system and the intelligence agents will have in the system 1.4 Brief Description of Our Approach Our methodology employs a cautious utility-based adaptive mechanism to combine cooperative learning and individual learning together, an interaction protocol for soliciting and exchanging information, and the idea of a chronological casebase In our multi-agent environment, agents negotiate to collaborate on real-time tasks such as multi- learning The tracking-related tasks are more demanding and also durational So tasks like this lead to a lot of cooperative learning So the agent is encouraged to perform more cooperative learning, but its individual learning is weakened We can see that the type of task does affect the learning behavior of agents The environments impact the two initiating and responding roles differently, especially for negotiations associated with tough requirements (such as at least three members of a tracking coalition) Since an initiating agent has to shoulder the coalition management and decision making, it is able to learn more and more diverse and useful cases 126 Chapter Future Work and Conclusions 6.1 Future Work In section 3.1.2, we talk about the expected learning curve of an agent’s casebase It is divided into four phases However, we have not performed any analysis on our experiments to show that the agents adhere to this expected learning curve So we will perform the related analysis in the future because this learning curve is a very important characteristic of an agent’s casebase and it can make us understand an agent’s learning behavior more thoroughly In section 4.1.3, COOPERATIVE_TRIGGER with value ranging from to 1.0 is used as a threshold for the success rate and the rate of incurring new cases Its current value is 0.5 We can change this value based on need For example, if the cooperative learning is costly, we set this value lower to prevent the cooperative learning to happen too often When the problem is difficult and the communication is noisy or delayed, the cooperative learning may be costly In the future, we can investigate this value to obtain more satisfactory results based on need In section 5.3.1.1.2, we observe that the diversity of negotiation outcomes is generally greater on the initiating side of the negotiations than on the responding side This is due to the fact that the initiating agent has more responsibility for managing a negotiation than responding agent This implies that learning on the initiating side would be better in terms of diversity and coverage Correspondingly, learning on the responding side would be more focused and detailed A future work in this topic will make us understand the effects of environments on the agents’ learning more clearly In section 127 5.3.1.1.2, we also observe that, since negotiations fail not due to the strategies but to the dynamic activities of the agents, the learning would not be effective; and thus the usefulness of learning in ES1 of CEB is reduced We plan to investigate the feasibility of predicting about the activities of other agents in the future because it will augment the agent’s own learning 6.2 Conclusions In chapter 3, a detailed description of our methodology is given Summarily, our methodology employs a cautious utility-based adaptive mechanism to combine cooperative and individual learning together, an interaction protocol for soliciting and exchanging information, and the idea of a chronological casebase In chapter 4, we discuss the detailed implementation of our methodology It includes the implementations of usage history, chronologically casebases and sharing of cases (cooperative trigger, neighbor selection and message passing) We also discuss the implementation of measurement collection and the interfaces of the CBR module with other modules in our agent system Chapter describes the experiments and the analyses on the experiment results It includes the discussions of overall experiment strategy, the comprehensive experiment set A and the comprehensive experiment set B The motivation behind our project is that agents try to learn to better solve a problem that an agent has failed to solve or solve satisfactorily by itself So the distributed multi-strategy (both individual and cooperative) learning methodology based on case-based reasoning (CBR) is proposed in a multi-agent environment However, the 128 additional communication and coordination overhead may be too expensive or too slow for cooperative learning to be cost-effective or timely Moreover, since an agent learns from its experience and its view of the world, its solution to a problem may not be applicable for another agent facing the same problem This injection of foreign knowledge may also be risky as it may add to the processing cost without improving the solution quality of an agent To prevent the multi-agent system from degrading team performance, we employ the cautious utility-based adaptive mechanism to combine cooperative learning and individual learning together in our project What we want to find out in this project is whether the combined learning (both cooperative learning and individual learning) can bring higher performance to the multiagent system than the individual learning alone Our research objectives to study are: (1) the effects of cooperative learning in subsequent individual learning, (2) the roles of cooperative learning in agents of different initial knowledge, (3) the feasibility of our multi-strategy learning methodology, and (4) agents’ adaptability to different problem domains/environments We conclude that cooperative learning brings more diversity and utility than individual learning and the environment does affect the learning behavior of agents Finally, the paper Combining Individual and Cooperative Learning for Multiagent Negotiations, which is written based on our project, has been accepted by the Second International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS03) 129 References [1] Aamodt, A and Plaza, E Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches AI Communications, 7(i): pp 39-59, 1994 [2] Marsella, S., Adibi, J., Al-Onaizan, Y., Kaminka, G A., Muslea, I., Tallis, M., and Tambe, M, On being a teammate: experiences acquired in the design of RoboCup teams, Proc 3rd Agents’99, Seattle, WA, 221-227, 1999 [3] A L Prodromidis and S J Stolfo, Agent-based distributed learning applied to fraud detection In Sixteenth National Conference on Artificial Intelligence (Submitted for publication) [4] Philip K Chan and Salvatore J Stolfo, Experiments in Multistrategy Learning by Meta-Learning, Proceedings of the second international conference on information and knowledge management, Washington, DC, 314-323, 1993 [5] Winton Davies and, Pete Edwards, Agent-Based Knowledge Discovery In Working Notes of the AAAI Spring Symposium on Information Gathering from Heterogeneous, Distributed Environments Stanford University, Stanford, CA, March, 1995 [6] Goldman, C and Rosenschein, J Mutually, Supervised learning in multi-agent systems In Proceedings of the IJCAI-95 Workshop on Adaptation and Learning in MultiAgent Systems, Montreal, CA., August 1995 [7] A Lazarevic, D Pokrajac, and Z Obradovic, Distributed Clustering and Local regression for Knowledge Discovery in Multiple Spatial Databases, In Proceedings of 8th European Symposium on Artificial Neural Networks, pp 129-134, 2000 [8] T Sugawara and V Lesser, Learning to improve coordinated actions in cooperative distributed problem-solving environments, Machine Learning, 33(2-3):129-153, 1998 [9] Chan, P K and Stolfo, S J Experiments on multistrategy learning by meta-learning, Proc 2nd CIKM, Washington, DC, 314-323, 1993 [10] Nagendra Prasad, Susan Lander, and Victor Lesser, Cooperative learning over composite search spaces: Experiences with a multi-agent design system In Proceedings of Thirteenth National Conference on Artificial Intelligence, pp 68-73, 1996 [11] Andrew Garland and Richard Alterman, Learning Cooperative Procedures, In AAAI Symposium on Integrating Planning, Scheduling and Execution in Dynamic and Uncertain Environments AAAI Technical Report WS-98-02, 54-61, 1998 [12] R Venkateswaran and Z Obradovi'c, Efficient learning through cooperation In World Congress on Neural Networks, volume 3, pages 390-395, San Diego, June 1994 [13] Ishida, T., Two is not always better than one: Experiences in real-time bidirectional search In Proceedings of the International Conference on Multi-Agent Systems, pp 185192, 1995 130 [14] Nagendra Prasad and Eric Plaza, Corporate Memories as Distributed Case Libraries, In Corporate Memory & Enterprise Modeling track in KAW’96, Tenth Knowledge Acquisition for Knowledge-Based Systems, pp1-19, 1996 [15] Francisco J Martin, Eric Plaza and Josep L Arcos, Knowledge and Experience through Communication among Competent (Peer) Agents, International Journal of Software Engineering and Knowledge Engineering, pp1-21, 1999 [16] Francisco J Martin and Eric Plaza, Auction-based Retrieval, Second Congres Catala d'Intel.ligencia Artificial, pp 1-9, 1999 [17] Enric Plaza, Josep L Arcos and Francisco Martin, Cooperative Case-Based Reasoning, Distributed Artificial Intelligence meets Machine Learning, Lecture Notes in Artificial Intelligence, G Weiss, Springer Verlag, pp1-21, 1997 [18] Ashwin Ram and Juan Carlos Santamaria, Continuous Case-Based Reasoning, Proceedings of the AAAI-93 Workshop on Case-Based Reasoning, pp 86-93, Washington DC, July 1993 [19] Anthony G Francis and Ashwin Ram, The Utility Problem in Case Based Reasoning, Case-Based Reasoning: Papers from the 1993 Workshop, Technical Report WS-93-01, Washington, D.C., AAAI Press, July 11-12 [20] Eduardo Alonso, Mark d’Inverno, Daniel Kudenko, Michael Luck and Jason Noble, learning in Multi-Agent Systems, Third Workshop of the UK’s Special Interest Group on Multi-agent Systems, 2001 [21] Soh, L.-K and Tsatsoulis, C Reflective negotiating agents for real-time multisensor target tracking, Proc IJCAI’01, (Seattle, WA, August 6-11 2001), 1121-1127 [22] Soh, L.-K and Tsatsoulis, C Learning to form negotiation coalitions in a multiagent system, AAAI Spring Symposium on Collaborative Learning Agents, 2002 [23] Littman, M and Boyan, J A distributed reinforcement learning scheme for network routing, Proc Int Workshop on Application of Neural Networks to Telecomm., 45-51, 1993 [24] Lazarevic, A and Obradovic, Z The distributed boosting algorithm, Knowledge Discovery and Data Mining, 311-316, 2001 [25] Gerhard Weiss, Multi-agent Systems: A modern Approach to Distributed Artificial Intelligence, The MIT Press, London, England, 1999 [26] Jacobs, R., et al Adaptive mixture of local experts, Neural Computation, 3(1), 7987, 1991 [27] S Sian Adaptation based cooperative learning multi-agent systems, In Distributed AI 2, 257-272, 1991 [28] Christopher Watkins and Peter Dayan, Technical Note Q-learning, Machine Learning, 8:279-292, 1992 [29] M Tan, Multi-agent reinforcement Learning: Independent vs Cooperative Agents, Machine Learning: Proceedings of the Tenth International Conference, 330-337, 1993 131 Appendix A Tables Table Utility and difference gains for Agent A2, performing both learning mechanisms in ES2 of CEA Initiating Casebase #learnings #new cases avg util gain max util gain util gain #util gain>0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain0.400000 #util gain0.400000 #diff gain agent1.log” How can we terminate the running of programs? After we run the system around million milliseconds, there are two ways for us to terminate the running of programs We will terminate the programs in the following sequence: 1) use script command “kall gent” to kill the agent processes, 2) use script command C-1 “kall proxy” to kill the proxy processes, 3) use script command “kall radsim” to kill the RADSIM process We can terminate the programs manually by pressing “Controll” + “C” keys together Firstly, terminate the agent processes; secondly, terminate the proxy processes; and finally, terminate the RADSIM process How to obtain useful results? After we terminate the running of programs, we have four log output files ready They are agent1.log, agent2.log, agent3.log and agent4.log Then we can search the useful data in the log files and then put them together C-2 ... Cooperative Learning CaseBased Learning RuleBased Learning Individual Learning CaseBased Learning RuleBased Learning Non Multi-Agent Learning Learning about Multi-Agent System Cooperative Learning. .. Learning Individual Learning CaseBased Learning CaseBased Learning RuleBased Learning RuleBased Learning Figure Machine learning topology The distributed learning can be divided into two categories again... described in the following sections Agent Cas eBaseManager (CBM)* ParentInitiat ingCaseBase* Initiat ingCaseBase* RespondingCaseBase* ParentInitiat ingCase* Initiat ingCase* RespondingCase* Initiat ingInput

Ngày đăng: 18/10/2022, 17:04

Xem thêm:

Mục lục

    1.4 Brief Description of Our Approach

    3.1 Design of the Methodology

    3.1.1 Chronological Casebases and Usage History

    3.3.2 Domains with Incomplete Information for Multi-Agent Negotiations

    4.1.1 Autonomous Negotiating Teams (ANTS) Project

    4.1.2 Usage History and Chronological Casebases

    4.1.3 Sharing of Cases: Cooperative Trigger

    4.1.4 Sharing of Cases: Neighbor Selection

    4.1.5 Sharing of Cases: Message Passing Behavior

    4.3 Interface with Other Modules in the System

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w