Interesting future research and open questions- 123docz.net

At this point we hope it is clear for the reader that finding the best policy with thousand of network elements in a dynamic environment in less than one hour is not possible (with our present techniques) so we need to continue the development with new ideas to solve the current open issues in multi agents learning techniques: how to coordinate, how to scale, and how to manage partial or incomplete information. Some possible options to explore are to define some information that allows us to process the data quicker in order to find a solution in a limited time; or to reduce the complexity and the heterogeneity of the agents.

Other approach could be to leave the goal of finding the global optimal policy and use some near optimal concept with a trade-off measure between time to learn and error introduced in the best solution achieved. All that implies that to find a solution to a big problem we need to identify and add meta-information to drive the transformation from the original problem to a less complex task, maybe we could find a way to introduce time restriction in our algorithms. Another path to explore is the fact that existing MARL algorithms often require some additional preconditions to theoretically guarantee convergence. Relaxing these conditions and further improving of various algorithms in this context is an active field of study.

Another issue is that any realistic network will have multiple, potentially competing goals.

Then it becomes a multi-objective problem and we need to find members of the Pareto- optimal set, where no single parameter can be improved without a corresponding decrease in the optimality of another parameter (Lau & Wang, 2005). Multi-objective optimization is a well-studied topic with good results for distributed multi-objective genetic algorithms (Cardon et al., 2000). There is still much work to do improving the selection between points of the Pareto frontier and developing more learning algorithms that do not depend only on genetic ones. And, to further complicate things, not only we have multiple goals but they can also be in different levels (or granularity). As it is mention in (Kephart & Das, 2007) an important issue is how to translate from resource level parameters to high level goals such as performance metrics (response times and throughputs) or availability metrics (recovery time or down time). It is key to manage the relationship between those metrics and the control parameters. Their work is interesting and relies on utility functions (although that approach also has some drawbacks at the time to express preferences with more than two parameters). That implies we need to improve the expressiveness of the goal function if we want to continue using utility based approaches. One possible approach is to reduce complexity, maybe less parameters is better than many if we need to reduce the time of learning, so how to choose which are the most important parameters to do the reduction may be another question to research.

Finally, because the great amount of tasks presented in a network environment we need a definition of a common benchmark to test different existing (and future) approaches. If every designer continues defining a new special scenario where his algorithm performs perfect then we will never be able to compare them. We acknowledge that this is not an easy task because the benchmark needs a definition of a “typical” network whereas a network could vary a lot about “typical” characteristics as load and user’s request, being a very heterogeneous scenario. If it is not feasible to define only one benchmark we could create few illustrative scenarios with a clear list of assumptions and justification of their existence.

In this chapter we have stated several tasks that are important to manage a network in autonomous fashion; we have collected disparate approaches to multi-agent learning and

Multi-Agent Systems - Modeling, Control, Programming, Simulations and Applications 190

have linked them to different networks tasks showing current advances and open questions.

We believe significant progress can be achieved by more exchange between the fields of machine learning and network management. From one side, new features of future Internet (and networks in general) will introduce more demands and will boost the development of better algorithms so we expect the appearance of new techniques and from the other side the new techniques developed by machine learning field will allow new network functionalities that will improve its utility and our user’s experience in the near future.

6. References

Abdallah, S. & Lesser, V. (2006). Learning the task allocation game, Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, AAMAS’06, pp. 850-857, 2006Hakodate, Japan

Alba, E. (2005). Parallel metaheuristics: a new class of algorithms. John Wiley & Sons, Inc., ISBN: 978-0-471-67806-9, NJ, USA

Bennani, M. & Menascé, D. (2005). Resource allocation for autonomic data centers using analytic performance models. IEEE International conference on autonomic computing, Seattle, Washington, USA

Bernstein, D.; Zilbertein, S. & Immerman, N. (2000). The complexity of decentralized control of MDPs, Proceedings of UAI-2000: The sixteenth conference on uncertainty in artificial intelligence, 2000, California, USA

Bianchi, R.; Ribeiro, C. & Costa, A. (2007). Heuristic selection of actions in multiagent reinforcement learning. IJCAI’07, 2007, India

Boutaba, R. & Xiao, J. (2002). Network Management: State of the art. Communication systems:

The state of the art (IFIP World computer congress), Kluwer, B.V., pp. 127-146, ISBN:1- 4020-7168-X , August 2002, Deventer, The Netherlands

Boutilier, C. & Dearden, R. (1994). Using abstractions for decision-theoretic planning with time constraints”, Proceedings AAAI-94. 1994, Washington, USA

Boyan, J. & Littman, M. (1994). Packet routing in dynamically changing networks: a reinforcement learning approach. In Advances in Neural Information Processing Systems, Vol. 7 (1994) pp. 671-678

Busoniu, L. ; Babuska, R. & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning, IEEE Transaction on systems, man and cybernetics – Part C:

Applications and reviews, Vol. 38, No. 2 (2008) pp. 156-172

Cardon, A.; Galinho, T. & Vacher, J. (2000). Genetic algorithms using multi-objectives in a multiagent system. Robotics and Autonomous systems, Vol. 33, No. 3 (2000) pp. 179- 190

Caro, G. & Dorigo, M. (1998). AntNet: distributed stigmergic control for communications networks. Journal of Artificial Intelligence Research, 9, pp. 317-365, 1998

Cosmin, G.; Dan Şerban, L. & Marius Litan, C. (2010). A Framework for Building Intelligent SLA Negotiation Strategies under Time Constraints. Economics of Grids, Clouds, Systems, and Services 2010, Vol. 6296 (2010) pp. 48-61, DOI: 10.1007/978-3-642-15681- 6_4

Everett, H. (1963). Generalized lagrange multiplier method for solving problems of optimum allocation of resources. Operations Research, Vol. 11, No. 3 (1963) pp. 399- 417

How Computer Networks Can Become Smart 191 Guestrin, C.; Lagoudakis, M. & Parr, R. (2002). Coordinated reinforcement learning, in

Proceedings of the 19th International Conference on Machine Learning (ICML-02), pp.

227-234, 2002, Sydney, Australia

IBM (2001). Autonomic manifesto, www.research.ibm.com/autonomic/manifesto

Karp, B. & Kung, H. (2000), GPSR: Greedy perimeter stateless routing for wireless networks, MobiCom2000, August, 2000, Boston, USA

Kephart, J. & Das, R. (2007). Achieving self-management via utility functions, IEEE Computer Society, 2007, Vol. 1, Issue 1 (2007) pp. 40-48, ISSN: 1089-7801

Kok, J. & Vlassis, N. (2006), Collaborative multiagent reinforcement learning by payoff propagation, Journal of machine learning research 7, pp. 1789-1828, 2006

Lau, H. & Wang, H. (2005). A multi-agent approach for solving optimization problems involving expensive resources, ACM Symposium on applied computing, 2005, New Mexico, USA

Lavinal, E.; Desprats T. & Raynaud Y. (2006). A generic multi-agent conceptual framework towards self-management. IEEE/IFIP Network Operations and Management Symposium, 2006

Lee, M. & Marconett, D. & Ye, X. & Ben, S. (2007). Cognitive Network Management with Reinforcement Learning for Wireless Mesh Networks, IP Operations and Management, Lecture Notes in Computer Science, Vol. 4786 (2007) pp. 168-179, DOI:

10.1007/978-3-540-75853-2_15, 2007

Legge, D. & Baxendale, P. (2002). An agent-based network management system, Proceedings of the AISB’02 symposium on Adaptive Agents and Multi-Agent Systems, 2002, London, England

Littman, M. ; Ravi, N. & Fenson, E. et al. (2004). Reinforcement learning for autonomic network repair. IEEE 1st International Conference on autonomic Computing, 2004

Makar, R.; Mahadevan, S. & Ghavamzadeh, M. (2001). Hierarchical multi-agent reinforcement learning. Atents’01, 2001, Montréal, Quebec, Canada

Papadimitriou, C. & Tsitsiklis, J. (1987). The complexity of markov decision processes.

Mathematics of Operations Research, vol. 12, nº. 3 (1987) pp. 441-450, USA

Pynadath, D. & Tambe, M. (2002). The communicative multiagent team decision problem:

Analyzing teamwork theories and models, Journal of Artificial Intelligence Research, Vol. 16 (2002) pp. 389-423

Rodrigues, E. & Kowalczyk, R. (2007). Learning in Market-based Resource Allocation. 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), 2007, pp.475-482, ISBN: 0-7695-2841-4

Ruiz, P.; Botía, J. & Gómez-Skarmeta, A. (2004). Providing QoS through machine-learning- driven adaptive multimedia applications, IEEE Transactions on systems, man, and cybernetics—Part B: Cybernetics, Vol. 34, No. 3 (2004)

Shah, K. & Kumar, M. (2008). Resource management in wireless sensor networks using collective intelligence, Intelligent Sensors, Sensor Networks and Information Processing, 2008. ISSNIP 2008. pp. 423-428

Steinder, M. & Sethi, A. (2004) Probabilistic fault localization in communication systems using belief networks, IEEE/ACM Transactions on Networking, Vol. 12, No. 5 (2004) pp. 809-822

Multi-Agent Systems - Modeling, Control, Programming, Simulations and Applications 192

Strassner, J.; Agoulmine, N. & Lehtihet, E. (2006). FOCALE: A Novel Autonomic Networking Architecture. In Latin American Autonomic Computing Symposium (LAACS), 2006, Campo Grande, MS, Brazil.

Sutton, R. & Barto, A. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press, 1998

Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, pp. 330-337, 1993

Tianfield H. (2003). Multi-Agent based autonomic architecture for network management, IEEE, 2003

Vengerov, D.; Bambos, N. & Berenji, H. (2005). A fuzzy reinforcement learning approach to power control in wireless transmitters, Systems, Man, and Cybernetics, Part B:

Cybernetics, IEEE Transactions on, pp. 768-778, ISSN: 1083-4419, August, 2005

Wang, X. & Sandholm, T. (2002). Reinforcement learning to play an optimal Nash equilibrium in team Markov games, in Advances in neural information processing systems, 2002

Weiss, G. (1999). Ed., Multiagent Systems: A modern approach to distributed artificial intelligence.

Cambridge, MA: MIT Press, 1999

Wolfson, O.; Xu, B. & Sistla, A. (2004). An economic model for resource exchange n mobile peer to peer networks, Proceedings of the 16th International conference on scientific and statistical database management, January, 2004

Yahaya, A. (2006). iREX: Efficient inter-domain QoS policy architecture, in Proceedings of IEEE Globecom, 2006

Autonomous Decentralized Voltage Profile Control Method in Future Distribution Network

using Distributed Generators

Takao Tsuji1, Tsutomu Oyama1, Takuhei Hashiguchi2, Tadahiro Goda2, Kenji Horiuchi3, Seiji Tange3, Takao Shinji4 and Shinsuke Tsujita4

1Faculty of Engineering, Yokohama National University

2Graduate School of Information Science and Electrical Engineering, Kyushu University

3Transmission & Distribution Systems Center, Mitsubishi Electric Corp.

4Smart Energy Technology Center, Tokyo Gas Co.,Ltd.

Japan

1. Introduction

To realize the sustainable energy society, it is of prime importance to introduce the renewable energy, such as Photovoltaics (PV) and Wind Turbine generators, or Co- generation systems which can utilize the exhaust heat energy. These generators are called

“Distributed Generator (DG)” and they are introduced to the distribution network mainly as shown in fig.1. However, in the distribution network with a large number of DGs, the voltage profile maintenance becomes important issue due to the reverse power flow caused by DGs.

Conventionally, the voltage profile is controlled within the allowable range by the use of load-tap-changing substation transformer (LTC) or Static Capacitor (SC) in order to compensate the voltage drop caused by the demand-directional power flow. There are a lot of studies in which the voltage profile is maintained by the effective utilization of those facilities when a large number of DGs are introduced. For example, the optimization technology based on the global information of distribution network is utilized to determine the control actions of voltage control equipments. However, it should be difficult to control the high-speed voltage change by LTC or SC because those equipments work by switching.

Although the utilization of SVC (Static Var Compensator) is one of effective approach to realize the high-speed voltage control, it should not be desirable from a viewpoint of increase in cost.

To realize the high-speed and flexible voltage maintenance control with reducing capital investments, it should be effective to utilize the reactive power control of DGs. Supposing that a large number of DGs are introduced, it should be difficult to manage the all information of whole system and to control all DGs. Hence, the much attention is paid to the autonomous decentralized control method, for example, by P. N. Vovos or P. M. S.

Carvalho. However, the cooperative work among multiple DGs is not considered in their papers because only the information of the connection node is utilized. Although Mesut E.

Baran et al studied the cooperative work among DGs based on the multi-agent system, the

Multi-Agent Systems - Modeling, Control, Programming, Simulations and Applications 194

method is not autonomous decentralized one because the specific control signal is generated by the optimization based on the global information.

On the other hand, the authors have developed autonomous decentralized voltage control method so far using reactive power control of DGs. Specifically, voltage profile maintenance is realized by the reactive power control of inverter based on the multi-agent system. In those papers, it is supposed the agent program is installed to each DG. The agents determine their proper control actions based on the local information exchange among neighboring agents. Where, the feedback control based on integral logic is applied. The proposed method is composed by three control methods whose control purposes are different. It is possible not only to maintain the voltage profile but also to decrease the excessive reactive power.

Additionally, in our other paper, the proposed method is enhanced to realize the effective utilization of free capacity.

In this chapter, we will describe the outline of the voltage profile control method proposed in our previous work. First, the concept of the reactive power control of DG and voltage profile control method are described in section 2. Next, the basic and enhanced method of autonomous decentralized control are shown in section 3 and 4, respectively. Finally, a conclusion is provided in section 5.

Fig. 1. Distribution network with distributed generators.

Interesting future research and open questions

Principles of Multi-Agent Based Simulation

Group III: Advanced applications in various themes