At this point we hope it is clear for the reader that finding the best policy with thousand of network elements in a dynamic environment in less than one hour is not possible (with our present techniques) so we need to continue the development with new ideas to solve the current open issues in multi agents learning techniques: how to coordinate, how to scale, and how to manage partial or incomplete information. Some possible options to explore are to define some information that allows us to process the data quicker in order to find a solution in a limited time; or to reduce the complexity and the heterogeneity of the agents.
Other approach could be to leave the goal of finding the global optimal policy and use some near optimal concept with a trade-off measure between time to learn and error introduced in the best solution achieved. All that implies that to find a solution to a big problem we need to identify and add meta-information to drive the transformation from the original problem to a less complex task, maybe we could find a way to introduce time restriction in our algorithms. Another path to explore is the fact that existing MARL algorithms often require some additional preconditions to theoretically guarantee convergence. Relaxing these conditions and further improving of various algorithms in this context is an active field of study.
Another issue is that any realistic network will have multiple, potentially competing goals.
Then it becomes a multi-objective problem and we need to find members of the Pareto- optimal set, where no single parameter can be improved without a corresponding decrease in the optimality of another parameter (Lau & Wang, 2005). Multi-objective optimization is a well-studied topic with good results for distributed multi-objective genetic algorithms (Cardon et al., 2000). There is still much work to do improving the selection between points of the Pareto frontier and developing more learning algorithms that do not depend only on genetic ones. And, to further complicate things, not only we have multiple goals but they can also be in different levels (or granularity). As it is mention in (Kephart & Das, 2007) an important issue is how to translate from resource level parameters to high level goals such as performance metrics (response times and throughputs) or availability metrics (recovery time or down time). It is key to manage the relationship between those metrics and the control parameters. Their work is interesting and relies on utility functions (although that approach also has some drawbacks at the time to express preferences with more than two parameters). That implies we need to improve the expressiveness of the goal function if we want to continue using utility based approaches. One possible approach is to reduce complexity, maybe less parameters is better than many if we need to reduce the time of learning, so how to choose which are the most important parameters to do the reduction may be another question to research.
Finally, because the great amount of tasks presented in a network environment we need a definition of a common benchmark to test different existing (and future) approaches. If every designer continues defining a new special scenario where his algorithm performs perfect then we will never be able to compare them. We acknowledge that this is not an easy task because the benchmark needs a definition of a “typical” network whereas a network could vary a lot about “typical” characteristics as load and user’s request, being a very heterogeneous scenario. If it is not feasible to define only one benchmark we could create few illustrative scenarios with a clear list of assumptions and justification of their existence.
In this chapter we have stated several tasks that are important to manage a network in autonomous fashion; we have collected disparate approaches to multi-agent learning and
Multi-Agent Systems - Modeling, Control, Programming, Simulations and Applications 190
have linked them to different networks tasks showing current advances and open questions.
We believe significant progress can be achieved by more exchange between the fields of machine learning and network management. From one side, new features of future Internet (and networks in general) will introduce more demands and will boost the development of better algorithms so we expect the appearance of new techniques and from the other side the new techniques developed by machine learning field will allow new network functionalities that will improve its utility and our user’s experience in the near future.
6. References
Abdallah, S. & Lesser, V. (2006). Learning the task allocation game, Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, AAMAS’06, pp. 850-857, 2006Hakodate, Japan
Alba, E. (2005). Parallel metaheuristics: a new class of algorithms. John Wiley & Sons, Inc., ISBN: 978-0-471-67806-9, NJ, USA
Bennani, M. & Menascé, D. (2005). Resource allocation for autonomic data centers using analytic performance models. IEEE International conference on autonomic computing, Seattle, Washington, USA
Bernstein, D.; Zilbertein, S. & Immerman, N. (2000). The complexity of decentralized control of MDPs, Proceedings of UAI-2000: The sixteenth conference on uncertainty in artificial intelligence, 2000, California, USA
Bianchi, R.; Ribeiro, C. & Costa, A. (2007). Heuristic selection of actions in multiagent reinforcement learning. IJCAI’07, 2007, India
Boutaba, R. & Xiao, J. (2002). Network Management: State of the art. Communication systems:
The state of the art (IFIP World computer congress), Kluwer, B.V., pp. 127-146, ISBN:1- 4020-7168-X , August 2002, Deventer, The Netherlands
Boutilier, C. & Dearden, R. (1994). Using abstractions for decision-theoretic planning with time constraints”, Proceedings AAAI-94. 1994, Washington, USA
Boyan, J. & Littman, M. (1994). Packet routing in dynamically changing networks: a reinforcement learning approach. In Advances in Neural Information Processing Systems, Vol. 7 (1994) pp. 671-678
Busoniu, L. ; Babuska, R. & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning, IEEE Transaction on systems, man and cybernetics – Part C:
Applications and reviews, Vol. 38, No. 2 (2008) pp. 156-172
Cardon, A.; Galinho, T. & Vacher, J. (2000). Genetic algorithms using multi-objectives in a multiagent system. Robotics and Autonomous systems, Vol. 33, No. 3 (2000) pp. 179- 190
Caro, G. & Dorigo, M. (1998). AntNet: distributed stigmergic control for communications networks. Journal of Artificial Intelligence Research, 9, pp. 317-365, 1998
Cosmin, G.; Dan Şerban, L. & Marius Litan, C. (2010). A Framework for Building Intelligent SLA Negotiation Strategies under Time Constraints. Economics of Grids, Clouds, Systems, and Services 2010, Vol. 6296 (2010) pp. 48-61, DOI: 10.1007/978-3-642-15681- 6_4
Everett, H. (1963). Generalized lagrange multiplier method for solving problems of optimum allocation of resources. Operations Research, Vol. 11, No. 3 (1963) pp. 399- 417
How Computer Networks Can Become Smart 191 Guestrin, C.; Lagoudakis, M. & Parr, R. (2002). Coordinated reinforcement learning, in
Proceedings of the 19th International Conference on Machine Learning (ICML-02), pp.
227-234, 2002, Sydney, Australia
IBM (2001). Autonomic manifesto, www.research.ibm.com/autonomic/manifesto
Karp, B. & Kung, H. (2000), GPSR: Greedy perimeter stateless routing for wireless networks, MobiCom2000, August, 2000, Boston, USA
Kephart, J. & Das, R. (2007). Achieving self-management via utility functions, IEEE Computer Society, 2007, Vol. 1, Issue 1 (2007) pp. 40-48, ISSN: 1089-7801
Kok, J. & Vlassis, N. (2006), Collaborative multiagent reinforcement learning by payoff propagation, Journal of machine learning research 7, pp. 1789-1828, 2006
Lau, H. & Wang, H. (2005). A multi-agent approach for solving optimization problems involving expensive resources, ACM Symposium on applied computing, 2005, New Mexico, USA
Lavinal, E.; Desprats T. & Raynaud Y. (2006). A generic multi-agent conceptual framework towards self-management. IEEE/IFIP Network Operations and Management Symposium, 2006
Lee, M. & Marconett, D. & Ye, X. & Ben, S. (2007). Cognitive Network Management with Reinforcement Learning for Wireless Mesh Networks, IP Operations and Management, Lecture Notes in Computer Science, Vol. 4786 (2007) pp. 168-179, DOI:
10.1007/978-3-540-75853-2_15, 2007
Legge, D. & Baxendale, P. (2002). An agent-based network management system, Proceedings of the AISB’02 symposium on Adaptive Agents and Multi-Agent Systems, 2002, London, England
Littman, M. ; Ravi, N. & Fenson, E. et al. (2004). Reinforcement learning for autonomic network repair. IEEE 1st International Conference on autonomic Computing, 2004
Makar, R.; Mahadevan, S. & Ghavamzadeh, M. (2001). Hierarchical multi-agent reinforcement learning. Atents’01, 2001, Montréal, Quebec, Canada
Papadimitriou, C. & Tsitsiklis, J. (1987). The complexity of markov decision processes.
Mathematics of Operations Research, vol. 12, nº. 3 (1987) pp. 441-450, USA
Pynadath, D. & Tambe, M. (2002). The communicative multiagent team decision problem:
Analyzing teamwork theories and models, Journal of Artificial Intelligence Research, Vol. 16 (2002) pp. 389-423
Rodrigues, E. & Kowalczyk, R. (2007). Learning in Market-based Resource Allocation. 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), 2007, pp.475-482, ISBN: 0-7695-2841-4
Ruiz, P.; Botía, J. & Gómez-Skarmeta, A. (2004). Providing QoS through machine-learning- driven adaptive multimedia applications, IEEE Transactions on systems, man, and cybernetics—Part B: Cybernetics, Vol. 34, No. 3 (2004)
Shah, K. & Kumar, M. (2008). Resource management in wireless sensor networks using collective intelligence, Intelligent Sensors, Sensor Networks and Information Processing, 2008. ISSNIP 2008. pp. 423-428
Steinder, M. & Sethi, A. (2004) Probabilistic fault localization in communication systems using belief networks, IEEE/ACM Transactions on Networking, Vol. 12, No. 5 (2004) pp. 809-822
Multi-Agent Systems - Modeling, Control, Programming, Simulations and Applications 192
Strassner, J.; Agoulmine, N. & Lehtihet, E. (2006). FOCALE: A Novel Autonomic Networking Architecture. In Latin American Autonomic Computing Symposium (LAACS), 2006, Campo Grande, MS, Brazil.
Sutton, R. & Barto, A. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press, 1998
Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, pp. 330-337, 1993
Tianfield H. (2003). Multi-Agent based autonomic architecture for network management, IEEE, 2003
Vengerov, D.; Bambos, N. & Berenji, H. (2005). A fuzzy reinforcement learning approach to power control in wireless transmitters, Systems, Man, and Cybernetics, Part B:
Cybernetics, IEEE Transactions on, pp. 768-778, ISSN: 1083-4419, August, 2005
Wang, X. & Sandholm, T. (2002). Reinforcement learning to play an optimal Nash equilibrium in team Markov games, in Advances in neural information processing systems, 2002
Weiss, G. (1999). Ed., Multiagent Systems: A modern approach to distributed artificial intelligence.
Cambridge, MA: MIT Press, 1999
Wolfson, O.; Xu, B. & Sistla, A. (2004). An economic model for resource exchange n mobile peer to peer networks, Proceedings of the 16th International conference on scientific and statistical database management, January, 2004
Yahaya, A. (2006). iREX: Efficient inter-domain QoS policy architecture, in Proceedings of IEEE Globecom, 2006
11
Autonomous Decentralized Voltage Profile Control Method in Future Distribution Network
using Distributed Generators
Takao Tsuji1, Tsutomu Oyama1, Takuhei Hashiguchi2, Tadahiro Goda2, Kenji Horiuchi3, Seiji Tange3, Takao Shinji4 and Shinsuke Tsujita4
1Faculty of Engineering, Yokohama National University
2Graduate School of Information Science and Electrical Engineering, Kyushu University
3Transmission & Distribution Systems Center, Mitsubishi Electric Corp.
4Smart Energy Technology Center, Tokyo Gas Co.,Ltd.
Japan
1. Introduction
To realize the sustainable energy society, it is of prime importance to introduce the renewable energy, such as Photovoltaics (PV) and Wind Turbine generators, or Co- generation systems which can utilize the exhaust heat energy. These generators are called
“Distributed Generator (DG)” and they are introduced to the distribution network mainly as shown in fig.1. However, in the distribution network with a large number of DGs, the voltage profile maintenance becomes important issue due to the reverse power flow caused by DGs.
Conventionally, the voltage profile is controlled within the allowable range by the use of load-tap-changing substation transformer (LTC) or Static Capacitor (SC) in order to compensate the voltage drop caused by the demand-directional power flow. There are a lot of studies in which the voltage profile is maintained by the effective utilization of those facilities when a large number of DGs are introduced. For example, the optimization technology based on the global information of distribution network is utilized to determine the control actions of voltage control equipments. However, it should be difficult to control the high-speed voltage change by LTC or SC because those equipments work by switching.
Although the utilization of SVC (Static Var Compensator) is one of effective approach to realize the high-speed voltage control, it should not be desirable from a viewpoint of increase in cost.
To realize the high-speed and flexible voltage maintenance control with reducing capital investments, it should be effective to utilize the reactive power control of DGs. Supposing that a large number of DGs are introduced, it should be difficult to manage the all information of whole system and to control all DGs. Hence, the much attention is paid to the autonomous decentralized control method, for example, by P. N. Vovos or P. M. S.
Carvalho. However, the cooperative work among multiple DGs is not considered in their papers because only the information of the connection node is utilized. Although Mesut E.
Baran et al studied the cooperative work among DGs based on the multi-agent system, the
Multi-Agent Systems - Modeling, Control, Programming, Simulations and Applications 194
method is not autonomous decentralized one because the specific control signal is generated by the optimization based on the global information.
On the other hand, the authors have developed autonomous decentralized voltage control method so far using reactive power control of DGs. Specifically, voltage profile maintenance is realized by the reactive power control of inverter based on the multi-agent system. In those papers, it is supposed the agent program is installed to each DG. The agents determine their proper control actions based on the local information exchange among neighboring agents. Where, the feedback control based on integral logic is applied. The proposed method is composed by three control methods whose control purposes are different. It is possible not only to maintain the voltage profile but also to decrease the excessive reactive power.
Additionally, in our other paper, the proposed method is enhanced to realize the effective utilization of free capacity.
In this chapter, we will describe the outline of the voltage profile control method proposed in our previous work. First, the concept of the reactive power control of DG and voltage profile control method are described in section 2. Next, the basic and enhanced method of autonomous decentralized control are shown in section 3 and 4, respectively. Finally, a conclusion is provided in section 5.
Fig. 1. Distribution network with distributed generators.