Ontheproblemofsynthesizinggesturesfor3Dvirtualhuman Lê Kim Thư Trường Đại học Công nghệ Luận văn Thạc sĩ ngành: Công nghệ thông tin Người hướng dẫn: TS Bùi Thế Duy Năm bảo vệ: 2010 Keywords: Đồ họa ba chiều; Người ảo; Cử chỉ; Công nghệ thông tin Content CONTENTS CONTENTS List of figures .6 INTRODUCTION CHAPTER I: BACKGROUND 1.1 Gesture synthesis 1.2 Reinforcement learning 11 1.2.1 Model description 11 1.2.2 Solutions 13 1.3 Exploitation versus exploration 15 1.4 Exploration strategies 16 1.4.1 Predefined trajectory strategy 16 Next-Best-View 17 CHAPTER II: 20 A MODEL FOR LEARNING PICKING UP PROBLEM 20 1.1 Problem representation 20 1.2 Model identification 21 1.3 Propose exploration strategy 24 CHAPTER III 27 IMPLEMENTATION, EXPERIMENT AND EVALUATION 27 3.1 Overview 27 3.2 Experiment and result 30 CONCLUSION 33 References 34 References A Jasra, C C Holmes and D A Stephens - Statistical Science, Vol 20, No (Feb., 2005), pp 50-67 - Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling JesseHoey, Pascal Poupart, Jennifer Boger, Craig Boutilier, Geoff Fernie, Alex Mihailidis - Proc ofthe 19th international joint conference on Artificial intelligence (2005) - A decision-theoretic approach to task assistance for persons with dementia Craig Boutilier, Relu Patrascu, Pascal Poupart, Dale Schuurmans - Proc ofthe 19th international joint conference on Artificial intelligence (2005) - Regret-based utility elicitation in constraintbased decision problems Dinah Rosenberg, Eilon Solanand, Nicolas Vieille - The annals of statistics Vol.30, No.4 (2002) Black well optimality in Markov decision processes with partial observation Omar Zia Khan, Pascal Poupart, James P Black - Proc ofthe 19th International conference on Automated planning and scheduling (2009) - Minimal sufficient explanations for factored Markov decision processes Abhijit Gosavi - INFORMS Journal on Computing Vol 21 , Issue (2009) - Reinforcement learning: A Tutorial survey and recent advances Leslie Pack Kaelbling, Michael L Littman, Andrew W Moore - Journal of artificial intelligence research Vol (1996) - Reinforcement learning: A survey (1996) Nils J Nilsson - draft demo (2000) - Learning strategies for mid-level robot control: Some preliminary considerations and experiments David H Wolpert, Kagan Tumer - Journal of artificial intelligence research Vol 16 (2002) Collective intelligence, data routing and Braess' paradox Richard S Sutton, Andrew G Barto - A Bradford book, MIT Press (1998) - Reinforcement learning: An Introduction 10 Relu Patrascu, Pascal Poupart, Dale Schuurmans, Craig Boutilier, Carlos Guestrin - Eighteenth national conference on Artificial intelligence, Canada (2002) - Greedy linear valueapproximation for factored Markov decision processes 11 Mance E Harmon, Stephanie S Harmon - Wright Laboratory, Centerville (USA) (1996) Reinforcement Learning: A Tutorial 12 Claude F Touzet - 1999 International joint conference on neural networks (1999) - Neural Networks and Q-Learning for Robotics 13 Sridhar Mahadevan - Proc ofthe 21st Annual conference on uncertainty in artificial intelligence (2005) - Representation policy iteration 14 Carlos Guestrin, Geoffrey Gordon - Proc ofthe 18th Conference on uncertainty in artificial intelligence, pp 197-206, (2002) - Distributed planning in hierarchical factored MDPs 15 D.P De Farias, B Van Roy - INFORMS journal on computing Vol 51 (2003) - The linear programming approach to approximate dynamic programming 16 Barto A G., Anderson C W., and Sutton R S - Biological Cybernetics, 43:175185 - (1982) Synthesis of nonlinear control surfaces by a layered associative search network 17 Barto, A G., Sutton, R S., and Brouwer, P S - IEEE Transactions on systems, san, and cybernetics, 40:201-211 (1981) - Associative search network: A reinforcement learning associative memory 18 Michael Bowling , Alborz Geramifard , David Wingate - Proc ofthe 7th international joint conference on Autonomous agents and multiagent systems (2008) - Sigma point policy iteration 19 Daphne Koller, Ronald Parr - Proceedings ofthe 16th International joint conference on artificial intelligence (1999) - Computing factored value functions for policies in structured MDPs 20 Carlos Guestrin , Daphne Koller, Ronald Parr - Advances in neural information processing system (2001) - Multiagent planning with factored MDPs 21 Carlos Guestrin , Daphne Koller, Ronald Parr, shobha Venkataraman - Journal of Artificial intelligence research (2003) - Efficient solution algorithms for factored MDPs 22 Richard Bellman - Proc Computers in control systems conference (1957) - Dynamic programming and the computational solution of feedback design 23 Richard Bellman - Information and Control, 1, 228-239 (1958) - Dynamic programming and stochastic control processes 24 Bridle, J S - In Touretzky, D S., editor, Advances in Neural information processing systems 2, 211217 (1990) - Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimates of parameters 25 Watkins- PhD thesis, Cambridge University, England (1989) - Learning from Delayed Rewards 26 Haye Lau - Proceedings of 2003 Australasian conference on robotics and automation (2003) Behavioral approach for multi-robot exploration, 27 Ioannis M Rekleitis, Gregory Dudek, and Evangelos E Milios - In Proc of IEEE international conference on robotics and automation, 3164-3169, San Francisco California, (2001) - Multi-robot collaboration for robust exploration 28 David Filliat, Jean-Arcady Meyer - Proc ofthe seventh international conference on simulation of adaptive behavior on From animals to animates (2002) - Global localization and topological maplearning for robot navigation 29 R Iglesias et al - Proc ofthe 3rd European conference on mobile robotics (2007) - Improving reinforcement learning through a better exploration strategy and an adjustable representation ofthe environment 30 Brian Yamauchi, Alan Schultz, and William Adams - Adaptive Behavior, Vol 7, No 2, 217-229 (1999) - Integrating exploration and localization for mobile robots 31 Eric Bourque, Gregory Dudek - Proc Ofthe IEEE International Conference on Intelligent Robotic Systems (IROS) Vol 526-532 (1998) - Viewpoint Selection - An autonomous robotic system forvirtual environment creation 32 John J Leonard, Hans Jacob S Feder - Robotics Research: the Ninth international symposium, Springer-Verlag (2000) - A computationally efficient method for large-scale concurrent mapping and localization 33 Yamauchi B - Proc ofthe second international conference on autonomous agents, 47-53 (1998) Frontier-based exploration using multiple robots 34 Kazunori Iwata , Nobuhiro Ito , Koichiro Yamauchi , and Naohiro Ishii - Springer-Verlag Berlin Heidelberg, IDEAL 326-331, (2000)- Combining exploitation-based and exploration-based approach in reinforcement learning 35 ZivA, SmallSD, WolpePR- Med Teacher 489-495 (2000) - Patient safety and simulation-based medical education 36 Joono Cheong, Wan K Chung, Youngil Youm - IEEE Transactions on robotics and automation vol 20 (2004) - Inverse kinematics of multilink flexible 37 Karan Singh, Eugene Fiume - Proc ofthe 25th annual conference on Computer graphics and interactive techniques, 405 - 414 (1998) Wires: A Geometric Deformation Technique 38 Ali Orkan Bayer, Aya Mge Sevin, Tolga Can - WSCG, February - 7, 2008 (2008) - Human skeletal and muscle deformation animation using motion capture data 39 Laurent Moccozet, Nadia Magnenat Thalmann - VSMM (1997) - Multilevel deformation model applied to hand simulation forvirtual actors 40 Sang Il Park, Jessica K Hodgins - ACM Transaction on Graphics (2006) - Capturing and animating skin deformation in human motion 41 Xiao song Yang, Arun Somasekharan, Jian J Zhang - Computer animation virtual worlds 281-292 (2006) - Curve skeleton skinning forhuman and creature characters 42 Alexis Angelidis, Karan Singh - Eurographics /ACM SIGGRAPH Symposium on computer animation (2007) - Kinodynamic skinning using volume-preserving deformations 43 Alexandre Bouenard, Sylvie Gibet, Marcelo M Wanderley, - Computer animation and social agents conference (2009)- Hybrid motion control combining inverse kinematics and inverse dynamics controllers for simulating percussion gestures 44 http://www.simulation-systems.co.uk/ 45 http://tek3d.org/mo-phong-oto-car-simulator 46 http://www.diamondcomics.com/public/0ppBuys/Previews0308/04%20M AR08%20Toys.pdf ... international conference on simulation of adaptive behavior on From animals to animates (2002) - Global localization and topological maplearning for robot navigation 29 R Iglesias et al - Proc of the. .. international conference on robotics and automation, 3164-3169, San Francisco California, (2001) - Multi-robot collaboration for robust exploration 28 David Filliat, Jean-Arcady Meyer - Proc of the. .. A computationally efficient method for large-scale concurrent mapping and localization 33 Yamauchi B - Proc of the second international conference on autonomous agents, 47-53 (1998) Frontier-based