Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 51 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
51
Dung lượng
11,32 MB
Nội dung
Mastering the game of Go with deep neural networks and tree search Nature, Jan, 2016 Roadmap What this paper is about? • Deep Learning • Search problem • How to explore a huge tree (graph) AlphaGo Video https://www.youtube.com/watch?v=53YLZBSS0cc https://www.youtube.com/watch?v=g-dKXOlsf98 rank AlphaGo vs*European*Champion*(Fan*Hui 27Dan) * October$5$– 9,$2015 I Time+limit:+1+hour I AlphaGo Wins (5:0) AlphaGo vs*World*Champion*(Lee*Sedol 97Dan) ource: Josun+Times Jan+28th 2015 March$9$– 15,$2016 I Time+limit:+2+hours Venue:+Seoul,+Four+Seasons+Hotel Lee*Sedol wiki Photo+ source: Maeil+Economics 2013/04 Lee Sedol =$multiple$machines European$champion The Game MiniMax in Tic Tac Toe Adversarial"Search"–"MiniMax"" 1" J1" 0" 0" 0" J1" J1" J1" 1" 1" 0" J1" 5" Adversarial"Search"–"MiniMax"" J1" J1" 0" J1" 0" 1" J1" J1" 1" J1" 0" 0" 0" J1" 1" 1" J1" J1" 1" 1" 0" J1" 6" What is the problem? Generate the Search Tree use MinMax Search The Size of the Tree Tic Tac Toe: b = 9, d =9 Chess: b = 35, d =80 Go: b = 250, d =150 b : number of legal move per position d : its depth (game length) One Grain of Rice https://www.youtube.com/watch?v=byk3pA1GPgU The “Space” of GO Game How about other Games? Tic Tac Toe: b = 9, d =9 Chess: b = 35, d =80 Go: b = 250, d =150 • Flappy bird? • Angry Bird? • Starcraft? • learning a language • Write a paper • Get a MS/PhD degree • Finding a job • Life How to solve? Chess (1996) Monte Carlo Las Vegas Monte"Carlo"Tree"Search" Tree"search" ……." ……." ……." ……." Monte"Carlo"search" ……." ……." ……." ……." ……." Monte"Carlo"Tree"Search" • Tree"Search"+"Monte"Carlo"Method"" – SelecIon" – Expansion" – SimulaIon" – BackJPropagaIon" 3/5" white"wins"/"total" 2/3" 1/1" 1/2" 1/2" 1/1" 1/1" 0/1" 0/1" 8"