Bài giảng Mạng nơ ron và ứng dụng

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	474
Dung lượng	22 MB

Nội dung

PowerPoint Presentation UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING 1 Lecture 1 Introduction to Deep Learning Efstratios Gavves UVA DEEP LEARNING COURSE – EFSTRATIOS GAV[.]

Lecture 1: Introduction to Deep Learning Efstratios Gavves UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - Prerequisites o Machine Learning o Calculus, Linear Algebra ◦ Derivatives, integrals ◦ Matrix operations ◦ Computing lower bounds, limits o Probability Theory, Statistics o Advanced programming o Time, patience & drive LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 2- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Learning Goals o Design and Program Deep Neural Networks o Advanced Optimizations (SGD, Nestorov’s Momentum, RMSprop, Adam) and Regularizations o Convolutional and Recurrent Neural Networks (feature invariance and equivariance) o Unsupervised Learning and Autoencoders o Generative models (RBMs, Variational Autoencoders, Generative Adversarial Networks) o Bayesian Neural Networks and their Applications o Advanced Temporal Modelling, Credit Assignment, Neural Network Dynamics o Biologically-inspired Neural Networks o Deep Reinforcement Learning LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 3- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Practicals o individual practicals (PyTorch) ◦ Practical 1: Convnets and Optimizations ◦ Practical 2: Recurrent Networks ◦ Practical 3: Generative Models o group presentation of an existing paper (1 group=3 persons) ◦ We’ll provide a list of papers or choose another paper (your own?) ◦ By next Monday make your team: we will prepare a Google Spreadsheet LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 4- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Grading Total Grade 100% Final Exam 50% Poster 5% LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES +0.5 Bonus Piazza Grade Total practicals 50% Practical 15% Practical 15% Practical 15% INTRODUCTION DEEP LEARNING- 5- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Overview o Course: Theory (4 hours per week) + Labs (4 hours per week) ◦ All material on http://uvadlc.github.io ◦ Book: Deep Learning by I Goodfellow, Y Bengio, A Courville (available online) o Live interactions via Piazza Please, subscribe today! ◦ Link: https://piazza.com/university_of_amsterdam/fall2018/uvadlc/home o Practicals are individual! ◦ More than encouraged to cooperate but not copy The top Piazza contributors get +0.5 grade ◦ Plagiarism checks on reports and code  Do not cheat! LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 6- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Who we are and how to reach us @egavves o Efstratios Gavves ◦ Assistant Professor, QUVA Deep Vision Lab (C3.229) ◦ Temporal Models, Spatiotemporal Deep Learning, Video Analysis Efstratios Gavves o Teaching Assistants ◦ Kirill Gavrilyuk, Berkay Kicanaoglu, Tom Runia, Jorn Peters, Maurice Weiler Me :P Kirill LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES Berkay Tom Jorn Maurice INTRODUCTION DEEP LEARNING- 7- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Lecture Overview o Applications of Deep Learning in Vision, Robotics, Game AI, NLP o A brief history of Neural Networks and Deep Learning o Neural Networks as modular functions LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 8- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Applications of Deep Learning UVA DEEP LEARNING COURSE EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - Deep Learning in practice YouTube Youtube Youtube Youtube LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES Website INTRODUCTION DEEP LEARNING- 10 - 10 DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS LSTMs o RNNs 𝑐𝑡 = 𝑊 ⋅ tanh(𝑐𝑡−1 ) + 𝑈 ⋅ 𝑥𝑡 + 𝑏 o LSTMs 𝑖 = 𝜎 𝑥𝑡 𝑈 (𝑖) + 𝑚𝑡−1 𝑊 (𝑖) 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐෥𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐෥𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝑚𝑡−1 𝜎 𝑜𝑡 𝑐෥𝑡 𝜎 𝑚𝑡 Output 𝑥𝑡 Input RECURRENT NEURAL NETWORKS - 54 LSTMs: A marking difference Additivity leads to strong gradients Bounded by sigmoidal 𝑓 o RNNs 𝑐𝑡 = 𝑊 ⋅ tanh(𝑐𝑡−1 ) + 𝑈 ⋅ 𝑥𝑡 + 𝑏 o LSTMs 𝑐 (𝑖) (𝑖) 𝑖 = 𝜎 𝑥𝑡 𝑈 + 𝑚𝑡−1 𝑊 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐෥𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐෥𝑡 ⊙ 𝑖 𝑚 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 o The previous state 𝑐𝑡−1 and the next state 𝑐𝑡 are connected by addition 𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝜎 𝑜𝑡 𝑐෥𝑡 𝜎 𝑚𝑡 Output 𝑡−1 𝑥𝑡 Input Nice tutorial: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 55 Cell state 𝑈 (𝑖) 𝑊 (𝑖) 𝑖 = 𝜎 𝑥𝑡 + 𝑚𝑡−1 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐෥𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐෥𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES 𝑐𝑡−1 𝑐𝑡 + 𝑓𝑡 𝜎 𝑚𝑡−1 𝑖𝑡 𝜎 Cell state line 𝑜𝑡 𝑐෥𝑡 𝜎 𝑚𝑡 𝑥𝑡 RECURRENT NEURAL NETWORKS - 56 LSTM nonlinearities 𝑈 (𝑖) 𝑊 (𝑖) 𝑖 = 𝜎 𝑥𝑡 + 𝑚𝑡−1 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐෥𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐෥𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝜎 𝑚𝑡−1 𝑜𝑡 𝑐෥𝑡 𝜎 𝑚𝑡 𝑥𝑡 o 𝜎 ∈ (0, 1): control gate – something like a switch o ∈ −1, : recurrent nonlinearity UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 57 LSTM Step by Step #1 𝑖 = 𝜎 𝑥𝑡 𝑈 (𝑖) + 𝑚𝑡−1 𝑊 (𝑖) 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐෥𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐෥𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝑚𝑡−1 𝜎 𝑜𝑡 𝑐෥𝑡 𝜎 𝑚𝑡 𝑥𝑡 RECURRENT NEURAL NETWORKS - 58 LSTM Step by Step #2 𝑖 = 𝜎 𝑥𝑡 𝑈 (𝑖) + 𝑚𝑡−1 𝑊 (𝑖) 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐෥𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐෥𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝑚𝑡−1 𝜎 𝑜𝑡 𝑐෥𝑡 𝜎 𝑚𝑡 𝑥𝑡 o Decide what new information is relevant from the new input and should be added to the new memory ◦Modulate the input 𝑖𝑡 ◦Generate candidate memories 𝑐෥𝑡 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 59 LSTM Step by Step #3 𝑈 (𝑖) 𝑊 (𝑖) 𝑐𝑡−1 𝑖 = 𝜎 𝑥𝑡 + 𝑚𝑡−1 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑓 (𝑜) (𝑜) 𝑜 = 𝜎 𝑥𝑡 𝑈 + 𝑚𝑡−1 𝑊 𝑐෥𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝜎 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐෥𝑡 ⊙ 𝑖 𝑚 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 𝑥 o Compute and update the current cell state 𝑐𝑡 + 𝑖𝑡 𝑡 𝑡−1 𝑐𝑡 𝜎 𝑜𝑡 𝑐෥𝑡 𝜎 𝑚𝑡 𝑡 ◦Depends on the previous cell state ◦What we decide to forget ◦What inputs we allow ◦The candidate memories UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 60 LSTM Step by Step #4 𝑖 = 𝜎 𝑥𝑡 𝑈 (𝑖) + 𝑚𝑡−1 𝑊 (𝑖) 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐෥𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐෥𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 o Modulate the output 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝜎 𝑜𝑡 𝑐෥𝑡 𝜎 𝑚𝑡 𝑥𝑡 ◦Does the new cell state relevant?  Sigmoid ◦If not  Sigmoid o Generate the new memory UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 61 Unrolling the LSTMs o Just the same like for RNNs o The engine is a bit different (more complicated) ◦Because of their gates LSTMs capture long and short term dependencies + × + × 𝜎 𝜎 𝜎 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES tanh × × + × × × 𝜎 𝜎 𝜎 × × 𝜎 𝜎 𝜎 RECURRENT NEURAL NETWORKS - 62 LSTM variants o LSTM with peephole connections o Gates have access also to the previous cell states 𝑐_(𝑡−1) (not only memories) o Bi-directional recurrent networks o Gated Recurrent Units (GRU) o Phased LSTMs o Skip LSTMs o And many more … UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 63 Encoder-Decoder Architectures Погода LSTM Today LSTM the LSTM weather LSTM is Encoder UVA DEEP LEARNING COURSE EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 64 LSTM good LSTM сегодня хорошая LSTM LSTM Погода сегодня Decoder LSTM хорошая Machine translation o The phrase in the source language is one sequence ◦“Today the weather is good” o It is captured by an Encoder LSTM o The phrase in the target language is also a sequence ◦“Погода сегодня хорошая” o It is captured by a Decoder LSTM Погода LSTM Today LSTM the LSTM weather LSTM LSTM LSTM is good Encoder UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES сегодня хорошая LSTM LSTM Погода сегодня LSTM хорошая Decoder RECURRENT NEURAL NETWORKS - 65 Image captioning o Similar to image translation o The only difference is that the Encoder LSTM is an image ConvNet ◦VGG, ResNet, … o Keep decoder the same Today Convnet LSTM the LSTM Today UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES weather LSTM the is LSTM weather good LSTM is LSTM good RECURRENT NEURAL NETWORKS - 66 Image captioning demo Click to go to the video in Youtube UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 67 o Sequential data o Recurrent Neural Networks o Backpropagation through time Summary o Exploding and vanishing gradients o LSTMs and variants o Encoder-Decoder Architectures UVA DEEP LEARNING COURSE EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 68

Ngày đăng: 02/10/2023, 13:36