PowerPoint Presentation UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING 1 Lecture 1 Introduction to Deep Learning Efstratios Gavves UVA DEEP LEARNING COURSE – EFSTRATIOS GAV[.]
Lecture 1: Introduction to Deep Learning Efstratios Gavves UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - Prerequisites o Machine Learning o Calculus, Linear Algebra ◦ Derivatives, integrals ◦ Matrix operations ◦ Computing lower bounds, limits o Probability Theory, Statistics o Advanced programming o Time, patience & drive LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 2- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Learning Goals o Design and Program Deep Neural Networks o Advanced Optimizations (SGD, Nestorov’s Momentum, RMSprop, Adam) and Regularizations o Convolutional and Recurrent Neural Networks (feature invariance and equivariance) o Unsupervised Learning and Autoencoders o Generative models (RBMs, Variational Autoencoders, Generative Adversarial Networks) o Bayesian Neural Networks and their Applications o Advanced Temporal Modelling, Credit Assignment, Neural Network Dynamics o Biologically-inspired Neural Networks o Deep Reinforcement Learning LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 3- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Practicals o individual practicals (PyTorch) ◦ Practical 1: Convnets and Optimizations ◦ Practical 2: Recurrent Networks ◦ Practical 3: Generative Models o group presentation of an existing paper (1 group=3 persons) ◦ We’ll provide a list of papers or choose another paper (your own?) ◦ By next Monday make your team: we will prepare a Google Spreadsheet LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 4- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Grading Total Grade 100% Final Exam 50% Poster 5% LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES +0.5 Bonus Piazza Grade Total practicals 50% Practical 15% Practical 15% Practical 15% INTRODUCTION DEEP LEARNING- 5- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Overview o Course: Theory (4 hours per week) + Labs (4 hours per week) ◦ All material on http://uvadlc.github.io ◦ Book: Deep Learning by I Goodfellow, Y Bengio, A Courville (available online) o Live interactions via Piazza Please, subscribe today! ◦ Link: https://piazza.com/university_of_amsterdam/fall2018/uvadlc/home o Practicals are individual! ◦ More than encouraged to cooperate but not copy The top Piazza contributors get +0.5 grade ◦ Plagiarism checks on reports and code Do not cheat! LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 6- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Who we are and how to reach us @egavves o Efstratios Gavves ◦ Assistant Professor, QUVA Deep Vision Lab (C3.229) ◦ Temporal Models, Spatiotemporal Deep Learning, Video Analysis Efstratios Gavves o Teaching Assistants ◦ Kirill Gavrilyuk, Berkay Kicanaoglu, Tom Runia, Jorn Peters, Maurice Weiler Me :P Kirill LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES Berkay Tom Jorn Maurice INTRODUCTION DEEP LEARNING- 7- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Lecture Overview o Applications of Deep Learning in Vision, Robotics, Game AI, NLP o A brief history of Neural Networks and Deep Learning o Neural Networks as modular functions LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION DEEP LEARNING- 8- DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS Applications of Deep Learning UVA DEEP LEARNING COURSE EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING - Deep Learning in practice YouTube Youtube Youtube Youtube LEARNING COURSE – EFSTRATIOS GAVVES UVA UVA DEEPDEEP LEARNING COURSE – EFSTRATIOS GAVVES Website INTRODUCTION DEEP LEARNING- 10 - 10 DEEPER INTO DEEP LEARNING ANDTOOPTIMIZATIONS LSTMs o RNNs 𝑐𝑡 = 𝑊 ⋅ tanh(𝑐𝑡−1 ) + 𝑈 ⋅ 𝑥𝑡 + 𝑏 o LSTMs 𝑖 = 𝜎 𝑥𝑡 𝑈 (𝑖) + 𝑚𝑡−1 𝑊 (𝑖) 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝑚𝑡−1 𝜎 𝑜𝑡 𝑐𝑡 𝜎 𝑚𝑡 Output 𝑥𝑡 Input RECURRENT NEURAL NETWORKS - 54 LSTMs: A marking difference Additivity leads to strong gradients Bounded by sigmoidal 𝑓 o RNNs 𝑐𝑡 = 𝑊 ⋅ tanh(𝑐𝑡−1 ) + 𝑈 ⋅ 𝑥𝑡 + 𝑏 o LSTMs 𝑐 (𝑖) (𝑖) 𝑖 = 𝜎 𝑥𝑡 𝑈 + 𝑚𝑡−1 𝑊 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐𝑡 ⊙ 𝑖 𝑚 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 o The previous state 𝑐𝑡−1 and the next state 𝑐𝑡 are connected by addition 𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝜎 𝑜𝑡 𝑐𝑡 𝜎 𝑚𝑡 Output 𝑡−1 𝑥𝑡 Input Nice tutorial: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 55 Cell state 𝑈 (𝑖) 𝑊 (𝑖) 𝑖 = 𝜎 𝑥𝑡 + 𝑚𝑡−1 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES 𝑐𝑡−1 𝑐𝑡 + 𝑓𝑡 𝜎 𝑚𝑡−1 𝑖𝑡 𝜎 Cell state line 𝑜𝑡 𝑐𝑡 𝜎 𝑚𝑡 𝑥𝑡 RECURRENT NEURAL NETWORKS - 56 LSTM nonlinearities 𝑈 (𝑖) 𝑊 (𝑖) 𝑖 = 𝜎 𝑥𝑡 + 𝑚𝑡−1 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝜎 𝑚𝑡−1 𝑜𝑡 𝑐𝑡 𝜎 𝑚𝑡 𝑥𝑡 o 𝜎 ∈ (0, 1): control gate – something like a switch o ∈ −1, : recurrent nonlinearity UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 57 LSTM Step by Step #1 𝑖 = 𝜎 𝑥𝑡 𝑈 (𝑖) + 𝑚𝑡−1 𝑊 (𝑖) 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝑚𝑡−1 𝜎 𝑜𝑡 𝑐𝑡 𝜎 𝑚𝑡 𝑥𝑡 RECURRENT NEURAL NETWORKS - 58 LSTM Step by Step #2 𝑖 = 𝜎 𝑥𝑡 𝑈 (𝑖) + 𝑚𝑡−1 𝑊 (𝑖) 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝑚𝑡−1 𝜎 𝑜𝑡 𝑐𝑡 𝜎 𝑚𝑡 𝑥𝑡 o Decide what new information is relevant from the new input and should be added to the new memory ◦Modulate the input 𝑖𝑡 ◦Generate candidate memories 𝑐𝑡 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 59 LSTM Step by Step #3 𝑈 (𝑖) 𝑊 (𝑖) 𝑐𝑡−1 𝑖 = 𝜎 𝑥𝑡 + 𝑚𝑡−1 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑓 (𝑜) (𝑜) 𝑜 = 𝜎 𝑥𝑡 𝑈 + 𝑚𝑡−1 𝑊 𝑐𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝜎 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐𝑡 ⊙ 𝑖 𝑚 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 𝑥 o Compute and update the current cell state 𝑐𝑡 + 𝑖𝑡 𝑡 𝑡−1 𝑐𝑡 𝜎 𝑜𝑡 𝑐𝑡 𝜎 𝑚𝑡 𝑡 ◦Depends on the previous cell state ◦What we decide to forget ◦What inputs we allow ◦The candidate memories UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 60 LSTM Step by Step #4 𝑖 = 𝜎 𝑥𝑡 𝑈 (𝑖) + 𝑚𝑡−1 𝑊 (𝑖) 𝑓 = 𝜎 𝑥𝑡 𝑈 (𝑓) + 𝑚𝑡−1 𝑊 (𝑓) 𝑜 = 𝜎 𝑥𝑡 𝑈 (𝑜) + 𝑚𝑡−1 𝑊 (𝑜) 𝑐𝑡 = tanh(𝑥𝑡 𝑈 𝑔 + 𝑚𝑡−1 𝑊 (𝑔) ) 𝑐𝑡 = 𝑐𝑡−1 ⊙ 𝑓 + 𝑐𝑡 ⊙ 𝑖 𝑚𝑡 = 𝑐𝑡 ⊙ 𝑜 o Modulate the output 𝑐𝑡−1 𝑐𝑡 + 𝑖𝑡 𝑓𝑡 𝜎 𝜎 𝑜𝑡 𝑐𝑡 𝜎 𝑚𝑡 𝑥𝑡 ◦Does the new cell state relevant? Sigmoid ◦If not Sigmoid o Generate the new memory UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 61 Unrolling the LSTMs o Just the same like for RNNs o The engine is a bit different (more complicated) ◦Because of their gates LSTMs capture long and short term dependencies + × + × 𝜎 𝜎 𝜎 UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES tanh × × + × × × 𝜎 𝜎 𝜎 × × 𝜎 𝜎 𝜎 RECURRENT NEURAL NETWORKS - 62 LSTM variants o LSTM with peephole connections o Gates have access also to the previous cell states 𝑐_(𝑡−1) (not only memories) o Bi-directional recurrent networks o Gated Recurrent Units (GRU) o Phased LSTMs o Skip LSTMs o And many more … UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 63 Encoder-Decoder Architectures Погода LSTM Today LSTM the LSTM weather LSTM is Encoder UVA DEEP LEARNING COURSE EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 64 LSTM good LSTM сегодня хорошая LSTM LSTM Погода сегодня Decoder LSTM хорошая Machine translation o The phrase in the source language is one sequence ◦“Today the weather is good” o It is captured by an Encoder LSTM o The phrase in the target language is also a sequence ◦“Погода сегодня хорошая” o It is captured by a Decoder LSTM Погода LSTM Today LSTM the LSTM weather LSTM LSTM LSTM is good Encoder UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES сегодня хорошая LSTM LSTM Погода сегодня LSTM хорошая Decoder RECURRENT NEURAL NETWORKS - 65 Image captioning o Similar to image translation o The only difference is that the Encoder LSTM is an image ConvNet ◦VGG, ResNet, … o Keep decoder the same Today Convnet LSTM the LSTM Today UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES weather LSTM the is LSTM weather good LSTM is LSTM good RECURRENT NEURAL NETWORKS - 66 Image captioning demo Click to go to the video in Youtube UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 67 o Sequential data o Recurrent Neural Networks o Backpropagation through time Summary o Exploding and vanishing gradients o LSTMs and variants o Encoder-Decoder Architectures UVA DEEP LEARNING COURSE EFSTRATIOS GAVVES RECURRENT NEURAL NETWORKS - 68