Báo cáo đồ án tốt nghiệp xe tự hành dùng Convolutional Neural Network, mạng neural. Có file code, clip hướng dẫn, powerpoint đầy đủ ở tệp đính kèm.Báo cáo đồ án tốt nghiệp xe tự hành dùng Convolutional Neural Network, mạng neural. Có file code, clip hướng dẫn, powerpoint đầy đủ ở tệp đính kèm.
THESIS TASKS Student name: Dao Duy Phuong Student ID: 15151199 Student name: Phan Vo Thanh Lam Student ID: 15151173 Major: Automation and Control Engneering Technology Program: Full-time program School year: 2015 – 2019 Class: 151511C I THESIS NAME: VISION-BASED NAVIGATION OF AUTONOMOUS CAR USING CONVOLUTIONAL NEURAL NETWORK II TASKS: INITIAL FIGURES AND DOCUMENTS: CONTENT OF THE THESIS: - In this thesis we researched and built a Car based on the theory of an autonomous vehicle, we provided a general survey on this issue, datasets and methods in computer vision for autonomous vehicles - Using Convolutional Neural Network directly with raw input images to a predicted steering angle and detect some object about traffic sign (Left, Right, Stop) and “Car” object as output III RECEIVED DATE: IV THESIS COMPLETED DATE: V ADVISOR: My-Ha Le, PhD Ho Chi Minh, July 02 2019 Ho Chi Minh, July 02 2019 Advisor Head of department (signature) (signature) i SCHEDULE Student name: Dao Duy Phuong Student ID: 15151199 Student name: Phan Vo Thanh Lam Student ID: 15151173 Major: Automation and Control Engneering Technology Program: Full-time program School year: 2015 – 2019 Class: 151511C THESIS NAME: VISION-BASED NAVIGATION OF AUTONOMOUS CAR USING CONVOLUTIONAL NEURAL NETWORK Week/Day Student Content Advisor 01/02/201907/02/2019 Dao Duy Phuong Phan Vo Thanh Lam Register topic with instructor 08/03/201915/03/2019 Dao Duy Phuong Phan Vo Thanh Lam List of components need to buy (RC Car platform, Driver, Camera, Battery, Servo, Raspberry Pi…) 16/03/201901/04/2019 Dao Duy Phuong Phan Vo Thanh Lam Research on theory of Convolutional Neural Network (CNN), Deep Learning (DL) Phan Vo Thanh Lam Designed PCB, connect all components base on RC Car platform 02/04/201920/04/2019 Dao Duy Phuong Phan Vo Thanh Lam 21/04/201902/05/2019 Dao Duy Phuong Collect data to train Study how to control and use the embedded computer (Raspberry Pi and Jetson Nano) Study how to train Classification and Object Detection model ii Phan Vo Thanh Lam 03/05/201920/05/2019 Dao Duy Phuong Program to control Servo and Motor, communicate between Raspberry Pi, Jetson Nano Kit and Arduino Train data and output the model 21/05/201901/06/2019 Dao Duy Phuong Phan Vo Thanh Lam Combine all components (Hardware - Software) 02/06/201920/06/2019 Dao Duy Phuong Phan Vo Thanh Lam Operate in the outdoor environment and adjust to the best operation 21/06/201901/07/2019 Dao Duy Phuong Phan Vo Thanh Lam Write the technical report ADVISOR (signature) iii ASSURANCE STATEMENT We hereby certify that the implementation for this topic has been by ourselves and depended on previous documents We have not reused or copied from any documents without reference Authors iv ACKNOWLEDGEMENT First and foremost, we would like to thanks to our thesis advisor, Dr Le My Ha, for his help and advice for me very much in the process of implementing the topic His guidance is an important factor for us to succeed in this project Second, we would like to thank the Faculty of Electrical and Electronics, teachers who have taught and mentored us throughout the past school years to train to us knowledge During our thesis, I received a lot of help, suggestions and enthusiastic advice from teachers Third, we would like to thank Intelligent System Laboratory (IS Lab) for the support of facilities and enabling us to carry out the thesis Finally, we would like to thank the families and friends as well as the class members who always stand by and support me in the process of implementing the topic In the past time, we have tried very hard to complete our topic, because of the limited knowledge, research content and time, it will surely have many shortcomings We sincerely thank you Ho Chi Minh City, July 02 2019 Dao Duy Phuong Phan Vo Thanh Lam v ADVISOR’S COMMENT SHEET Student name: Dao Duy Phuong Student ID: 15151199 Student name: Phan Vo Thanh Lam Student ID: 15151173 Major: Automation and Control Engneering Technology Program: Full-time program School year: 2015 – 2019 Class: 151511C About the thesis contents: Students have implemented the final project satisfied the requirement of student undergraduate program Advantage: -System operated stably in outdoor environments -Recognition the road sign with high accuracy Disadvantage: The system hasn’t test on the different environments Propose defending thesis? Permit students to present thesis Rating: Excellent Mark: 9.8 (In writing: Nine point eight) Ho Chi Minh City, July 02, 2019 Advisor vi REVIEWER’S COMMENT SHEET Student name: Dao Duy Phuong Student ID: 15151199 Student name: Phan Vo Thanh Lam Student ID: 15151173 Major: Automation and Control Engneering Technology Program: Full-time program School year: 2015 – 2019 Class: 151511C About the thesis contents: Đồ án phù hợp với yêu cầu cấu trúc, kỹ thuật trình bày nội dung đồ án tốt nghiệp Nội dung đồ án tốt nghiệp tạo tín hiệu điều khiển xe tự lái từ hình ảnh từ camera thời gian thực Các tác giả thực nội dung gồm mơ hình phân loại phát đối tượng Các tác giả sử dụng Raspberry Pi mơ hình Phân loại Jetson Nano mơ hình Phát đối tượng Các giải thuật trình bày đồ án tốt nghiệp gồm giải thuật tạo tập liệu thời gian thực, giải thuật huấn luyện cho mạng thần kinh loại CNN Các kết luận, đánh giá, nhận xét nghiên cứu tác giả đưa chưa thật thuyết phục khơng có so sánh với nghiên cứu khác khơng có số liệu đánh giá nhược điểm đồ án tốt nghiệp Opinions – Conclusions - Viết lại tổng quan đặc biệt lưu ý đến nghiên cứu gần - Để có số liệu so sánh làm bật ưu khuyết điểm kỹ thuật mà tác giả đề xuất tác giả cần phân tích nghiên cứu kết họ đạt được, cần có thử nghiệm, có đo đạc hệ thống Đây sở để tác giả viết lại kết luận - Việc trích dẫn chưa hợp lý, cần điều chỉnh Các tài liệu tham khảo chưa quy định cần có điều chỉnh Rating: Mark: 7.1 (In writing: seven point one) Ho Chi Minh City, July 10 2019 Reviewer Quách Thanh Hải vii CONTENTS THESIS TASKS i SCHEDULE ii ASSURANCE STATEMENT iv ACKNOWLEDGEMENT v ADVISOR’S COMMENT SHEET vi REVIEWER’S COMMENT SHEET vii CONTENTS viii ABBREVIATIONS AND ACRONYMS x LIST OF FIGURES xii LIST OF TABLES xv ABSTRACT xvi CHAPTER 1: OVERVIEW .1 1.1 INTRODUCTION 1.2 BACKGROUND AND RELATED WORK 1.2.1 Overview about Autonomous Car 1.2.2 Literature Review and Other Study .3 1.3 OBJECTIVES OF THE THESIS 1.4 OBJECT AND RESEARCHING SCOPE 1.5 RESEARCHING METHOD 1.6 THE CONTENT OF THESIS CHAPTER 2: THE PRINCIPLE OF SELF – DRIVING CARS 10 2.1 INTRODUCTION OF SELF – DRIVING CARS 10 2.2 DIFFERENT TECHNOLOGIES USED IN SELF-DRIVING CARS 11 2.2.1 Laser 11 2.2.2 Lidar .12 2.2.3 Radar 15 2.2.4 GPS 16 2.2.5 Camera 16 2.2.6 Ultrasonic Sensors .17 2.3 OVERVIEW ABOUT ARTIFICIAL INTELLIGENCE 18 2.3.1 Artificial Intelligence 18 2.3.2 Machine Learning 19 2.3.3 Deep Learning .21 CHAPTER 3: CONVOLUTIONAL NEURAL NETWORK 24 3.1 INTRODUCTION .24 3.2 STRUCTURE OF CONVOLUTIONAL NEURAL NETWORKS 24 3.2.1 Convolution Layer .25 3.2.2 Activation function 27 3.2.3 Stride and Padding .28 3.2.4 Pooling Layer .29 viii 3.2.5 Fully-Connected layer 30 3.3 NETWORK ARCHITECTURE AND PARAMETER OPTIMIZATION 31 3.4 OBJECT DETECTION 32 3.4.1 Single Shot Detection framework 32 3.4.2 MobileNet Architecture .34 3.4.3 Non-Maximum Suppression 38 3.5 OPTIMIZE NEURAL NETWORKS 39 3.5.1 Types of Gradient Descent 39 3.5.2 Types of Optimizer 40 CHAPTER 4: HARDWARE DESIGN OF SELF-DRIVING CAR PROTOTYPE 43 4.1 HARDWARE COMPONENTS 43 4.1.1 1/10 Scale 4WD Off Road Remote Control Car Buggy Desert 43 4.1.2 Brushed Motor RC-540PH 44 4.1.3 Motor control module BTS7960 45 4.1.4 RC Servo MG996 47 4.1.5 Raspberry Pi Model B+ 47 4.1.6 NVIDIA Jetson Nano Developer Kit 50 4.1.7 Camera Logitech C270 53 4.1.8 Arduino Nano 54 4.1.9 Lipo Battery 2S-30C 2200mAh 56 4.1.10 Voltage reduction module 57 4.1.11 USB UART PL2303 59 4.2 HARDWARE WIRING DIAGRAM 59 4.2.1 Construct The Hardware Platform .60 4.2.2 PCB Of Hardware 61 CHAPTER 5: CONTROL ALGORITHMS OF SELF-DRIVING CAR PROTOTYPE 62 5.1 CONTROL THEORY 62 5.1.1 Servo Control Theory 62 5.1.2 UART Communication Theory 64 5.2 FLOWCHART OF COLLECTING TRAINING DATA 67 5.3 FLOWCHART OF NAVIGATING THE CAR USING TRAINED MODEL 68 CHAPTER 6: EXPERIMENTS 69 6.1 EXPERIMENTAL ENVIRONMENTS 69 6.2 COLLECT DATA .70 6.3 DATA AUGMENTATIONS 71 6.4 TRAINING PROCESS 72 6.5 OUTDOOR EXPERIMENTS RESULTS 75 CHAPTER 7: CONCLUSION AND FUTURE WORK 77 REFERENCES 79 ix ABBREVIATIONS AND ACRONYMS ADAS : Advanced Driving Assistance System CCD : Charge Coupled Device CMOS : Complimentary Metal-Oxide Semiconductor CNN : Convolutional Neural Network DSP : Digital Signal Processing FOV : Field of View FPGA : Field-Programmable Gate Array GPS : Global Positioning System GPIO : General Purpose Input-Output GPU : Graphics Processing Unit IMU : Inertial Measurement Unit LIDAR : Light Detection And Ranging PAS : Parking Assistance System PCB : Printed Circuit Board PWM: Pulse Width Modulation RADAR : Radio Detection And Ranging RC : Radio Controlled RNN : Recurrent Neural Network SOCs : System-On-a-Chips UV : UltraViolet 4WD : Wheel Drive WP : Water Proof YAG : Yttrium Aluminum Garnet x The transmitting UART adds the start bit, parity bit, and the stop bit(s) to the data frame Figure 5.5 Data Frame of Transmitting UART The entire packet is sent serially from the transmitting UART to the receiving UART The receiving UART samples the data line at the preconfigured baud rate Figure 5.6 Transmitting and Receiving UART The receiving UART discards the start bit, parity bit, and stop bit from the data frame Figure 5.7 Data Frame of Receiving UART The receiving UART converts the serial data back into parallel and transfers it to the data bus on the receiving end Figure 5.8 Converts the serial data back into parallel 66 5.2 FLOWCHART OF COLLECTING TRAINING DATA Figure 5.9 Flowchart of collect image training Figure 5.9, when we begin to collect data to training, we use PS2 to control the car After run the script on the Raspberry Pi, we run the car and Raspberry Pi will take images if the R2 button on PS2 not press And the image will be saved to the folder corresponding to the steer angles received from PS2 67 5.3 FLOWCHART OF NAVIGATING THE CAR USING TRAINED MODEL Figure 5.10 Flowchart of Navigating the Car using Trained Model To begin this process, run the predict script on the Raspberry Pi and Jetson Nano, the car runs and captures image and then takes it to the CNN model to predict angle for the car Then the car will run and navigates itself with the steering angle received from the CNN model 68 CHAPTER 6: EXPERIMENTS In this chapter, the data used to train the network architecture in chapter and the experimanetal results is presented This includes (1) the experimental environments to collect data,(2) data collection, (3) some methods to augment data, (4) training process and (5) the results of training process and experiment 6.1 EXPERIMENTAL ENVIRONMENTS 50cm Figure 6.1 The overall oval-shaped lined track 50cm Figure 6.2 Lined track and traffic signs recognition To collect the training data, we manually drive the RC car platform to record timestamped images and control commands in two outdoor tracks created on the asphalted plane 69 Experimental environment has black background and white wire, 50-centimeter wide course with borders in 10-centimeter wide tape show in Figure 6.1 – 6.2 There are three traffic signs used to navigate in this environment: Left, Right and Stop (show in Figure 6.3) (a) (b) Figure 6.3 Traffic signs (a) Left, (b) Right, (c) Stop (c) 6.2 COLLECT DATA Data preparation is required when working with deep learning networks Raspberry Pi records the images and driving information from the user manually driving the car around the track with the speed at 4-5km/h Collected data contains over 10,000 images couple with steering angles The original resolution of the image is 480x640 Camera is configured to capture at a rate of 10 frames per second with exposure time 5000us to prevent the blur caused by elastic vibration when the car drive on the track Sample images of this dataset are shown in Figure 6.4 Figure 6.4 Some typical images of the Dataset 70 6.3 DATA AUGMENTATIONS Deep learning model tends to over fit the small dataset because of having too few examples to train on, resulting in a model that has poor generalization performance Data augmentation is a technique of manipulating the incoming training data to generate more instances of training data by creating new samples via random transformation of existing ones This method boosts the size of the training set, reducing overfitting Common transformations are horizontal flip, brightness adjusts are illustrated in Figure 6.5 In addition, data augmentation is only performed on the training data, not validation or test set Horizontal Flip (a) ` (b) Figure 6.5 Horizontal Flipis (a) Original image, (b) Horizontal flipped image The model needs to learn to steer correctly whether the car is on the left or right side of the road Therefore, we apply a horizontal flip to a proportion of images and naturally invert the original steering angle Brightness Augmentation Brightness is randomly changed to simulate different light conditions since some parts of the tracks are much darker or lighter, due to shadows and sun light We generate augmented images with different brightness by first converting images to HSV, scaling up or down the V channel and converting back to the RGB channel 71 (a) (b) (c) Figure 6.6 Brightness Augmentation (a) Original image, (b) Brighter image and (c) Darker image 6.4 TRAINING PROCESS The stored data is then copied to a desktop computer, which is equipped with a Core-i7 8750H, 2.2GHz, Ram 16GB, which we trained the network to accelerate training speed With classification model, we split the dataset into two subsets: a training set (80%) and a test set (20%) Training dataset contains 15,000 samples, test set contains 3,000 samples With object detection model, we made labelling 1000 images, 250 images for each category Then we split data into two subsets: a training set (85%) and a test set (15%) We used Training App to load Dataset and make training process is more easily to use This app was built on C# platform by Dao Duy Phuong, the GUI of this App is showed in Figure 6.7 Figure 6.7 GUI of Training App 72 We set some parameters as following: Number of Epochs : 100 It is the number of passing through whole the training dataset Minibacth: 32 It is the number of samples used in one iteration Optimizer: Adam It is a loss function to measure how wrong your predictions are Loss: Categorical cross entropy It is used for multi-class classification Momentum: 0.9 It helps reduce time for training model Learning rate: 0.0001 It is a hyper-parameter that controls how much the weights of network with aspect the loss gradient are adjusted Decay: 0.009 It is a hyper-parameter that learning rate is calculated each epoch as following: lrate = initial_Lrate/(1+ decay*epoch) Nesterov: True It is a hyper-parameter that helps reduce time for training Epsilon: None It returns the value of the fuzz factor used numeric expressions Transfer Learning: If True, the weights of network will be continuely trained from trained model If False, the weights of network will be randomly initialized Figure 6.8 Model is under training 73 (a) (b) (c) (d) ` (e) (f) Figure 6.9 The visualization output of convolutional layers (a) is an originally selected frame.(b), (c), (d), (e) and (f) are the feature maps at first five convolutional layers Figure 6.10 Change in loss value throughout training 74 Figure 6.11 Change in accuracy value throughout training 6.5 OUTDOOR EXPERIMENTS RESULTS Once trained on the desktop computer, the model was copied back to the Raspberry Pi The network is then used by the car’s main controller, which feeds an image frame from the Camera as input In each control period, 10 images per second can be processed and top speeds of the car around curves are about - km/h The representative testing images and steering wheel angles of model’s prediction can be seen in Figure 6.12 The trained model was able to achieve a reasonable degree of accuracy The car successfully navigates itself in both two tracks with diverse driving conditions, regardless of whether lane markings are present or not (g) (a) (b) (c) (d) (e) (f) Figure 6.12 Experimental Results: The top is images and the bottom is outputs after through Softmax function of the model (a) Steer is100, (b) Steer is 110, (c) Steer is 120, (d) Steer is 130, (e) Steer is 140, (f) Steer is 150, (g) Steer is 160 75 Figure 6.13 The actual and predicted steering wheel angles of the models Figure 6.14 The outputs of the object detection model 76 CHAPTER 7: CONCLUSION AND FUTURE WORK The paper in Chapter used one model (Classification model), it can run well in the lane with the speed is 5-6km/h and the accuracy of the model is 92.38% But it cannot identify and zone the traffic signs by draw a bounding box, so we decide to add the Object Detection model In this thesis, we presented an autonomous car platform based on state-of-the-art AI technology: End-to-end deep learning based on real-time control This work addresses a novel problem in computer vision, which aims to autonomously drive a car solely from its camera’s visual observation There are two model were used in this thesis: Classification model (training accuracy: 96.7%, improve than the Paper in Chapter 1) and Object Detection model (training accuracy: 76.3%) After we ended up with one that is able to power our car to drive itself on both tracks with the speed is 5-6km/h The encouraging result showed that it is possible to use a deep convolutional neural network to predict steering angles of the vehicle directly from camera input data in real-time Despite the complexity of the neural network, the embedded computing platforms like Raspberry Pi and NVIDIA Jetson Nano Kit are powerful enough to support the vision and end-to-end deep learning based real-time control applications Due to the complexity of the network, each embedded computer can only handle one model, so we need Raspberry Pi to process the Classification model and Jetson Nano to process the Object Detection model We found the data very important for good model, we need to collect lots of image Without all those images and steering angles, along with their potentially infinite augmentations, we would not have been able to build a robust model The issue we observed in training/testing the network is camera latency It is defined as the period from the time the camera sensor observes the scene to the time the computer actually reads the digitized image data Unfortunately, this time can be significantly long depending on the camera and the performance of the Pi and the Jetson, which is about 100-120 milliseconds This is remarkably higher than the latency of human perception, which is known to be as fast as milliseconds Higher camera latency could negatively affect control performance, especially for safetycritical applications, because the deep neural network would analyze stale scenes There are many areas we could explore to push this project further and obtain even more convincing results In the future, we continue to investigate ways to achieve 77 better prediction accuracy in training the network, identify and use low-latency cameras, as well as improving the performance of the RC car platform, especially related to precise steering angle control Furthermore, we are going to take throttle into the model with the ambition of achieving higher levels of autonomous vehicles Most importantly, we will research How to use only one model, one computer and one camera on our car 78 REFERENCES [1] LeCun, Y., Boser, B., Denker, J S., Henderson, D., Howard, R E., Hubbard, W., & Jackel, L D (1989) Backpropagation applied to handwritten zip code recognition Neural computation, 1(4), 541-551 [2] Krizhevsky, A., Sutskever, I., & Hinton, G E (2012) Imagenet classification with deep convolutional neural networks In Advances in neural information processing systems (pp 1097-1105) [3] Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., & Darrell, T (2014, January) Decaf: A deep convolutional activation feature for generic visual recognition In International conference on machine learning (pp 647-655) [4] Fei-Fei, L., Fergus, R., & Perona, P (2007) Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories Computer vision and Image understanding, 106(1), 59-70 [5] Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., & Zhang, X (2016) End to end learning for self-driving cars arXiv preprint arXiv:1604.07316 [6] Karaman, S., Anders, A., Boulet, M., Connor, J., Gregson, K., Guerra, W., & Vivilecchia, J (2017, March) Project-based, collaborative, algorithmic robotics for high school students: Programming self-driving race cars at MIT In Integrated STEM Education Conference (ISEC), 2017 IEEE (pp 195-203) IEEE [7] N Otterness, M Yang, S Rust, E Park, J H Anderson, F D Smith, Berg, and S Wang An Evaluation of the NVIDIA TX1 for Supporting Real-Time Computer-Vision Workloads In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 353–364 IEEE, apr 2017 [8] Truong-Dong Do, Minh-Thien Duong, Quoc-Vu Dang, My-Ha Le, Real-Time Self-Driving Car Navigation Using Deep Neural Network, 4th “International Conference on Green Technology and Sustainable Development – GTSD 2018” [9] Zhi Titan, Chunhua Shen, Hao Chen, Tong he Fully convolutional one-stage object detection arXiv: 1904, 2019 [10] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C Berg Single Shot MultiBox Detector arXiv:1512, 2016 79 [11] Andrew G.Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam MobileNet: Efficent convolutional Neural Networks for Mobile Vision Applications arXiv: 1704,2017 [12] S Ioffe and C Szegedy Batch normalization: Accelerating deep network training by reducing internal covariate shift arXiv preprint arXiv:1502.03167, 2015 [13] C Szegedy, V Vanhoucke, S Ioffe, J Shlens, and Z Wojna Rethinking the inception architecture for computer vision arXiv preprint arXiv:1512.00567, 2015 [14] J Jin, A Dundar, and E Culurciello Flattened convolutional neural networks for feedforward acceleration arXiv preprint arXiv:1412.5474, 2014 [15] D Cire¸san, U Meier, and J Schmidhuber Multi-column deep neural networks for image classification Arxiv preprint arXiv:1202.2745, 2012 [16] Keiron O’Shea and Ryan Nash An Introduction to Convolutional Neural Networks arXiv:1511, 2015 [17] T.R Padmannabhan Programming with Python Springer [18] Vu Huu Tiep Machine Learning Co Ban Nha xuat ban Khoa Hoc va Ky Thuat 80 ... neural network Chapter III: Convolutional Neural Network: This chapter gives the knowledge about Convolutional Neural Network, the Structure of CNN Chapter IV: Hardware Design of Self-driving... a deep neural network for vision-based real-time obstacle detection and avoidance More recently, researchers are investigating DNN based end-to-end control of cars and other robots [5] Executing... key point and objective to deeply understanding what a real-time end-to-end deep learning based self-driving car and convolutional neural network actually is Besides that, it is critical to determining