(Đồ án hcmute) design of advanced driver assistance system based on deep learning

MINISTRY OF EDUCATION AND TRAINING HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION GRADUATION PROJECT COMPUTER ENGINEERING TECHNOLOGY DESIGN OF ADVANCED DRIVER ASSISTANCE SYSTEM BASED ON DEEP LEARNING LECTURER: LE MINH THANH, M.Eng STUDENT: THAI HOANG MINH TAM SKL010587 Ho Chi Minh City, December 2022 HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT DESIGN OF ADVANCED DRIVER ASSISTANCE SYSTEM BASED ON DEEP LEARNING Student: THAI HOANG MINH TAM ID: 18119040 Major: COMPUTER ENGINEERING TECHNOLOGY Advisor: LE MINH THANH, MEng Ho Chi Minh City, December 2022 HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT DESIGN OF ADVANCED DRIVER ASSISTANCE SYSTEM BASED ON DEEP LEARNING Student: THAI HOANG MINH TAM ID: 18119040 Major: COMPUTER ENGINEERING TECHNOLOGY Advisor: LE MINH THANH, MEng Ho Chi Minh City, December 2022 THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness -Ho Chi Minh City, December 25, 2022 PRE-DEFENSE EVALUATION SHEET Student name: Thái Hoàng Minh Tâm Student ID: 18119040 Major: Computer Engineering Technology Class: 18119CLA1 Project title: Design of advanced driver assistance system based on deep learning Name of Reviewer: EVALUATION Content and workload of the project Strengths: Weaknesses: Approval for oral defense? (Approved or denied) Overall evaluation: (Excellent, Good, Fair, Poor) Mark: ……………… (In words: ) Ho Chi Minh City, December 25, 2022 REVIEWER (Sign with full name) THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness -Ho Chi Minh City, December 25, 2022 EVALUATION SHEET OF DEFENSE COMMITTEE MEMBER Student name: Thái Hoàng Minh Tâm Student ID: 18119040 Major: Computer Engineering Technology Class: 18119CLA1 Project title: Design of advanced driver assistance system based on deep learning Name of Defense Committee Member: EVALUATION Content and workload of the project Strengths: Weaknesses: Overall evaluation: (Excellent, Good, Fair, Poor) Mark: ……………… (In words: ) Ho Chi Minh City, December 25, 2022 COMMITTEE MEMBER (Sign with full name) Acknowledgment I would like to express my deepest gratitude to my advisor, Mr Le Minh Thanh, MEng, for many interesting weekly technical discussions that helped me find research directions and advice on technical issues His patience and devotion to teaching have helped me become a better student in terms of both knowledge and personality, not only during the development of the graduation project but throughout the academic years I would like to extend my sincere thanks to the teachers of the Faculty of High Quality Training and the Faculty of Electrical and Electronics Engineering for their advanced knowledge and experience through every course I am also grateful that my colleagues created the best conditions for me to complete the graduation project Additionally, this endeavor would not have been possible without BOSCH Global Software Technologies Company, which financed my research Lastly, I would be remiss in not mentioning my family and friends Their belief and support in me have kept my spirits and motivation high during this process Disclaimer This thesis is the result of my study, evaluation, and implementation All texts, quoted directly or paraphrased, have been indicated by in-text citations Full bibliographic details are given in the reference list containing internet sources containing URLs Thái Hoàng Minh Tâm Table of Contents List of Figures i List of Tables iv Abstract v List of Abbreviations vi CHAPTER 1: INTRODUCTION 1.1 OVERVIEW 1.2 GOALS 1.3 LIMITATIONS 1.4 OUTLINES CHAPTER 2: LITERATURE REVIEW 2.1 DEEP LEARNING 2.1.1 Convolutional Neural Network 2.1.2 Convolutional Layer 2.1.3 Pooling Layer 2.1.4 Fully Connected Layer 2.2 OBJECT DETECTION 2.2.1 Two-Stage Object Detection 2.2.2 One-Stage Object Detection 10 2.3 YOLOv6 OBJECT DETECTION ARCHITECTURE 13 2.3.1 RepVGG Backbone 14 2.3.2 RepPAN Neck 17 2.3.3 Decoupled Head 18 CHAPTER 3: SYSTEM DESIGN 20 3.1 OVERALL SYSTEM 20 3.2 COMPARISON OF OBJECT DETECTION MODELS 22 3.3 TRAFFIC SIGN RECOGNITION 25 3.3.1 TSR Overview 25 3.3.2 Training Process 27 3.3.3 TSR Algorithm 33 3.4 FORWARD COLLISION WARNING 34 3.7 ELECTRICAL/ELECTRONIC ARCHITECTURE 3.7.1 Overview The introduction of the ECU into the automobile industry has advanced vehicle electrification and mechatronics The ECU’s functions have evolved from managing engine operation to controlling the chassis, electronic components, and in-car entertainment and networking devices Currently, one or more ECUs regulate each vehicle’s features Electronic controllers have increased dramatically in recent years with the growth of fuel-saving, safety, comfort, and entertainment demands A level premium car nowadays has more than 100 ECUs The ECU is built around a microcontroller unit (MCU) and an embedded system The embedded system is a microcomputer, while the MCU is mostly used for control but not computation As a result, a single ECU can only perform data-intensive computation and control functions, including engine control, battery management, and motor control The future’s toughest challenge for vehicle development will be the increasing demand for data processing and computing speed, whether from intelligent connection or autonomous driving technology Developing driver assistant technologies, in particular, will generate difficult logical processes and unstructured data processing scenarios The computational power of ADAS software has already reached 10 TOPS (Tera Operations Per Second), and the computing power of autonomous driving software is predicted to approach 100 TOPS, which microcomputers’ existing computing capacity cannot handle This work requires a high-performance vehicle computer - NVIDIA Jetson AGX Orin, as shown in Figure 3.35, to handle essential ADAS applications and a graphic-userinterface It contains 2048-core NVIDIA Ampere architecture GPU, 12-core ARM CortexA78AE 64-bit CPU, 32GB LPDDR5 RAM, and a maximum power of 50W delivering up to 275 TOP while keeping power efficiency in a small footprint with high-speed interfaces to support deep learning research [31] In addition, a low-cost camera is used to capture images with a resolution of 1280×720 and transfer it to the vehicle computer through the USB port The output of the ADAS software will be displayed on a 1920x1080 screen through the DisplayPort interface 52 Figure 3.35: Connection diagram of the proposed ADAS 3.7.2 Hardware Utilization This project has to cope with several multithreading challenges for the ADAS system to achieve real-time performance on low-power edge devices in a vehicle while preserving GUI responsiveness Thus, numerous evaluations based on sequential and concurrent execution have been done According to the data presented in Table 3.9, concurrent execution outperforms sequential execution by 156.83% in begin-to-end (time elapsed between the application threads receiving the input frame and the last block rendering the output to the GUI) Currently, there is a lack of supported software for precise statistics of hardware resources on embedded devices However, by observing the system’s execution process and selecting peak parameters, the concurrent execution uses 61% GPU resources As a result, this satisfies the system’s real-time requirement with FPS above 60, while the most power-consuming component in this vehicle computer - the GPU is only 12.8W Table 3.9: Comparison of performance and hardware utilization between sequential and concurrent programming Model YOLOv6s FCW YOLOv6s TSR UFLD Begin to end Utilize/Power Sequential (FPS) 179.70 179.60 365.90 27.8 56% / 10.68W 53 Concurrent (FPS) 149.40 -16.86% 150.80 -16.03% 234.30 -35.97% 71.4 +156.83% 61% / 12.8W CHAPTER 4: RESULTS 4.1 SIMULATION RESULTS The proposed ADAS chose state-of-the-art object detection and lane detection models by analyzing scientific research and developing a benchmark among the models using a customized dataset Furthermore, this study investigates how to deploy selected models most efficiently by utilizing special strategies while training models and concurrent programming to enable three heavy computer vision and deep learning tasks to run swiftly with responsive GUI The simulation results when the ADAS software runs with the input 1280×720 dashcam footage, as shown in Figure 4.1, will be given in the following sections Figure 4.1: Example frames from dashcam footage to simulate the ADAS software 4.1.1 Traffic Sign Recognition Results The YOLOv6s FCW and TSR models accurately detect and recognize vehicles and signs in most frames The limit speed of 60 sign was detected with great precision (92%) in Figure 4.2 but was not present in the proposed region (light blue boxes) As a result, the indication is not displayed on the GUI In Figure 4.3, the detected limit speed of 60 is present in the proposed region Thus, the traffic sign is displayed to the driver through the GUI 54 Figure 4.2: 1st Sample result of the traffic sign recognition during inference Figure 4.3: 2nd Sample result of the traffic sign recognition during inference 4.1.2 Forward Collision Warning Results The YOLOv6s-FCW detects the vehicle in Figure 4,4 with high precision (95%), and the vehicle’s coordinate in the frame was inside the caution zone Therefore the warning was shown in the inference result frame and notification “collision warning” on the GUI While in Figure 4.5, The vehicle was precisely detected, and its coordinate in the frame 55 was within the danger zone Therefore, the danger alert was raised and shown in the inference result frame and notification “danger ahead” on the GUI Figure 4.4: 1st Sample result of the forward collision warning during inference Figure 4.5: 2nd Sample result of the forward collision warning during inference 56 4.1.3 Lane Departure Warning Results The UFLD accurately detects the lane when the lane markers are visible As shown in Figure 4.6, there is no lane departure warning because the vehicle is heading straight While in Figure 4.7, considering an upper boundary (yellow line) of the caution zone as an indicator for easily observing the center of the lane (LDW still based on the off-center value), the lane departure notification “driving off lane” appears on the GUI because the car is leaning to the right Figure 4.6: 1st Sample result of the lane departure warning Figure 4.7: 2nd Sample result of the lane departure warning during inference 57 4.2 EXPERIMENTAL RESULTS This experiment adjusted the input to capture from a camera instead of taking direct input from dashcam footage to examine environments close to real traffic scenarios Figure 4.8 depicts the proposed ADAS setup, which includes (1) the TV used to display traffic scenarios in Vietnam, (2) a camera to capture images and transfer them to the Jetson Orin vehicle computer powered by an AC-DC adapter, (3) the monitor to display the GUI, and (4) the keyboard and mouse for user interaction (3) and (4) are subjected to change to a touchscreen in the future for greater convenience Figure 4.8: Two angle shows the experimental setup 58 The safe result of the system displayed in Figure 4.9 demonstrates that practically all vehicles are detected with extremely high accuracy, and the lane detection system also detects and displays lanes correctly At the same time, the vehicle is visible in the center of the lane In contrast, we can observe in Figure 4.10 that the vehicle tends to go to the left, and caution is displayed on the vehicle’s GUI Figure 4.9: The safe result with multiple detected objects of the experiment Figure 4.10: The lane departure warning result of the experiment 59 Figures 4.11 and 4.12 show that FCW and TSR applications with the camera input operate similarly using dashcam footage as direct input Thus, we may deduce that deploying the ADAS system to a real vehicle in a controlled environment will result in the proposed ADAS performing perfectly fine Figure 4.11: The traffic sign recognition result of the experiment Figure 4.12: The forward collision warning result of the experiment 60 CHAPTER 5: CONCLUSION AND FUTURE WORK 5.1 CONCLUSION In conclusion, this work proposes three fundamental ADAS applications using cuttingedge object detection YOLOv6 and lane detection UFLD based on computer vision and deep learning with a user-friendly graphic user interface Furthermore, this study employs the best technique for improving YOLOv6 performance by fine-tuning training and converting models to FP16 during inference for fast speed while maintaining accuracy In addition, this work provides a detail benchmarked among five YOLO models and a custom dataset targeted to traffic scenarios with over 18832 labeled objects The object detection models trained and fine-tuned from the custom TSR and FCW datasets result in mAP50 of 88.6% and 82.1%, respectively The ADAS system and GUI can operate in real-time at 71 frames per second while utilizing just 61% of the GPU’s performance The simulation and experimental results show that the system has successfully satisfied the goals, with important driver assistance features such as warnings and instructions working intuitively, precisely, and elegantly through the GUI The initial accomplishment of this ADAS software will open the path for future software based on vehicle E/E architecture to progressively advance to autonomous cars 5.2 FUTURE WORK The following improvements will be done in the future to make this work more practical and to meet the OEM’s expectations Firstly, The FCW application can calculate and change the distance of collision warning and perform auto brake when combined with dedicated sensors and ECUs on the vehicle Secondly, TSR and FCW applications can enhance object detection performance with more variety of signs, vehicles, and pedestrians by preparing data for various traffic scenarios Thirdly, UFLD can be enhanced further by replacing the ResNet18 backbone with a better model, such as the RepVGG-A0 [13] Finally, converting models to INT8 precision by TensorRT for even faster inference speed [19] 61 References [1] S Singh, “Critical reasons for crashes investigated in the National Motor Vehicle Crash Causation Survey”, US National Highway Traffic Safety Administration, DOT HS 812 506, Washington DC, USA, pp 2-2, March 2018 [2] L Yue, M Abdel-Aty, Y Wu, and L Wang, “Assessment of the safety benefits of vehicles’ advanced driver assistance, connectivity and low-level automation systems”, Accident Anal Prevention, vol 117, pp 55-64, Aug 2018 [3] M Hasenjäger, M Heckmann and H Wersing, “A Survey of Personalization for Advanced Driver Assistance Systems,” in IEEE Transactions on Intelligent Vehicles, vol 5, no 2, pp 335-344, June 2020, doi: 10.1109/TIV.2019.2955910 [4] Dumoulin, Vincent, and Visin, Francesco “A guide to convolution arithmetic for deep learning.” arXiv, 2016, https://doi.org/10.48550/arXiv.1603.07285 [5] Redmon, Joseph, and Farhadi, Ali “YOLOv3: An Incremental Improvement.” arXiv, 2018, https://doi.org/10.48550/arXiv.1804.02767 [6] Bochkovskiy, Alexey, et al "YOLOv4: Optimal Speed and Accuracy of Object Detection." arXiv, 2020, https://doi.org/10.48550/arXiv.2004.10934 [7] Jocher Glenn “YOLOv5 release v6.1” (2022) [Online] Available: https://github.com/ultralytics/yolov5/releases/tag/v6 [8] Li, Chuyi, et al “YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications.” arXiv, 2022, https://doi.org/10.48550/arXiv.2209.02976 [9] Wang, Chien, et al “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors.” arXiv, 2022, https://doi.org/10.48550/arXiv.2207.02696 [10] Ge, Zheng, et al “YOLOX: Exceeding YOLO Series in 2021.” arXiv, 2021, https://doi.org/10.48550/arXiv.2107.08430 [11] Wu, Xiongwei, et al “Recent Advances in Deep Learning for Object Detection.” arXiv, 2019, https://doi.org/10.48550/arXiv.1908.03673 [12] Carranza-García, M.; Torres-Mateo, J.; Lara-Benítez, P.; García-Gutiérrez, J On the Performance of One-Stage and Two-Stage Object Detectors in Autonomous Vehicles Using Camera Data Remote Sens 2021, 13, 89 https://doi.org/10.3390/rs13010089 [13] Ding, Xiaohan, et al “RepVGG: Making VGG-style ConvNets Great Again.” arXiv, 2021, https://doi.org/10.48550/arXiv.2101.03697 62 [14] Lin, Tsung, et al “Feature Pyramid Networks for Object Detection.” arXiv, 2016, https://doi.org/10.48550/arXiv.1612.03144 [15] Zhang, Can, et al “PAN: Towards Fast Action Recognition via Learning Persistence of Appearance.” arXiv, 2020, https://doi.org/10.48550/arXiv.2008.03462 [16] Deloitte, “Autonomous Driving” (2019) [Online] Available: Deloitte_Autonomous-Driving.pdf [17] National Highway Traffic Safety Administration, “Driver Assistance Technologies” (2022) [Online] Available: https://www.nhtsa.gov/equipment/driver-assistance-technologies [18] Z Zhu, D Liang, S Zhang, X Huang, B Li, and S Hu, “Traffic-Sign Detection and Classification in the Wild,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp 2110-2118, doi: 10.1109/CVPR.2016.232 [19] NVIDIA TensorRT (2022) [Online] Available: https://developer.nvidia.com/tensorrt, [20] Takaki, M.; Fujiyoshi, H Traffic Sign Recognition Using SIFT Features IEEJ Trans Electron Inf Syst 2009, 129, 824–831 [21] Dalal, N.; Triggs, B Histograms of oriented gradients for human detection In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 10, pp 886–893 [22] Department of Transportation, “Circular Number: 91/2015/TT-BGTVT” (2015) [Online] Available: Thông tư 91/2015/TT-BGTVT tốc độ khoảng cách xe giới xe máy chuyên dùng giao thông đường [23] Wei, Pan, et al “LiDAR and Camera Detection Fusion in a Real-Time Industrial Multi-Sensor Collision Avoidance System”, arXiv, 2018 https://doi.org/10.48550/arXiv.1807.10573 [24] Ziebinski, A.; Cupek, R.; Erdogan, H.; Waechter, S A Survey of ADAS Technologies for the Future Perspective of Sensor Fusion In Computational Collective Intelligence; Nguyen, N.T., Iliadis, L., Manolopoulos, Y., Trawiński, B., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp 135–146 63 [25] Nur, SA; Ibrahim, M.; Ali, N.; Nur, FIY Vehicle detection based on underneath vehicle shadow using edge features In Proceedings of the 2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), Penang, Malaysia, 25–27 November 2016; pp 407–412 [26] Lin, Tsung, et al “Microsoft COCO: Common Objects in Context.” arXiv, 2014, https://doi.org/10.48550/arXiv.1405.0312 [27] Elsagheer Mohamed, S.A.; Alshalfan, K.A.; Al-Hagery, M.A.; Ben Othman, M.T Safe Driving Distance, and Speed for Collision Avoidance in Connected Vehicles Sensors 2022, 22, 7051 https:// doi.org/10.3390/s22187051 [28] Aly, Mohamed “Real-time Detection of Lane Markers in Urban Streets.” arXiv, 2014, https://doi.org/10.1109/IVS.2008.4621152 [29] Neven, D., De Brabandere, B., Georgoulis, S., Proesmans, M., Van Gool, L.: Towards end-to-end lane detection: an instance segmentation approach In: Proceedings of the IEEE Intelligent Vehicles Symposium pp 286–291 (2018) [30] Qin, Zequn, et al “Ultra Fast Structure-aware Deep Lane Detection.” arXiv, 2020, https://doi.org/10.48550/arXiv.2004.11757 [31] NVIDIA “Jetson AGX Orin developer kit specification” (2022) [Online] Available: Jetson AGX Orin for Advanced Robotics | NVIDIA 64 Appendix Figure 1: Plagiarism check result by Turnitin 65 S K L 0

Định dạng
Số trang	83
Dung lượng	10,08 MB