Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 83 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
83
Dung lượng
3,41 MB
Nội dung
MINISTRY OF EDUCATION AND TRAINING HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY RESEARCH, DESIGN, AND CONSTRUCT LANE TRACKING AND OBSTACLE AVOIDANCE SYSTEM FOR AUTONOMOUS GROUND VEHICLES BASED ONMONOCULAR VISION AND 2D-LIDAR ADVISOR: ASSOC.PROF LE MY HA STUDENT: LE TRUNG LINH SKL 0 Ho Chi Minh City, August, 2022 HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH-QUALITY TRAINING GRADUATION PROJECT RESEARCH, DESIGN, AND CONSTRUCT LANE TRACKING AND OBSTACLE AVOIDANCE SYSTEM FOR AUTONOMOUS GROUND VEHICLES BASED ON MONOCULAR VISION AND 2D-LIDAR LÊ TRUNG LĨNH Student ID: 18151016 Major: AUTOMATION AND CONTROL ENGINEERING Advisor: LÊ MỸ HÀ, Assoc.Prof Ho Chi Minh City, August 2022 HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH-QUALITY TRAINING GRADUATION PROJECT RESEARCH, DESIGN, AND CONSTRUCT LANE TRACKING AND OBSTACLE AVOIDANCE SYSTEM FOR AUTONOMOUS GROUND VEHICLES BASED ON MONOCULAR VISION AND 2D-LIDAR LÊ TRUNG LĨNH Student ID: 18151016 Major: AUTOMATION AND CONTROL ENGINEERING Advisor: LÊ MỸ HÀ, Assoc.Prof Ho Chi Minh City, August 2022 THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness Ho Chi Minh City, August 8, 2022 GRADUATION PROJECT ASSIGNMENT Student name: Lê Trung Lĩnh Major: Automation and Control Engineering Student ID: 18151016 Class: 18151CLA Advisor: Assoc Prof Lê Mỹ Hà Phone number: 0938811201 Date of assignment: Date of submission: Project title: Research, Design, And Construct Lane Tracking And Obstacle Avoidance System For Autonomous Ground Vehicles Based On Monocular Vision And 2d-Lidar Initial materials provided by the advisor: - Image processing and machine learning documents such as papers and books: - The related thesis of previous students - The hardware specifications and its review Content of the project: - Refer to documents, survey, read and summarize to determine the project directions - Collect and visualize data of sensors - Choose models and algorithms for the car’s perception - Write programs for microcontrollers and processors - Test and evaluate the completing system - Write a report - Prepare slides for presenting Final product: the model robot that can operate on the HCMUTE campus has two modes: Autonomic and Manual CHAIR OF THE PROGRAM ADVISOR (Sign with full name) (Sign with full name) i THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness Ho Chi Minh City, August 8, 2022 ADVISOR'S EVALUATION SHEET Student name: Lê Trung Lĩnh Student ID: 18151016 Major: Automation and Control Engineering Project title: Research, Design, And Construct Lane Tracking And Obstacle Avoidance System For Autonomous Ground Vehicles Based On Monocular Vision And 2d-Lidar Advisor: Assoc.Prof Lê Mỹ Hà EVALUATION Content of the project: - Design and construct the autonomous model robot that can operate on the HCMUTE campus The robot operates based on different techniques and sensors The final product fulfills the objectives outlined in this proposition - Strengths: - The automobile model serves as a testing ground for actual automotive improvements to come A control system and algorithms for deep learning were used in the creation of this project The program's processing speed is suited for real-time applications All of the equipment used in this project are inexpensive - Weaknesses: - The system is unable to achieve full autonomy When operating in an outside setting, the system's accuracy and stability are acceptable Approval for oral defense? (Approved or denied) Overall evaluation: (Excellent, Good, Fair, Poor) Mark:……………….(in words: ) Ho Chi Minh City, August 8, 2022 ADVISOR (Sign with full name) ii THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness Ho Chi Minh City, August 8, 2022 PRE-DEFENSE EVALUATION SHEET Student name: Student ID: Student name: Student ID: Major: Project title: Name of Reviewer: EVALUATION Content and workload of the project Strengths: Weaknesses: Approval for oral defense? (Approved or denied) Overall evaluation: (Excellent, Good, Fair, Poor) Mark:……………….(in words: ) Ho Chi Minh City, August 8, 2022 REVIEWER (Sign with full name) iii THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness EVALUATION SHEET OF DEFENSE COMMITTEE MEMBER Student name: Student ID: Student name: Student ID: Major: Project title: Name of Defense Committee Member: EVALUATION Content and workload of the project Strengths: Weaknesses: Overall evaluation: (Excellent, Good, Fair, Poor) Mark………… (in words: ) Ho Chi Minh City, August 8, 2022 COMMITTEE MEMBER (Sign with full name) iv ACKNOWLEDGEMENTS We want to express our utmost thanks to Professor Le My Ha for his thorough instructions, which provided us with the necessary information to complete this thesis Despite the period this project requires, it is expected that mistakes will still exist despite our total effort With help from our Advisor, especially his input and advice, we hope to gain more experience and succeed in this project topic We would also like to thank the Faculty of Hight Quality Training and the Faculty of Electrical and Electronics Engineering, where we obtained basic knowledge and experience Moreover, we would like to thank the members of ISLab members for helping us gain the entire perspective of this project They shared valuable knowledge and experience with us We want to express our gratitude to our families for their support of our team throughout the working of this thesis Sincere thanks for everything! v A GUARANTEE This thesis is the result of our study and implementation, which we, as a result, formally proclaim We did not plagiarize from a published article without author acceptance We will take full responsibility for any violations that may have occurred Authors Lê Trung Lĩnh vi ABSTRACT This thesis presented a novel yet simple method for a car overtaking based on a combination between the camera and 2D LiDAR As for the camera, we utilized two models: "Lane-Line Detection" and "Object Detection." The Lane-Line Detection model plays the plaining path role, which helps the car model determine the next destination in the image frame series In object detection, YOLOV4 tiny is taken advantage of detecting cars and different types of traffic signs Moreover, Mosaic augmentation was applied to enhance the performance of the YOLO model To boost the inference time and implement a deep learning model on a low-cost device such as Jetson TX2, we converted the two models into TensorRT fp16 format From the above ideas, the car can be aware of the lane, obstacles, and traffic signs, which will help the vehicle solve many problems on the road Besides, 2D LiDAR was utilized to check the right side when the camera range was out Adaptive Breakpoint Detection was applied to cluster the objects in a scanning plane Then we find the rule of data by RANSAC and calculate its distance The estimated distance was the safety condition that helped the car cannot collide with the obstacle The whole pipeline was conducted with the multithreading technique, which can manage our system and lightly boost the inference time vii 4.3 Design of the Steering Controller A collection of coordinates _ , represent the exact lane lines on the road from the output of lane-line detection At most, there may be three lanes each lane on our campus (_p, , _:, , _Ÿ, from left to right) Left lane was disregarded to pursue the next destination and abide by traffic laws Meanwhile, we may figure out the steering angle using the two remaining lane lines First, the mean of two sets of _:, and _Ÿ, was used to construct the center point of the right lane, which was the predicted location in the next state, in order to compute the steering angle The following examples explain how to arrive at the desired outcome: ~(8, ;) = ã Ê ` ĂÂ+ + 0p ã0p (_: L +_ )Ô ã Ê , Â+ + _ + _ ÔƠ 0p •0p : Ÿ (4.9) N and R are the number of lane-line and the number of predefined rows, respectively The steering angle was inherited from the midway point computed above in the following step Geometry has various drawbacks, such as sounds, when it comes to wheeling angles In place of this, an ideal PID controller would monitor the steering angle 4.4 Algorithm on 2D LiDAR RANSAC is used following the clustering step to extract the right-side obstacle's boundary RANSAC was used for the most extended cluster because the robot was near the obstacle Afterward, we can estimate a safety distance d from LiDAR M (81 ; ;1 ) to that straight line (T: Ax + By + C = 0) as follows: (Đ, ^) = |~81 + ã;1 + E | √~: + •: (4.10) Figure 4.9 The estimated straight line distance from LiDAR LiDAR identifies when a vehicle may return to the right lane, as illustrated in figure 4.9 The RANSAC algorithm is not applied, and the returning flag is set if the closest right cluster includes fewer than three points or disappears for seven consecutive frames Currently, the predicted destination is located in the correct lane, as calculated by the output lane-line model Algorithm 2: Obstacle avoidance algorithm Input: Image frame F, list points cloud P Outputs: Steering angle θ 51 θ← Steering angle S ← {S1, …, Sn} /* S is the list of obstacle area*/ B ← The bounding box of the nearest obstacle fa ← False /* Initialize the avoiding flag */ d ← /* d (cm) is the distance from LiDAR to a straight line */ Comments: two while loops are run parallel, and variables are sent to each other Begin /* Camera processing */ a← 0.2 while true do: F ← new frame F ← G(F) /* G is the semantic segmentation model */ C = center of bounding Find the nearest obstacle and its bounding box B /* B is a list including {x, y, w, h} */ Calculate the drivable area on the road if fa= False then: if C > 260, And the obstacle is on the right side then: x = x-20 fa= True θ= PID(240-x) /* 240 is the middle point of image */ else θ= PID (240-right middle point) else if d< 20 then: θ= PID (240-middel point) + ad else θ= PID (240-middel point) end /* 2D LiDAR processing */ While true do: P ← new list points Cluster (P) and find the maximum length cluster if length of maximum length cluster > then: if fa= True then: Straight line ← RANSAC (the maximum length cluster) d ← distance from straight line to LiDAR else d← None if d > 50: d← None if d is None during frames then: fa= False end End 52 CHAPTER EXPERIMENTS AND RESULTS 5.1 EXPERIMENTAL ENVIRONMENT 5.1.1 Environment Our self-driving vehicle is put to the test on UTE's campus, on relatively flat roads, as illustrated in figure 5.1 Currently, the self-driving vehicle can only function on small and medium-sized campus roads due to time and technology constraints The operational road is the whole road network, except the road to the right of the center block Figure 5.1 The testing road on HCMUTE campus To fit the system and project specifications We decided to create traffic signs and scale objects Four traffic signs to navigate this environment: left, right, straight, and stop, as illustrated in figure 5.2 Figure 5.2 Custom traffic signs 53 In this thesis, our system is just a scale robot Therefore, it can not avoid the object that a large object Finally, the custom model car is suitable for our system, as illustrated in figure 5.3 Figure 5.3 The scale model car 5.1.2 Dataset 5.1.2.1 dataset for lane-line detection We took more than 2000 images in some condition lights on the HCMUTE campus, as illustrated in figure 5.4 Figure 5.4 Some views of the road on the data There are 6,408 road images in the TuSimple collection [27] The image has a resolution of 1280 x 720 The TuSimple test set includes 3,626 photos for training, 358 images for validation, and 2,782 images for testing, all of which were taken in various weather conditions, as illustrated in figure 5.5 Figure 5.5 Tusimple dataset [27] 54 5.1.2.2 dataset for object detection We collect more than 1000 images per class, as illustrated in figure 5.6 Figure 5.6 The custom data with different light conditions In addition to this, we self-label all of the data that was gathered on the HCMUTE campus The graphical image annotation tool known as the label is called "labelImg" [28] Additionally, it is developed in Python, and its graphical user interface is built using Qt, as illustrated in figure 5.7 Figure 5.7 The interface of labeling tool 55 5.2 TRAINING PROCESS 5.2.1 Lane-line detection Table 5.1 provides more information regarding the parameters measured throughout the training time in the training process Pytorch was used as the underlying framework to execute the UFLD model We used Tesla T4 GPUs with 12 gigabytes of RAM for the model's training Adam optimizer [29] was used to improve the training process's overall efficiency Table 5.1 Training parameters of the lane-line detection mode Parameter Name Parameter Value Image size 288 × 800 Batch size 32 The total epochs 200 Learning rate 0.00038 Momentum 0.9 Weight decay 0.0001 Evaluate metric mIoU Figure 5.8 depicts the training mIoU and training loss graph to help user can validate the model with their parameter Figure 5.8 Training mIoU (left) and loss graph (right)of the lane-line detection model 5.2.2 Object detection Using Table 5.2, we may learn more about the parameters that were monitored during the training process The YOLOv4-tiny model was implemented using the Darknet framework The model was trained on Google Colab using Tesla T4 GPUs with 12 GB of RAM Table 5.2 Training parameters of the object detection model Parameter Name Image size Batch size Learning rate Momentum Weight decay Parameter Value 416x416 64 0.00261 0.9 0.0005 56 Evaluate metric mAP The validation of training performance for YOLOv4-tiny on this system follows some parameters in Tabel 5.2 : - Precision: 0.9 Recall: 0.95 F1-score: 0.92 Mean average precision (mAP@0.50) = 0.975972 class_id = 0, name = car, ap = 93.93% (TP = 1452, FP = 166) class_id = 1, name = stop, ap = 97.40% (TP = 191, FP = 36) class_id = 2, name = left , ap = 98.69% (TP = 177, FP = 13) class_id = 3, name = right, ap = 98.18% (TP = 500, FP = 32) class_id = 4, name = straight, ap = 99.79% (TP = 95, FP = 20) Figure 5.9 The loss and mAP graph during training 57 5.3 RESULTS Lane-Line detection In various lanes, the lane line identification model functioned effectively even with noises from illuminance, full-leaf roads, or tree shadows The TensorRT platform enables the model to operate using the Float16 data type instead of the Float32 data type that is the standard for the model Deep learning inference on NVIDIA graphics cards may be optimized with TensorRT, which uses the quantization approach to increase execution speed without sacrificing accuracy The experimental findings showed that the inference time was superior to that of the original model The output of the Lane-Line detection model is demonstrated in figure 5.10 Figure 5.10 The result of Lane-Line detection model on test image Object detection Figure 5.11 shows the output of object detection on five classes In particular, the output YOLOv4-tiny on custom data is pretty good and suitable for our system operation Figure 5.11 The result of the object detection model on test image 58 2D-LiDAR performance When applied to various vehicles, the adaptable threshold for clustering point clouds performed admirably Despite this, the fact that it is still a low-cost gadget means that point clouds are significantly disrupted by vehicles with dark colors Figure 5.12 shows the output of the cluster technique in an outdoor environment on the HCMUTE campus Figure 5.12 In the left image, the autonomous vehicle system on real environment.The result of the cluster algorithm and RANSAC on the right image To fitting the line, we have a series of techniques However, we not have enough time to implement the series of methods Therefore, figure 5.13 shows the results of the two methods, which are 2D-RANSAC and Linear Regression on some data points Figure 5.13 The comparison between 2D RANSAC (left) and Linear Regression (right) on 2D LiDar data 5.4 COMPARISONS AND EVALUATION Lane-Line detection and object detection We ported the model over to the TensorRT platform, which makes it possible for the model to execute using the float 16 data type rather than the more conventional float 32 data type As seen in Table 5.3, the amount of time needed to complete the task virtually doubles, but the accuracy of the result is satisfactory given the trade-off 59 Table 5.3 Comparison between the original model and with TensorRT model on the custom dataset Type of model Original (Float32) TensorRT (Float16) Frame per second Memory space (FPS) (MB) 739 0.812 14 0.802 125,0 mIoU The whole pipeline with threading The whole pipeline implementation is done at once, so threading can be used to run things at the same time Table 5.4 shows that the CPU division does not affect the execution time of the GPS and LiDAR processes This is because the GPS and LiDAR process is just logical The deep learning model will take longer to run if you don't have enough resources But the execution time is a trade-off compared to the sequential execution method, which is fast and easy to set up Table 5.4 Execution time after and before applying to thread on the custom dataset Process Frame per second (FPS) single run Frame per second (FPS)integration Lane-line detection (Float 16) 14 12 YOLOv4-tiny 30 25 LiDAR process 50 45 60 CHAPTER CONCLUSION AND FUTURE WORK In this thesis, the combination of Camera and Lidar helps our vehicle model navigate and avoid obstacles on the road at the HCMCUTE campus Our Lane Detection and Tracking algorithm can navigate the vehicle with a steering angle pointing to one specific lane on a two-way road Thus, we can detect and follow traffic sign directions and avoid other vehicles with the help of the Object Detection YOLOv4-tiny model and a 2D LiDAR Our system ran real-time at 11~13 fps on the NVIDIA Jetson TX2 board while processing the Lane-Line Detection model for lane-keeping and Object Detection model in parallel threads Furthermore, the fusion between LiDAR and the camera ensures the vehicle can switch lanes mid-run to avoid significant obstacles, especially cars while navigating back to the old lane Although the final result is not yet satisfactory, for a student project, it is an excellent attempt to understand and experiment with the self-driving system in real life in small-scale hardware To push the work further, in the future, we will try to research more about the car theory and improve the robot hardware There are many optimizations exploited for improving this car in the promising future like: Utilizing more sensors integrated into the vehicle to receive better visualization of the surrounding environment Try to boost the inference time of systems by using a decentralized structure Deploy the pipeline on a life-size car model to test the algorithm's effectiveness 61 REFERENCE [1] N S Aminuddin, M M Ibrahim, N M Ali, S A Radzi, W H M Saad, and A M Darsono, “A new approach to highway lane detection by using hough transform technique,” J Inf Commun Technol., vol 16, no 2, pp 244–260, Dec 2017, doi: 10.32890/JICT2017.16.2.8231 [2] S Lu, Z Luo, F Gao, M Liu, K Chang, and C Piao, “A Fast and Robust Lane Detection Method Based on Semantic Segmentation and Optical Flow Estimation,” Sensors 2021, Vol 21, Page 400, vol 21, no 2, p 400, Jan 2021, doi: 10.3390/S21020400 [3] J Kocic, N Jovicic, and V Drndarevic, “Sensors and Sensor Fusion in Autonomous Vehicles,” 2018 26th Telecommun Forum, TELFOR 2018 - Proc., 2018, doi: 10.1109/TELFOR.2018.8612054 [4] “MIT Just Built A Camera System To Help Self-Driving Cars See Around Corners - AUTOJOSH.” https://autojosh.com/mit-just-built-a-camera-to-help-self-drivingcars-see-around-corners/ (accessed Aug 08, 2022) [5] J Redmon, S Divvala, R Girshick, and A Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit., vol 2016-December, pp 779–788, Jun 2015, doi: 10.48550/arxiv.1506.02640 [6] J Redmon and A Farhadi, “YOLO9000: Better, Faster, Stronger,” Proc - 30th IEEE Conf Comput Vis Pattern Recognition, CVPR 2017, vol 2017-January, pp 6517– 6525, Dec 2016, doi: 10.48550/arxiv.1612.08242 [7] “The PASCAL Visual Object Classes http://host.robots.ox.ac.uk/pascal/VOC/ (accessed Aug 07, 2022) [8] “COCO - Common Objects in Context.” https://cocodataset.org/#home (accessed Aug 07, 2022) [9] “ImageNet.” https://www.image-net.org/ (accessed Aug 07, 2022) Homepage.” [10] “The PASCAL Visual Object Classes Challenge 2007 (VOC2007).” http://host.robots.ox.ac.uk/pascal/VOC/voc2007/ (accessed Aug 07, 2022) [11] J Redmon and A Farhadi, “YOLOv3: An Incremental Improvement,” Apr 2018, doi: 10.48550/arxiv.1804.02767 [12] A Bochkovskiy, C.-Y Wang, and H.-Y M Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” Apr 2020, doi: 10.48550/arxiv.2004.10934 [13] “CS 230 Convolutional Neural Networks Cheatsheet.” https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-convolutional-neuralnetworks (accessed Aug 06, 2022) [14] M A Fischler and R C Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Comm ACM, vol 24, no 6, pp 381–395, Jun 1981, doi: 10.1145/358669.358692 [15] “Sử dụng TensorRT để suy luận nhanh giảm độ trễ cho Mơ hình đào sâu 62 Trang Chủ.” https://itzone.com.vn/vi/article/su-dung-tensorrt-de-suy-luan-nhanhhon-va-giam-do-tre-cho-mo-hinh-dao-sau/ (accessed Aug 05, 2022) [16] G A Borges and M J Aldon, “Line Extraction in 2D Range Images for Mobile Robotics,” J Intell Robot Syst 2004 403, vol 40, no 3, pp 267–297, Jul 2004, doi: 10.1023/B:JINT.0000038945.55712.65 [17] “RPLIDAR A1 Introduction and Datasheet Low Cost 360 Degree Laser Range Scanner rev.2.1 Model: A1M8 Shanghai Slamtec.Co - PDF Free Download.” https://hobbydocbox.com/Radio/104091010-Rplidar-a1-introduction-anddatasheet-low-cost-360-degree-laser-range-scanner-rev-2-1-model-a1m8-shanghaislamtec-co.html (accessed Aug 05, 2022) [18] “Jetson Module Comparison Connect Tech Inc.” https://connecttech.com/jetson/jetson-module-comparison/ (accessed Aug 05, 2022) [19] K Simonyan and A Zisserman, “Very Deep Convolutional Networks for LargeScale Image Recognition,” 3rd Int Conf Learn Represent ICLR 2015 - Conf Track Proc., Sep 2014, doi: 10.48550/arxiv.1409.1556 [20] K He, X Zhang, S Ren, and J Sun, “Deep Residual Learning for Image Recognition,” Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit., vol 2016-December, pp 770–778, Dec 2015, doi: 10.48550/arxiv.1512.03385 [21] X Du et al., “SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization,” Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit., pp 11589–11598, Dec 2019, doi: 10.48550/arxiv.1912.05027 [22] M Tan and Q V Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” 36th Int Conf Mach Learn ICML 2019, vol 2019-June, pp 10691–10700, May 2019, doi: 10.48550/arxiv.1905.11946 [23] C Y Wang, H Y Mark Liao, Y H Wu, P Y Chen, J W Hsieh, and I H Yeh, “CSPNet: A New Backbone that can Enhance Learning Capability of CNN,” IEEE Comput Soc Conf Comput Vis Pattern Recognit Work., vol 2020-June, pp 1571– 1580, Nov 2019, doi: 10.48550/arxiv.1911.11929 [24] K He, X Zhang, S Ren, and J Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics), vol 8691 LNCS, no PART 3, pp 346–361, Jun 2014, doi: 10.1007/978-3-319-10578-9_23 [25] Z Jiang, L Zhao, S Li, Y Jia, and Z Liquan, “Real-time object detection method based on improved YOLOv4-tiny,” Nov 2020, doi: 10.48550/arxiv.2011.04244 [26] Z Qin, H Wang, and X Li, “Ultra Fast Structure-aware Deep Lane Detection,” Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics), vol 12369 LNCS, pp 276–291, Apr 2020, doi: 10.48550/arxiv.2004.11757 [27] “TuSimple/tusimple-benchmark: Download Datasets and Ground Truths: https://github.com/TuSimple/tusimple-benchmark/issues/3.” https://github.com/TuSimple/tusimple-benchmark (accessed Aug 05, 2022) 63 [28] “heartexlabs/labelImg: LabelImg is a graphical image annotation tool and label object bounding boxes in images.” https://github.com/heartexlabs/labelImg (accessed Aug 05, 2022) [29] D P Kingma and J L Ba, “Adam: A Method for Stochastic Optimization,” 3rd Int Conf Learn Represent ICLR 2015 - Conf Track Proc., Dec 2014, doi: 10.48550/arxiv.1412.6980 [30] “Understanding a Real-Time Object Detection Network: You Only Look Once (YOLOv1) PyImageSearch.” https://pyimagesearch.com/2022/04/11/understanding-a-real-time-object-detectionnetwork-you-only-look-once-yolov1/ (accessed Aug 07, 2022) [31] V H Phung and E J Rhee, “A High-accuracy model average ensemble of convolutional neural networks for classification of cloud image patches on small datasets,” Appl Sci., vol 9, no 21, Nov 2019, doi: 10.3390/APP9214500 64 S K L 0