Understanding libraries and frameworks: OpenCV, Tensorflow Object Detection API… Practice: Building a Vehicle Detection and Counting System: o Collect dataset and pre-processing data..
INTRODUCTION
The reason for choosing the topic
Implementing a dependable vehicle detection and counting system is essential for addressing the growing concerns posed by urban traffic congestion Apart from improving traffic safety and efficiency, such a system furnishes valuable insights into traffic patterns, which can be leveraged for traffic planning and incident forecasting The ramifications of this system extend beyond traffic management, offering benefits in urban planning, environmental conservation, and overall economic growth.
Improving the overall efficiency of travel
Moreover, the system contributes to the development of public applications like online traffic maps, ensuring users are well-informed about road conditions
Integrating a vehicle detection and counting system paves the way for building smart cities by creating a secure, efficient, and intelligent traffic environment This system enhances the overall quality of life for communities by providing valuable data that empowers decision-makers to optimize traffic flow, improve safety, and make informed transportation planning choices.
TOPIC NAME : “Building a Vehicle Detection and Couting System”.
Objectives
In the modern world, vehicle detection systems are commonplace and silently monitor our roads and highways These observant eyes are more than just sophisticated devices; they play a vital role in accomplishing several important goals that improve the safety, efficiency, and smoothness of our transportation networks
Vehicle detection systems are essential tools for building safer, more seamless, and effective transportation systems They are not just amazing pieces of technology These technologies are subtly reshaping the future of our roads and creating a more intelligent transportation environment for everybody, from improving safety and security to streamlining traffic and collecting important data So our goals are to build a system that can detect vehicle and to an extend counting vehicle to help analyzing the traffic density
Methodology
To build a Vehicle Detection and Counting System, we start by collecting a video for our dataset We then use Histogram Equalization and Gaussian Blur to handle image quality this data Next, we use the SSD MobileNet algorithm and train our model using Google Colab, which helps us with the process
After training, we test the system to see how well it works and find any mistakes Once we identify errors, we evaluate the system's accuracy and make improvements to fix the issues This continuous cycle of testing, evaluating, and improving ensures that our system becomes more reliable over time The goal is to create a system that accurately detects and counts vehicles, making traffic management more efficient.
Scope
Improve traffic and reduce congestion: By providing accurate information on traffic density and road conditions, vehicle detection and counting systems help traffic authorities increase their predictive capacity traffic prediction, management and coordination This helps to reduce congestion, improve information flow and reduce travel time for people
Improve traffic safety: The vehicle detection and counting system assists in detecting hazards and dangerous conditions on the road Authorities can use this information to develop safety measures such as speed monitoring, hazard warnings and safe routes This helps to reduce the number of traffic accidents and protect the lives and property of members in traffic
Optimizing traffic planning and urban development: Traffic detection and counting data provides important information for city planning and development Management and planning agencies can use this data to assess road system performance, predict road transport needs, and make smart decisions about infrastructure development and urban planning.
THEORETICAL FUNDAMENTAL
Overview of Single Shot Detector (SSD)
The SSD is a purely convolutional neural network (CNN) that we can organize into three parts:
• Base convolutions derived from an existing image classification architecture that will provide lower-level feature maps
• Auxiliary convolutions added on top of the base network that will provide higher-level feature maps
• Prediction convolutions that will locate and identify objects in these feature maps
The research introduces two models designated as SSD300 and SSD512, where the suffixes indicate the size of the input image These networks share fundamental similarities in their construction, with SSD512 being larger in scale and yielding slightly superior performance.
• Fast and Efficient : SSD MobileNet is designed to be fast and efficient, making it suitable for real-time object detection on mobile and embedded devices It achieves a good balance between accuracy and speed, allowing for real-time processing even on devices with limited computational resources
• High Accuracy : Despite its efficiency, SSD MobileNet maintains a high level of accuracy in object detection It leverages the power of deep neural networks to accurately identify and locate objects in images or videos
• Small Model Size : SSD MobileNet has a small model size, which makes it suitable for deployment on resource-constrained devices with limited storage capacity It enables efficient deployment and reduces the memory footprint, making it ideal for mobile and embedded applications
• Easy Integration : SSD MobileNet is readily available as a pre-trained model in popular deep learning frameworks such as TensorFlow and PyTorch Its ease of integration allows developers to quickly incorporate object detection capabilities into their applications without having to build the model from scratch
● Lower Accuracy compared to larger models : While SSD MobileNet achieves a good balance between accuracy and speed, it may not perform as well as larger and more complex models in terms of accuracy This trade-off is necessary to maintain efficiency and real-time processing capabilities
● Limited Flexibility : SSD MobileNet is a fixed architecture model, which means it has limitations in terms of architectural modifications or customizations Developers may not have as much flexibility to fine-tune or modify the architecture compared to more flexible models.
Tensorflow Object Detection
The TensorFlow Object Detection API, constructed on the TensorFlow platform, is an open-source framework designed to simplify the creation, training, and deployment of object detection models Within this framework, there exists a set of pre- trained models known as the Model Zoo These models have already undergone training using diverse datasets, including:
❖ The COCO (Common Objects in Context) dataset
❖ The KITTI dataset, and the Open Images Dataset
TensorFlow Object Detection excels as a versatile framework, providing users with a comprehensive suite of pre-trained models and algorithms This adaptability enables users to select and modify models that align precisely with their requirements and application scenarios.
• Community Support : TensorFlow benefits from a thriving and engaged community of developers, researchers, and enthusiasts This active support system results in a wealth of valuable resources, including documentation, tutorials, and code examples This abundant support makes it more accessible for users to kick-start their projects and overcome obstacles during the development process
• Customizability : TensorFlow Object Detection provides room for customization and fine-tuning of pre-trained models, allowing developers to adapt them to particular use cases This adaptability empowers developers to optimize models based on their application domains, achieving higher accuracy or speed as required
• Expandability : TensorFlow Object Detection is crafted for scalability, supporting distributed training and inference across multiple GPUs or even machines This scalability feature makes it well-suited for extensive projects and scenarios where real-time processing of a large number of objects is essential
• High Memory Consumption : TensorFlow Object Detection models, particularly those that are larger and more intricate, demand a substantial amount of memory for both training and inference This could pose a challenge when deploying models on devices with constrained resources and limited memory capacity
• Training Complexity : Crafting custom object detection models using
TensorFlow Object Detection can be intricate, especially for beginners or those unfamiliar with deep learning It necessitates understanding the TensorFlow framework, data preparation, and hyperparameter tuning, making it a time-consuming and daunting process
• Extended Training Duration : The training of deep learning models, including those for object detection, can be computationally intensive and time- consuming, particularly for elaborate architectures and extensive datasets Training times vary significantly based on model complexity and available hardware resources
• Hardware Compatibility : TensorFlow Object Detection relies on hardware acceleration, such as GPUs or TPUs, for efficient training and inference Compatibility with different hardware architectures and configurations may fluctuate, and specialized hardware might be essential for achieving optimal performance
Histogram Equalization
Histogram equalization serves the purpose of contrast enhancement However, it is imperative to acknowledge that its application does not universally guarantee an augmentation in contrast There exist scenarios where the implementation of histogram equalization may, in fact, yield unfavorable outcomes, leading to a reduction in contrast
• Contrast Enhancement: Employed to refine the contrast of an image, histogram equalization redistributes intensity values across the entire dynamic range This technique proves particularly effective in revitalizing images with low contrast, enhancing the visibility of intricate details
• Simplicity and Efficiency: Noteworthy for its simplicity and computational efficiency, histogram equalization stands as a straightforward method with easy implementation The absence of intricate calculations or extensive parameter adjustments makes it an efficient approach for achieving contrast enhancement
• Retention of Image Structure: Distinguished from certain contrast enhancement techniques, histogram equalization minimally impacts the overall structure or configuration of the image Its primary adjustment lies in the intensity distribution, ensuring the preservation of spatial relationships and details inherent in the original image
• Over-Amplification of Noise: An inherent challenge of histogram equalization is its propensity to amplify intensity values throughout the entire range, encompassing any existing noise within the image This amplification can, unfortunately, lead to the overemphasis of noise, resulting in a diminished image quality marked by heightened visibility of noise artifacts
• Loss of Local Contrast: The global approach employed by histogram equalization treats the entire image as a singular entity, neglecting local characteristics and contrast variations Consequently, this method may induce a loss of local contrast, flattening texture details and imparting an unnatural or unrealistic appearance to the image.
Gaussian Blur
Gaussian Blur: Gaussian blur involves the application of a Gaussian function to induce a blurring effect in an image This technique is extensively employed in graphics software, primarily to diminish image noise and soften intricate details
Visual Outcome: The perceptible result of this blurring method is a seamlessly blurred appearance, akin to viewing the image through a translucent screen This effect is distinctly different from the bokeh effect generated by an out-of-focus lens or the shadow cast by an object under standard illumination
Application in Computer Vision: Gaussian smoothing extends its utility as a pre- processing step in computer vision algorithms This is particularly valuable for
Page | 9 augmenting image structures across various scales, aligning with the principles of scale- space representation and its implementation
• Noise Reduction: Gaussian blur is a widely employed method for diminishing noise in images Through the convolution of the image with a Gaussian kernel, high-frequency noise elements are effectively smoothed, resulting in a cleaner and aesthetically improved image
• Edge Preservation: An inherent property of Gaussian blur is its ability to preserve edges within an image While reducing noise and introducing a degree of blur, it actively maintains the sharpness and clarity of edges, preventing them from being excessively softened or distorted
• Softening and Smoothing Effects : Gaussian blur imparts a softening and smoothing effect to an image, lending it a visually appealing quality in various applications This technique is instrumental in reducing the prominence of fine details or imperfections, thereby bestowing the image with a more aesthetically pleasing and balanced appearance
• Loss of Fine Details: Gaussian blur, functioning through the averaging of neighboring pixel values, has the inherent risk of diminishing fine details within an image The blurring effect may result in a reduction of sharpness and clarity, particularly in areas with intricate or small-scale structures
• Computational Complexity: The Gaussian blur operation, involving convolution with a kernel, can pose computational challenges, especially for sizable images or when applied iteratively This heightened computational complexity may lead to increased processing time, rendering it less suitable for real-time or time-sensitive applications
Gaussian blur can introduce halo or ringing artifacts around edges, especially with a large blur radius These artifacts appear as unwanted bright or dark halos encircling edges, which can impact the quality and accuracy of the image.
Priors
Priors are predetermined boxes strategically positioned on specific feature maps These boxes come with defined aspect ratios and scales, meticulously chosen to align with the characteristics of object bounding boxes (ground truths) in the dataset.
Multibox
Multibox is an approach that treats predicting an object's bounding box as a regression problem, where the coordinates of a detected object are regressed to match its ground truth coordinates:
• For each predicted box, scores are generated for different object types
• Priors act as starting points for predictions due to their alignment with ground truths, resulting in as many predicted boxes as there are priors, with many containing no objects.
Hard Negative Mining
Hard Negative Mining involves deliberately selecting the most challenging false positives predicted by a model for learning This process focuses on mining negatives
Page | 11 that the model found most difficult to identify correctly, particularly effective in addressing the negative-positive imbalance in object detection.
Non-Maximum Suppression
Non-Maximum Suppression (NMS) is utilized when multiple priors significantly overlap at a given location It eliminates redundant predictions by retaining only the one with the maximum score, preventing duplicate predictions of the same object.
Bounding Box
A bounding box encapsulates an object, outlining its limits and serving as a representation of its bounds.
Boundary Coordinates
Representing a box by its pixel coordinates (x_min, y_min, x_max, y_max) is a common approach, but its utility is limited without knowledge of the image's actual dimensions Using fractional coordinates makes them size-invariant and comparable across all images.
Center-Size Coordinates
Center-Size coordinates provide a more explicit representation of a box's position and dimensions, denoted as (c_x, c_y, w, h).
Jaccard Index
The Jaccard Index, also known as Jaccard Overlap or Intersection-over-Union (IoU), quantifies the degree of overlap between two boxes in object detection.
OS Module
The 'os' module facilitates interactions with the operating system In this code, it is employed for tasks such as constructing file paths and fetching the current working directory
Pandas
The 'pandas' library is a robust tool for data manipulation and analysis In this code, it is utilized to construct a DataFrame and subsequently save it as a CSV file
Glob Module
The 'glob' module serves the purpose of file and pattern matching within directories
In this code, it is harnessed to locate all XML files within a designated directory
Figure 5: Glob and Pandas Module
SYSTEM REQUIREMENTS
Situation survey
Vehicle detection holds significant importance in the realms of computer vision and autonomous driving systems The process of designing and analyzing a vehicle detection system typically encompasses several key stages The following provides a high-level overview of this process:
Data Collection
• Gather a dataset comprising images or videos containing vehicles Obtain data from diverse sources, including public datasets or by capturing images/videos using cameras or sensors.
Data Preprocessing
• Prepare collected data for analysis by resizing images, converting formats, and eliminating noise or irrelevant information.
Feature Extraction
• Extract relevant features from preprocessed data to distinguish vehicles from the background or other objects Common features include color information, edges, texture, and shape characteristics.
Selection of Detection Algorithm
• Choose an appropriate algorithm based on application-specific requirements Popular algorithms include:
• Haar Cascade Classifiers: Utilize machine learning for object detection based on specific features
• Histogram of Oriented Gradients (HOG): Compute gradient orientations to detect objects
• Deep Learning-based Approaches: Employ Convolutional Neural Networks (CNNs) for automatic feature learning
Training
• If using machine learning or deep learning, train the algorithm on a labeled dataset Provide annotated examples where vehicles are positive instances, and non-vehicle regions are negative instances.
Detection and Localization
Employ trained algorithms and sliding window techniques to detect and locate vehicles in new data, classifying potential vehicle areas based on the learned model.
Post-processing
• Refine detected vehicle regions to enhance accuracy and reduce false positives Techniques like non-maximum suppression, morphological operations, and bounding box filtering can eliminate overlapping and spurious detections.
Evaluation
• Assess system performance using metrics such as precision, recall, and accuracy Fine-tune the design based on evaluation results.
Deployment
• Integrate the designed vehicle detection system into the desired application or system, incorporating the algorithm into an autonomous vehicle platform or relevant environment
It is crucial to note that the specific implementation details and chosen algorithms for vehicle detection can vary based on application requirements, available resources, and constraints.
IMPLEMENTATION & EVALUATION
Choosing a video for data collection
The provided code serves the purpose of converting a video into images to create a dataset This process is achieved using the OpenCV library to read the video frames, resize each frame, save a specified number of frames as images in a designated directory, and display the resized frames in a window named 'Vehicle Video.' Specifically, the implementation follows the following steps:
1 Set Parameters and Open Video
• Define parameters such as the delay between frames (delay), the desired number of images (number_images), and the target dimensions for each frame (height, width)
• Open the video from the specified path
2 Create Images from Video Frames
• Iterate through each frame of the video
• Resize each frame according to the specified height and width
• If the frame index is less than the desired number of images, create a file name for the image and save it to the specified directory
• Display the resized frame in a window named 'Vehicle Video.'
• Wait for the specified delay, and exit the loop if the 'q' key is pressed
4 Release Resources: Release the video capture resources
5 Close OpenCV Windows: Close all OpenCV windows
Image Processing
The initial code segment aims to enhance the contrast of an input image using Histogram Equalization and Contrast Limited Adaptive Histogram Equalization (CLAHE) The steps are outlined as follows:
• Read Image: o Read the input image ('D:\KhoaLuanTotNghiep\Image\image_1.jpg') using OpenCV
• Convert to LAB Color Space:
Page | 17 o Convert the RGB image to the LAB color space using cv2.cvtColor()
• Split into Channels: o Split the LAB image into its three channels: L, a, and b
• Apply CLAHE to L Channel: o Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the L channel to enhance local contrast
• Merge Channels and Convert Back: o Merge the processed L channel with the original a and b channels o Convert the LAB image back to the BGR color space
• Display and Evaluate: o Display the original and enhanced images side by side o Optionally, save the enhanced image as 'enhanced_image01.jpg'
The subsequent code segment evaluates the quality of the image after Histogram Equalization by analyzing mean, standard deviation, and entropy for each color channel
(Red, Green, Blue) Additionally, histograms for each channel are plotted to visualize the distribution of pixel intensities
The final part introduces Gaussian blur to the image using a custom Gaussian kernel The steps are as follows:
• Define Gaussian Kernel: o Define a function gaussian_kernel() to generate a Gaussian kernel based on the desired size and sigma
• Apply Gaussian Blur: o Define a function apply_gaussian_blur() to apply the Gaussian blur to the input image using the generated kernel
• Blur the Image: o Read an image:
('D:\KhoaLuanTotNghiep\Vehicle Detection\enhanced_image01.jpg') o Apply Gaussian blur to the image
• Display the Blurred Image: o Display the original and blurred images for visual comparison
• Evaluate Image Quality After Gaussian Blur: o Output histograms for each color channel (Red, Green, Blue) after the Gaussian blur
Figure 14: Image Processing (Before and After)
Color Histogram (before and after):
Figure 15: Color Histogram (Before and After)
Label Images
We use the tool available on GitHub: Labelimg [5]: https://github.com/HumanSignal/labelImg
The output after using the tool to label the objects in the photo is a XML file:
450 images for training and 150 images for validation.
Convert XML file to Record file and Create pbtxt file
Because Tensorflow 2 Object Detection API requires file record for training, we need to change format from XML file to Record file
• First, we have to convert from XML file to CSV file
• After that, we convert from CSV file to Record file
Figure 19: Convert XML to CSV, CSV to Record
Use Google Colab to train
Download and apply model for system
After training successfully, we exported trained model content file saved_model.pb and download it
Model Evaluation
The model's performance is evaluated using the following metrics:
• Precision: The proportion of detected vehicles that are actually vehicles
• Recall: The proportion of actual positives that are correctly identified
• Loss function: A measure of the model's error during training, indicating how well it fits the training data
The graphs of our model are generated using Google Colab and included function in Scikit-learn library
Here are some observation we drawn from the lost function:
• Convergence: o Decreasing loss: If both the classification loss and localization loss graphs consistently decrease and eventually plateau, it indicates that the model is learning and converging towards a solution This is a positive sign o Persistently high loss: If the loss values remain high or even increase, it suggests that the model is struggling to learn effectively This could be due to various factors such as insufficient training data, an overly complex model, or suboptimal hyperparameters
• Stability: o Smooth decline: A smooth, gradual decline in loss values indicates a stable training process This is desirable as it suggests the model is learning consistently without significant fluctuations o Erratic fluctuations: Large or frequent spikes in the loss values can signal instability in training This might be caused by issues like overfitting, vanishing gradients, or problems with the optimization algorithm
Monitoring the trends of classification and localization losses during training is crucial for assessing the model's progress Similar decreasing trends in both losses suggest a balanced learning of both object classification and localization However, significant discrepancies between the losses indicate a prioritization of one task over the other, which may hinder optimal model performance.
Here are some observations we can make about the loss function based on the image:
• Decreasing loss: Both the regularization loss and total loss curves are generally decreasing over time, which is a positive sign This suggests that the model is learning and improving its performance as the training progresses
• Smoother regularization loss: The regularization loss curve appears to be smoother than the total loss curve, with fewer fluctuations This indicates that the regularization term is helping to stabilize the training process and prevent overfitting
• Starting loss values: The initial regularization loss is higher than the initial total loss This could be due to the specific hyperparameters used for the regularization term, or it could be an indication that the model initially had a lot of complexity that needed to be penalized
• Rate of decline: Both loss curves seem to be declining at a similar rate, which suggests that the model is making progress on both the classification and localization tasks simultaneously
• Plateauing: Neither curve appears to be fully plateauing yet, although the rate of decline seems to be slowing down slightly This suggests that the model is still learning and has not yet reached its optimal performance
Overall, the loss function in the image suggests that the model is training well and making progress on the vehicle detection and counting task
Provide some general insights based on the information provided by the image:
To effectively train complex models or large datasets, a high learning rate, such as 0.08, may be utilized initially This facilitates rapid progress through the parameter space during the early stages of training However, as training progresses, it is often beneficial to decrease the learning rate to enable finer tuning and stabilize the training process.
• The image likely shows a plot of the learning rate over time Ideally, the learning rate should decrease gradually over the course of training This allows the model to refine its parameters more precisely as it approaches the optimal solution
While still needed some more optimization and improvement on the dataset in the future, The SSD MobileNet model demonstrates effective vehicle detection and counting capabilities on the highway video dataset and is the most suitable models for our project goal at the time
Count Vehicles
Firstly, we import all the necessary packages for the project After that, we create an instance of the EuclideanDistTracker() object from our previously developed tracking program and name the object "tracker."
The variables confThreshold and nmsThreshold represent the minimum confidence score required for detection and the Non-Maximum Suppression (NMS) threshold, respectively
4.8.2 Post-process the output data from Detection process
Initially, we established an empty list named 'detected_classNames' to store all the detected classes in a frame Employing two nested for loops, we traversed through each vector of every output, gathering the confidence score and class ID index
Following that, we verified whether the class confidence score exceeded our predefined 'confThreshold.' Subsequently, we gathered information about the class, storing the box coordinate points, class ID, and confidence score in three distinct lists
By utilizing the NMSBoxes() method, we streamlined the number of boxes, retaining only the optimal detection box for the given class
Upon obtaining all the detections, we monitor those objects with the help of the tracker object The tracker.update() function is responsible for tracking each detected object and updating their positions
The function count_vehicle is a customized function designed to tally the number of vehicles that have crossed the road [6] count_vehicle function:
Results
1 Open a video: This button lets you pick and open a video file
2 Detect Vehicles: Clicking this makes the system detect the vehicles in the video
3 Reset: reset state of program and allowing user choose another video
Open a video: Can open a video
Detect Vehicles: Detect is quite accurate.
CONCLUSION
Knowledge Acquisition
• Gained practical insights into traffic management challenges, including the intricacies of vehicle detection, counting, and traffic data analysis
• Developed a comprehensive understanding of the necessity and interconnectedness of the system in daily life, particularly in the context of modern urban traffic challenges.
Achievements
• Knowledge: Acquired proficiency in designing the UX/UI of the vehicle detection and counting system
• Skills: o Demonstrated the ability to work collaboratively within a group and contribute effectively to successful teamwork o Utilized tools to streamline the team's workflow
• Product: Developed an Application Building a Vehicle Detection and Couting
Advantages and Disadvantages
• High Accuracy : Leveraging machine learning algorithms and models, the system achieves remarkable accuracy in classifying and counting vehicles when trained with quality input data
• Real-Time Operation : A significant strength lies in its real-time capabilities The system swiftly processes and identifies traffic, providing live information and continuous data on the number of vehicles on the road
• Automation Benefits : By automating vehicle identification and counting tasks, the system reduces reliance on human intervention This not only saves time and effort but also diminishes the likelihood of errors, enhancing the overall reliability of counting results
• Scalability and Flexibility : The system demonstrates scalability and flexibility, adaptable to diverse locations and environments Its deployment can extend to streets, parking lots, toll booths, and various other settings, effectively collecting traffic flow data
• Stability and Reliability : The system's stability and reliability hinge on various factors such as lighting and weather conditions, along with challenges like vehicle overlap and occlusion These variables can occasionally impede accurate recognition and counting
• Vehicle Definition and Classification Complexity : Identifying and classifying vehicles becomes intricate, especially with a diverse range of vehicle types and shapes Non-standard vehicles, obscured objects, and variations in size pose challenges to precise identification and classification
• Configuration and Tuning Challenges : Achieving optimal performance necessitates meticulous configuration and tuning Fine-tuning parameters, selecting suitable models, and adjusting classification thresholds can be labor- intensive, demanding efforts to attain optimal outcomes
• Cost and Implementation Complexity : The construction of a vehicle identification and counting system involves intricate hardware and technology Deployment and maintenance demand financial and human investments to ensure sustained and effective functionality.
OpenCV provides a framework for developing a system that detects and counts vehicles using computer vision algorithms This system involves capturing images or videos, applying image processing techniques to extract features, and using machine learning models to identify and count vehicles.**Object Detection API:** TensorFlow offers a comprehensive API for building object detection models This API enables developers to train and deploy models for detecting specific objects within images or videos It utilizes a neural network architecture to process input data, extract features, and classify the presence of target objects The API provides tools for data preprocessing, model training, and evaluation, making it a valuable resource for object detection tasks.
Pham Thi Hong Anh Sử dụng Tensorflow API cho bài toán Object Detection (2019) https://viblo.asia/p/su-dung-tensorflow-api-cho-bai-toan-object-detection-aWj534YGK6m
[3] Nguyễn Tiến Đạt Histogram - Histogram equalization (2020) https://viblo.asia/p/tuan-3- histogram-histogram-equalization-3P0lPnxmKox
[4] Dục Đoàn Trình Gaussian Blur trong OpenCV (2021) from https://websitehcm.com/opencv- gaussian-blur/
[5] Tzutalin LabelImg Git code (2015) from https://github.com/tzutalin/labelImg
[6] Techvidvan Vehicle Counting, Classification & Detection using OpenCV & Python (2021) from https://techvidvan.com/tutorials/opencv-vehicle-detection-classification-counting/