With a database stored on google firebase, besides the Website, this system will make the work of management the information of employee more convenient and manager can monitor the total
OVERVIEW
Introduction
Amidst the technological advancements and advent of Artificial Intelligence (AI), smart appliances have proliferated to cater to human needs Manual tasks are increasingly replaced by AI-driven systems, enhancing convenience and security The implementation of AI in facial recognition for smartphone unlocking and online payments, as well as smart door systems that utilize facial recognition in homes and vehicles, exemplifies its widespread use Inspired by these advancements, the development of a system for attendance verification in a small office utilizing facial recognition was proposed This project, aptly titled "Design and Implementation of a Smart Attendance System applying Facial Recognition," aimed to explore the practical application of facial recognition technology in addressing workplace attendance management.
The main goal of the system is to design and deploy a facial recognition-based attendance system for users This involves collecting data from the user's facial features to train a detection model Additionally, users have the option to check in or out by using their fingers in front of the camera The system will be fully automated, avoiding any physical buttons and relying solely on the camera to save costs The system's data will be stored in a database (Firebase), allowing administrators to add or manage data either directly on Firebase or through the web application that we have developed.
Objectives
The project "Design and Implementation of a Smart Attendance System applying Facial Recognition" was carried out to design a system with the following functions:
- The system allows employee to check attendance automatically when they look at the camera of the system The employee uses their finger to check in when they come to the office in the morning or check out the office in the afternoon
- The system has a screen for showing the information of the employee after they check attendance successfully The system will show name, id, major and the address of the employee
The comprehensive employee management system streamlines HR processes in small offices Its user-friendly website enables administrators to manage employee data effectively From adding and deleting employees to efficiently searching for employee information based on their unique ID numbers, the system empowers HR professionals to access and update employee information with ease This centralized online platform provides a convenient and efficient way to manage employee data, enhancing the productivity of HR operations.
- The data of the system will be sent to google firebase which is easily for storing and managing.
Scope of study
The scope of this project is based on the following tasks:
- Design and implement an attendance system for employees in a decentralized office
- Utilize Raspberry Pi 4 as a signal and image processing unit within the system
- Implement Yolo v8 model to process facial images of all employees based on a pre-customized dataset
- Deploy a database for the system on Google Firebase
- The system will operate based on the Internet, therefore, a stable internet connection is essential for seamless system functionality.
Research methods
The necessary research methods for this topic are as follows:
- Find and learn about the Yolo V8 model
- Conduct research using the OpenCV library and collect data for the dataset based on the Face Recognition module
- Write a program to gather a dataset and integrate the Hand Detection module into this product
- Build and test the functions of the system
Outline
The report is divided into 5 chapters:
3 Introduces the project, its purpose, ways of research and the range for research of the project
Introduce some related topics, as well as compare them with the current topic
From the requirements, this chapter will present the system block diagram, steps to design and implement for both the hardware system and software interface, as well as an algorithm for processing data
This chapter presents the result from the above design, check its functions and analyses the result
- Chapter 5: Conclusions and future work
Based on the result, we will give conclusion about the advantages and disadvantages of the system, as well as some works for future development
BACKGROUND
YOLO Network
The You Only Look Once (YOLO) method suggests employing an end-to-end neural network for simultaneous predictions of bounding boxes and class probabilities This contrasts with the conventional approach of repurposing classifiers in previous object detection algorithms to accomplish detection tasks
Figure 2.1 Development time of YOLO network [1]
The YOLO algorithm begins by taking an image as its input, employing a straightforward deep convolutional neural network to identify objects within the image The underlying architecture of the CNN model integral to YOLO is illustrated in the diagram below
Figure 2.2 The structure of YOLO network [2]
The initial 20 convolution layers of the model undergo pre-training with ImageNet, incorporating temporary average pooling and fully connected layers Subsequently, the pre-trained model is adapted for detection, as prior research has demonstrated performance improvement by adding convolution and connected layers to a pre-trained
5 network The ultimate fully connected layer in YOLO serves the dual purpose of predicting class probabilities and bounding box coordinates
YOLO divides the input image into an S × S grid, designating each grid cell as responsible for detecting an object if the object's center falls within it Within each grid cell, YOLO predicts B bounding boxes and assigns confidence scores to these boxes These confidence scores gauge the model's certainty about the presence of an object within the box and the accuracy of the predicted box
During training, the objective is to assign one bounding box predictor to each object YOLO accomplishes this by designating the predictor with the highest Intersection over Union (IOU) with the ground truth as the "responsible" predictor This specialization fosters improved forecasting for specific sizes, aspect ratios, or classes of objects, thereby enhancing the overall recall score
A pivotal technique employed in YOLO models is non-maximum suppression (NMS) NMS is a post-processing step that enhances the accuracy and efficiency of object detection Given that multiple bounding boxes may be generated for a single object, potentially overlapping or located at different positions, NMS identifies and removes redundant or incorrect boxes The result is a streamlined output with a single bounding box representing each object in the image.
Mediapipe
Mediapipe is a library that is researched and developed by Google [3] MediaPipe is a combination of a wide range of cross-platform machine learning solutions with several advantages: deployable on mobile, desktop, cloud, Web, IoT appliances, … it is open source and completely free (users can use and customize directly to fit their own problem)
Mediapipe supports almost all areas of Computer Vision Some solutions include face detection, face mesh, hand detection, estimation human pose estimation, object detection,
… and much more Mediapipe also includes a hand detection solution to identify and locate hands The hand detection solution consists of two models: the palm detection model and the hand landmarks detection model The palm detection model detects the hand regions in the input image, while the hand landmarks detection model identifies and
6 extracts specific points of the hand within those regions The key points being extracted including 21 landmark coordinates, with each coordinate include the width and height of the key point base on its position in the image
Figure 2.3 Landmarks positions detected by Mediapipe [3]
Figure 2.3 gives information about 21 hand landmarks that the Mediapipe identified and extracted For each key point, Mediapipe returns the location of the point on the image, corresponding to the width and height of the image’s size The starting reference point is the point on the top left of the image With each different hand gesture, Mediapipe will return a set of different values, which is suitable for training a model to recognize hand gesture using the outputted value
Figure 2.4 Example of hand landmarks using in design
Google Firebase
Figure 2.5 Logo of Google Firebase
Firebase by Google comprises a collection of cloud-based development tools designed to assist mobile app developers, web app developers in the creation, deployment, and expansion of their applications
Firebase offers a range of functionalities, including:
Firebase empowers developers with a robust and secure authentication system, enabling users to seamlessly log in to their applications Leveraging Firebase Authentication, developers can integrate multiple sign-in options, including email and password, Google Sign-In, and Facebook Login, providing users with a convenient and secure authentication experience.
- Realtime Database: The Firebase Realtime Database is a cloud-hosted NoSQL database that enables organizations to store and synchronize data in real-time across all user devices This feature streamlines the development of apps that remain consistently updated, even when users are offline
- Cloud Messaging: Firebase Cloud Messaging (FCM) is a service designed for businesses to send messages to users' devices, even if they are not actively using the app Developers can utilize FCM to send push notifications, update app content, and perform other communication tasks
- Crashlytics: Firebase Crashlytics is a service dedicated to helping organizations track and address crashes in their applications It provides comprehensive crash reports, enabling quick identification of the root cause and efficient problem resolution
- Performance Monitoring: Firebase Performance Monitoring offers insights into an app's performance Organizations can use this feature to monitor metrics such as CPU usage, memory usage, and network traffic, ensuring optimal performance
- Test Lab: Firebase Test Lab is a cloud-based service that allows developers to test their apps on various devices and configurations This capability assists in verifying that the app functions effectively across different devices and under diverse network conditions
Figure 2.6 Mobile app or Web app development process
SYSTEM DESIGN
System Requirements
In order to achieve the set target functions, our team decided to design a smart attendance system applying facial recognition include one microcontroller, one power supply, one camera and one screen
The microcontroller is Raspberry Pi 4 which is responsible for analyze image data which is collected by camera and it sends the result into google firebase and show the information of the employee via a screen of the system The data of all employee in the system are shown on website which is managed by manager
The system consists of following function:
Employees can conveniently track their attendance using a cutting-edge system that utilizes face recognition technology Upon arriving at or departing from the office, employees simply gaze into the system's camera, allowing it to capture their facial features This advanced technology eliminates the need for manual check-in or check-out processes, streamlining attendance management and ensuring accuracy.
- The system has a screen for showing the information of the employee after they check attendance successfully The system will show name, id, major and the address of the employee
- The system also has a website for showing all the information of all employee in the small office, which has some functions such as add/delete employee, or find the information of the employee based on their ID number
- The data of the system will be sent to google firebase which is easily for storing and managing
The figure below shows the operating of the system when the system starts, the employee stands in front of the camera of the system after check attendance successfully The microcontroller based on the result and display information of the employee into a screen It also sends the data to database
Block Diagram
For this project, the block diagram of the whole system is designed as in Figure 3.2 The system have three main blocks: Processing block, camera and user interface block This system also has power supply to provide power for whole system
- Camera Block: This block is a peripheral camera that works continuously to receive facial images from the employee and send these to the processing block to perform the identification and analysis, thereby sending data to google firebase via WIFI
- Central Processing Block: This is the block that handles the main functions of the system In particular, this block will contain 2 sub-blocks: hand detection and face detection For the hand detection block, the system will analyze the user's choice (check in or check out) As for the face detection block, the system will analyze the user's face to assign the corresponding user id
- Displaying Block: This block is a screen, which has only one function is showing all the information which relates to the employee include name, old, major, address,
- Database Block: This block contains data of the entire system about user data, attendance date, number of attendance in the month
- Power Supply Block: providing voltage and current for all system can perform task perfectly.
Hardware design
At the request of the system, the project needs to use a camera that can collect clear images with high resolution to improve image quality for image processing There are many Camera modules suitable for the AI project Table 3.1 below listed some suitable cameras
Raspberry Pi Camera Module V3 12 790,000VND
Raspberry Pi Camera Module V2 8 750,000VND
Compare for different results of Table 3.1, the Logitech Webcam C270 HD is the most suitable camera because 3 megapixels is enough for this project and this camera is also the cheapest of all cameras in the table which saves on the cost of the project In
12 addition, this Webcam has a high resolution of 720p/30fps, 30 frames per second also greatly helps with the continuous image reading minimizing frame loss
- Focal Type: fixed focal length
- Mic Range: Up to 1 meter
For the project's facial recognition application, the team requires a powerful mini computer capable of efficiently processing images This computer should meet the system's specifications and possess the requisite processing capabilities for image handling.
There are many mini computers for AI projects Table 3.2 lists some suitable mini- computers
Table 3.2 Mini computers comparison table
Mini computer name CPU RAM Wifi Price
Raspberry Pi 4 B Cortex-A72 1GB Yes 1,300,000VND
Raspberry Pi 4 B Cortex-A72 2GB Yes 1,500,000VND
Raspberry Pi 4 B Cortex-A72 4GB Yes 2,200,000VND
Raspberry Pi 4 B Cortex-A72 8GB Yes 2,800,000VND
NVIDIA Jetson Nano B01 ARM A57 4GB None 5,000,000VND
After comparing results of Table 3.2, the Raspberry Pi 4 B model with 8 GB RAM is the most suitable mini-computer because image processing will require a lot of RAM
- Micro-SD card: for storing SD Card (Installing OS)
As the request of the system, we need a screen for monitoring the information of the employee And we also do not interact with the screen so we just need a normal screen not touch screen or expensive screen We can also use an online application on computer or laptop which is VNC viewer for monitoring the information through the Internet
As shown above, the raspberry acts as a miniature computer, able to display the operating system interface through any screen
For the system to operate, the power supply block is indispensable For this system, we will provide the main source for the central processing block (Raspberry Pi 4) Currently on the market there are many types of power supplies for Raspberry Pi 4 Based on criteria such as good, safe to use, and cheap, we have chosen the adapter as shown below
- Power consumption (no load): Maximum 0.075W
- Inrush current: No damage will occur and the input fuse will not blow Output:
- Rise time: maximum 100ms to specified limit for DC output
- Turn-on delay: maximum 3000ms at input AC voltage and full load
3.3.5 Connection diagram of the entire system
In order for the system to operate stable and in accordance with the target function, our team selected block with appropriate functions These blocks are interconnected as shown in figure below:
As we can see, this is a rough image of the connection of the blocks in the system together For the central processing block, we use Raspberry Pi 4 as the main hardware and the brain of the entire system To provide power for this Raspberry device, we will use a genuine power adapter through the type-C port Besides, to provide the best and most realistic images for the central processing block, we will use the Logitech C270 HD webcam This is a pretty good webcam in its price range, and to save money on the system, we will connect the Webcam to the Raspberry Pi 4 via the USB port And to display the entire system interface, and also the main operations in the system, we will provide a removable screen, with good quality Additionally, we can use any screen, or through the VNC-based laptop screen
As the system hardware diagram the schematic diagram of the system also has 4 parts:
The figure “System Schematic” illustrates how we connect each part The Raspberry
Pi 4 will be responsible for receiving image data from the camera and send data of the employee to the screen.
Software design
Before working with the Yolo v8 model, we need to learn about the structure of the model, the blocks and classes present inside
With the above structure, the model is built based on two parts: Backbone and Head Backbone is the deep learning architecture that basically acts as a feature extractor Head combines the features acquired from various layers of the backbone model, predicts the classes and bounding box regions which is the final output by the object detection model The model structure has total of 7 Conv blocks, 8 C2f blocks, 4 Concat blocks, 1 SPPF block and 2 Upsample layers, 3 Detect blocks
In YOLOv8, the Conv block serves as the foundational convolutional unit, combining a Conv2d layer, a BatchNorm2d layer, and a SiLu activation layer into a single, efficient component This architectural design streamlines the convolutional process, enhancing the overall performance of the network.
- The C2f block: This block contains convolutional block which then the resulting feature Maps will be split One goes to the bottle neck block whereas the other goes directly into the concat block In C2f block we can have many bottleneck blocks And at the end, there is another convolutional block
- The Bottleneck block: It is a sequence of convolutional blocks with a shortcut
- The SPPF block: It stands for Spatial Pyramid Pooling Fast It is a modification of
SP of spatial pyramid pulling with a higher speed Inside SPPF, there is a convolutional block at the beginning and followed by 3 MaxPool2d layers The interesting part is that every resulting feature map is concatenated right before the end of SPPF SPFF is ending with a convolutional block
- The Detect block: This is where the detection happens The detect block contains two tracks, the first track is for bounding box prediction whereas the other is for class prediction Both tracks has the same block sequence which is two convolutional blocks and a single Conv2d layer
- The Upsample layer: It is used to increase the feature map resolution of the SPPF to match with the feature map
To train the model above, we need to have a separate dataset for this project For the dataset for this system based on [5], it will include facial images of employees from various angles and under different lighting conditions With this dataset, we have developed a specific line of code to collect data instead of manually labeling each image (which is time-consuming and impractical for this project)
For collecting this dataset, we will continuously save the detected faces along with corresponding information for each image in a text file containing details such as bounding box coordinates and class
Figure 3.10 Bounding box on Image User
Figure 3.11 Information about bounding boxes
- 0.550781 : x center ( following by width of bounding box )
- 0.542708 : y center ( following by height of bounding box )
After exploring several modules shared by users, we found the Face Detection module by Computer Vision Zone Through this module, it automatically detects the faces of each person Additionally, based on this module, we have created a new code and adjusted some parameters to enhance data collection for the dataset We have collected about 13497 images for the custom dataset We then stored about 70% of the data for training(9448 images), 30% of the data for testing(3150 images), and 10% for validation(899 images)
Here we will choose Google Colab as the environment to train the model for the project The reason we choose Colab is that it provides a virtual GPU for graphics processing and contains CUDA cores to accelerate image processing speed faster than a CPU After collecting the dataset, we will divide the images and txt files into different folders to create a dataset in the standard format that the YOLO v8 model requires
First, we will upload the dataset to Google Drive, then connect Colab to Drive to extract the dataset from Drive Before training, we need to download the Ultralytics library containing the YOLO model as follows:
After that, train the YOLO model as shown below:
As mentioned above, we will use the YOLO v8 nano version based on [6], which is a lightweight variant suitable for our project as it does not require handling complex images Additionally, we will adjust the 'imgsz' (image size) parameter After researching on Google, we found that for the Raspberry Pi to smoothly process the model with an adequate frames-per-second (FPS), it is necessary to reduce 'imgsz.' Therefore, we need to find a suitable 'imgsz' value The reason is that if 'imgsz' is too large, the FPS on the Raspberry Pi 4 when running the model is significantly low, below 1 FPS On the other hand, if 'imgsz' is too small, the accuracy of detection is compromised with increased chances of misclassifying objects into different classes
After the training process is complete, we will obtain two model files, namely 'best.pt' and 'last.pt' Here, we will choose the 'best.pt' file for deployment on the Raspberry Pi
3.4.3 Design UI interface for User
We will first create a main interface for the project Due to the complexity of the design, we found an Attendance system interface online and brought it in for use We made some modifications to distinguish it, adapting it to our needs
Figure 3.12 The project’s interface frame
This is the location where the main webcam feed is displayed
This is where the corresponding modes are displayed
The next step is the modes in this project:
Figure 3.13 The functions in mode
- Mode 1: Check-in and Check-out selection The system will provide 2 options for the user, so the user can choose to in or out
- Mode 2: Display employee information After selecting the username, the system will detect the user's face and based on that, display information about the user as follows id, name, date of birth, position, total attendance in month, …
- Mode 3: Notify employees of successful check-in Then, the system will notify the employee of successful attendance
- Mode 4: Notify employees who have already checked in The system will notify the user that attendance has been taken, to avoid the user taking attendance again, or the user can check whether they have taken attendance or not
3.4.4 Web apps to manage data
In the database creation section, we mentioned the development of a web application to assist administrators in managing user data
Figure 3.14 Main interface of management software
RESULTS
Hardware implementation
The results we obtained when combining hardware with dentistry The hardware for this topic is quite simple, does not need too many details
Figure 4.2 illustrates the hardware connections, with the Raspberry Pi connected to both the camera and the screen.
Figure 4.3 Display application of Project
The figure 4.3 above show the general view about the Smart attendance system applying facial recognition after implementation the hardware The system has one micro-controller, one camera, one screen and one power supply
System operation
After completing the code, and implementing the hardware, we will now see the results of the hand detection implementation, as we said before, in this part hand detection helps the user choose entry and exit identifiers
With the image above, we can see that when the user raises his finger like in the image, it means the user is choosing Check In And the system will rotate in a circle every 4 - 7 seconds to maintain the user's choice
In the image above, we can, when raising the finger like this image above, its means user choose check out An the system also rotate 360˚ for about 4 - 7 seconds for the user selection And that is the mode number 1 interface we designed
After the user chooses to register in or out, then the Yolo model that we previously trained will display its capabilities It will detect the user's face and from there assign the user an id, then the system will rely on that id to get information about the user on Google Firebase The results after retrieving information from the database are as follows:
Figure 4.6 Show Information of the first employee
After the model in the system detects the user's face, the system will then display the user's information based on the face The image above is the result of the first user
Figure 4.7 Show Information of the second employee
On the other hand, the image 4.7 shows the information of the second user Which show clearly information of the employee in the office
After the system displays information about the user, the system will notify that attendance has been successfully registered as follows:
Figure 4.8 Check IN and OUT successfully of first employee
After detecting and displaying the first user's information, the system will notify you that attendance has been successful
Figure 4.9 Check IN and OUT successfully of second employee
And the figure 4.9 show us that the system has announced that the second user has attended successfully, and that is the mod interface number 3 that we have designed Next, we will check whether the system will notify the user that attendance has been taken, when the user is not sure whether he or she has taken attendance or not
Figure 32 tells us that the system will immediately notify attendance based on the Yolo model when detecting the user's face and when the user selects the type of attendance that he or she has previously attended within a certain period of time And that is the mod interface number 4 that we have designed
Besides the developed attendance system, we also developed a attendance website for managers, to be able to control user information First, one of the most important functions of this management software is adding new employees
Here the manager is required to enter necessary fields such as id, name, positions, of the new user and click the INSERT button
36 The second function of this management software is to search for employees, here we will allow managers to search based on user ID
Example of find employee function
Figure 4.13 Show information of first employee
This is the result above of the search function on the management website, then the manager will find user information based on id After searching, the website will display some basic information about the first user
Figure 4.14 Show information of second employee
We have already tested the results for the second user as figure 4.14 With this search function, managers can easily know the necessary information about users
Another function of that management software is the employee deletion function, which helps managers delete the user's id and information after the employee leaves work
For the image above, the manager will delete the user based on the user's ID, which helps the manager conveniently delete that user's information, without having to enter any other fields about the user
The last function is the function that helps managers see all information about all users present in the system
Figure 4.16 Show information of all employee in the office
The user management software provides comprehensive information about each user, including their profile details Additionally, it displays the number of attendance days accrued by the user during the current month.
CONCLUSIONS AND FUTURE WORK
Conclusions
The design and implementation of a smart attendance system leveraging facial recognition technology represent a significant leap forward in the realm of attendance management This project aimed to streamline the traditional attendance-taking process by incorporating cutting-edge facial recognition algorithms and smart technology Through meticulous planning, rigorous development, and extensive testing, we have successfully created a robust and efficient system with the potential to revolutionize attendance tracking in various settings, including educational institutions, corporate offices, and events In addition, the project has proposed requirements such as an automatic attendance system using only cameras, users can check in or out every time they come in or leave work Next, the system helps users track information about themselves, and how many times they take attendance during the month And the topic has helped managers easily access employees in the company through management software on the website, as well as all data is stored securely on Google Firebase
The key findings and outcomes of this project can be summarized as follows:
- Accuracy and Efficiency: The facial recognition algorithms employed in our system have demonstrated commendable accuracy in identifying individuals, minimizing the margin for error traditionally associated with manual attendance recording The system's efficiency is evident in its ability to process large volumes of data rapidly, contributing to a streamlined and time-saving attendance management process
Users can effortlessly navigate and utilize the system thanks to its intuitive and user-friendly interface Administrators, teachers, and students alike find the system easily accessible, fostering widespread acceptance and adoption This seamless user experience plays a pivotal role in the successful implementation and utilization of the technology.
- Real-Time Monitoring: One of the standout features of our smart attendance system is its capability for real-time monitoring This ensures that attendance records are updated promptly, enabling administrators and educators to make
40 informed decisions based on current attendance data The system's ability to generate instant reports adds an extra layer of functionality for timely analysis
- Scalability: Our system is designed to be scalable, accommodating varying scales of usage without compromising on performance This scalability factor makes it adaptable to different organizational sizes and future expansions
- Cost-Effectiveness: By automating the attendance tracking process, the smart attendance system offers a cost-effective alternative to traditional methods The reduction in manual labor and the elimination of paper-based systems contribute to long-term cost savings for institutions
The drawbacks and limitations of this project can be summarized as follow:
- Accuracy and Reliability: Facial recognition systems may not be 100% accurate, leading to potential errors in attendance tracking Factors such as changes in lighting conditions, facial expressions, or even the quality of the camera can affect the system's performance
Facial recognition systems can be costly to implement and maintain, posing financial challenges especially for organizations with limited resources These systems require significant upfront investment in hardware, software, and infrastructure, followed by ongoing maintenance expenses that can add to the overall cost of operation.
- Distance: The attendance system still has a weakness in terms of the distance from the user's face position to the camera Facial recognition accuracy tends to decrease as the distance between the camera and the subject increases At greater distances, the system may struggle to capture detailed facial features, leading to potential errors in identification
In conclusion, the successful design and implementation of this smart attendance system underscore the transformative potential of facial recognition technology in the domain of attendance management As technology continues to evolve, our system serves as a testament to the possibilities that arise when innovation meets practical application The positive outcomes observed in this project pave the way for broader implementations and inspire further exploration of smart technologies in various facets of organizational management
Future work
After researching and implementing the topic, the group realized that the topic still need improvement To make the system better, some adjustments can be implemented such as:
To enhance accessibility and convenience, organizations should develop mobile applications that empower employees to track attendance using their smartphones This multi-device platform optimization ensures seamless usage across various devices, including both primary and secondary cameras, providing flexibility and ease of use.
- Handling special cases: Address challenging situations such as low light conditions, different facial angles, and even recognition when users are wearing masks Integrate image enhancement algorithms to make faces more prominent in all lighting conditions
- Integrate advanced security technology: Utilize a system of classification and facial feature recognition to determine the level of similarity between images, reducing the risk of deception through impersonation methods Integrate gesture recognition algorithms to verify the validity of user actions
- Accuracy and Reliability: Increase accuracy and reliability, the detection model in the system needs to be trained with a better dataset such as many images with low light quality, or images contain the user's face in some special cases such as wearing clothes masks, glasses,
- Cost and Maintenance: Optimize system costs and maintenance, we need to provide good enough hardware for the system at a price range like Nvidia Jetson Nano Developer Kit B01 This is good enough hardware to process images for the AI model, and is also suitable for small and medium-sized systems
- Distance: Model detection on the system requires a lot of data with different distances of the user's face from the webcam
[1] Youshan Zhang (2023, Dec 13) CowStallNumbers: A Small Dataset for Stall Number Detection of Cow Teats Images [Online] Available: (PDF) Stall Number Detection of Cow Teats Key Frames (researchgate.net)
[2] Rohit Kundu (2023, Dec 14) YOLO: Algorithm for Object Detection Explained
[Online] Available: YOLO Algorithm for Object Detection Explained [+Examples] (v7labs.com)
[3] Google (2023, Sep 16) Hand landmarks detection guide [Online] Available: https://developers.google.com/mediapipe/solutions/vision/hand_landmarker
[4] Jacob Solawetz and Francesco (2023, Dec 14) What is YOLOv8? The Ultimate
Guide [Online] Available: What is YOLOv8? The Ultimate Guide (roboflow.com)
[5] Denis Kuria (2023, Sep 25) Yolov8 How train custom data [online] Available:
How to Train YOLOv8 on Custom Data (makeuseof.com)
[6] Ultralytics (2023, Sep.15) Yolo network [Online] Available: Train - Ultralytics YOLOv8 Docs
[7] Sunny Kumar (2023, Sep 17) Hand tracking using Mediapipe [Online] Available: Hand tracking using Mediapipe Access full tutorial of body pose… | by Sunny Kumar | Medium
[8] Suchandra Datta (2023, Oct 10) Firebase with Python [Online] Available: How to Get Started with Firebase Using Python (freecodecamp.org)
[9] Logitech (2023, Oct 13) C270-HD-Webcam [Online] Available: https://www.logitech.com/vi-vn/products/webcams/c270-hd-webcam.960-
[10] Cytron (2023, Oct 13) Raspberry Pi4 Model B 8Gb Ram [Online] Available: cytrontech.vn/tutorial/raspberry-pi-4-model-B-8GB-RAM