Đồ án real time attendance management system based on face recognition with facial features techniques

As of April 24, 2022, the COVID-19 pandemic has resulted in over 500 million confirmed cases and more than 6.2 million deaths globally, highlighting the urgent need for effective prevention methods With no specific treatment available, minimizing the virus's spread is crucial, achievable through practices such as physical distancing and wearing masks Masks play a vital role in preventing the transmission of droplets from infected individuals during communication, sneezing, or coughing The World Health Organization (WHO) emphasizes that these measures can significantly reduce COVID-19 transmission As countries reopen, health authorities stress the importance of wearing masks in public spaces to safeguard against the virus Therefore, developing a model to monitor mask usage and curb disease spread is a practical and essential step in addressing the ongoing health crisis.

In the digital era, identity verification using biological traits has become increasingly popular due to its ease and confidentiality Traditional methods like access cards and ID cards are struggling to keep pace with today's fast-moving society, leading to congestion, wasted resources, and delays To address these challenges, biometric technologies have been developed for personal identification and access rights Among these, facial recognition has emerged as the leading application of biometric systems in computer vision over the past few decades.

Facial recognition is a physiological biometric technology that involves several key processes: image capture, face detection, feature extraction, and ultimately verification or identification This technology utilizes various feature extraction techniques, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Local Binary Patterns (LBP), Elastic Bunch Graph Matching (EBGM), Gabor Wavelet, and Convolutional Neural Networks (CNN) to enhance accuracy and efficiency in recognizing faces.

Current face recognition systems perform effectively in controlled environments but struggle with challenges like variations in posture, occlusion, lighting conditions, low resolution, aging, and makeup To enhance their performance, especially regarding posture variations, an upgrade is necessary The difficulties in face identification primarily stem from an emphasis on frontal face images and the limitations of existing database images.

Face recognition technology, including its application for identifying individuals wearing masks, is increasingly common One major challenge is the ability to accurately recognize a person in real-time under these conditions This technology relies on a dataset of pre-labeled images to train algorithms for automatic identification The thesis explores various strategies employed to enhance face recognition accuracy in masked scenarios.

Recent advancements in machine learning technology have significantly improved the detection and recognition of various objects, impacting nearly every aspect of daily life A majority of projects in the realm of facial detection and recognition concentrate on reconstructing images and verifying identities through human face recognition.

The research employs a sophisticated multi-stage system that integrates deep convolutional networks with PCA for dimensionality reduction and SVM for classification Y Sun et al utilize a deep network to transform faces into a standard frontal view, subsequently training a CNN to recognize each face's identity Face verification is achieved through PCA applied to the network's output, along with an ensemble of SVM classifiers.

Sun et al developed a compact network that is cost-effective to compute, utilizing 25 distinct networks, each targeting a specific facial patch This approach achieved an impressive LFW performance of 99.47 percent, incorporating both regular and flipped images They employed PCA and a Joint Bayesian model, which function as linear transformations in the embedding space, without requiring prior 2D/3D alignment Their training methodology combines classification and verification loss to optimize the networks.

G K Jakir Hussain and his team have created an innovative monitoring device that measures temperature and detects mask usage, leveraging convolutional neural networks for face mask identification in compliance with COVID-19 safety measures Their system utilizes deep learning technology to continuously monitor individuals and store data on a server, with extensive testing conducted on various classifiers, including Support Vector Machine and Symbolic Classifier The project employs campus cameras to gather data, aiding in the machine learning model's software training The team utilized TensorFlow and Python for transfer learning to enhance mask detection speed However, the study's limitations include a slow operation and the capability to only indicate whether a face is masked or unmasked.

This article discusses three image processing techniques for detecting face masks, utilizing a model that integrates Deep Learning and classical Machine Learning methods with OpenCV, TensorFlow, and Keras The model demonstrates real-time detection capabilities for individuals wearing or not wearing masks and has been tested with both images and live video streams Continuous optimization ensures high accuracy, making this model a viable example of edge analytics Additionally, researchers Amrit Kumar Bhadani and Anurag Sinha compare various algorithms to identify the most effective program that balances accuracy with minimal training and detection time.

In recent years, facial recognition-based attendance management systems have been increasingly adopted to enhance student performance across various organizations Jomon Joseph and K P Zacharia proposed a Matlab-based system utilizing image processing, PCA, and Eigenfaces, though it requires front-facing photos, highlighting the need for orientation-compatible solutions Ajinkya Patil and his team introduced an attendance marking system using the Viola-Jones algorithm, employing Haar cascades for face detection and the Eigenface method for recognition Additionally, a system leveraging artificial neural networks was developed, which effectively utilizes PCA for facial image extraction and operates well across different orientations MuthuKalyani K and VeeraMuthu also presented a 3D facial recognition solution aimed at improving attendance management.

A VeeraMuthu [10] recommended that they track attendance along with each monthly progress of the student There is a need for a different algorithm that can improve face identification on oriented faces The Efficient Attendance Management system is built with the assistance of the PCA algorithm [11], and they have obtained an accuracy of up to 83 percent, however, their system performance diminishes due to tiny changes in light situations The author developed an eigenface technique combined with a PCA algorithm for marking face recognition attendance systems They also discuss a comparison of several face recognition algorithms in their article Overall, it was a solid strategy for keeping track of attendance

The proposed solution aims to automate attendance tracking for organizations, addressing the limitations of current manual methods This system records attendance for each subject by allowing administrators to manually input student and subject data When a class begins, the system automatically captures images to detect human faces For facial recognition, we employ the Haar Cascade Classifier alongside the “face recognition” module, utilizing deep learning algorithms to analyze and compare 128-dimensional facial features Once faces are identified using the existing database, the system efficiently calculates attendance in real-time.

4 time for the recognized pupils with the relevant name ID And the system will automatically produce and store an excel sheet.

This thesis aims to develop a system utilizing OpenCV and machine learning for automated class attendance tracking through facial recognition, with the capability to export attendance data to an Excel file The implementation of a real-time facial recognition system via OpenCV is designed to enhance efficiency, reduce errors in name spellings, and eliminate fraudulent attendance practices.

− Identifying faces in real-time and all information of the users entered will appear in the system after the face recognition

− Faces on an image must be detected (Fake faces)

− Display the name and status of the users on the screen

− The administrator can add the new user attendance

− The administrator can print attendance reports

− Compute the total attendance and time engaged based on detected faces

− The GUI will inform the users when attendant with clear instructions on how to position their faces

− The system can detect the face from a live-stream video.

This project implements an automated attendance system that utilizes face recognition and a medical mask detector for enhanced security The system employs the K-nearest neighbor (KNN) classification algorithm to identify key facial features, such as eyes, eyebrows, hairline, and overall face shape By leveraging neighborhood classification, the KNN method effectively predicts instance values The study accommodates a substantial number of participants, allowing the admin to add new users to the database seamlessly.

The Face Mask Detection System utilizes a dataset of 50 images, comprising 25 of individuals without masks and 25 with masks, to train its algorithms This system effectively identifies individuals in front of the camera, issuing warnings if a mask is not worn or improperly worn Key facial features are detected and stored for future recognition The implementation includes a tracking image option to recognize and log attendance in a spreadsheet, capturing the date and time of each individual Utilizing advanced techniques in image analysis, computer vision, and deep learning, the system processes both masked and unmasked images through segmentation, feature extraction, and classification The project leverages OpenCV, Keras/TensorFlow, and principles of deep learning to enhance the accuracy of mask detection in static images.

The training dataset for the Face Mask (FM) and Liveness (LN) models consists of over 32,000 images featuring individuals with and without masks, as well as more than 67,000 images of both real and fake faces This dataset is organized into small folders to facilitate effective training for the models, which are designed to analyze real-time video streams.

The capstone project report is arranged into 5 chapters:

Chapter 1 provides a comprehensive introduction to the topic, outlining the methodology employed and presenting relevant facts that connect the subject to real-world contexts It also offers a brief overview of the content covered in the article, emphasizing the significance of the thesis objective as a crucial element of the discussion.

− Chapter 2: Theoretical Framework: Presenting a general introduction to the theoretical framework about deep learning, convolutional neural network, face recognition, classifier, library, and implementation

Chapter 3 delves into the system design for an attendance system that integrates face recognition and face mask detection technologies It presents a comprehensive block diagram illustrating the overall architecture of the system, accompanied by a flowchart that outlines the process flow Each component of the system is explained in detail, highlighting its functionality and significance Additionally, the chapter guides readers through the design of a user-friendly graphical user interface (GUI) to enhance usability and accessibility.

− Chapter 4: Result and Discussion: Presenting the construction results of the system model

− Chapter 5: Conclusion and Future Scope: Draw conclusions, strengths, and weaknesses Present the plan for the topic in the future

Deep learning (DL) is a specialized area within machine learning that employs advanced techniques and algorithms to enable computers to identify intricate patterns in extensive data sets The rise of DL gained momentum around 2012 with the advent of artificial deep convolutional neural networks (DCNNs), notably exemplified by AlexNet, which surpassed existing models on key benchmarks DCNNs are effective solutions for various challenges across fields such as computer vision, natural language processing, and robotics.

Deep learning is a powerful technique that enhances outcomes and reduces processing times across various computer tasks It has been effectively applied in natural language processing for tasks such as generating image captions and handwriting recognition The following software applications are categorized into medical imaging, biometrics, and digital image processing.

In supervised learning, input variables X are mapped to output variables Y using an algorithm to train the mapping function f.

The primary objective of a learning algorithm is to approximate the mapping function to predict the output (Y) for new inputs (X) By utilizing the prediction error from training, the output can be refined Learning can be concluded once all inputs have been successfully trained to yield the desired output Regression techniques address regression problems, while Support Vector Machines are employed for data classification Additionally, Random Forest can effectively handle both classification and regression challenges.

Unsupervised learning relies solely on input data without corresponding output, aiming to understand data dispersion through modeling This approach allows algorithms to uncover intriguing structures within the data It is commonly applied to clustering and association problems, with techniques such as the K-means algorithm for clustering and priority algorithms for addressing association issues.

Reinforcement learning utilizes a reward and punishment system to train algorithms, where the agent gathers information from its environment The agent receives rewards for successful actions and faces penalties for underperformance.

In reinforcement learning, an agent, such as a self-driving vehicle, is rewarded for safely reaching its destination while penalized for deviating from the path Similarly, in chess software, winning the game serves as a reward, while getting checkmated represents a punishment The agent's objective is to maximize rewards and minimize penalties, and the algorithm autonomously determines how to learn and improve its performance.

Hybrid learning architectures integrate both generative (unsupervised) and discriminative (supervised) elements, allowing for the creation of hybrid deep neural networks by combining various designs These networks are applied in human action recognition using action bank features, and they are expected to deliver significantly improved performance.

A deep learning framework facilitates rapid modeling without the need for a network, exploring essential techniques Each framework is distinct and tailored for specific objectives Table 2.1 provides a summary of the deep learning frameworks discussed.

• TensorFlow: Python, C++, and R are among the languages supported by Google

Brain It allows us to run on both CPUs and GPUs, we run our deep learning models

Keras is a powerful Python API designed for rapid experimentation in deep learning It effectively supports both Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), leveraging the computational capabilities of CPUs and GPUs for enhanced performance.

PyTorch is a Python-based library designed for creating deep neural networks and performing tensor calculations It enables users to build computational graphs efficiently, making it a powerful tool for machine learning and artificial intelligence applications.

Caffe, developed by Yangqing Jia, is a free and open-source deep learning framework known for its exceptional processing speed and visual learning capabilities It offers a Caffe Model Zoo, which provides easy access to pre-trained models, enabling users to address various challenges efficiently.

Deeplearning4j is a highly efficient deep learning framework developed in Java, utilizing the ND4J tensor library for handling multi-dimensional arrays and tensors It supports both CPUs and GPUs, making it versatile for various computational tasks This framework is capable of processing different data types, including images, CSV files, and text, catering to diverse machine learning needs.

Table 2.1 Comparison of deep learning frameworks

2.2.1 Overview of convolutional neural network CNN

Convolutional Neural Networks (CNNs) are specialized artificial neural networks designed for image recognition They excel in identifying two-dimensional visuals and are robust against various transformations such as translation, scaling, and rotation This resilience stems from their multi-layered architecture, where the initial layer focuses on pixel-level details, while subsequent layers extract increasingly complex features, including relational and structured types, closely aligned with the characteristics of the original object.

2.2.2 The structure of the CNN

Convolutional Neural Networks (CNNs) are a specialized type of deep learning algorithm designed to process data with a grid-like structure, effectively interpreting spatially or temporally related information Unlike traditional neural networks, CNNs incorporate multiple convolutional layers, which enhance their complexity and capability These convolutional layers are essential components of CNNs, allowing for advanced data analysis and feature extraction.

Figure 2.1 The architecture of Convolutional Neural Network [18]

In this article, we explore the architecture of Convolutional Neural Networks (CNNs) by demonstrating how various layers are combined The most common CNN designs consist of multiple stacked convolutional layers followed by a pooling layer, with this pattern repeated throughout the network Finally, fully connected layers are incorporated to complete the architecture Notable examples of CNN designs include LeNet-5, AlexNet, and VGGNet.

To create convolutional layers, filters, or kernels, are applied to the input image, resulting in a feature map that represents the image with the applied filters By combining multiple convolutional layers, more advanced models can be developed to extract intricate features from images Additionally, pooling layers play a crucial role in this process.

Pooling layers in deep learning play a crucial role in reducing the spatial dimensions of input data, which not only accelerates training but also conserves memory and decreases the number of parameters The two main types of pooling are max pooling, which selects the maximum value from each feature map, and average pooling, which computes the average value Typically, pooling layers are employed after convolutional layers to minimize input size before it is processed by fully connected layers.

A fully-connected layer is a crucial component of convolutional neural networks (CNNs), where each neuron is interconnected with every neuron in the preceding layer Typically found towards the end of a CNN, fully-connected layers leverage features identified by earlier layers to make predictions For example, the final fully-connected layer can classify an image by determining whether it contains an animal, such as a dog, cat, or bird.

The activation function in neural networks simulates the speed of a pulse traveling along a neuron's axon Common activation functions include Sigmoid, Tanh, ReLu, Leaky ReLu, and Maxout, as illustrated in Figure 2.2 Among these, ReLu is particularly popular due to its significant advantages in training neural networks, such as fast computation.

Figure 2.2 List of the activation functions that are used the most frequently [19].

Facial recognition, a key area of computer vision, identifies and detects faces in images and videos Various commercial applications utilize this technology; for instance, Facebook automatically tags users in photos using facial recognition Mastercard is exploring facial recognition for payments through a system known as Selfie Pay, while schools have implemented it for automatic attendance tracking The process consists of two main components: detection, which locates faces within images, and recognition, which identifies those faces as specific individuals.

Every face, even among identical twins, is as unique as a fingerprint, indicating that facial recognition systems should match the accuracy of fingerprint scanners Striking a balance between the speed and precision of facial recognition technology remains a significant challenge that necessitates further research It is essential for these systems to be both accurate and fast enough to ensure user convenience.

The face recognition method Sparse Representation Classification is designed to run on systems with minimal processing capabilities, such as a smartphone or a Raspberry Pi.

Securing a facility can be achieved through various methods, including RFID cards and biometric authentication systems While traditional biometric systems like fingerprint and retina scanners require active human involvement and can be time-consuming, facial recognition technology offers a discreet and user-friendly alternative for efficient access control.

Facial technology systems can vary, but in general, they tend to operate as follows:

Face detection technology enables cameras to identify and locate faces, whether they appear alone or within a crowd This capability allows for the recognition of subjects looking directly at the camera or in profile.

Facial recognition technology primarily relies on 2D images for analysis, as these are easier to compare with publicly available photos or databases The process involves photographing and examining the face to capture its unique geometry, including key measurements such as the distance between the eyes, the depth of the eye sockets, and the proportions of the forehead and chin Additionally, the shape of the cheekbones, lips, ears, and chin are analyzed to identify the distinctive features that make each face unique.

The face capture technique transforms facial traits into digital data by converting analog information into a mathematical formula This process results in a unique numerical code known as a faceprint, which is distinct for each individual, much like a thumbprint.

Facial recognition technology is considered the most intuitive form of biometric assessment, as we primarily identify ourselves and others through facial features rather than thumbprints or irises It is estimated that this technology engages with over half of the global population, highlighting its widespread impact and relevance in today's digital landscape.

2.4.1 K-Nearest Neighbors (KNN) a Introduction to KNN

K-Nearest Neighbors (KNN) is a supervised machine learning algorithm utilized for both classification and regression tasks Renowned for its simplicity, KNN is a non-parametric method that makes no assumptions about the underlying data distribution Often referred to as a lazy algorithm, KNN does not perform learning during the training phase; instead, it stores the data points and conducts learning during the testing phase As a distance-based algorithm, KNN relies on calculating the proximity between data points to make predictions.

K- Nearest Neighbor is a simple method that maintains all available data and predicts the categorization of unlabeled data using a similarity metric When two parameters are displayed on a 2D Cartesian system, we calculate the distance between the points to determine the similarity measure The same holds true here; the KNN algorithm operates on the notion that comparable objects exist in close proximity; simply expressed,

12 similar things remain close to one another This algorithm can be used for tasks like classification and regression.

Classification is a predictive modeling technique that assigns a class label to a given input sample For instance, it determines whether an animal is a cat or a dog, or if an email is classified as spam In classification tasks, the predicted outcomes are represented as discrete values, typically 0 or 1.

1, which correspond to true or false Multi-variate (more than one label) classifications are also possible.

Regression analysis is essential for predicting continuous data, such as estimating the future value of a stock market share By applying regression techniques, we can make informed forecasts about financial trends and stock performance.

Step followed in K-Nearest Neighbors:

• Step 1: Load the training and testing datasets

• Step 2: Specify or choose the value of the K

• Step 3: For each point on the test data perform the following:

− Calculate the distance between the point and each point of the training dataset

We can use the Euclidean distance or Manhattan distance

− Sort the values in ascending order based on distances

− Find the top K values from the sorted list

− Find the frequency (mode) of the labels of the top K value

− Assign the mode to the test data point

− The assigned value is the classified or predicted value for the particular test data point

To understand the algorithm mathematically, shown in Figure 2.3.

Figure 2.3 KNN algorithm is shown visually [21]

Advantages and disadvantages of the KNN algorithm:

− It is very easy to understand and implement

− It is an instance-based learning (lazy learning) algorithm

− KNN does not learn during the training phase hence new data points can be added with affecting the performance of the algorithm

− It is well suited for small datasets

− It fails when variables have different scales

− It is difficult to choose K-value

− It leads to ambiguous interpretations

− It is sensitive to outliers and missing values

− Does not work well with a large dataset

− It does not work well with high dimensions

2.4.2 Haar Cascade a Introduction to Haar Cascade

The Haar cascade technique enables the detection of various objects in photographs, regardless of their size or position This real-time algorithm is relatively straightforward and can be trained to recognize a wide range of items, such as cars, bicycles, buildings, and fruits Utilizing a cascading window approach, the Haar cascade assesses features in each window to identify potential objects effectively.

Figure 2.4 Sample of Haar features

The algorithm can be explained in four stages:

Calculating Haar features involves performing computations on adjacent rectangular sections within a detection window To determine the differences between these sums, it is essential to first aggregate the pixel intensities from each region Illustrations of Haar features can be found in Figure 2.5.

Figure 2.5 Types of Haar features

Integral images significantly accelerate the computation of Haar characteristics by constructing sub-rectangles and array references instead of calculating each pixel individually While most Haar features are ineffective for object recognition, as they focus solely on the object's attributes, selecting the most relevant features from the vast array of options is essential In this context, Adaboost proves to be an effective method for identifying the Haar characteristics that best represent an object.

Figure 2.6 Illustration of how an integral image works [22]

Adaboost is a powerful machine learning technique that enhances classification by selecting the most important features and training classifiers to leverage them effectively This method excels in object recognition by creating a "strong classifier" through the combination of multiple "weak classifiers," thereby improving accuracy and performance.

Cascading classifiers utilize a multi-stage approach, where each stage comprises a set of weak learners By employing boosting techniques to train these weak learners, the cascade classifier achieves high accuracy through the aggregation of their individual predictions.

TensorFlow is a free and open-source machine learning library developed by Google, designed for efficient numerical computations and primarily focused on training and inference of deep neural networks While it excels in deep learning applications, TensorFlow also supports traditional machine learning tasks Users can leverage TensorFlow directly or through wrapper libraries that simplify the development process, making it accessible for a wide range of machine learning projects.

TensorFlow utilizes a multidimensional array known as a Tensor to accept inputs, enabling the creation of dataflow graphs that define the movement of data through these structures This framework allows users to design flowcharts that outline the operations performed on the inputs, ultimately producing the output at the end of the process.

TensorFlow will perform the multiplication of X_1 and X_2 as illustrated in Figure 2.8 by creating a node for the operation, referred to as "multiply." Once the graph is constructed, TensorFlow's computational engines will execute the multiplication of X_1 and X_2.

Figure 2.7 TensorFlow example c TensorFlow Architecture

The architecture of TensorFlow is broken into three sections.:

− Preparing the data for analysis

− Model train and estimation model

Keras is an open-source deep learning library built on top of powerful frameworks like TensorFlow, Theano, and Cognitive Toolkit (CNTK) It enables rapid numerical computations through Theano and offers essential tools for designing and deploying deep learning models efficiently Keras leverages TensorFlow's scalability and cross-platform capabilities, with its core data structures being layers and models Additionally, it facilitates model compilation and the transformation of class vectors into binary class matrices during data processing.

Keras makes high-level neural network API simpler and more performant by leveraging multiple optimization approaches It has the following capabilities:

− Consistent, simple, and extensible API

− Minimal structure - easy to achieve the result without any frills

− It supports multiple platforms and backends

− It is a user-friendly framework that runs on both CPU and GPU

− Highly scalability of computation b Keras works

Keras is a highly powerful and dynamic framework that offers the following benefits:

− Keras neural networks are written in Python which makes things simpler

− Keras supports both convolution and recurrent networks

− Deep learning models are discrete components so you can combine them in many ways

OpenCV (Open Source Computer Vision Library) is a versatile, free, and open-source software library designed for computer vision and machine learning applications Developed in C and C++, it supports multiple operating systems, including Linux, Windows, and Mac OS X The library also offers active development for interfaces in Python, Ruby, Matlab, and other programming languages OpenCV enables various functionalities such as face detection and recognition, object identification, motion tracking in videos, eye gesture recognition, red-eye removal from flash photos, image database comparisons, landscape perception, and augmented reality marker setup.

An effective attendance system utilizing face recognition and face mask detection necessitates a structured execution approach, akin to other software-integrated projects This method focuses on a sequential development process, ensuring each step is completed before progressing to the next, ultimately leading to the final prototyping phase The system comprises five essential components: image acquisition and preprocessing, face liveness detection, face mask detection, facial recognition, and time attendance management, as illustrated in the accompanying block diagram.

Figure 3.1 Block diagram of attendance system using face recognition and face mask detection.

3.2.1 Image Acquisition for Liveness and Face Mask Detection

This project aims to develop an automated data collection program that utilizes the laptop webcam to capture images of users To create the "real face" dataset, users simply need to sit in front of the camera, align their faces within a designated frame, and adhere to the on-screen instructions The system operates under three specific scenarios.

The data collection process involves capturing images of a subject looking straight, left, and right, with a total of 500 photos being collected once the display reaches 100%, adjustable based on project needs For the "fake/spoofing face" dataset, a short video of the subject's face is recorded and presented to the laptop camera to gather actual face data The dataset for face mask detection includes two categories: "with mask" and "without mask," while similar methods are applied for liveness detection Visual representations of these datasets are shown in Figures 3.2 (a with mask, b without mask) and 3.3 (a real face, b fake face).

Figure 3.2 Face mask detection dataset a Real Face b Fake face

In this use case, we aim to detect individual faces using the haarcascade_frontalface_default.xml method Due to the large dimensions of the input image, it is necessary to scale down the image for improved output results.

Figure 3.4 Flowchart of image acquisition for liveness and face mask detection

3.2.2 Training Model for Face Mask and Liveness Detection a Training Model for Face Mask Detection

Our system utilizes TensorFlow and Keras algorithms to detect face mask usage Initially, we train the system using the Kaggle Dataset, followed by loading the face mask classifier from disk to recognize faces in real-time video streams Additionally, MobileNet is employed to train on a vast array of images, ensuring high-quality classification The algorithm's detailed steps are outlined below.

In our project, Face Mask Detection is implemented in two main phases The first phase involves training the Face Mask Detector using a loaded dataset, where classification is performed with Keras and TensorFlow, followed by saving the classifier to disk In the second phase, the trained classifier is loaded and used to identify the region of interest for each face in images or live video streams The results are displayed after applying the classifier to determine if individuals are wearing masks, highlighting the accuracy achieved in detecting the correct outcomes.

To develop a network, it is crucial to gather essential datasets and elements from various categories Once the dataset is created, the next step involves preparing and testing it to evaluate the network's performance The neural network should be trained to identify different categories based on the provided labels Additionally, the dataset must be analyzed and compared against the ground-truth labeling for accuracy.

Figure 3.5 Block diagram of face mask detection [25]

To develop a custom face mask detector, we will divide the project into two main phases The first phase involves loading the face mask detection dataset, training a model using Keras and TensorFlow, and serializing the trained model for future use In the second phase, we will deploy the face mask detector, enabling us to perform face detection and classify each face as either wearing a mask or not.

The face mask detection system consists of two key phases: training the Face Mask Detector and applying it This article will concentrate on the first phase, which involves training the detector, while the application phase will be discussed in section 3.3.3, accompanied by a comprehensive flowchart.

• Step 1: Prepare our training data for data augmentation by loading and preprocessing it, which includes:

− Obtaining a list of images, initializing the information in our dataset directory, and then classifying the image

− Changing training to Numpy array format

− Labels should be encoded (one-time encoding)

− Using the scikit-learn package, divide the dataset into training (80%) and testing (20%)

− Load MobileNet with pre-trained ImageNet weights, leaving the network head off

− Create a new Fully Connected (FC) head and attach it to the base instead of the previous one

− Freeze the network fundamental layers The weights of these base layers will not be modified during the backpropagation process, however, the weights of the head layer will be tweaked

• Step 2: Compile and train a network for detecting face masks:

− Model compilation using Adam Optimizer, a learning rate decay schedule, and binary cross-entropy

Train the network head using the model method and its required parameters Afterward, make predictions on the test set to identify the class label indices with the highest probability, allowing for an evaluation of the model's performance Finally, generate and print a classification report for detailed inspection of the results.

• Step 3: Then serialize our face mask classification model to disk Also, we plot our accuracy and loss curves b Training Model for Liveness Detection

Our team chose an anti-spoofing system as the core component of the project, as it plays a crucial role in detecting face spoofing The block diagram of the face anti-spoofing system is illustrated in Figure 3.6 Each phase in this process is essential, interconnected, and should not be overlooked The subsequent sections will provide a detailed discussion of the individual blocks in the diagram.

Figure 3.6 Block diagram of face liveness detection.

The face anti-spoofing system consists of two essential procedures: training and identification, both vital for its effectiveness A thorough training session significantly improves the system's performance, while the identification phase assesses its operational integrity This testing process identifies vulnerabilities and issues, enabling us to enhance the system's overall reliability and efficiency.

In our project, we developed a CNN Classifier network tailored for distinguishing between genuine and fake faces, utilizing a structure inspired by VGGNet The network is intentionally shallow, featuring a limited number of filters, which allows for rapid and accurate processing—ideal for our classification needs Despite its simplicity, our CNN retains the quality associated with VGGNet, as we found that a deep architecture is unnecessary for effective discrimination between real and counterfeit faces To enhance performance, we incorporated batch normalization (BN) and dropout layers, while the output layer is fully connected, activated with ReLU, and employs a softmax classifier.

− Part 1: Convolution_1 => Activation_1 (ReLU) => Convolution_2 => Activation_2 (ReLU) => Pool Layer_1

− Part 2: Convolution_3 => Activation_3 (ReLU) => Convolution_4 => Activation_4 (ReLU) => Pool Layer_2

− Finally: Fully Connected lessons will be paired with Activation

At the core, LivenessNet is actually just a simple convolutional neural network We adding a layer to our CNN is shown in Figure 3.7 and detail explain below:

− Convolution_1: use a filter (filter or kernel) of 3 x 3, a number of kernels of 16, with

To achieve "same" padding with a stride of 1, the output dimensions of the layer will match the input dimensions, ensuring that 0 pixels are added to preserve edge pixels The filter will systematically traverse the input images, applying convolution to extract essential features from the images.

The ReLu activation function is utilized in this activation layer due to its superior processing speed compared to Sigmoid and Tanh functions Unlike these alternatives, ReLu does not involve calculations with base e, allowing for faster performance Additionally, ReLu effectively converts negative values to zero while retaining positive values after convolution, enhancing the model's efficiency.

− Convolution_2: similar to Convolution_1 layer but without input_shape

− Activation_2 (ReLu): similar to Activation_1

The Pool Layer_1 employs a 2 x 2 kernel to halve the image size, effectively emphasizing key features while minimizing the parameters for the convolutional neural network (CNN) to learn.

− Convolution_3: similar to Convolution_1 and 2 layers but the number of filters increases to 32

− Activation_3 (ReLu): similar to Activation_1 and 2

− Activation_4 (ReLu): similar to Activation_3

− Flatten layer: in this layer, we stretch the image into a single dimension that is the product of 1 x 1 x 128

A graphical user interface (GUI) is developed using the library Tkinter The proposed attendance system using face recognition and face mask detection has the following capabilities as shown in Figure 3.9.

• “Add a User” to enter the names and face images of the person into the database

• “Time Attendance” to compare face encoding in the image captured by the camera with the encodings available in the database and generate the attendance list

Figure 3.9 Main menu GUI window

When the "Add a User" button is clicked, a pop-up window appears for the admin to input the new user's information After entering the details, the admin should click "next" to capture a dataset of the new user, during which the system collects 50 images for training purposes Once the training model process is finished, the admin can proceed to the time attendance feature.

When the "Time Attendance" feature is selected, the application utilizes the webcam to capture an image of the classroom Face recognition technology is then applied to the captured image, generating a list of identified students This attendance list can be conveniently provided to teachers in Excel format.

Figure 3.10 illustrates the flow chart of the GUI operations for the attendance system, which encompasses four primary functions The following sections will provide a detailed explanation of each function along with individual flowcharts for clarity.

• Function 1: Return to the main menu

• Function 2: Create a dataset for face recognition

• Function 3: Training face recognition with the KNN algorithm

• Function 4: Face mask recognition and time attendance

Figure 3.10 Flowchart for the Graphical User Interface

3.3.1 Image Acquisition for Face Recognition

Collecting classification images traditionally involves manual photo editing to crop and resize photos, a process that is both time-consuming and labor-intensive To streamline this task, we are developing an automated application designed to gather 50 face images per individual, capturing various expressions both with and without masks This application will efficiently detect appropriate expressions, correct any tilts, and save the images automatically Our team is focused on creating an intuitive graphical user interface (GUI) for this face data collection program, which will operate using a laptop webcam Figure 3.11 illustrates the facial recognition dataset.

The application starts with a request for a name to be entered to be stored with the

The face detection system initiates by capturing 25 images of a person's face without a mask, ensuring optimal brightness levels before proceeding Following this, the user is required to wear a mask and perform basic rotations to the left and right, as displayed on the application window The program operates in a loop until it successfully collects 50 viable images, enhancing the efficiency of data collection for face recognition A flowchart illustrating the image acquisition process is provided in Figure 3.12.

Figure 3.11 Facial recognition dataset (Duong and Hung)

Figure 3.12 Flowchart of image acquisition for face recognition.

3.3.2 Facial Recognition System a Face detection with “face recognition” library

This project utilizes a face recognition library to identify and verify faces through a four-step process: detecting the face, obtaining 68 facial landmarks, acquiring 128 measurements, and implementing a machine learning algorithm The described face recognition methods are illustrated in Figure 3.13.

Figure 3.13 The whole process of face recognition

− First of all, we will generate face patterns based on the HOG algorithm

− We will find the part of the simplified images that look the most similar to an originally known HOG face pattern

− Finally, a bounding box is drawn around the detected face

• Step 2: Obtain 68 points and an adjusted face

− The face landmark estimation algorithm will be used to figure out 68 specific points that exist on every face

OpenCV's affine transformation employs fundamental image transformations, including rotation, scaling, and shearing, to ensure that key facial features, such as the eyes and lips, consistently align in the same position across various images.

• Step 3: Obtain 128 measurements: The centered face images are passed through a deep convolutional neural network to obtain 128 measurements which are 128- dimensional unit hypersphere

In the final step of the process, implement your preferred machine learning algorithm for tasks such as clustering, similarity detection, and classification, with a focus on classification for face recognition This stage also involves crucial tasks like face alignment and feature extraction to enhance the accuracy of your model.

The system enhances face recognition accuracy by normalizing facial images through geometry and photometric data analysis It primarily processes 2-D images to align them with existing public photographs and databases The algorithm identifies key facial features, including eye socket depth, cheekbone shape, eye distance, and the dimensions of the forehead, chin, lips, and ears Additionally, face alignment ensures consistency in patch scales, resolution, brightness, zoom levels, and orientations, serving as a crucial preparatory step for effective face recognition.

Individual face patches are extracted from normalized photos, allowing the system to convert facial images into data based on unique traits This process involves identifying and extracting key facial features while filtering out irrelevant information The facial recognition technology performs tasks such as information packing, noise reduction, dimensionality reduction, and salience extraction During feature extraction, a unique faceprint for each individual is generated, exemplified in Figure 3.14, which illustrates the most significant attributes obtained from a given image.

33 c Face recognition with KNN classifier

K-Nearest Neighbor (KNN) is a data classification approach that may be used to identify faces Face recognition of the K-Nearest Neighbor Method is divided into two phases: training and testing Each pixel in the face represents a unique piece of information Based on each pixel categorization, this project detected faces The face was identified by the most common class in each pixel categorization Before classification, the pixel matrix of the facial picture should be reshaped into a vector The following is a description of the proposed KNN facial recognition algorithm:

To categorize a sample effectively in KNN, it is essential to measure the separation between the sample and each training sample This can be achieved by utilizing a non-similarity measure to avoid matching issues Commonly, either Manhattan distance or Euclidean distance is employed to compare neighboring samples, with their respective equations outlined as 3.1 and 3.2.

− Sort distances from shortest to greatest

Choosing training samples with the shortest distances is crucial, as a lower K value can reduce training error by focusing on samples in a smaller neighborhood However, this can lead to higher generalization error, making the model more complex and susceptible to overfitting Conversely, increasing K reduces generalization error but raises training error, suggesting the model may become oversimplified and its fitting performance may decline.

To assess the probability of occurrence for each category in the K training samples, it is essential to utilize a weighted algorithm that emphasizes samples closer to the testing sample By giving greater weight to these proximate samples, the calculation of sample probability becomes more accurate and relevant.

− The category of the testing samples can be assumed to be the one among the K samples with the highest likelihood of occurrence

The KNN classifier is trained using a labeled dataset of known faces to identify individuals in unknown images It determines the K most similar faces based on the Euclidean distance and conducts a weighted majority vote on their labels to make accurate identifications The modeling process follows a structured sequence of steps to ensure effective implementation.

The proposed method outlines a user-friendly face recognition attendance system developed using the Tkinter GUI The interface features several buttons with specific functions: the start button activates the camera and automatically recognizes faces, the register button allows for the enrollment of new users, and the update button trains the system with the latest registered images Additionally, the browse and recognize buttons enable users to select images from a database and test the system's identification capabilities To set up the system, the administrator must first register their user data, including their name For testing, a training dataset consisting of five individuals, each with 50 images, has been created.

Figure 4.1 Face Mask Recognition GUI function

Figure 4.2 Register new user GUI window

The entire program is tested well and in a different set of conditions Some of the conditions tested are displayed in the following diagram as detected by our system:

In test case 1 (Figure 4.3a), the system detects all authentic facial features, including the mouth, nose, and chin, confirming that the user is not wearing a mask Consequently, the label displayed will read “Name; Real Face; Please wear your mask.”

In test case 2 (Figure 4.3b), the system detects visible fake facial features, including the mouth, nose, and chin, indicating that the user is not wearing a mask and presenting an invalid face Consequently, the label will read “Name; Invalid; Fake Face” and will be displayed on the screen.

In test case 3 (Figure 4.3c), facial features are obscured by a mask, prompting our system to accurately identify the mask The display will show the label "Name; Valid; Real Face," confirming the authenticity of the individual while adhering to safety protocols.

In test case 4, facial features are obscured by a mask and a fake face, prompting our system to display the message “Name; Invalid; FakeFace” on the screen to indicate the issue.

The results of testing real-time face mask recognition reveal the following scenarios: a real face without a mask, a fake face without a mask, a real face wearing a mask, a fake face without a mask, an identification camera positioned at the exit, and instances of incorrect identification An appropriate automated message is displayed based on these outcomes.

After pressing the start button, the automated procedure captures the user's face image from the video frame, facilitating face recognition The identification results for successful check-ins and checkouts are displayed in the command window, as shown in Figure 4.4 For optimal results, it is crucial that the frontal face is clearly visible and that photos are taken in ambient lighting Additionally, slight variations in the user's posture or facial expression enhance the accuracy of the recorded images Faces are accurately captured under typical lighting conditions when the user's posture is correctly positioned To prevent false detection, maintaining optimal illumination is essential; images are disregarded if the database is empty.

The attendance system can recognize images in various lighting and angle conditions, but unknown faces not included in the training dataset cannot be identified For real-time attendance tracking, users must have identifiable photos, with the check-in image at the entrance camera being a valid face label, and the checkout image captured at the exit camera The automated system then saves this data and imports it into an Excel sheet, generating a file that includes the relevant name, date, and time, while also calculating the total engagement time for attendance.

Figure 4.4 Identification results are printed in the command window.

Figure 4.5 Attendance information is saved in an excel sheet.

Managing data for models and tests can be challenging, requiring more than just organization and tracking, as it demands significant time and effort to ensure everyone is updated and aligned Effective folder management is crucial, and the project folder is categorized into six main sections: FunctionContribution, GUI, Models, Dataset, Attendance_Result, and temp files The FunctionContribution folder houses programs essential for executing the project's main functions, such as creating user datasets and training newcomers and detectors The GUI folder contains elements necessary for developing a user-friendly interface, while the Models folder stores all trained models based on the available datasets The Dataset folder is designated for user datasets, facilitating streamlined data management and collaboration.

The Attendance_Result folder is designated for storing attendance data, including photos captured during check-in and check-out, along with a CSV file that summarizes this information Additionally, the temp folder holds temporary data files generated during program execution, while the data will also be utilized for training facial recognition systems.

The study on face identification using the K-Nearest Neighbor (KNN) system is divided into training and testing phases It utilizes a parameter k, representing the number of neighbors based on previously trained individuals The training data is categorized into k classes based on proximity, with the clustering process iteratively refined to enhance homogeneity within each cluster Two primary methods for calculating cluster distance include minimum and maximum distance presentations, with single linkage also employed to define distances between clusters Feature vectors within a cluster encapsulate encoded face images of the same individual under similar conditions, such as facial expressions and lighting When a live face is introduced, it is matched against existing clusters to identify the correct group.

The approach, while exact, demands considerable computational resources due to the need to recalculate cluster distances each time a new observation is added A significant challenge arises from the fact that similarities between facial images of the same person captured from different angles often exceed those of different individuals in identical positions and expressions This complicates the determination of the number of clusters and their locations, leading to a high likelihood of overlaps in real-world applications Consequently, the presence of incorrectly clustered elements can hinder the convergence of the partition algorithm.

The overlapping boundaries between clusters can expand over time, leading to the incorrect merging of two distinct clusters To rectify this issue, significant additional effort is required, making the task quite challenging As a result, relying solely on minimum or maximum distance methods for clustering training sequence sets proves to be ineffective.

The training process, known as backpropagation, is essential for adjusting model weights to accurately reflect input data and achieve the correct output class In this study, a CNN model was trained for face mask detection and liveness detection, with training data divided into batches of 1024 for face mask detection and 2048 for liveness detection By segmenting the dataset into smaller batches, the model can be trained more efficiently, allowing for quicker adjustments to gradient precision Additionally, a learning rate of 0.0001 was utilized to determine the step size for minimizing the loss function.

51 similar manner The optimization algorithm, forward pass, loss function, backward pass, and weights updates are used to train the model using the labeled data

In this study, the Adam optimizer was utilized for training the CNN model, as it is an adaptive learning rate optimization technique tailored for deep neural networks The Adam optimizer calculates individual learning rates based on different model parameters, effectively enhancing the optimization process By leveraging the first and second gradient moments, Adam adjusts the learning rate for each neural network weight, significantly improving the training methodology.

The Mean Square Error (MSE) loss function was utilized in this strategy to achieve optimal accuracy MSE quantifies the total squared differences between the actual target variable and the model's predicted values This function is mathematically represented in Equation 4.1, where 'n' denotes the number of data points, Ytrue signifies the actual value for Data Point I, and Ypredicted represents the model's output.

An epoch refers to a complete cycle in which a model processes the entire training dataset Training a neural network over multiple epochs enhances its ability to predict unknown data, such as test data, leading to improved results.

In the training of face mask detection, a total of 100 epochs were implemented to achieve optimal accuracy in predicting test data The relationship between the number of epochs and the resulting loss percentage is illustrated in Figure 4.7.

Figure 4.7 Result of training model of Face Mask Detection

Figure 4.8 Result of training model of Liveness Detection.

The Liveness models indicate underfitting in epoch intervals below 5, as evidenced by a lower training loss compared to validation loss This suggests the model struggles to accurately represent the training data, leading to significant errors Furthermore, the findings highlight the necessity for additional training to reduce the loss encountered during the training process.

During the training of the face mask model and liveness models, the validation loss initially decreases but later begins to rise between the 0th and 13th epochs for the face mask model and after the 50th epoch for the liveness models This pattern indicates that the models are overfitting, meaning they excel on training data but struggle with unseen validation data Possible reasons for this issue include excessive model complexity or prolonged training times To address overfitting, early stopping can be implemented to halt training while the loss remains low and stable, making it an effective strategy for improving model generalization.

Early stopping is a regularization technique that helps prevent overfitting on the training dataset The training ends if the loss stops reducing for numerous epochs in a row

The training process monitors validation loss and generates a file with essential weights The patience parameter determines how many epochs to wait before implementing early stopping, as the validation loss ceased to decrease after the 80th epoch for the face mask detection model and the 50th epoch for the liveness detection model Consequently, early stopping concludes training after precisely 80 epochs for the face mask detection model and 50 epochs for the liveness detection model.

The test results on each module and integrated module are presented in Table 4.1

It is easy to see that the system is completed and has a track record of covering most of the requirements from the goal achieved before completion

Table 4.1 System test list report

No Action Inputs Expected Output Actual Output Test Result

Images (In/Out) are captured and stored

Vector features are created and

A live stream of a person’s face

The name of the detected person is displayed on the screen

Label (Real/Fake face) of detected person is displayed on the screen

Label (Mask/Without Mask) of detected person is displayed on the screen

Time In/Out after ID recognition successful

In/Out, Time engaged) to the CSV file

7 Detect muli- faces and update time attendance to CSV file

Multiple faces from a live video stream

Update time attendance for all faces detected

Attendance is updated only for a single face

The prediction values for face mask and liveness detection were analyzed using a matching threshold of 0.5, with True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) metrics detailed in Tables 4.2, 4.3, and 4.4 To assess performance, five additional volunteers, whose faces were not previously stored in the database, participated in the testing The evaluation included two scenarios—indoor and outdoor—under varying brightness conditions Each subject underwent testing three times, with assessments conducted from a distance of 50 to 100 cm, evaluating frontal, right (±30 degrees), and left (±30 degrees) facial angles.

To evaluate the system's effectiveness, we utilize diverse datasets that account for varying conditions, including illumination and expression Our analysis will also extend to real-time applications using a proprietary database Previous literature highlights that recognition accuracy is a standard metric for validating system effectiveness.

The following definitions provide the accuracy or recognition rate formula:

The experimental test results demonstrate the effectiveness of the proposed attendance system utilizing deep learning models, with input images varying from 200 to 1000, evenly distributed across labels and retried twice for each input size The findings indicate an impressive overall accuracy of 95.3% for the face mask detection model and 75.1% for the face liveness detection model Additionally, the precision for face mask detection reached 96.4%, while the precision for fake face detection was 76.5% The recall rates were 94% for face mask detection and approximately 74.3% for face liveness detection, highlighting the robustness of the proposed method as illustrated in the accompanying Confusion Matrix.

Table 4.2 Testing actual accuracy for Face Mask detection

Table 4.3 Testing actual accuracy for Liveness Face detection

Table 4.4 Testing actual accuracy for Face and Face Mask Recognition

The rigorous matching value employed results in a high accuracy rate by minimizing False Negative instances and consequently reducing False Positive cases However, testing outcomes can vary with different thresholds, camera distances, or when using a new camera Testing subjects at a distance of 50 to 100 cm provides high facial resolution, leading to more accurate and informative feature vectors Nonetheless, performance testing is conducted remotely with separate camera and computer technology, which may introduce minor issues To establish the desired threshold for future tasks, it is essential to perform performance testing with various matching criteria.

Traditional attendance methods are fraught with issues, leading to significant challenges for many institutions The integration of facial recognition technology into attendance monitoring systems addresses these shortcomings, ensuring accurate attendance tracking This technological advancement not only enhances efficiency but also reduces costs and minimizes human involvement by automating complex tasks.

This thesis explores the use of deep learning methods for detecting face masks and facial recognition by developing a machine learning model Despite the challenge of half-obscured faces due to masks, the technique achieved swift and precise results for security systems The findings indicate a high accuracy rate of approximately 85% in identifying individuals as mask-wearing, non-mask-wearing, or incorrectly masked This research offers a significant strategy for mitigating COVID-19 by enabling biometric verification while ensuring mask compliance However, the approach has limitations, including the requirement for frontal input images, the need for initial recognition of a single face, sensitivity to lighting conditions, and potential inaccuracies with blurry images.

Implementing a biometric attendance system that verifies mask usage can significantly enhance public healthcare efforts during the coronavirus pandemic By requiring users to place a finger on the system, it minimizes contact with potentially infected surfaces This innovative solution is suitable for colleges, universities, and workplaces with automatic doors, effectively tracking attendance while ensuring compliance with health guidelines The system can be scaled to adapt to various environments, including urban settings where it can identify individuals not wearing masks in crowds Ultimately, this approach not only addresses limitations of previous models but also streamlines access to important data, showcasing technology's role in supporting public health initiatives.

[1] WHO, “Weekly epidemiological update on COVID-19 - 24 April 2022”, World Health Organization, Edition 63, Emergency Situational Updates, 26 October 2021

[2] Wati, V., Kusrini, K., Al Fatta, H., & Kapoor, N., “Security of facial biometric authentication for attendance system” Multimedia Tools and Applications, vol 80, pp 23625–23646, 2021

[3] Ahmed, S B., Ali, S F., Ahmad, J., Adnan, M., & Fraz, M M., “On the frontiers of pose invariant face recognition: a review” Artificial Intelligence Review, vol 53, pp

[4] Y Sun, X Wang, and X Tang, “Deeply learned face representations are sparse, selective, and robust”, CoRR, 2014

[5] G K Jakir Hussain, R Priya, S Rajarajeswari, P Prasanth, N Niyazuddeen, “The Face

Mask Detection Technology for Image Analysis in the Covid-19 Surveillance System”, IOP Publishing Ltd, 2021 https://doi.org/10.1088/1742- 6596/1916/1/012084

[6] Amrit Kumar Bhadani, Anurag Sinha, “A Facemask Detector Using Machine

Learning and Image Processing Techniques”, Amity University Jharkhand Ranchi,

[7] Joseph, Jomon, and K P Zacharia “Automatic attendance management system using face recognition”, International Journal of Science and Research (IJSR) 2.11: 327-

[8] Patil, Ajinkya, and Mrudang Shukla, “Implementation of classroom attendance system based on face recognition in class”, International Journal of Advances in

[9] Kanti, Jyotshana, and Shubha Sharm “Automated Attendance using Face

Recognition based on PCA with Artificial Neural Network.” International journal of science and research IJSR, 2012

[10] MuthuKalyani, K., and A VeeraMuthu “Smart application for AMS using face recognition”, Computer Science & Engineering 3.5, 2013

[11] Deshmukh, Badal J., and Sudhir M Kharad “Efficient Attendance Management: A

Face Recognition Approach”, International Journal of Computer Science Issues,

[12] Rachel Draelos, MD, Ph.D., “The History of Convolutional Neural Networks

Glassboxmedicine Retrieved December 26, 2020”, GLASS BOX, 2019, April 13 https://glassboxmedicin-e.com/2019/04/13/ashorthistory-of-convolutional-neural- networks/

[13] Deng L, Yu D “Deep learning: methods and applications”, now Publishers Inc, Foundations and Trends in Signal Processing 7, 2014

[14] Bernhard Scholkopf and Alexander J Smola “Learning with kernels: support vector machines, regularization, optimization, and beyond” MIT Press, 2001

[15] Anil K Jain “Data clustering: 50 years beyond K-means, Pattern recognition letters”, Volume 31, Issue 8, ISSN 0167-8655, 2010

[16] Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves

“Asynchronous Methods for Deep Reinforcement Learning” International conference on machine learning, 2016

[17] Bengio, Y “Learning deep architectures for AI” Foundations and Trends in Machine Learning: Vol 2: No 1, pp 1-127, 15 Nov 2009 http://dx.doi.org/10- 1561/2200000006

[18] Ajitesh Kumar, “Different Types of CNN Architectures Explained: Examples”, Data Analytics, Powered by WordPress, 2022

[19] J.K Rahul Jayawardana, T Sameera Bandaranayake, “Analysis of optimizing neural networks and artificial intelligent Models for guidance, control, and navigation systems”, International Research Journal of Modernization in Engineering

[20] W Liu et al., “SSD: Single Shot MultiBox Detector”, ECCV 2016, Computer Vision and Pattern Recognition, 2015 https://doi.org/10.48550/arXiv.1512.02325

[21] Srishilesh P S, “An Introduction to KNN Algorithm”, Slack community, 2021

[22] Zhengyou Zhang, “A Survey of Recent Advances in Face Detection”, Microsoft,

[23] M Xin and Y Wang, “Research on image classification model based on deep convolutional neural network” vol 8, 2019

[24] OpenCV team, “OpenCV”, Opencv.org [Online] Available: https://opencv.org/

[25] Adrian Rosebrock, “COVID-19: Face Mask Detector with OpenCV, Keras/TensorFlow, and Deep Learning”, May 4, 2020

Tiêu đề	Real – Time Attendance Management System Based On Face Recognition With Facial Features Techniques
Tác giả	Nguyen Dac Duong, Nguyen Quoc Hung
Người hướng dẫn	Pham Ngoc Son, Dr.
Trường học	Ho Chi Minh City University of Technology and Education
Chuyên ngành	Electronics and Communications Engineering Technology
Thể loại	Capstone project
Năm xuất bản	2022
Thành phố	Ho Chi Minh City

Định dạng
Số trang	73
Dung lượng	5,8 MB