In contrast to current card, mobile, and biometricpayment systems, face recognition payments offer a more seamless experience byeliminating the need for a physical device to execute the
Trang 1VIETNAM NATIONAL UNIVERSITY HOCHIMINH CITY
UNIVERSITY OF INFORMATION TECHNOLOGY
ADVANCED PROGRAM IN INFORMATION SYSTEMS
Le Ngoc Thuy An - 19521176 Pham Minh Tri — 19522391
BACHELOR OF ENGINEERING IN INFORMATION SYSTEMS
THESIS ADVISOR Assoc Prof Dr techn Quan LE-TRUNG
HO CHI MINH CITY, 2024
Trang 2VIETNAM NATIONAL UNIVERSITY HOCHIMINH CITY
UNIVERSITY OF INFORMATION TECHNOLOGY
ADVANCED PROGRAM IN INFORMATION SYSTEMS
Le Ngoc Thuy An - 19521176 Pham Minh Tri — 19522391
BACHELOR OF ENGINEERING IN INFORMATION SYSTEMS
THESIS ADVISOR Assoc Prof Dr techn Quan LE-TRUNG
HO CHI MINH CITY, 2024
Trang 3ĐẠI HỌC QUỐC GIÁ TP.HCM CỘNG HOA XÃ HỘI CHỦ NGHĨA VIỆT }
_ TRUONG ĐẠI HỌC Độc lập - Tự đo ~
Trang 4ïí thúc lúc E;.{ÍO ¿cùng ngày .J.
Chủ tịch hội đông(ky và ghi rõ he tên)
'TS Cao Thị Nhạn GS TS Đã Phúc
Lara ý; Số thành viên Hội đồng là 03 Cong thức tỉnh điểm tang ke:
+ Trường hop CBPB không có trong Hội đẳng
Điểm KLTN = (CTHĐ+TKHĐ+UVHĐ+CBHD*~CBPBT21 7
« Trường hợp CBPB là ủy viên Hội đẳng
Điểm KLTN = (CTHĐ+TKHĐ+CBHD*3+CBPHT3¿
Trang 5Table of Contents
./2/-1.7A4 7-0 5
21-1 Ắn¡ 7P
Table Of 17.21277757 8
134/13/1211 c14110PPnPP8Ẻ8Ẻhhh 9
4 V€FVỈ€W c S000 0 0 0000001119999 80000000000 10 1.1 ADStract cccccsssscsssccccessccssssceeeceesssccsssssseeeeeessseccssesseeeeseesscsesssceeeeeessesssssssseeseesesssscooeanes 10 1.2 Problem Statermeini c có nọ nọ 000 00001001008 0800000 10 1.2.1 Current Payment Methods: An Overview and Comparative AnalySiS - scccSSssSsreeeeres 10 1.2.2 Methodological Examination of Biometric Authentication Modalities in Payment Systems: Advancing the Case for Facial RECOgMItION cccecseceseeseeeeeceeeseestecesecseeacesssresecaeeacceeecaeeseeaeeesaessetaeeesereaeeaeeaea® 12 1.2.3 Current Landscape of Face Recognition in Paymennt - kg HH gu ng nh my 13 1.2.4 Liveness Detection IMlethOS: - c1 TH TH TH TH HT Hi 13 1.3 SUFVOV ẤP con 2 00080100777%2 c- sEERES co ccedtecsccanlsseessvacscasccesccanecceesssanssasscesoaas 15 1.4 Motivation - - - cọ ng ng g8 98 17 1.5 SCOpe TT N0 TU ee 18 1.6 Obj€CẨÏV©S QQQQQ Q.0 0000000000000 9098 0804.9999998 9% 19 2 Methodology =2 5 << << s9 gọn 0 0 21 2.1 Related WOIFÍk 9 ỌỌọ Họ họ 00 000.1000000 0 21 2.2 Theoretical BSÌS - cọ Họ 1 0 0011.108 06 22 2.2.1 Convolutional Neural N@tWOrFK G119 219 1T họ nh Tu TH TH ni HH cv ry 22 2.2.2 c co -aÊ ÊẼằ 24 2.2.3 IV 6i 27
2.2.4 Flash for API S@rvelr 11 6 (.(4(.( 4a4‹s*.£.g 28
2.2.5 9-92 18 5e 5-1-2 29 2.2.6 W0 1 44 32
2.2.7 — 9) 33
Trang 63.5 Building Liveness Detection
3.6 Building Face Recognition Model
Building Model
Trang 7Table of Figures
Figure 2: Card Payment Value 00Ẻ8nẺ8ẺẺ8n8 17Figure 3: CNN ATChIf€CẨUTC - 2G 2 119119 1 91111 TH TH TH TH TH HT nh 22Figure 4: System 2 0(9ì11ïi50iì1i0 8n 39Figure 5: Processing FIOW TP ““-3Ụ 42Figure 6: GOVM ConfigTUAfIOTI - Ác 2c 22112511111 1911 119111 11 11111 1g ng ng 44
Figure 7: GCVM ConfiBTUAfIOTI - -.- c3 911230 E91 901v TH ng HH ng nh 45
Figure 8: Testing ĐT 54I5) 18.1 55
Figure 10: Choose bank - co + 1911210 11111 họ nu HT HH nh 57
Figure 11: Turn On (aTN€YA - - 2G 2222 3 12218238311 133 189119118531 111811 1 E11 1 ng ng 58
Figure 12: Do QU€SfIOTI - - Q2 1119 nu TH Tu HH HH ng 58Figure 13: Capture ÍAC€ LH HT TH TT TT TT TH HH HH HH HH 59
Figure 14: Account InÍOTI4fIOI -. 5c 222 3183213511331 151151 9111 1 81111 811 1 g1 vn ng 60
Figure 15: Payment Sucessful - s 112 1E n HT H ng ng nh Hệ 60
Trang 8Table of Tables
Table 1: CNN comparision
Table 2: VGG ComparisionTable 3: Liveness Accuracy
Trang 9First and foremost, we would like to extend our deepest gratitude to Assoc Prof Dr.techn.Quan LE-TRUNG of the Faculty of Information System, IoT Laboratory His invaluable
guidance, expertise, and unwavering support were instrumental throughout the journey of
this thesis His insightful feedback, constructive criticisms, and commitment to excellence
have significantly enriched the quality and depth of our research
Assoc Prof Dr.techn Quan LE-TRUNG's dedication to fostering academic growth and
his profound knowledge in the field of information systems have been pivotal in shapingour perspectives, refining our methodologies, and navigating the intricate nuances of our
study His mentorship has not only facilitated the realization of this thesis but has also
instilled in us a deeper appreciation for rigorous academic inquiry and innovation
Furthermore, his encouragement, patience, and willingness to engage in meaningful
discourse have been invaluable assets, ensuring that our research endeavors remainedfocused, relevant, and impactful It is with sincere appreciation that we recognize his
contributions and express our heartfelt thanks for his pivotal role in the culmination of this
academic endeavor
In conclusion, our journey would not have been possible without the unwavering supportand guidance of Assoc Prof Dr.techn Quan LE-TRUNG His mentorship, expertise, anddedication to academic excellence have left an indelible mark on our research journey, forwhich we are profoundly grateful
Trang 101 Overview
1.1 Abstract
Modern civilization with the goal has been to establish a digital and cashless society With
the advent of payment methods such as credit cards, online banking, and digital wallets,contactless and cardless transactions are now feasible both online and offline Nevertheless,these methods of payment are susceptible to larceny and may occasionally necessitate thatusers commit unique passwords and store it Although biometric payments may appear to
be a feasible alternative, the fingerprint method can be deceived because the sensors aresusceptible to damage from dirt particles In contrast to current card, mobile, and biometricpayment systems, face recognition payments offer a more seamless experience byeliminating the need for a physical device to execute the transaction dependable, secure,and effective As a result, both the consumer and the retailer save time The primaryemphasis of this research is a payment system that integrates two essential components:
face recognition and liveness detection The remarkable 99.996% accuracy of the liveness
detection module contributes to increased security by distinguishing living subjects fromnonliving entities In addition, while the facial recognition module achieves a relativelymodest accuracy rate of 69.8%, it significantly contributes to the improvement of userconvenience in the context of transactions By integrating these two modules, acomprehensive strategy for face recognition in payment systems is achieved, striking abalance between user experience and security The heightened precision of livenessdetection provides an additional level of safeguarding against fraudulent activities, whereasongoing endeavors to improve the accuracy of face recognition contribute to the paymentsystem's overall efficacy
1.2 Problem Statement
1.2.1 Current Payment Methods: An Overview and Comparative Analysis
At the moment, different payment ways are used around the world, such as
Trang 11Cash: Cash is a traditional way to pay for things People use real money, like bills or coins,
Biometric authentication: Technologies for Biometric Authentication are being developed
to make payment methods safer and easier to use These technologies include facialrecognition, fingerprint scanning, and iris scanning
Cryptocurrencies: Digital currencies like Bitcoin, Ethereum, and other cryptocurrencies arebecoming more and more popular as a way to pay for things online They make transferssafe and anonymous
Finally, Biometric Authentication stands out as one of the most important and changing ways to pay in a world full of different options, from standard cash transactions
game-to digital currencies Using technologies like face recognition, fingerprint scanning, andiris scanning, this new way of doing things is changing the way payments are made byproviding a safe, quick, and user-centered authentication process Biometric authentication
uses unique physical or behavioral traits to confirm people's names, which lowers the risks
of identity theft and other fraudulent activities This is in contrast to traditional methods
that may be open to fraud or unauthorized access Since biometric technologies are always
getting better, adding them to payment systems could greatly improve security, speed up
transactions, and build trust and confidence among users in a world that is becoming more
Trang 12and more digital Therefore, biometric identification is an important part of this thesis
because it helps to change the future of safe and effective payment systems
1.2.2 Methodological Examination of Biometric Authentication Modalities in Payment Systems:
Advancing the Case for Facial Recognition
Regarding biometric authentication in payment systems, many different methods have
been used to make sure that deals are safe and go smoothly
Face Recognition: This advanced method uses complex algorithms to carefully record,examine, and verify face features, including things like the distance between the eyes, thestructure of the nose, and the shape of the mouth When users' faces are scanned, they are
compared to templates that have already been made to make sure the deal is valid
Dermatoglyphic Authentication: This method uses the unique patterns and ridges that show
up on a person's fingers to help confirm their identity by comparing photos of theimpressions they leave behind with old records
Iridal biometrics: This method uses infrared light to get clear pictures of the iris, which isthe complicated circle-shaped structure that surrounds the pupil Because iris patterns are
so unique, this method is a great example of a strong part of identification
Vocal Biometrics: Voice recognition figures out the unique things that make each person's
voice sound different by analyzing things like intonation, timbre, and rhythmic patterns.This makes sure that users are who they say they are during transactions
In conclusion, many biometric authentication methods work well to make payment systemssafer, but Facial Recognition stands out as the best option because it offers the best userexperience, accuracy, and adaptability As the cutting edge of technology moves forward,the seamless merging of facial recognition into payment infrastructures could have hugeeffects on creating safe, streamlined, and user-centered ways to make transactions
Trang 131.2.3 Current Landscape of Face Recognition in Payment
In the realm of payment technology, face recognition has emerged as a prominent method,
transforming the landscape with its convenience, speed, and security features The ease of
identity verification through quick facial scans enhances the payment experience for users,
contributing to its widespread adoption
Security is a paramount concern in payment applications, and face recognition systemsaddress this by integrating high-security measures such as data encryption and decryption.The synergy with other technologies like artificial intelligence (AI) and machine learningfurther refines the accuracy and performance of face recognition systems, ensuring a robustand secure payment environment
However, the widespread use of face recognition in payments has prompted societalreactions and privacy concerns Regulatory frameworks and policies governing the use of
facial data have become pivotal in discussions, highlighting the importance of ethical
considerations and responsible data practices Striking a balance between technological
innovation and privacy protection is crucial for the continued evolution of face recognition
in payment systems
In conclusion, the current state of face recognition in payments reflects a dynamic
landscape with opportunities and challenges The adaptability of these systems acrossvarious environments, coupled with ongoing improvements to address diverse facial
characteristics, underscores their potential impact As the technology evolves, a mindful
approach to privacy and security will be essential to foster trust and acceptance in thebroader societal context
1.2.4 Liveness Detection Methods:
Ensuring the authenticity of a presented face in face recognition systems is paramount to
Trang 14each offering distinct advantages Here are some commonly used methods for liveness
detection:
Facial Movement Analysis:
se Method: Analyzing dynamic facial movements like blinking, smiling, or nodding
e How it works: Monitors the dynamic nature of facial expressions to distinguish a
live face from a static image or video
3D Depth Sensing:
e Method: Capturing the three-dimensional structure of the face
« How it works: Uses technologies such as structured light or time-of-flight cameras
to assess spatial information, making it challenging for attackers to spoof with flatimages or videos
Texture Analysis:
e Method: Analyzing surface details, pores, and microexpressions
« How it works: Assesses the texture and fine details of the face to verify authenticity
Eye Blink Detection:
e Method: Monitoring natural eye blink patterns
¢ How it works: Verifies the presence of spontaneous and regular eye blinks, which
are challenging to replicate in static images or videos
Infrared Imaging:
« Method: Detecting thermal patterns emitted by living skin
e How it works: Utilizes infrared sensors to capture heat signatures associated with
living skin, making it difficult to mimic with printed images or videos
Voice Recognition:
e Method: Integrating voice-based challenges alongside face recognition
¢ How it works: Verifies that the voice associated with the presented face matches the
expected vocal characteristics of a live person
Random Challenge Prompts:
e Method: Introducing unpredictable challenges during authentication
Trang 15« How it works: Requests users to perform random actions, such as turning their head,
speaking a specific phrase, or responding to dynamic prompts
Blood Flow Monitoring:
e Method: Assessing blood flow patterns in the face
e How it works: Uses advanced techniques like photoplethysmography to detect
changes in blood circulation, ensuring the presented face exhibits physiologicalcharacteristics of a living person
Behavioral Analysis:
« Method: Analyzing user behavior during the authentication process
« How it works: Assesses the consistency and naturalness of user interactions,
identifying signs of automation or artificial manipulation
The adoption of Random Challenge Prompts for liveness detection stands out by not only
meeting the immediate need for secure face recognition but also providing a solution that
is adaptable, engaging, and continuously evolving against emerging threats This method
significantly contributes to the overall effectiveness and reliability of face recognition
applications, delivering a secure and user-friendly authentication experience
1.3 Survey
Share of Payment Modes at Retail Shops, in Percentage, Vietnam, 2021
@ Payments through Bank Accounts
Trang 16Figure 1: 2021 Retail Transactions in VietNam
In 2021, non-cash payments accounted up 70% of the overall retail transactions in Vietnam
Significantly, 89.3% of retailers have favorable evaluations of non-cash payments,
regarding them as a prevailing and enduring phenomenon Upcoming cashless payment
systems are anticipated to be introduced in order to alleviate the challenges nowencountered by businesses
According to the survey conducted by OpenGov Asia, bank account payments were thepredominant mode of transaction at retail establishments, restaurants, and cafés,
comprising 36.5% of all transactions Cash accounted for 29.8% of transactions, followed
by e-wallets (14.8%), QR codes (9.9%), bank cards (8.5%), and payment gateways (0.5%)
As per a report from the State Bank of Vietnam [5], by the conclusion of 2022, more than77.41% of Vietnamese adults possess bank payment accounts During the initial 7 months
of 2023, there was a notable increase in non-cash payment transactions compared to thesame period in 2022 Specifically, the quantity of non-cash payment transactions rose by
51.14% Transactions conducted through the Internet channel experienced a significantincrease of 66.46% in quantity and 4.01% in value Similarly, transactions made via mobilephone channel saw a substantial increase of 63.09% in quantity and 8.79% in value Lastly,transactions carried out using the QR Code method witnessed a remarkable increase of
124.15% in quantity and 16.12% in value The implementation of online account opening
commenced at the conclusion of March 2021 As of June 2023, there are around 27 millionactive payment accounts that were opened electronically with eX YC, and there are 10.8million cards currently being used Distributed and released utilizing the electronic KnowYour Customer (eK YC) methodology
GlobalData, a prominent data and analytics business, predicts that the Vietnamese cardpayments market will have a 23.8% growth, reaching VND859.2 trillion ($37.6 billion) in
Trang 172022[1] This growth will be driven by increased consumer spending and the government's
efforts to promote digital payments
® Vietnam: Card Payments Value (VND trillion), 2017-26f
Card paymentsvalue -s-Growth rate
Note: “e” refers “estimated”, whereas “f” refers “[orecost
Source: GlobalData Bankingand Payments Intelligence Center @® GlobaiData.
Figure 2: Card Payment Value
GlobalData's Payment Cards Analytics report reveals that card payments in Vietnamexperienced a significant increase of 13.7% in 2021, compared to a modest growth of 2.2%
in 2020 This surge might be attributed to decreased consumer spending during thepandemic The country's card payments market experienced a significant growth in 2021due to a modest economic recovery and the reopening of businesses
1.4 Motivation
In an era dominated by digital transformation, the fusion of biometric technologies andpayment systems stands as a pivotal frontier The application of face recognition inpayment processes is an area of immense potential, promising both heightened security and
a seamless user experience This thesis is motivated by the desire to explore and enhancethis potential by integrating liveness detection alongside face recognition, aiming to create
a robust and trustworthy framework for secure financial transactions
The primary motivation lies in addressing the evolving landscape of cybersecurity threats,
where traditional authentication methods often fall short Face recognition, coupled with
Trang 18liveness detection, presents a dynamic solution to combat identity fraud and unauthorizedaccess Liveness detection ensures that the presented face is not a static image, therebyfortifying the security of the payment system against spoofing attacks.
This research aims to delve into the intricate technicalities of combining face recognitionand liveness detection in payment applications By understanding the challenges associated
with both technologies and exploring their synergies, the thesis endeavors to develop a
comprehensive system that not only authenticates users based on facial features but alsoverifies the liveliness of the presented face in real-time
The integration of liveness detection is crucial not only for security but also for cultivating
user trust As we transition towards a cashless and contactless society, users expect
seamless and trustworthy payment experiences This thesis seeks to contribute to thegrowing body of knowledge on biometric authentication, particularly in the realm of
payment systems, and aims to provide insights into the practical implementation and
effectiveness of such a combined approach
The outcomes of this research could have a profound impact on shaping the future of securepayment systems, influencing the design of biometric technologies, and contributing to theongoing discourse on the intersection of security, usability, and technological innovation
Ultimately, the thesis aspires to contribute towards building a safer and more user-friendly
landscape for digital payments
1.5 Scope
The scope of the thesis involves designing and implementing a module that integrates bothliveness detection and face recognition technologies, aiming to enhance the security and
accuracy of identity verification systems The focus will be on developing algorithms for
liveness detection, such as eye blink detection, in conjunction with robust face recognitiontechniques The module aims to address the challenges posed by fraudulent attempts using
Trang 19static images or videos, providing a more secure and reliable solution for identityverification in various applications, including but not limited to payment systems, access
control, and user authentication The research will encompass the exploration of existingmethodologies, the development of novel algorithms, and the evaluation of the module'sperformance through extensive testing and comparison with existing solutions
The main goal of this thesis is to look into the study and development of a facial recognition
system that can be used for payments in stores In particular, the study wants to:
Foundational Ideas: Learn about the latest techniques and algorithms used for face
recognition, with a focus on how they can be used with payment systems
System Requirements and Standards: Look at the specific needs and rules thatmust be met in order for a face recognition system to work well in a retail paymentsetting, paying special attention to safety, privacy, and the ability to grow as needed
Facial Recognition Model Development: Use advanced methods from deeplearning and machine learning to help build and improve a facial recognition model
The model should try to get the best accuracy and performance possible, especially
in store settings that change quickly
Integration Infrastructure Design: Build a strong foundation that makes it easy
to add the face recognition system to current banking and electronic paymentsystems This will ensure that the systems work well together and efficiently
Trang 20Simulation and Evaluation: Run thorough simulations using relevant data sets tofigure out how well, reliably, and possibly problematic the face recognition system
is in a variety of retail settings
Future Directions: To wrap up, summarize the study results and talk about possible
ways to improve, come up with new ideas, and use broader integration strategies inthe ever-changing world of payment technologies
Trang 212 Methodology
2.1 Related Work
Dhikhi et al implemented a novel credit card transaction system that combines facial
recognition and detection technologies, utilizing the Haar-Cascade and GLCMalgorithms[8] This system primarily focuses on ensuring the security of Mastercard users,specifically in cases when illegal access occurs as a result of the exposure of Mastercard
information or the loss of the card This study presents a holistic method to improve the
security of credit card transactions through the utilization of facial recognition anddetection technology The objective of the suggested system is to reduce the likelihood ofcredit card theft by comparing the user's face image with a dataset linked to the user, andthe authentication process relies on this comparison If the facial image is a match,indicating the user's validity, the transaction is authorized On the other hand, if there is nomatch, the user is not allowed to proceed with the transaction, which enhances securitymeasures and decreases the vulnerability to credit card fraud
M.Du et al[9] introduced a lightweight improvement scheme for face recognition tailored
for mobile payment systems Utilizing dynamic heteroscedasticity and the classical scale
transformation algorithm, the proposed method autonomously adapts by adding reliabletest samples, resulting in significant performance enhancements Testing on ORL and Yaleface databases showed recognition rate improvements by 6.13% and 14.11%, respectively,compared to traditional methods Moreover, it achieved a 74.05% recognition rate on theORL database, surpassing classic algorithms like PCA and LBP The scheme wassuccessfully implemented on Android smartphones, affirming its feasibility Additionally,
a cloud-based architecture enhancement was proposed for future scalability
Trang 222.2 Theoretical Basis
2.2.1 Convolutional Neural Network
Convolutional Neural Network (CNN) [10] is a powerful deep learning approach that is
specifically designed for the study and interpretation of visual content, particularly images
The complexities of the human visual system serve as inspiration for CNN, which excel at
autonomously recognising and organising spatial characteristics that are present in input
images These networks are essential to the operation of a wide variety of computer vision
applications, which include, but are not limited to, picture categorization, object
identification, and facial recognition
Convolution Neural Network (CNN)
Input
Pooling Pooling Pooling
Activation
Convolution Convolution Convolution EN Y Functi
Kernel RaLU ReLU RẻLU Flatten\ j mạn
Trang 23CNN Advantages:
Hierarchical Feature Learning: CNNs excel in their capacity to autonomouslyacquire hierarchical characteristics from unprocessed input By eliminating therequirement for human feature extraction, the model is able to effectively capturecomplex patterns, textures, and structures that are inherent in visual data
CNNs demonstrate spatial invariance, which refers to their ability to identify objects
or patterns regardless of their location within an image This attribute improves themodel's resilience and precision in diverse computer vision tasks
Parameter sharing in Convolutional Neural Networks (CNNs) involves utilizing thesame set of parameters across several regions of the input space This approach
optimizes memory usage and improves computational efficiency, making CNNswell-suited for handling extensive visual datasets
CNNs exhibit versatility by excelling in various tasks such as picture categorization,
object identification, facial recognition, image production, and more Thisshowcases their adaptability and effectiveness in diverse applications
CNN disadvantages :
Computational Complexity: The training of Convolutional Neural Networks(CNNs) requires substantial computational resources, such as powerful GraphicsProcessing Units (GPUs), because of its complex structure, depth, and extensiveparameter space The intricate nature of this can lead to prolonged training durationsand computationally demanding calculations
Overfitting: CNNs are prone to overfitting if regularization techniques and data
augmentation are not properly implemented Overfitting occurs when the model
Trang 24performs well on training data but fails to generalize to unknown data, resulting in
reduced reliability and performance
Interpretability: The intricate and intricate structure of CNNs frequently results indiminished interpretability, posing difficulties in comprehending and interpretingthe model's judgments, feature representations, and underlying mechanisms Thisconstraint can provide difficulties in crucial applications that necessitatetransparency and interpretability
Data dependency: The performance of Convolutional Neural Networks (CNNs) isstrongly influenced by the quality, diversity, and quantity of the data used for
training Inadequate or prejudiced datasets might result in less than optimal
performance, underscoring the significance of rigorous data collection,
preprocessing, and augmentation procedures
2.2.2 VGGFace
VGG Face[11] is a convolutional neural network (CNN) model that was created by theVisual Geometry Group (VGG) at the University of Oxford It is already taught torecognize faces VGG Face is very good at getting detailed facial features, patterns, andstructures from images because it uses the deep learning design of VGG networks Thismakes face recognition and verification accurate and reliable in a wide range of situations
and applications
Trang 25Aspect VGGFace OpenFace DeepFace
Carnegie MellonAuthor Oxford University ¬ Facebook
High Accuracy: VGG Face is very good at recognizing faces because it has a deep
architecture, trains on a lot of large datasets, and has advanced feature extraction
tools that let it work well in a wide range of camera angles, lighting conditions, andfacial expressions
Displaying Features: VGG Face's hierarchical convolutional layers capture and
display multi-level facial features, textures, and patterns in a systematic way This
allows for complete and accurate feature representations that improve the model'sability to accurately tell the difference between people
Transfer Learning: VGG Face's pre-trained model is useful for transfer learning
because developers and researchers can use its learned representations,architectures, and weights to speed up the training process, improve convergence,and make custom face recognition datasets and applications work better
Scalability and Adaptability: VGG Face's modular architecture, scalability, andadaptability make it easy to integrate, customize, and deploy in a wide range of facerecognition systems, platforms, and environments This makes it useful in many
Trang 26fields, including biometrics, security, surveillance, and personalized userexperiences.
Community Support and Contributions: The deep learning community activelyresearches, develops, and contributes to VGG Face This leads to collaborative
innovation, improvements in face recognition technologies, and the ongoing
refinement and optimization of the model's architecture, algorithms, andperformance metrics
Disadvantages of VGGFace
Computational Intensity: VGG Face's complex architecture and many layers require
a lot of memory bandwidth, processing power, and computational resources This
leads to longer inference times, higher resource utilization, and higher operational
costs, especially in real-time or latency-sensitive applications
Model Size and Complexity: VGG Face's neural network architecture may be hard
to deploy, store, and scale because of its size, complexity, and depth To fix theseproblems, optimization techniques, model pruning, or changes to the architecture
may be needed to make the models smaller or more complex
Overfitting Risks: VGG Face has high accuracy and performance metrics, but it mayoverfit certain face recognition tasks, datasets, or environments with little variation
To make it work well in a wide range of situations and conditions, it needs to be
carefully regularized, data augmented, and fine-tuned
Training and Data Needs: VGG needs to be trained and fine-tuned Face on custom
datasets, applications, or domains may need a lot of computing power, annotateddata, subject knowledge, and testing to get the best performance, reduce biases, and
Trang 27e Licensing and Usage Restrictions: VGG Face's research and pre-trained models are
freely available for academic and research use However, commercial use, licensingagreements, and intellectual property rights may place restrictions, limitations, orcompliance requirements on them depending on the applications, industries, andjurisdictions This means that they need to be carefully considered, legally reviewed,and used in line with relevant policies, regulations, and ethical guidelines
2.2.3 MediaPipe
MediaPipe[13] is a highly adaptable platform that is well-known for its diverse range ofcapabilities in AI applications across several areas, including facial recognition, hand
tracking, and position calculation MediaPipe is an open-source framework that provides a
wide range of tools for easy integration It helps in developing various projects that usevisual data analysis MediaPipe stands out in its ability to perform real-time processing,
allowing for quick analysis of video streams and image sequences This greatly improves
project efficiency and flexibility
Advantages of MediaPipe
e MediaPipe's cross-platform interoperability guarantees excellent performance on
many devices and operating systems, highlighting its adaptability and versatility
e Efficient Development: MediaPipe streamlines the development process by offering
smooth integration and easy access to pre-trained models, resulting in reducedcomplications, faster deployment, and improved resource efficiency
e Advanced Methodologies: MediaPipe utilizes state-of-the-art computer vision
techniques like as convolutional neural networks (CNNs) and recurrent neural
networks (RNNs) to accurately analyze complex visual features
Trang 28e Efficiency and Scalability: MediaPipe's modular architecture enables easy
customization, integration of pre-trained models, and scalability, accommodatingvarious application needs and promoting creativity
Disadvantages of MediaPipe
e Integration Complexity: Although MediaPipe provides flexible integration options,
the sophisticated nature of certain functionality might cause complexity throughoutthe integration process This necessitates careful attention to ensure smoothimplementation
e High Resource Usage: Due to its powerful real-time processing and complex
techniques, MediaPipe may require a significant amount of computational
resources, which could be problematic in contexts with limited resources
e Learning Curve: Developers who are not aware with the subtleties of MediaPipe
may need to undergo specific training and possess experience due to the wide range
of programming languages, frameworks, and advanced techniques used in this
technology
2.2.4 Flash for API Server
Using Flask[12] as our API server is a crucial component in designing the structure of our
project, providing a strong and adaptable framework specifically designed for developing
online applications The reputation of Flask lies in its ability to effectively combine a
lightweight design with strong features, enabling us to create a user-focused experience.This allows us to effortlessly integrate dynamic and interactive material, enhancing ouruser interface
Advantages of Using Flask
Trang 29Flask's modular and scalable design, enhanced by several extensions, enables thedevelopment of a flexible and expandable user interface that can easily adjust to
changing project needs
Flask's minimalist approach facilitates efficient development cycles by streamlining
the process and accelerating the deployment of new features This allows for quickiterations and improvements
Flexibility and Creativity: The impartiality of Flask enables developers with the
liberty to personalize and expand features, promoting originality and guaranteeingadherence to project demands
Flask's extensive ecosystem enhances our API server by providing strong support
for creating RESTful APIs, connecting to databases, and being compatible withthird-party libraries This allows us to seamlessly integrate various functionality and
improve the user experience
Disadvantages of Using Flask
2.2.5.
Learning Curve: Although Flask provides flexibility and customization, itsextensive capabilities may present a learning curve for developers who are notfamiliar with its intricacies, requiring specialized training and knowledge
Configuration Complexity: Utilizing Flask's wide range of features and extensionscan lead to intricate configuration and optimization challenges, necessitating careful
attention to guarantee optimal performance and scalability
OpenCV
OpenCV[14], which is also called the Open Source Computer Vision Library, is a known system for computer vision applications that is known for its advanced features
Trang 30well-This open-source project gives researchers and developers a lot of different algorithms,modules, and tools that can be used for different jobs Complex image processing tasks likefiltering, transformations, and morphological analysis are part of these jobs So areadvanced object identification and recognition tasks that use algorithms like HOG, SSD,YOLO, and Haar cascades Besides that, OpenCV works well with popular machinelearning tools, which makes it easier to use trained models for tasks like classification,regression, and clustering The system's ability to handle data in real time, do parallelcomputing, and use hardware acceleration techniques makes it faster, more scalable, andmore responsive Because of this, it is an important choice for projects that need strongprocessing and analysis of visual data.
Advantages OpenCV:
e Flexibility: OpenCV has many different methods and tools that make it useful for
many different computer vision tasks, from simple image processing to complex
machine learning schemes
e Since OpenCV 1s an open-source platform, it gets a lot of help from a large group
of developers This means that it is always getting better, updated, and supported
e When it comes to operating platforms, OpenCV works with a lot of them, including
Windows, Linux, macOS, iOS, and Android This adaptability makes deployment
easy in a number of different operating systems
e Large Number of Features: The library has many features, such as the ability to
identify objects, recognize faces and gestures, and track movements, among others.These tools make it a lot easier to make different kinds of apps
Trang 31e Machine Learning Integration: OpenCV works well with popular machine learning
frameworks, which makes it easy for developers to add advanced machine learningmodels and methods to their apps
Disadvantages of OpenCV:
e For beginners, OpenCV is very hard to get the hang of because it has so many
features and is very technically complicated It takes a lot of time and work to learnit
e Performance Limitations: OpenCV can handle complicated tasks, but it may need a
lot of computer power, which could slow things down when resources are limited
e Problems with the documentation: OpenCV's documentation is very good, but it is
sometimes broken up or missing full examples This could make things hard for
writers who want clear instructions on how to use certain features
e When you combine OpenCV with other frameworks or tools, it can get complicated,
especially when you have to keep different versions in sync or make sure that all ofthe parts work together
e Possible Overhead: OpenCV has a lot of advanced features and powers, which could
lead to extra work that isn't needed for projects that only need certain features Some
people think that this could hurt performance and efficiency if it's not carefully
managed
Trang 322.2.6 WebSocket
WebSockets[15] are a type of communication protocol that lets clients and servers talk toeach other in real time over a single, persistent link Unlike regular HTTP connections,
which don't store any information and need to make a new connection for each
request-response loop, WebSockets keep a connection open, which lets the client and server send
and receive data quickly and easily
The advantages of WebSockets:
Real-Time Communication: WebSockets allow clients and servers to talk to eachother in real time, both ways This makes them perfect for apps that need to sendand receive data right away, like chat apps, online games, and trade platforms
Reduced delay: WebSockets reduce delay and the work that comes with setting upmultiple connections by keeping a persistent connection This means that responses
are faster and users have better experiences
Efficient Use of Resources: WebSockets allow event-driven communication, whichsaves server resources and bandwidth by sending data only when it's needed This
is different from polling methods, which continuously request data at set intervals
Improvements to the User Experience: WebSockets' real-time features help make
dynamic, interactive user experiences possible by letting clients and servers shareupdates, alerts, and data instantly
Scalability: WebSockets support scalability by making it easy for multiple clientsand servers to talk to each other This means that they can be used to launch
applications across distributed architectures and handle multiple connections at the