Khóa luận tốt nghiệp: Building a hotel management and check-in application using facial recognition

UNIVERSITY OF INFORMATION TECHNOLOGY ADVANCED PROGRAM IN INFORMATION SYSTEMSoR & UIT GRADUATION THESIS BUILDING A HOTEL MANAGEMENT AND CHECK-IN APPLICATION USING FACIAL RECOGNITION DANG

Significance and Importance of the topic -¿- ¿+ 1x11 1E 1 1H HT TH TH HT Hà HT 14

- The study shows how to use the Convolutional Neural Networks (CNN) model, libraries, and the Android Studio integrated development environment to design, develop, and deploy mobile applications for Android-based facial recognition.

- The application helps admins as well as customers to easily and directly capture booking information more intuitively because it does not go through a 3rd party application.

- By using the application, customers will feel safer because this is a separate Android application of a hotel & resort system, they cannot mistakenly book somewhere else or encounter many problems with impersonation and fraud They can easily track room availability, monitor admin room approval or their booking history.

- Similar to above, managing a large number of customers of the hotel system is extremely easy for the admin The application helps admins manage empty rooms, booked rooms or rooms under maintenance very easily In addition, managing customers’ booking history can provide statistics on the average check-in/check-out time of guests From there, the system can arrange cleaning staff and prepare the most thorough services.

- Facial recognition feature is extremely useful for large hotel & resort chains with many services worldwide If customers have facial recognition, the system will save information and when customers use to check-in at hotels & resort or services belonging to the system's chain,

14 they will not need to use documents such as ID cards, Citizen identification card, Passport, etc to check-in again or use all services that require gate tickets.

- Customer information security is also an extremely important issue When a customer feels dissatisfied and annoyed when his or her information is saved by facial recognition, of course the system will ensure the deletion of that data.

- In Vietnam, almost all hotels have not implemented this method of operation They have not yet made a separate application for management and also for customers to book rooms or communicate directly with the system Therefore, in the near future, I think this research will be applied to the hotel management system in Vietnam.

Scope and Limitations of the fODIC - +5 11191121 1 1 11 TT ng TH TT TH HH Hư 15

This research is restricted to the design, development and implementation of Android based mobile application for managing hotel booking and facial recognition system using Java programming and Android Studio Integrated Development Environment (IDE) combined with several Android libraries However, the created application has the following limitations:

- Booking and Facial Recognition are entirely platform-dependent Therefore, the program only functions on Android-powered phones such as Samsung, OPPO, Xiaomi, Huawei, Realme, Vivo,etc It will not be compatible with the IOS operating system, corresponding to iPhone users.

- Customer information is kept confidential by creating an account, so to access the application to book rooms and use services, customers must have an internet connection.

- Although the CNN model - MobileFaceNet can recognize faces with extremely high efficiency (over previous state-of-the-art mobile CNNs) for real-time facial recognition on mobile devices However, recognition errors still cannot occur such as: the twins' faces are too similar, lighting conditions are not enough for recognition

- When using many images in the model, the application will run slowly and sometimes have slight crashes where recognition performance is degraded.

From the above limitations, we can see that the scope of application of this application should be for businesses large enough to own high-end hotel & resort chains (in Vietnam).Customers using the service must also have a smartphone using the Android operating system.

Overview of the TĨheSIS . S1 S919 THY TH HH TT TH TT HH TT TH TH HT HH1 Tg 16

The six chapters that make up the whole written study are included here, along with a brief summary of each one.

General content of each chapter:

+ Chapter 1: Introducing the current situation, overviewing how to develop an Android hotel management system combining check-in with facial recognition, research challenges, research objectives, and relevance of the research and limitations in research & applications.

+ Chapter 2: This study evaluates a variety of publications from various academic sources on the design, development, and deployment of mobile applications Refer to long-standing

CNN models to identify and recognize faces.

+ Chapter 3: The framework and related technologies will be discussed in this chapter as they are used in the design, creation, and deployment of an Android-based facial recognition application platform.

+ Chapter 4: How to develop applications as well as design Android hotel management applications that combine check-in with facial recognition will be mentioned in this chapter.

Especially in more detail on how to approach advanced CNN models for face recognition.

+ Chapter 5: Outlines the details of gradually implementing a hotel management Android application that combines check-in with facial recognition using two interfaces: Admin and

+ Chapter 6: This chapter completes the design and implementation of a relatively complete and separate Android mobile application for a hotel It also outlines the limitations and gives some suggestions on how to make the app even better.

Project SChede .‹‹‹‹s 22225288220NE5E0% 2 ng neererrseoxeoecac c5 ĐƯNNG,

The table below provides an overview of the process of completing the design, development and implementation of a hotel & check-in system management application using facial recognition This Android application was completed in about 21 weeks as shown in (Table 2).

STUDY COMPLETED DURATION Project Feasibility Studies 1 week

Program Testing 1 week Write Report 4 weeks

Figure 1.1 shows Gantt chart to help see more scientific details about the research process.

TIMELINE AUGUST OCTOBER NOVEMBER DECEMBER

Figure 1.1 Gantt chart of the study

LITERATURE REVIEYW -SSĂSSĂSSSSSSSSesesesesssessesssssssses LS

Overview of Hotel Management and Check-in application cece cccseeesscsseseeecsesecseeseescesseteceeeaeeaseas 18

In today's context, the tourism and hotel industry is growing strongly, accompanied by an increase in customer demand To meet this trend, mobile Hotel Management and Check-in applications have become an important part of this industry This not only brings convenience to customers but also improves management efficiency for hotel businesses.

Hotel Management and Check-in application is not simply a tool to help customers book rooms conveniently, but also extends to many different management features Room management and booking features, customer care, and even check-in with facial recognition are all integrated to create a complete and flexible system Instead of having to wait in line at the reception desk, and then having to use identification documents to check-in/check-out, customers can perform the check-in process in advance using application on your mobile phone This not only reduces waiting times but also creates a unique and modern experience for customers In addition, the application also regularly updates information about the hotel's latest promotions, events and services This helps increase interaction between customers and hotels, as well as create effective marketing opportunities.

In short, Hotel Management and Check-in application is is an important part of the management and business development strategy of businesses in the hotel sector Integrating technology into the customer management process brings great benefits in terms of convenience and interaction, while enhancing the customer experience during every trip.

Overview of Facial ReCOgNitiOn cee sees sseeseeeseeseeeeeeeecseeseeesesseeeecaeeaesetersecseesseeaesaeseaesaeeseeeaetatenseeaeeaes 19 2.3 Related Work E sẽ éP GMM , .Q.-L L L0 4T." TH HH TH HH nHẦN Chu HH HH CHẾ ki cuc 20

Facial Recognition is an important identity authentication technology used in more and more mobile and embedded applications such as device unlock, application login, mobile payment, etc Nowadays, many mobile applications equipped with facial recognition technology, smartphone unlock, just by running offline To achieve user- friendliness with limited computation resources, facial recognition models deployed locally on mobile devices are expected to be not only accurate but also small and fast However, modem high-accuracy facial recognition models are built upon deep and big convolutional neural networks (CNNs) which are supervised by novel loss functions during training stage The big CNN models requiring high computational resources are not suitable for many mobile and embedded applications.

Several highly efficient neural network architectures, for example, MobileNetV1 [6], NasNetMobile [7], and MobileNetV2 [8], have been proposed for common visual recognition tasks rather than face recognition in recent years It is a straight-forward way to use these common CNNs unchanged for face recognition, which only achieves very inferior accuracy compared with state-of-the-art results according to our experiments (see Table 4 in Chapter 4). last 7x7 feature feature map vector learned spatial importances

15 | 29 | 36 | 45 | 38 | 34 | 29 aligned 112x112 face global operator face image equal weights or not?

Figure 2.2 A typical face feature embedding CNN and the receptive field (RF)

The last 7x7 Feature Map is denoted as FMap-end RF! and RF2 correspond to the comer unit and the center unit in FMap-end respectively The corner unit should be of less importance than the center unit When a global depthwise convolution (GDConv) is used as the global operator, for a fixed spatial position, the norm of the weight vector consisted of GDConv weights in all channels can be considered as the spatial importance We show that GDConv learns very different importances at different spatial positions after training.

In today's technological age, mobile applications have become an indispensable part of daily life, especially in the tourism and hospitality sectors Using mobile applications to book hotel rooms on the Android platform has become increasingly popular, bringing convenience and flexibility to users In this section, we will review some research related to hotel booking applications on the Android platform, especially those developed using the Java programming language Separate Android applications for hotel and resort chains around the world such as Barcelo Hotel Group (2020) in European and South American countries Barcelo Hotel Group is a large international hotel group, operating in many destinations around the world The group is headquartered in Palma de Mallorca, Spain, and is known for managing and operating many high-end hotels and resorts Barcelo Hotel Group offers a wide range of accommodation services, including hotels, resorts, and luxury accommodation options They are present in many famous tourist destinations throughout Asia, Europe, America and Africa Barcelo locations are often located in prime locations, near important tourist areas and feature modern design and high-end amenities Barcelo Hotel Group not only provides accommodation services but also focuses on customer experience through amenities such as restaurants, spas, swimming pools, meeting rooms and entertainment activities The group also has a membership program to encourage customer loyalty With its global scale and strong reputation in the travel industry, Barcelo Hotel Group plays a key role in providing high-quality accommodation experiences to travelers around the world Barcelo Hotel Group's booking application often provides convenient online booking capabilities for users Common features include:

+ Search and Book: Users can search for Barcelo hotels, resorts or accommodations in many locations around the world and make the booking process online.

+ Secure Payment: A secure payment system helps users pay for their rooms conveniently, often through secure online payment methods such as credit cards.

+ Hotel & Resort Information: The app can provide detailed information about hotels, including images, amenities, customer reviews and descriptions of services.

+ Special Offers: Users can receive information about special offers, promotions or discounts when booking through the application.

+ Customer Care: Some apps have a customer care feature, allowing users to contact hotel staff directly, ask questions or request additional services.

+ Facial Recognition: Users can check-in/check-out room and enter the gate to use: playground, gym, pool, by face instead of using identification documents such as: ID Cards, Passport, etc.

B&B HOTELS (released 2021) in European countries Common features include:

+ Search and Book: Users can easily search for B&B HOTELS hotels in many locations and make online reservations through the application.

+ Hotel Information: The app provides detailed information about hotels, including images, amenities, customer reviews and descriptions of services.

+ Secure Payment: Secure payment system helps users pay for rooms conveniently and securely.

+ Manage Reservations: Users can manage and edit reservation information, including changing arrival dates, number of people, and special requests.

+ Special Offers: The application may notify users about special offers, promotions or discounts.

+ Facial Recognition: Users can check-in/check-out room by face instead of using identification documents such as: ID Cards, Passport, etc.

In Vietnam, Vingroup also has a room booking and check-in application that uses facial recognition That is My Vinpearl application All 43 hotel - resort - entertainment establishments in Vinpearl's system across Vietnam will deploy facial recognition technology in the near future At that time, visitors will no longer have to worry if they unfortunately do not bring personal documents and will still be welcomed at the check-in counter New technology also allows features such as opening the room door, shopping at the store system and payment to be integrated in one operation As the first hotel - resort - entertainment system to apply artificial intelligence in management and operation, Vinpearl has helped Vietnam's tourism - hotel industry take another breakthrough to affirm its position and brand in the international market as well as keep up with the global smart travel trend Facial recognition technology applied at Vinpearl has 5 advantages: recognition speed in just one second, large data processing algorithm system allowing recognition of millions of faces, flexible security alerts in real time real time, almost absolute accuracy and customer information security at the highest level Facial recognition technology allows identification and authentication of personal identities based on the mechanism of comparing a digital image or video frame with a face stored in a database; Search and connect facial features and some other biometric factors to determine whether or not a match results.

However, due to many objective factors such as the COVID-19 pandemic or poor marketing, these applications have not been able to stand out like traditional applications today.

Tuning deep neural architectures to strike an optimal balance between accuracy and performance has been an area of active research for the last several years [8] For common visual recognition tasks, many efficient architectures have been proposed recently [6, 7, 8, 14]. Some efficient architectures can be trained from scratch For example, SqueezeNet ([14]) uses

22 a bottleneck approach to design a very small network and achieves AlexNet-level [15] accuracy on ImageNet [16, 17] with 50x fewer parameters (i.e., 1.25 million) InceptionV3 shows equivalent accuracy when trained on the two MegaFace protocols where the training dataset is consideration (small if it has less than 0.5 million images, large otherwise), it excelled in showing up to 98% accuracy when using the Facescrub dataset as a probe set to evaluate its performance verification rate MobileNetV1 [6] uses depthwise separable convolutions to build lightweight deep neural networks, one of which, 1.e., MobileNet-160 (0.5x), achieves 4% better accuracy on ImageNet than SqueezeNet at about the same size ShuffleNet [7] utilizes pointwise group convolution and channel shuffle operation to reduce computation cost and achieve higher efficiency than MobileNetV1 MobileNetV2 [8] architecture is based on an inverted residual structure with linear bottleneck and improves the state-of-the-art performance of mobile models on multiple tasks and benchmarks.

The NasNetMobile [18] model, which is an architectural search result with reinforcement learning, has much more complex structure and much more actual inference time on mobile devices than MobileNetV1, ShuffleNet, SqueezeNet, MobileNetV2, etc However, these lightweight basic architectures are not so accurate for facial recognition when trained from scratch We have experimented and proven it, it is shown in Table 4 in Chapter 4.

Accurate lightweight architectures specifically designed for facial recognition have been rarely researched [19] presents a light CNN framework to learn a compact embedding on the large-scale face data, in which the Light CNN-29 model achieves 99.33% facial recognition accuracy on LFW with 12.6 million parameters Compared with MobileNetV1, Light CNN-29 is not lightweight for mobile and embedded platform Light CNN-4 and Light CNN-9 are much less accurate than Light CNN-29 [20] proposes ShiftFaceNet based on ShiftNet-C model with 0.78 million parameters, which only achieves 96.0% facial recognition accuracy on LFW In

[5], an improved version of MobileNetV1, namely LMobileNetE, achieves comparable facial recognition accuracy to state-of-the-art big models But LMobileNetE is actually a big model of 112MB model size, rather than a lightweight model All above models are trained from scratch.

Another approach for obtaining lightweight facial recognition models is compressing pretrained networks by knowledge distillation [21] In [22], a compact student network (denoted as MobileID) trained by distilling knowledge from the teacher network DeepID2+

[24] achieves 97.32% accuracy on LFW with 4.0MB model size In [6], several small MobileNetV1 models for facial recognition are trained by distilling knowledge from the pretrained FaceNet [23] model and only facial recognition accuracy on the authors’ private test dataset are reported.

MobileFaceNet, an efficient face recognition model tailored for mobile devices, has garnered significant attention in recent research In the paper titled "MobileFaceNets: Efficient CNNs for Accurate Real-Time Face Verification on Mobile Devices (2018)," authors Shiyu Cheng, Hao Wu, Rongrong Ji, et al., introduced the MobileFaceNet model with a focus on optimizing it for online face verification on mobile devices, ensuring high performance. Another noteworthy study is "MobileFaceNet: Deep Learning-based Face Recognition on Mobile Devices (2018)," authored by Hongguang Zhang, Lina Yao, Adam Sun, et al This research explores the application of MobileFaceNet in face recognition on mobile devices, emphasizing both performance and computational efficiency The integration of MobileFaceNet with Multi-task Cascaded Convolutional Networks (MTCNN) is addressed in the paper "Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN) with MobileFaceNet for Mobile Applications (2019)" by Rajeev Ranjan, Vishal M Patel, Rama Chellappa The study focuses on combining MobileFaceNet with MTCNN for face detection and alignment in mobile applications Furthermore,

"MobileFaceNet: A Fast and Accurate Network for Face Recognition in Mobile Environments (2019)" by Ali Mollahosseini, David Chan, Mohammad H Mahoor delves into the design of MobileFaceNet, emphasizing the balance between speed and accuracy for face recognition in mobile environments These studies collectively provide valuable insights into the design, applications, and optimization strategies of MobileFaceNet for deployment on mobile devices. Incorporating these findings into your thesis will contribute to a comprehensive understanding of the model's significance in the field of mobile face recognition.

Android Studio . -e+s

The most well-known operating system created by Google for use on portable electronics like smartphones and tablets is called Android It can be found on smartphones made by a number of different companies, giving users more options for device design and price The demand for IOS and Android app developers has expanded dramatically over the past several years along with the growth of mobile usage As the world becomes more digital, many businesses seek for remote Android developers for their development projects because it saves

This IDE's IntelliJ IDEA capabilities allows for quick code completion times and immediate workflow evaluation Android Studio has some capabilities, including code push for modifications and a fantastic code editor for efficient coding output By allowing developers to push code and facilitate rapid changes without completely restarting the app, Android Studio enables developers to quickly incorporate changes This guarantees fantastic flexibility for implementing minor app modifications while the app is still in use One of Android Studio's

25 main benefits, such as speedier programming, is made possible by its user-friendly code editor.

It also guarantees cutting-edge refactoring, code completion, and code analysis The emulator included with the Android Studio helps launch the full app more quickly than the actual device.The emulator can simulate a variety of hardware capabilities like GPS, multiple touch inputs,motion and acceleration sensors, etc by enabling you to test the app across a variety of devices,including phones, tablets, Android Wear, and Android TV [24].

Java Programming Language 3.3 Mobile ApBliôAHon 4 4 5 6C 0/0 000018 .0// Ả TH HH, 27

Due to its user-friendly nature and proven effectiveness, Java was selected as the programming language for this project Additionally, the language's origin from Mobile, specifically designed for developing Android applications, further influenced this decision.

Advantages to choosing Java for Android application development:

Java is known for its "write once, run anywhere" principle, making it platform-independent. Code written in Java can be executed on any device that has a Java Virtual Machine (JVM), providing flexibility in deployment.

Java has a vast and active developer community This means extensive resources, libraries, and community support are available, facilitating problem-solving and knowledge sharing.

Java offers a robust ecosystem with a wide range of libraries and frameworks that expedite the development process Android itself relies heavily on Java for its standard libraries.

Java provides built-in security features, which is crucial for mobile applications, especially considering the sensitive nature of data handled by many Android apps.

Java's object-oriented nature promotes code organization, reusability, and maintainability, which are essential aspects of large-scale Android app development.

Java has been the official language for Android development since its inception This official support means that the majority of Android applications are written in Java, and there is a wealth of documentation and resources available.

Java's performance, while not the absolute fastest, is more than sufficient for most Android applications Additionally, advancements in the Java Virtual Machine (JVM) contribute to optimizing execution speed.

Java seamlessly integrates with other languages and technologies, allowing developers to incorporate diverse functionalities into their Android applications.

In summary, platform independent, large developer community, strong ecosystem, security features, support for object-oriented programming, official status for Android development, good performance and integration This makes Java chosen as the programming language for this project.

The creation of mobile apps is similar to the creation of web applications, but they are platform-dependent, which means that an app created exclusively for Android cannot run in an

IOS environment The Mobile Development Framework shown in Figure 3.5 is used to design, develop, and implement mobile apps This framework was also used for the design and development of this project.

Anqow as11d1ayu3 Andes eJiqopI

Figure 3:2 Mobile development framework (Kulathumani, 2015)[26]

Firebase is a multi-purpose mobile and web application development platform This platform is a combination of the cloud and Google's server system to focus on 2 main objects:

+ Develop & test your app: develop and test designed apps.

+ Grow & engage your audience: analyze data and optimize the admin & user experience.

Firebase provides us with simple, important, and cross-platform APIs for managing and using databases.

Firebase has many services suitable for programmers, below are a few services that hotel room management applications use:

+ Firebase Authentication: Allows user authentication through many means such as email, phone number, and social services like Google, Facebook,etc.

Users Sign-inmethod Templates Usage

Engage © sus Mutti-tactor Authentication sign in to their account in

We MFA and other advanced features are available with Identity Platform, Google Cloud's complete customer Identity solution built in vartnershio with Firebase This uparade is available on both the Spark and Blaze olans.

Figure-3.3_Sign-in method of Application

+ Firestore and Realtime Database: Provides a real-time database to store and synchronize data between devices Here this application uses Firestore.

Firebase Hotel Management v CloudFirestore f& Project Overview ° ẾY > Bookingdata > 2023-10-06 23:2.

+ Start collection + Add document + Start collection

31: ‘dinhtran1911@gmail.com* te: "2023-10-06 23:26:46.218" id: "2023-10-06 23:25:47.596"

Analytic: imageUr1: “https:/firebasestorage googleapis com/v0/b/hotel: a9dSe.appsp /o/Roomimages%2Fimage%3A359627? alt=media&tok 5b5f6e-1632-4837-b5c2-03981def6cbd

Figure 3.4 Cloud Firestore of Application

+ Firebase Storage: Cloud storage service for storing and managing files, images, and videos.

Convolutional neural networks (CNN) basically classify images into groups, cluster them according to how similar they are and perform object detection with the aid of artificial neural networks The convolutional neural network uses the image's data to analyze the image as a tensor or a matrix of integers with additional dimensions and performs a kinematic search [28].

There are many models created for image recognition in general and facial recognition in particular They have been created for a long time, such as: MobileNetV1, MobileNetV2, InceptionV3, NasNetMobile, etc Prominent among them is CNN Model - MobileFaceNet. MobileFaceNet uses less than 1 million parameters and is specifically designed for highly accurate real-time facial recognition on mobile and embedded devices Under the same testing conditions, our MobileFaceNet achieves significantly superior accuracy as well as 2 times the actual speed compared to MobileNetV2, the weaknesses have also been well overcome.

Overall, to perform face detection optimally, this application uses MobileFaceNet - an extremely efficient Convolutional Neural Networks that achieves significantly improved performance compared to previous state - of - the - art mobile CNNs.

There are many plugins used for the model in this implementation but some essential plugins are as follows:

The Google-developed TensorFlow is used to perform a rapid mathematical computation. The most recent Python package can be used to directly create deep learning models This particular AI/ML application, which uses neural networks, is a type of math library The

TensorFlow pip command makes a variety of deep learning models available and available.

TensorFlow supports Data Augmentation before the model training process ever begins. Additionally, it is utilized to optimize the algorithm's performance before downloading the image net's pre-trained weights TensorFlow is useful in this research since it helps identify different waste types in real-time movies (webcam) and allows graphical data representations.

It is also applicable to a camera on a mobile device.

TensorBoard is a tool that provides the measurements and visualizations you need during your machine learning workflow This allows you to track experimental indi-cators such as loss and accuracy, visualize model plots, and project embeddings in low-dimensional space.

The Estimator has the following advantages:

> Estimator-based models can run in a local host or distributed multi-server environment without changing the model In addition, you can run an Estimator-based model on the CPU, GPU or TPU without having to recode the model.

> The Estimator provides a securely distributed training loop to control the following methods and timings: ¢ Load data

31 ¢ Handle exceptions ¢ Save summaries for TensorBoard s Create checkpoint files and recover from failures

While using Estimator to create application, it is necessary to separate the data entry pipeline from your model This separation simplifies experimentation with different datasets.

ML Kit is a Google service that provides a set of tools and APIs (Application Programming Interface) for integrating machine learning features into mobile applications ML Kit helps developers leverage the power of machine learning without having to have in-depth knowledge of the field.

ML Kit's contribution to facial recognition:

> Face Detection: ML Kit provides a face detection API, allowing mobile apps to easily identify and localize faces in photos or videos.

> Face Contour Detection: In addition to face detection, ML Kit also supports the recognition of facial contours, helping in identifying detailed facial structures and features.

> Text Recognition: Some facial recognition applications can incorporate text recognition from images to extract information from photos containing faces, names, addresses, etc.

> Cloud-Based Face Recognition: ML Kit also supports cloud-based facial recognition, helping applications access and use powerful machine learning models available on the platform Google cloud platform.

Convolutional Neural Networks (CNN) .- c c1 111121 H119 1n TH HT TH TH TH TH KH hy 30

Convolutional neural networks (CNN) basically classify images into groups, cluster them according to how similar they are and perform object detection with the aid of artificial neural networks The convolutional neural network uses the image's data to analyze the image as a tensor or a matrix of integers with additional dimensions and performs a kinematic search [28].

There are many models created for image recognition in general and facial recognition in particular They have been created for a long time, such as: MobileNetV1, MobileNetV2, InceptionV3, NasNetMobile, etc Prominent among them is CNN Model - MobileFaceNet. MobileFaceNet uses less than 1 million parameters and is specifically designed for highly accurate real-time facial recognition on mobile and embedded devices Under the same testing conditions, our MobileFaceNet achieves significantly superior accuracy as well as 2 times the actual speed compared to MobileNetV2, the weaknesses have also been well overcome.

Plugins Sed .- án Hà HT To Hà HT TH TH Tà HT TT HH TH Tà HT TH nàn 31 KJM\ửi 0yyL,aỊỌỊỌỊỤOỆỆỒẲỔẮẲẤỂđÁỂỔdảiadầđiadiđiaiaiđiiiiii55

There are many plugins used for the model in this implementation but some essential plugins are as follows:

The Google-developed TensorFlow is used to perform a rapid mathematical computation. The most recent Python package can be used to directly create deep learning models This particular AI/ML application, which uses neural networks, is a type of math library The

TensorFlow pip command makes a variety of deep learning models available and available.

TensorFlow supports Data Augmentation before the model training process ever begins. Additionally, it is utilized to optimize the algorithm's performance before downloading the image net's pre-trained weights TensorFlow is useful in this research since it helps identify different waste types in real-time movies (webcam) and allows graphical data representations.

It is also applicable to a camera on a mobile device.

TensorBoard is a tool that provides the measurements and visualizations you need during your machine learning workflow This allows you to track experimental indi-cators such as loss and accuracy, visualize model plots, and project embeddings in low-dimensional space.

The Estimator has the following advantages:

> Estimator-based models can run in a local host or distributed multi-server environment without changing the model In addition, you can run an Estimator-based model on the CPU, GPU or TPU without having to recode the model.

> The Estimator provides a securely distributed training loop to control the following methods and timings: ¢ Load data

31 ¢ Handle exceptions ¢ Save summaries for TensorBoard s Create checkpoint files and recover from failures

While using Estimator to create application, it is necessary to separate the data entry pipeline from your model This separation simplifies experimentation with different datasets.

ML Kit is a Google service that provides a set of tools and APIs (Application Programming Interface) for integrating machine learning features into mobile applications ML Kit helps developers leverage the power of machine learning without having to have in-depth knowledge of the field.

ML Kit's contribution to facial recognition:

> Face Detection: ML Kit provides a face detection API, allowing mobile apps to easily identify and localize faces in photos or videos.

> Face Contour Detection: In addition to face detection, ML Kit also supports the recognition of facial contours, helping in identifying detailed facial structures and features.

> Text Recognition: Some facial recognition applications can incorporate text recognition from images to extract information from photos containing faces, names, addresses, etc.

> Cloud-Based Face Recognition: ML Kit also supports cloud-based facial recognition, helping applications access and use powerful machine learning models available on the platform Google cloud platform.

A collection of the multidimensional matrix that facilitates complex mathematical operations NumPy can be used to execute operations on arrays that are related to mathematics, such as algebraic, statistical, and trigonometric patterns The image is transformed into a matrix The Convolutional Neural Network is utilized to understand and analyze the image in its matrix An image is improved to 224 x 224 pixels during the pre-processing stages The image's annotations then adopted a NumPy array style Finally, the dataset contains the precise

32 labels for each image On this, SciPy was also developed It provides more noteworthy execution that utilizes NumPy arrays and is required for various logical and engineering tasks.

The collection of functions and classes is one of the components of the Python library The Convolutional Neural Network's implementation is mostly supported by the Utils. TensorFlow's Utils library will be deployed for this project.

OpenCV is used to analyze a wide variety of images and movies, including face recognition, object detection, image editing, advanced robotic vision, optical character recognition, and much more Through OpenCV, image processing is carried out Real-time computer vision is the main topic Python scripts will be built in this model to test the newly trained waste detection classifier on any webcam feed, pictures, or videos These scripts will make use of the OpenCV python library.

- Scikit-learn(sklearn) is an open source library in the Python programming language, designed to assist in the process of building and deploying machine learning models It provides many tools and algorithms for tasks such as classification, regression, clustering, data dimensionality reduction and many other tasks in the field of machine learning.

- MXNet is a popular open source and deep neural network (deep learning) library, developed by the Apache Software Foundation It provides a robust community of tools for building, training, and deploying machine learning models, especially those used in the field of computer vision.

- Matplotlib is plot functions in the Python programming language are supported by Matplotlib In this project, it will be used to draw a bounding box to indicate the image name and rating area The bounding box is used to display the object detection name in the evaluation area of the image Matplotlib provides an object-oriented application programming interface. Numpy is one of Matplotlib’s mathematical extensions.

- Pandas is a Python package that provides fast, flexible and expressive data structures that allow you to work with "relational" or "labeled" data easily and intuitively It aims to be a basic high-level building block for practical real-world data analysis in Python In addition, it has the broad goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language.

TensorFlow 0 .35

TensorFlow Lite is a compact and optimized version of the TensorFlow machine learning library, specifically developed to run on resource-constrained mobile and IoT (Internet of Things) devices TensorFlow Lite is an important tool that helps make mobile machine learning applications flexible, efficient and easy to deploy.

Figure 3.6 Functional execution flow for TensorFlow Lite models in Android

- TensorFlow Lite is designed to support the deployment of machine learning models on mobile devices, smartwatches, embedded devices, and other IoT devices.

- It helps reduce the size of the model and enhance computational performance, suitable for the limited resources of mobile devices.

- The machine learning model is packaged and deployed as a TFLite format, reduced in size compared to the TensorFlow Standard SavedModel format.

- This format is optimized for running on mobile processors and supports a wide variety of devices It can be integrated into mobile applications to perform tasks related to facial recognition, such as face verification in mobile applications or other scenarios related to image processing on devices have limited resources.

- TensorFlow Lite supports Android and iOS platforms, helping to develop mobile machine learning applications In this project, itis supported on the Android operating system.

- It is also available on several embedded platforms and IoT devices.

- TensorFlow Lite is optimized to run models quickly on devices with limited resources and low latency.

- Provides easy-to-use integration tools to integrate machine learning models into mobile applications In this project, it combines with the MobileFacenet model to provide computationally and resource-efficient facial recognition on mobile devices.

There are various ways to delegate a run-time environment for executing models in an Android app These are the preferred options:

- Standard TensorFlow Lite run-time environment - Usage application

- Google Play services run-time environment for TensorFlow Lite (Beta)

Generally, a TensorFlow Lite run-time environment is recommended as its a more adaptable environment running the model on Android The run-time environment offered by Google Play services which is more suitable and space-efficient than the standard environment Because it’s loaded from Google Play store resources and not bundled in a specific app Some advanced use

35 cases require customization of a model run-time environment To access these run-time environments in an Android app, it’s required to add TensorFlow Lite development libraries to in app development environment.

There are two main APIs that can use to integrate TensorFlow Lite machine learning models into the Android app:

- TensorFlow Lite Task API - Usage application

Interpreter API comes up with classes and methods for running inferences with existing TensorFlow Lite models The TensorFlow Lite Task API encases the Interpreter API and delivers a high-level programming interface for conducting common machine learning tasks on handling visual, audio, and text data. a Obtain models

Running a model in an Android app requires a TF Lite - format model Pre-built models or build ones are also can be used with TensorFlow and converting them to the Lite format. b Handle input data

Any data which is going to pass into the ML model requires to be in form of a tensor with a specific data structure, which is known as the shape of the tensor To process data with a model, app code requires a transformation of data from its native formats, like image, text, or audio data, into a tensor in the required shape for the individual model.

The TensorFlow Lite Task library provides data handling logic for transforming visual, text, and audio data into tensors with the correct shape to be processed by a TensorFlow Lite model. c Run inferences

Processing data through a model to generate a prediction result is known as running an inference Running an inference in an Android app requires a TensorFlow Lite run-time environment, a model and input data.

The speed of generating inference of a model on a particular device depends on the size of the data processed, the complexity of the model, and available computing resources like memory and CPU or specialized processors are known as accelerators.Machine learning models can run swiftly on these diversified processors such as graphics processing units (GPUs) and tensor processing units (TPUs), using TensorFlow Lite hardware drivers known as delegates. d Handle output results

Models produce prediction results as tensors, which are required to be handled by the Android app by taking action or displaying a result to the user Model output results can be as simple as a number corresponding to a single result for image classification, to much more complex results, such as multiple bounding boxes for several classified objects in an image, with prediction confidence ratings between 0 and 1.

METHODOLOOY < 5555 ssS955951951958558558588588588588585865 OD

The Weakness of Common Mobile Networks for Facial ẹeCOðTIIfIOT 5 ô5 1x sserekree 38 4.3 Global Depthwise Convolution oo ccc cece ee ence erie 1 1 111 HT nàn TH HT TH Hà HH TH HH TH, 40 4.4 MobileFaceNet ATChIf€CEUT€S - 191191111 1H TH TH HT TH Hà HH TH HH HT HH HH Tà HT 41

In most of the recent state-of-the-art cellular networks proposed for common visual recognition tasks, it has been observed that they have a global average pooling layer (denoted as GAPool) For recognition, especially for facial recognition some researchers have observed that CNNs with GAPool layers (such as MobileNetV1, MobileNetV2, InceptionV3, etc) are less accurate than those without GAPool However, no theoretical analysis of this phenomenon has yet been provided Therefore, in this study, we will perform a simple analysis of this phenomenon in the theory ofreceptive field [24].

A typical deep face recognition pipeline includes preprocessing face images, extracting face features by a trained deep model and matching 2 faces by their features similarity or distance. Following the preprocessing method in [10], we use MTCNN to detect faces and five facial landmarks in images Then we align the faces by similarity transformation according to the five landmarks The aligned face images are of size 112 x 112, and each pixel in ROB images is normalized by subtracting 127.5 then divided by 128 Finally, a face feature embedding CNN maps each aligned face to a feature vector, as shown in Figure 2.2 in Chapter 2 Without loss of

38 generality, we use MobileNetV2 as the face feature embedding CNN in the following discussion To preserve the same output feature map sizes as the original network with 224 x

224 input, we use the setting of stride = 1 in the first convolutional layer instead of stride = 2, where the latter setting leads to very poor accuracy So, before the global average pooling layer, the output feature map of the last convolutional layer, denoted as FMap-end for convenience, is of spatial resolution 7 x 7 Although the theoretical receptive fields of the corner units and the central units of FMap-end are of the same size, they are at different positions of the input image The receptive fields' center of FMap-end's corner units is in the A typical deep facial recognition pipeline includes preprocessing face images, extracting face features by a trained deep model, and matching two faces by their features' similarity or distance.

The receptive fields' center of FMap-end's corner units is in the corner of the input image and the receptive fields' center of FM- end's central units are in the center of the input image, as shown in Figure 2.1 in Chapter 2 According to the Effective Receptive Field in Deep Convolutional Neural Networks, pixels at the center of a receptive field have a much larger impact on an output and the distribution of impact within a receptive field on the output is nearly Gaussian The effective receptive field sizes of FMap-end's corner units are much smaller than the ones of FMap-end's central units When the input image is an aligned face, a corner unit of FMap-end carries less information of the face than a central unit Therefore, different units of FMap-end are of different importance for extracting a face feature vector.

In MobileNetV2, the flattened FMap-end is unsuitable to be directly used as a face feature vector since it is of a too high dimension 62720 It is a natural choice to use the output of the global average pooling (denoted as GAPool) layer as a face feature vector, which achieves inferior recognition accuracy in many researchers’ experiments [19, 10] as well as ours (see Table 4 in Chapter 4) The global average pooling layer treats all units of FMap-end with equal importance, which is unreasonable according to the above analysis Another popular choice is to replace the global average pooling layer with a fully connected layer to project FMap-end to a compact face feature vector, which adds large number of parameters to the whole model. Even when the face feature vector is of a low dimension 128, the fully connected layer after

FMap- end will bring additional 8 million parameters to MobileNetV2 We do not consider this choice since small model size is one of our pursuits.

To treat different units of FMap-end with different importance, we replace the global average pooling layer with a global depthwise convolution layer (denoted as GDConv) A GDConv layer is a depthwise convolution [6] layer with kernel size equaling the input size, pad

= 0, and stride = 1 The output for global depthwise convolution layer is computed as:

Where F is the input feature map of size W x H x M,, K is the depthwise convolution kernel of size W x H x M, G is the output of size 1 x 1 x M, the mth channel in G has only one element Gm, (i,j) denotes the spatial position in F and K, and m denotes the channel index,

Global depthwise convolution has a computational cost of:

When used after FMap-end in MobileNetV2 for face feature embedding, the global depthwise convolution layer of kernel size 7 x 7 x I 280 outputs a I 280-dimensional face feature vector with a computational cost of 62720 MAdds (i.e., the number of operations measured by multiply-adds, [8] and 62720 parameters Let MobileNetV2-GDConv denote MobileNetV2 with global depthwise convolution layer When both MobileNetV2 and MobileNetV2-GDConv are trained on CIASIA - Webface [10] for facial recognition by ArcFace loss, the latter achieves significantly better accuracy on LFW (see Table 2) Global depthwise convolution layer is an efficient structure for our design of MobileFaceNet.

Table 3 MobileFaceNet architecture for feature embedding

Tx 512 linear GDConv7x7 1? x 512 linear convlxl

We use almost the same notations as MobileNetV2 [8] Each line describes a sequence of operators, repeated n times All layers in the same sequence have the same number c of output channels The first layer of each sequence has a stride s and all others use stride I All spatial convolutions in the bottlenecks use 3 x 3 kernels The expansion factor t is always applied to the input size GDConv7x7 denotes GDConv of 7 x 7 kernels.

Now we describe our MobileFaceNet architectures in detail The residual bottlenecks proposed in MobileNetV2 [8] are used as our main building blocks For convenience, we use the same conceptions as those in [8] The detailed structure of our primary MobileFaceNet architecture is shown in Table 1 Particularly, expansion factors for bottlenecks in our architecture are much smaller than those in MobileNetV2 We use PReLU as the non-linearity, which is slightly better for facial recognition than using ReLU (see Table 2) In addition, we use a fast downsampling strategy at the beginning of our network, an early dimension- reduction strategy at the last several convolutional layers, and a linear | x 1 convolution layer following a linear global depthwise convolution layer as the feature output layer Batch normalization is utilized during training and batch normalization folding is applied before deploying.

ExperlmerIs J$f HE vuớt, JJJN TNG S

M From MobileFaceNet-M, removing the 1 x 1 convolution layer before the linear GDConv layer produces the smallest network called MobileFaceNet-S These MobileFaceNet networks’ effectiveness is demonstrated by the experiments in the below section.

In this section, we will first describe the training settings of our MobileFaceNet models and our baseline models Then we will compare the performance of our trained facial recognition models with some previous published facial recognition models, including several state-of-the- art big models.

Figure 4.1 Labeled Faces in the Wild (LFW) dataset

Labeled Faces in the Wild (LFW) is a famous dataset in the field of facial recognition, created and released by the Computer Vision Research Group of the University of Massachusetts (MIT) This dataset contains photos of more than 5,000 celebrities from around the world with more than 13,000 facial images, focusing on diversifying the angles, lighting, and other variations of facial images LFW is widely used to evaluate the performance of face recognition algorithms under real-world conditions and is an important testing standard in facial recognition research As can be seen from the following image, the human faces in the

LFW dataset have a variety of poses and expressions as well as different lighting conditions. Some faces are even blocked, which greatly increases the difficulty of facial recognition.

Figure 4.2 Sample images of LFW dataset

LFW dataset file with more than 2GB of facial data will be used in this experiment.

Google released the network MobileNetV2 in 2018, which is mainly designed to target mobile and embedded devices As shown in the Figure 4.3 that the diagram of MobileNetV2 that includes 17 IRMs, the inverted residual module (IRM) is the basic and key unit of MobileNetV2.

B88 nix 98898988 | mM =fRB68. a Convolution it) BN a ReLU6 6 Pointwise convolution B Average pooling 8 Depthwise convolution

Figure 4.3 Diagram of MobileNetV2, which includes 17 inverted residual module (IRM)

Firstly complete IRM performs point-wise convolution to inflate the number of features depicting channels to 6 times the input number and then manipulates a depthwise convolution and another pointwise convolution The depthwise convolution changes the width and height of

43 input feature maps The second pointwise convolution reduces the number of feature map channels to the input number Furthermore, In order to improve the model convergence speed and avoid gradient explosion, batch normalization (BN) layers are attached to each convolution layer ReLU6 (rectified linear unit 6) non-linearity activation function hikes the sparsity of the network and scales down the interdependence between parameters, thus being widely used after

BN layers Nevertheless, the ReLU6 function will steer to crucial information loss in the case of low - dimensional input Hence, the feature maps are directly delivered to the next convolution layer without using a RuLU6 afterward the second point-wise convolution layer and the BN layer.

Google also proposed InceptionV3 in 2015 which can also be run on android phones. Which has mainly 3 core modules that mean InceptionV3 module 1, InceptionV3 module 2, and InceptionV3 module 3, as shown in Figure 4.4 All 3 modules have the same principle and that is increasing the depth and width of the network while leading to little computational cost. aT 988 os

OG 88 88 088 88 - os BO 88 28 Boones BB B8 28 8B om? BBB i ee | SRE .èÀ Ỏ d 7c ẻnọnọa

_ tt oe on 088 1 | on o8 8 ae #tiGs |:

\szscrbustesauÐtrtrtEcscvotoosmuil jvwssusgwgewwetbsie ne se lAanegoisesgsu aoe

Bccavotution [an ReLU Maxpoaling J Filterconcat (J Average pooling Dropout J Fullyconnected J Softmax

Figure 4.4 Diagram of InceptionV3 and its three core parts, InceptionV3 module I (IM I), InceptionV3 module II (IM ID), and InceptionV3 module III (IM III)

In these 3 modules, the n x n convolution kernel is split into two convolution kernels and that is 1 x n and n x 1, which is greatly optimized network parameters and outstandingly refined detection speed The last layer of these three modules is Filter concat layer which contains the LRN (Local Response Normalization) layer and the Depthconcat layer, Where the Depthconcat layer fuses the features pull out by convolutional layers At last, InceptionV3 utilizes an average pooling layer and dropout layer before the fully connected layer and

44 appends a softmax layer for further transmission to avoid the vanishing gradient problem All convolution layer is followed by the BN layer and ReLU non-linearity.

NasNetMobile is efficient CNN architecture, which restrains cardinal building blocks (cells) and is upgraded with aid of reinforcement training Blocks are just having some operations such as a few separable convolutions and bundles and it’s replicated many times relying on necessary network capacity The mobile version contains 12 cells with a capacity of 5.3 million and a multiplying capacity of 564 million NasNetMobile is a pretty straightforward and tiny pre-trained model The usual arrangement of the convolutionary network is preestablished in this method handoperated They consist of convolutional cells that are duplicated multiple times and each cell has the same architecture but different weights.It is necessary to have 2 types of convolutional cells in order to provide two key roles if we take a map of a feature to directly construct scalable architectures for a picture of any scale Such as input: i N

_—— le —— - El fe \ a + = [sae “Tae — bes

Figure 4.5 Nasnet Normal and Reduction Cell Architecture

1 A convolution cell that returns a feature map of the same dimension

2 The convolution cell returns a feature map with the height and width of the feature map reduced by half.

Experts have suggested looking for components of the smaller dataset architecture and then moving the block to a larger dataset Specifically, it first scans CIFAR-10 for the best co- evolution layer or cell, and then uses that cell to further stack a copy of that cell with ImageNet.

A new regularization strategy ScheduledDropPath has been proposed, significantly increasing the widespread use of the NASNet model.

MobileFaceNet is a facial recognition model designed to work on mobile devices with limited resources This is one of the important models in the field of mobile facial recognition and has achieved high performance The main goal of MobileFaceNet is to optimize model size and computational resources so that it can run on devices with limited resources such as mobile phones This is usually achieved through reducing the number of weights and parameters in the model Despite being small in size and using less resources than larger models, MobileFaceNet still achieved good performance in the task of facial recognition With all of the above, we choose it as the main CNN model applied for the application of this thesis.

We implemented it using the TensorFlow Lite format, both of which are lightweight modeling formats that run on resource-limited devices like mobile phones.

So the asset "mobilefacenet_tflite" contains an optimized version of the MobileFaceNet model in TensorFlow Lite format, which can be integrated into mobile applications to perform face recognition tasks.

4.5.3 Training Settings & Accuracy Comparison on LFW

# File Edit View Navigate Code Analyze Refactor Build Run Tools Git Window Help — Material Detection cia Materisi\Sem-3\! AsterialDetection ild.gradle (app ơ ứ x

SAS © >A xappv [PxelSAPI29 vị b BOOS mam ceva ° mmqa

W Z Project + @ FS B — G& CameraActivityjava build.gradle (:app) © ClassifierActivity java seh tHe_ic_camera_connection fragment.xm! -

H % MaterialDetection [Material Detection] You can use the Project Structure dialog to view and edit your project configuration Open (Ctrl+AltsShiftsS) Hide notificat H

" se androidTest } dependencies { srw Satie implementation ‘androidx.appcompat: appcompat:1.4.2°

3 float l6-quanttflte implementation ‘com.google.android.material:material:1.6.1° inplementation ‘androidx.constraintLayout:constraintlayout:2.1.4° testInpLenentation ‘junit: junit:4.13.2' androidTestImplementation 'androidx.test.ext:junit:1.1.3" androidTestImplementation ‘androidx.test.espresso:espresso-core:3.4.0° se nasnetmobile float16-quant.fite Implementation 'org.tensorFLow:tensorfLow~ ae Nasnetmobile_int-quant.tflite implementation đổ labelsbt inpLenentation_'ong.tensorf1ow:tensorfLow-

} gan cr build.gradle Fy

#IODO Poblm: Gt Terminal = Logest 2Profilet “sulla Appinspection Qtventiog — f5 Layout inspector (Plugin error: Plugin ‘Google Sceneform Tools (Beta)’ is compatible with Intell IDEA only because it doesn't define any explicit module dependencies (8 minutes ago) 18:1 CRLF UTF-8 dspaces |? main @ @

Figure 4.6 TensorFlow Lite libraries ùn app and models in Assets folder

Purpose of this library: implementation ’org.tensorflow:tensorflow-lite:0.0.0-nightly’ and implementation ’org.tensorflow:tensorflow-lite-support:0.0.0-nightly’, is The TensorFlow Lite Android Support Library was developed to handle the inputs and outputs of TensorFlow Lite models and make the TensorFlow Lite interpreter more user - friendly.

TensorFlow Lite supports multiple hardware accelerators This library: implementation

SYSTEM IMPLEMENTA TION cecsessessessessesses DD

Database DesnNE, mmg: v TẾ .WHwm Ễ

5.2.1 Normalize Data e UserData (name, email, password) e RoomData (id, description, imageUrl, isAvailable, location, price, title) e BookingData (bookingDays, customerEmail, endDate, id, imageUrl, price, roomID, roomTitle, startDate, status, totalPayment) e RecognitionData (name, email, phoneNumber, roomID, roomTitle, service)

Includes UserData, RoomData, BookingData, RecognitionData tables to store customer information, room information, booking information, and facial recognition information.

> Table UserData e Function: Save information about customer e List of attributes:

Name of attribute | Data type Description name String Full name of customer email String E-mail of customer password String Password of E-mail

> Table RoomData e Function: Save information about hotel rooms e List of attributes:

Name of attribute | Data type Description id String ID of room description String Description about room imageUrl String Overall photo of room price Number Prcie of room title String Title displayed of room

> Table BookingData e Function: Save information about customer‘s booking e List of attributes:

Name of attribute | Data type Description bookingDays Number Booking date customerEmail String E-mail of customer id String ID of booking imageUrl String Overall photo of room price Number Prcie of booking (1 date) roomID String ID of customer room roomTitle String Title displayed of room startDate String Check-in date endDate String Check-out date status String Status of booking totalPayment Number Total payment price

> Table RecognitionData e Function: Save information about facial recognition of customer/admin e List of attributes:

Name of attribute | Data type Description name String Full name of customer email String E-mail of customer phoneNumber String Customer phone number roomID Number ID of customer room roomTitle String Customer's room type service String Customer service used

Of course for high-end hotel & resort chains, when customers experience the hotel's services, they will be instructed to install the application directly at the hotel & resort or online through the hotel's homepage After installing the Android application on the phone, it will provide a shortcut with an icon in the menu for customers to conveniently access The application is used by clicking on the newly created icon.

Figure 5.3 Android application shortcut icon

The application needs to be granted basic permissions when used by admins or customers:

+ Access rights for E-mail authentication.

+ Access rights to photo album.

Application ÍT€TÍAC€ - óc 1 11900 1H 01H TH TT HH TT HH TH cư HH 58

+ Access rights to save data.

5.4.1 Sign Up & Sign In Page

Figure 5.4 First interface when starting the application

QUÝ KHÁCH CHỈ VIỆC CHUAN BỊ HANH TRANG VÌ PHÒNG VÀ DỊCH VỤ TUYỆT

VOI NHẤT ĐÃ CÓ CHUNG TÔI Lo!

CHUA ĐƯỢC LE TAN CUNG CAP TÀI KHOẢN?

'# VỤI LÒNG ĐĂNG KY TẠI DAY ®

The hotel reception will provide accounts for customers, but if they want to create more for relatives, they just need to click "VUI LONG DANG KY TAI DAY" to register for an account.

Depending on the customer's preferences, the application will direct to the sign up or sign in page as shown in Figure 5.5 below.

CHUA ĐƯỢC LẺ TÂN CUNG CAP TÀI KHOẢN?

'# VUI LỒNG DANG KY TẠI ĐÂY -ò-

Figure 5.5 Account sign up and sign in page

When customers register for a new account, they are required to fill in their full name, E- mail and password The system will send a verification e-mail and the customer must access the link to verify their E-mail The customer screen displays the content as shown in Figure 5.6 and when they click on the text "liên kết", they will be redirected to the verification E-mail as shown in Figure 5.6.

Vui lòng kiểm tra E-mail của bạn và nhấp vào liên kết của chúng tôi để xác minh tải khoản của bạn, sau đó bạn có thể đăng nhập!

Truy cập liên kết này để xác minh địa chỉ email của bạn. bạn có thể bỏ qua email này.

Information for all registered customer accounts will be stored at Firebase - Authentication. This system will be managed and monitored by ADMIN.

& Firebase Hotel Management ft Project Overvien Authentication sroject shortouts Users Sign-inmethod Templat ge Settings &Exten:

2 os Q ch by email address, phone number, or user UID

Identifier Prov Created Signed In UID

Nov 21, 2023 ssJEF7PQiDNckwNhZYL8Lr7iCji1

VspF9CYpH7Dxd2 admin@gmail.com

As mentioned above, the application has 2 interfaces: Admin and Customer. ® Dingest Lodge 5)

| Admin] QUẢN LY KHÁCH SAN

THONG TIN TAT THEM PHONG CÁ PHONG

YÊU CÂU DAT QUAN LÝ PHONG PHONG ĐANG ĐẶT

LỊCH SỬ ĐẶT THONG TIN

DỊCH VỤ CỦA QUÝ KHÁCH

Figure 5.8 Main menu of admin and customer

Each interface will correspond to appropriate functions, the Customer interface will have more limited functions than the Admin.

Table 9 Main menu functions of each interface

“THEM PHONG” button: Admin can add new rooms to the system, it will be displayed in the customer interface.

“YEU CAU DAT PHONG?” button:

Customers can review information once they have successfully booked a room, they can cancel the booked room.

“THONG TIN TAT CÁ PHONG?” button:

Admin can edit the hotel's current room

“THONG TIN DAT PHÒNG?” button:

Customers can check successfully booked

61 information, including deletion It will be displayed real-time on the customer interface. room information after Admin has approved.

“YEU CÂU DAT PHONG?” button: Admin receives customer booking information, they can accept or reject them.

“LICH SU DAT PHONG?” button:

Customers can review their booking history upon check-out, including information: service, date and time, name, E-mail

“QUAN LY PHONG DANG DAT” button:

Admin can view information about currently booked rooms When customers check-out, admin can click "check-out".

“DICH VU PHONG” button: Customers can use hotel services such as: facial recognition, wifi, hotline

“LICH SỬ ĐẶT PHONG?” button: Admin can review the booking history of customers who have checked out, including information: service, date and time, name, E-mail

“THONG TIN KHACH HANG?” button:

Admin can view information about all customers who have been using the hotel's application.

When customers start the application, they will see and select suitable rooms They can view all available hotel rooms (which the Admin has added), then they will proceed to book a room as shown in Figure 5.9.

LUXURY AND ROMANTIC HÃY CHON CAN PHONG PHU HỢP

PHÒNG ĐÔI CỔ BIEN | PHONG ĐÔI LANG MAN

Figure 5.9 Viewing and booking interface

After selecting the date and number of nights for the booking, the system displays the message “Booking successful! Please check booking request to update details”.

Next, customers click on “Yéu cau dat phong” to check the room information waiting for

Admin approval If they feel it is not suitable, they can cancel it as shown in Figure 5.10.

‘Trang thái: Yêu cầu PHÒNG ĐÔI HIỆN ĐẠI

Giá(1 đêm): 650000 VND Tổng tiền: 1950000 VND

Figure 5.10 Booking and request page

After customers make booking, the system receives the booking information, Admin clicks on “Yéu cau dat phong” to check and approve Booking information data has been stored at

Firestore - Firebase as shown in Figure 5.11.

Gli(1 den): 650000 VND Teng ttn: 1950000 VND E-mailđênhd: Ở nguolsgecial@omaficom kigure 5.11 Requests page and-“Eirestore - Firebase” stores it

When Admin accepts customer's booking, the system displays the message “Xác nhận đặt phong!” The " Quan ly phong dang dat" page will display the room information: customer name, e-mail, room service, check-in/check-out date After customers pay for their room, Admin can click "CHECK-OUT" as shown in Figure 5.12.

Trạng thái: Yêu cău PHONG ĐÔI HIỆN DAI Trạng thái: Đã duyệt PHONG ĐÔI HIỆN ĐẠI

Check-in: 23/11/2023 Số đêm: 3 ‘Check-in; 23/11/2023 Số đêm: 3

Giá(1 đêm): 650000 VND Tổng tiền: 1950000 VND Giá(1 đêm): 650000 VND Tổng tiến: 1950000 VND

E-mail liên hệ: nguoÌspeciaigmail.com E-mail liên hộ: nguoispecial@›gmail.com.

Figure 5.12 Admin confirms customer'si booking and manages it

Similar to above, customers are notified that booking has been accepted by Admin and booked information are displayed on the “Thông tin đặt phòng” page as shown in Figure 5.13.

YÊU CẦU ĐẶT PHÒNG | Admin | Ph HIEN TAL

‘Trang thái: Yêu cầu PHONG ĐÔI HIỆN ĐẠI Check-in: 23/11/2023 Số đêm; 3

Giá(1 đêm): 650000 VND Tổng tiên: 1950090 VND

E-mail liên hệ: nguoispecial@gmail.com

Figure 5.13 Admin confirms booking and customers view booked room information

After customer has paid for the room, Admin clicks on the CHECK-OUT button and the system will display the message "Đã trả phòng” Paid rooms will be displayed on the “Lịch sử dat phong” page and they are stored in Firestore - Firebase as shown in Figure 5.14.

Trang thái: Đã duyệt PHONG ĐÔI HIỆN ĐẠI.

Giả(1 đêm): 650000 VND _ Tổng tiên: 1950000 VND malian: ngookpeosl@gnstcem

Customer information and rooms they have booked will be displayed on the “Lịch sử đặt phong” page of both Admin and Customer interfaces as shown in Figure 5.15.

LICH SỬ ĐẶT PHÒNG — | Admin | a ad

“Trọng th: Đ tr phòng PHONG 001 HIEN DA

Reviewing booking history helps customers make more suitable choices for their next visit.

In addition, Admin can also manage and compile customer check-in and check-out times to make it easier to arrange facilities and personnel.

Facial Recognition is the most special function applied to this app In Admin interface it will be on the “Thông tin khách hàng” page and in Customer interface it will be on the “ Dich vu phong” page as shown in Figure 5.16.

Tên: Đặng Ly Miễn Phúc Face Recognition

Ten; LýThịHảiThắng Face Recognition Email: nirtellunthri2640@gmail.com

DỊCH VỤ CUA QUÝ KHACH

Vui lòng nhấn vào đây để. sử dụng “Facial Recognition” khí bạn Check-in và Check-out.

Ngoài ra, bạn còn có thể sử dụng nó đế ra vào khi sử dụng các dich vụ khác như: khu vui chơi giải tri, hd bơi, gym,

Vui lòng nhấn vào day dé truy cập Wifi của khách sạn.

Tên Wifi: [Số phòng của bạn]

Vui lòng nhấn vào đây đế gọi cho chúng tôi bất cử khi nào bạn cần, Chúng tôi luôn sẵn lòng phục vụ đến bạn tận. tinh - tận phòng 24/24

Hoặc có thế gọi vào Hotline:

Figure 5.16 Facial recognition in 2 interfaces: Admin and Customer

Overall, both have relatively similar interfaces (as shown in Figure 5.17), but if we use them more deeply, we will clearly see the difference between them.

9:15 13KB/eiệ1 Ho ‹ '9:15|023KB/s ỉ1 EB +: sẻ.

Dingest Lodge - Facial Recognition Dingest Lodge - Facial Recognition

Vui long giữ khudn mét có định trong vai giãy

9:150,1KB “ as 9:16) 0.2K GD ooằ Ae ingest Lodge - Facial Recognition aa

Vui lòng nhập thông tin:

Phòng đôi cổ điển hong

To ensure security and proper functionality for Admins as well as Customers, the menu options of the two interfaces will be different Table 4 below will detail each function of the buttons in the menu.

Table 10 Detailed functions of each button

This button supports customers turn on the front or back camera of their phone.

When putting any face into camera, this button will help Admin or customers add that face recognition to the application system.

Once the face has been detected, this button will help Admin or customers fill in their information into the system It displays to the right of recognized face.

Vui lòng nhập thông tin:

Dịch vụ sử dụng: oO Playground

The face recognition information panel displays with the following fields:

- Ho tén: full name of the recognized face

- Loại phòng: type of room they are using

- Số phòng: room number they are using

- Số điện thoại: their phone number

- E-mail liên hệ: E-mail to contact them

- Dich vu su dung: tick boxes are displayed so customers can select the service they use to check-in when using it.

After filling in the facial recognition information, this button will test to see if this is you.

If correct, your information will be displayed on the screen and accompanied by a success status line.

If incorrect, text line displayed cannot identify the face and includes a failure status line.

Depending on the interface, when Admin or customers press this button, functional options will be displayed: face recognition list, list editing, current recognition storage, etc

Danh sách nhận dang khuôn mặt

Chỉnh sửa danh sách nhận dạng

Lưu trữ nhận dạng hiện tại

Load nhận dạng đã lưu

Nhận dạng khuôn mặt bằng ảnh

Xóa hết lưu trữ nhận dạng

Danh sách nhận dang khuôn mặt

Lưu trữ nhận dạng hiện tại

Load nhận dạng đã lưu

“Danh sách nhận dạng khuôn mặt”: Compile a list of customers' facial recognition, including their detailed information such as: Name, Room, Phone number, E- mail, etc

“Chỉnh sửa danh sách nhận dạng”: Admin can permanently delete "facial recognition" of one or more customers as they wish.

“Luu trữ nhận dang hiện tại”: After adding a customer face to the system, Admin can store that face when clicking this button.

“Load nhận dang đã lưu”: To make sure the saved face recognition works fully and smoothly, Admin can click this button to load saved face recognition in the system.

“Nhận dang khuôn mat bang anh”: This button helps facial recognition using images by uploading photos from photo album to the system and then retrieving the face for recognition.

“Xóa hết lưu trữ nhận dạng”: Admin can click this

72 button to delete all customer facial recognition saved on the system.

“Ché d6 ADMIN”: When Admin clicks on this button,

Admin mode will be turned on to perform tasks such as scanning 2 faces far and near.

*Note: this button is being tested.

Dinh Văn Thanh Phước Khoa Đặng Miên Đặng Hải Trang Phúc

Thông tin chi tiết khách hang:

Loại phòng: Phòng đôi lãng man

All facial recognition information will be displayed as names.

*Note: only Admin can view detailed information for each facial recognition.

Vui lòng chon “nhận dạng khuôn mặt” mà bạn muốn xóa:

All facial recognition information will be displayed as names and tick boxes so Admin can delete any facial recognition they want.

After clicking this button, the system will ask the customer for permission to access their photo album and then allow them to start selecting photos for facial recognition.

After selecting the appropriate image, perform facial recognition similar to above by clicking on the + sign.

*Note: this function is only really suitable for customers who cannot use their current face for specific and legitimate reasons.

As a result, this research has completed an Android application with 2 interfaces: Admin and Customer As mentioned, once customers have downloaded the application, they will work directly with the Admin: view room information, make reservations, view booked room information, view history, receive new notifications, and contact guests directly hotels, etc We tested booking/check-in/check-out with both interfaces and both were 100% successful This is the first step for the hotel to reach regular and potential customers.

Secondly, the application has applied a class of face feature embedding CNNs, namely MobileFaceNet, with extreme efficiency for real-time face identification so that customers can Check-in instead of using many types of complicated identification documents The research used the Tensorflow, Tensorflow Lite and ML Kit libraries to create a CNN network model to solve the face recognition problem on the Android platform In addition, Firebase has solved the problem of storing the application's database including customer information, room information, reservation information, etc Overall, face recognition achieved high accuracy rates 99.28% on LFW dataset with quite fast speed.

Although the actual evaluation results are generally favorable, there are still some cases of misidentification because the model's data set differs from reality in some points Therefore, the application will need many improvements to overcome.

Due to the development of information technology, there are many hotel and resort booking applications that have become popular However, there are very few separate applications for a hotel As mentioned, in Vietnam hotel & resort systems need their own room booking application The undeveloped problem of Check-in using facial recognition has been integrated into this application To help hotel & resort systems have a separate application to easily notify, manage and contact customers, this research creates a mobile device-based application that operates on the platform - the most widely used mobile phone, Android.

The application will connect Admin and Customers through registration, booking/check- in/check-out and will also help customers use the services of the hotel system by checking in

75 by face Android-based facial recognition application created using Java as programming language and Android Studio as IDE and SDK This shows that the app is platform-dependent, meaning it can only be used on smartphones running Android An application very suitable for large hotel & resort systems in Vietnam.

Currently, the application should only be applied to large hotel & resort corporations - with high investment capital in Vietnam This proves that the application needs to be adjusted to suit other large and small hotels & resorts In addition, the hotel's internal application needs to add online messaging sections between customers and admins through the application and the interface must be more user-friendly Furthermore, minimize facial recognition errors to optimize guests' time so they can be more comfortable using hotel services.

The application is specifically created and developed to work only on Android-based devices for management/booking room and facial recognition This proves that the program is still platform dependent, for saw the need to make it compatible with many other smartphone operating systems, such as Windows Phone, Symbian OS and iOS, among many others. Additionally, the app should include a cloud-based backup feature in case a customer's device is lost or malfunctioned, allowing them to retrieve all of their "facial recognition" data.

In conclusion, the comments and feedback we received throughout our research have provided valuable perspectives and insights that have shaped our study The emphasis on addressing bias and fairness, ensuring privacy and data security, obtaining user consent, monitoring system performance, and considering the social impact of facial recognition technology has reinforced the need for responsible and ethical practices By incorporating these comments into our research, we aim to contribute to the ongoing dialogue surrounding facial recognition technology and ensure its development and deployment align with societal values, ethical considerations, and user expectations The application needs to be further improved after listening to comments from the council, professors and teachers.

[1] https://vov.vn/du-lich/du-lich-quoc-te-nam-2023-co-the-phuc-hoi-xap-xi-truoc-dai-dich-covid-19- post1019222.vov

[2] https://www.gso.gov.vn/du-lieu-va-so-lieu-thong-ke/2023/08/doanh-thu-dich-vu-tang-manh-trong-mua- cao-diem-du-lich-he-2023/

[3] https://nhandan.vn/xu-huong-phat-trien-dan-so-the-gioi-post705674.html

[4] Nosrati, M., Karimi, R., & Hasanvand, H A (2012) Mobile computing: principles, devices and operating systems World Applied Programming, 2(7), 399-408.

[5] https://www.businessofapps.com/data/android-statistics/

[6] Howard, A G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al.: Mobilenets:

Efficient convolutional neural networks for mobile vision applications CoRR, abs/1704.04861 (2017)

[7] Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices CoRR, abs/1707.01083 (2017)

[8] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: Inverted Residuals and

[9] Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: Ms-celeb-Im: A dataset and benchmark for large-scale face recognition arXiv preprint, arXiv: 1607.08221 (2016)

[10] Deng, J., Guo, J., Zafeiriou, S.: ArcFace: Additive Angular Margin Loss for Deep Face Recognition. arXiv preprint, arXiv: 1801.07698 (2018)

[11] Huang, G.B., Ramesh, M., Berg, T., et al.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments (2007)

[12] Kemelmacher-Shlizerman, I., Seitz, S M., Miller, D., Brossard, E.: The megaface benchmark: | million faces for recognition at scale In: CVPR (2016)

[13] Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I., Zafeiriou, S.: Agedb: The first manually collected in-the-wild age database In: CVPRW (2017)

[14] landola, F N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet- level accuracy with 50x fewer parameters and 0.5 mb model size arXiv preprint, arXiv: 1602.07360 (2016)

[15] Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks In: NIPS (2012)

[16] Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database In: CVPR IEEE (2009)

Tiêu đề	Building a Hotel Management and Check-in Application Using Facial Recognition
Tác giả	Dang Hai Trang Phuc
Người hướng dẫn	Ph.D Tran Van Thanh, MsC. Vo Tan Khoa
Trường học	University of Information Technology
Chuyên ngành	Information Systems
Thể loại	Graduation Thesis
Năm xuất bản	2023
Thành phố	Ho Chi Minh City

Định dạng
Số trang	80
Dung lượng	49,07 MB

Tài liệu tham khảo	Loại	Chi tiết
[4] Nosrati, M., Karimi, R., & Hasanvand, H. A. (2012). Mobile computing: principles, devices and operating systems. World Applied Programming, 2(7), 399-408	Khác
[6] Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al.: Mobilenets:Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861 (2017)	Khác
[7] Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. CoRR, abs/1707.01083 (2017)	Khác
[8] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: Inverted Residuals and Linear Bottlenecks. CoRR, abs/1801.04381 (2018)	Khác
[9] Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: Ms-celeb-Im: A dataset and benchmark for large-scale face recognition. arXiv preprint, arXiv: 1607.08221 (2016)	Khác
[10] Deng, J., Guo, J., Zafeiriou, S.: ArcFace: Additive Angular Margin Loss for Deep Face Recognition.arXiv preprint, arXiv: 1801.07698 (2018)	Khác
[11] Huang, G.B., Ramesh, M., Berg, T., et al.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. (2007)	Khác
[12] Kemelmacher-Shlizerman, I., Seitz, S. M., Miller, D., Brossard, E.: The megaface benchmark: \| million faces for recognition at scale. In: CVPR (2016)	Khác
[13] Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I., Zafeiriou, S.: Agedb: The first manually collected in-the-wild age database. In: CVPRW (2017)	Khác
[14] landola, F. N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet- level accuracy with 50x fewer parameters and 0.5 mb model size. arXiv preprint, arXiv: 1602.07360 (2016)	Khác
[15] Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)	Khác
[16] Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR. IEEE (2009)77	Khác
[17] Russakovsky, 0., Deng, J., Su, H., et al.: Imagenet large scale visual recognition challenge. Int. J.Comput. Vis. 115, 211-252 (2015)	Khác
[18] Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Leaming transferable architectures for scalable image recognition. arXiv preprint, arXiv:1707.07012 (2017)	Khác
[19] Wu, X., He, R., Sun, Z., Tan, T.: A light cnn for deep face representation with noisy labels. arXiv preprint, arXiv:1511.02683 (2016)	Khác
[20] Wu, 8., Wan, A., Yue, X., Jin, P., Zhao, S., Golmant, N., et al.: Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions. arXiv preprint, arXiv: 1711.08141 (2017)	Khác
[21] Hinton, G. E., Vinyals, 0., Dean, J.: Distilling the knowledge in a neural network. In arXiv:1503.02531 (2015)	Khác
[22] Luo, P., Zhu, Z., Liu, Z., Wang, X., Tang, X., Luo, P., et al.: Face Model Compression by Distilling Knowledge from Neurons. In: AAAI (2016)	Khác
[23] Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: CVPR (2015)	Khác
[24] Sun, Y., Wang, X., Tang, X.: Deeply learned face representations are sparse, selective, and robust. In	Khác