Image Enhancement Trang 22 Figure 2.2: Contrast Enhancement Techniques Image enhancement techniques are utilized to improve the quality, contrast, and sharpness of digital images, enab
MINISTRY OF EDUCATION AND TRAINING HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION THESIS AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY SMART LOCK SYSTEM BASED ON FACE RECOGNITION ADVISOR : Dr NGUYEN MINH TAM STUDENTS: NGUYEN TAN NHAT NGUYEN MINH NHAT SKL011639 Ho Chi Minh City, July 2023 HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT SMART LOCK SYSTEM BASED ON FACE RECOGNITION NGUYEN TAN NHAT Student ID: 18151025 NGUYEN MINH NHAT Student ID: 18151099 Major: AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY Advisor: Dr NGUYEN MINH TAM Ho Chi Minh City, July 2023 THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness -Ho Chi Minh City, June 30th, 2023 GRADUATION PROJECT ASSIGNMENT Student name: _ Student ID: _ Student name: _ Student ID: _ Major: _ Class: Advisor: _ Phone number: _ Date of assignment: Date of submission: _ Project title: _ Initial materials provided by the advisor: _ Content of the project: _ Final product: CHAIR OF THE PROGRAM (Sign with full name) ADVISOR (Sign with full name) i THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom – Happiness -Ho Chi Minh City, June 30th, 2023 ADVISOR’S EVALUATION SHEET Student name: .Student ID: Student name: .Student ID: Major: Project title: Advisor: EVALUATION Content and workload of the project: Strengths: Weaknesses:S Approval for oral defense? (Approved or denied) Overall evaluation: (Excellent, Good, Fair, Poor) Mark:………….(in words: .) Ho Chi Minh City, June 29th, 2023 ADVISOR (Sign with full name) ii THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom – Happiness -Ho Chi Minh City, June 30th, 2023 PRE-DEFENSE EVALUATION SHEET Student name: .Student ID: Student name: .Student ID: Major: Project title: Advisor: EVALUATION Content and workload of the project: Strengths: Weaknesses: Approval for oral defense? (Approved or denied) Reviewer questions for project valuation Mark:………….(in words: .) Ho Chi Minh City, June 29th, 2023 REVIEWER (Sign with full name) iii THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom – Happiness -Ho Chi Minh City, June 30h, 2023 DEFENSE COMMITTEE MEMBER EVALUATION SHEET Student name: .Student ID: Student name: .Student ID: Major: Project title: Name of Reviewer: EVALUATION Content and workload of the project: Strengths: Weaknesses: Approval for oral defense? (Approved or denied) Overall evaluation: (Excellent, Good, Fair, Poor) Mark:………….(in words: .) Ho Chi Minh City, August 6th, 2023 COMMITTEE MEMBER (Sign with full name) iv COMMITMENT Title: SMART LOCK SYSTEM BASED ON FACE RECOGNITION Advisor: Doctor Nguyen Minh Tam Name of student 1: Nguyen Tan Nhat ID: 18151025 Tel: 0338484540 Email: 18151025@student.hcmute.edu.vn Name of student 2: Nguyen Minh Nhat ID: 18151099 Tel: 0901241381 Email: 18151099@student.edu.vn.com We assure that the graduation report is our research and study We not copy from any public documentaries without citation If there are any violation, we will receive full responsibilities Project Team Nguyen Tan Nhat Nguyen Minh Nhat v WORKING TIMETABLE Time Week 1: (06/03 – 10/03) Week 2: (13/03 – 17/03) Week 3: (20/03 – 24/03) Week 4: (27/03 – 31/03) Week 5: (03/04 – 07/04) Week 6: (10/04 – 14/04) Week 7: (17/04 – 21/04) Week 8: (24/04 – 28/04) Week 9: (01/05 – 05/05) Week 10: (08/05 – 12/05) Week 11: (15/05 – 19/05) Week 12: (22/05 – 26/05) Week 13: (29/05 – 02/06) Week 14: (05/06 – 09/06) Week 15: (12/06 – 16/06) Week 16: (19/06 – 23/06) Week 17: (26/06 – 30/06) Plan Project confirmation Objection The title Outline the ideas The main outliners Research the theory Writing report Deep learning for Face Recognition Test and evaluate own built model Data adjustment for the built model Applied pre-trained model Device selection: Nano Jetson Build dataset for the model Combine and test the overall code Code Clean and Improvement Transfer program to Jetson Setup environment on Jetson Table of contents Writing report Finish all theory Writing report Result and Conclusion Optimizing Research the theory Research the theory Research the theory Hardware Research and Selection Programming Programming Programming Set up Jetson Working on Jetson Writing report Advisor Comments Finish ADVISOR (Sign with full name) vi ACKNOWLEDGEMENT To begin, we would like to express our heartfelt gratitude to Dr Nguyen Minh Tam, our project instructor Despite having a busy schedule, Dr Tam generously dedicated his time to provide us with guidance on what to and how to proceed This invaluable support greatly aided us in enhancing our soft skills and conducting thorough research on relevant papers Furthermore, we extend a warm appreciation to all the esteemed teachers and advisors at Ho Chi Minh City University of Technology and Education Their comprehensive teachings and practical projects equipped us with essential knowledge, enabling us to apply it successfully in our graduation project This project stands as a tangible testament to the achievements we have made throughout our years as students, and it wouldn't have been possible without their unwavering dedication Lastly, we would like to express our profound love and gratitude to our families, who have been, currently are, and will always be our strongest pillars of support, both emotionally and financially We assure you that we will exert our utmost efforts to make you proud through our contributions to our nation and society, striving not to let you down vii TASK COMPLETION No Name Tasks Completion Face Recognition 100% Study and Setup Hardware 100% Build Program into Jetson 100% Finger Identification 100% Workflow Design 100% Write Report 100% Nguyễn Tấn Nhật Nguyễn Minh Nhật viii - Face detection and alignment: An ultra-lightweight model is used to detect human faces in the captured frames and align them It is important for the user to collect face images from every angle to ensure comprehensive data collection - Saving aligned faces: The aligned human faces are saved to the dataset, associating them with the respective user's name - Face encoding: Once the required data is collected, the system proceeds to encode the faces, extracting unique features that represent each individual's facial characteristics - Saving encoded faces: The encoded faces are saved, along with the user's name, for future reference during the face recognition process Figure 4.3: Flowchart for Face Registration By following these steps, users can efficiently register their faces for facial recognition, leading to improved recognition accuracy when using the system 62 Figure 4.4: Flowchart for Face Recognition The recognition system process is outlined in detail as: - Face Detection: The model detects the presence of a face in an input image or frame Face feature extraction: The FaceNet Recognition model is utilized to extract features from the detected human face These features are encoded into a 128-dimensional vector, which represents the unique measurements of the input face - Face Encoding: The FaceNet model encodes the face by assigning a specific 128dimensional embedding vector based on its unique characteristics and measurements - Distance Comparison: The system compares the distance between the embeddings of the input face and the registered faces using the Euclidean distance metric This distance measures the dissimilarity between the face embeddings - Similarity Determination: If the calculated distance between the face embeddings is within the required threshold distance, the system recognizes the two faces as the same, indicating a successful match 63 If the faces that require recognition bear resemblance to identities in the dataset, the system will select the identity whose photos the face most closely resembles For instance, consider a dataset comprising two identity files, "A" and "B," each containing three embeddings generated from the first three registered images Let's say we need to recognize a face labeled as "C." During the comparison process, it is found that "C" bears similarity to two images in the "A" file and one image in the "B" file Based on this observation, the system determines that "C" is more similar to "A" (2 similarities out of 3) rather than "B" (1 similarity out of 3) Consequently, the system concludes that the identity of "C" is most likely "A." The detailed process of the system's liveness detection is illustrated as: The detector identifies the presence of a human face 68 facial landmarks are added to key parts of the detected human face The blinking eyes ratio is calculated If the number of blinking times matches the required blinking times, it indicates that the detected human face is genuine Figure 4.5: Flowchart for Liveness Detection 64 To detect blinking eyes, the system performs calculations to determine the blinking ratio of each eye and then calculates the average ratio The blinking ratio of each eye is obtained by dividing the length of the vertical line by the length of the horizontal line The length is calculated using the Euclidean distance formula: The blinking eyes ratio is then calculated using formula: Figure 4.6: Flowchart for General System Operation 4.3 Environment and Dataset The researcher gathered data using the registration function of the system This involved collecting data from five different individuals, with thirty images for each person The collected data encompassed various aspects of the human faces Validation of the system was also performed in different scenarios, such as capturing images of faces directly facing the camera, faces positioned on one side, faces with eyeglasses, faces with full masks, and faces with half masks To evaluate the processing speed of the system, the researchers utilized Frames per Second (FPS) After capturing a frame from the camera, the detector would provide a list of rectangular boxes, with each value representing the position of the top-right or bottom-left corner of the bounding box The FPS value was calculated by dividing second by the time interval between two consecutive frames, as shown in the provided formula FPS = #$% &'()$ *+)$,-'$.+/01 &'()$ *+)$ Furthermore, the researchers assessed the accuracy of the system by examining the number of true identifications, also referred to as true accepted (TA), that the system could correctly 65 recognize among the specified identifications The accuracy value is determined by calculating the ratio of the number of true identifications to the total number of given identifications, as demonstrated in the provided formula Accuracy = 2'0$ '$103*1∗ 44 #0)5$' /& +6$7*+&+8(*+/71 4.4 Performance Of The System Device Name Jetson Nano B01 Operating system Ubuntu 18.04.5 Processor core ARM A57 (1.43GHz) Memory 4GB 64bit LPDDR4 25.6GB/s Table 4.4: System Performance This system configuration is very suitable for experimentation due to the limited hardware configuration and without a graphics card - With Face Recognition + liveness detection, the processing speed of the system achieved 5FPS-15FPS when running on Jetson Nano - With Face Recognition, the processing speed of the system achieved 9FPS-20FPS when running on Jetson Nano - The system is tested on 40 identifications and achieves 82.5% and 97.5% accuracy with Face Recognition + liveness detection and Face Recognition respectively 4.4.1 Operation result 4.4.1.1 Good Brightness The result is shown as below for Face Recognition: Figure 4.7: Good Brigthness Results of Face Recognition 66 The result is shown as below for Face Recognition + Liveness Detection: Figure 4.8: Good Brigtness Results of Face Recogntion + Liveness Detection Table 4.5: General Result in Good Brightness According to the data presented in the table, the Face Recognition system achieved a good accuracy of 97.5% when recognizing faces directly facing the camera However, the accuracy dropped to 82.5% when the faces were positioned on one side In cases where individuals were wearing eyeglasses, the accuracy decreased further to 87.5% Similarly, when faces were partially covered with a mask, the accuracy decreased to 67.5% According to the data presented in the table, the Face Recognition combined with liveness detection system achieved a good accuracy of 95% when recognizing faces directly facing the camera However, the accuracy dropped to 82.5% when the faces were wearing eyeglasses 4.4.1.2 Low Brightness To consider all the scenarios, we also make a performance comparison of the system in case low brightness is received This scenario is made when there is a heavy rain, when there is a power cut and the system can only work on backup battery, no light is on The result is shown as below for Face Recognition Face Recognition: 67 Figure 4.9: Low Brigthness Results of Face Recogntion The result is shown as below for Face Recognition + Liveness Detection: Figure 4.10: Low Brigtness Results of Face Recogntion + Liveness Detection Table 4.6: General Result in Low Brightness According to the data presented in the table, the Face Recognition system achieved a relatively decent accuracy of 82.5% when recognizing faces directly facing the camera However, the accuracy dropped to 55% when the faces were positioned on one side and unstable In cases where individuals were wearing eyeglasses, the accuracy decreased further to 72.5% Similarly, when faces were partially covered with a mask, the accuracy decreased to 12.5% According to the data presented in the table, the Face Recognition combined with liveness detection system achieved a reasonably good accuracy of 85% when recognizing faces directly facing the camera However, the accuracy dropped to 2.5% when the faces were wearing eyeglasses It was nearly impossible to determine 68 4.4.1.3 Backup Solution A backup solution is implemented in case there are issues with the user's facial recognition, leading to the system being unable to identify them The solution involves using a fingerprint recognition sensor and controlling the door lock via a smartphone using the Blynk app This setup ensures both high security and ease of use, with quick response times Figure 4.11: Lock control through Blynk app (IC ESP8266) Figure 4.12: Fingerprint Recognition 4.4.2 Hardware Result The hardware is built with all necessary components to support investigation purpose 69 Figure 4.13 Hardware Result The system is built mainly with MICA material and designed using SolidWorks software Figure 4.14 Solidwork Design 4.4.3 Face Datasets This dataset is collected from individuals, with 30 images taken from each person The purpose of increasing the dataset size is to evaluate the stability and accuracy of the system more effectively Taking 30 images for each individual is intended to enhance the realism and robustness of the system using the Ultra-Light Fast algorithm Figure 4.15 Face Dataset Stored 70 Chapter CONCLUSION In this project, the researchers proposed a deep learning model for real-time human face recognition and liveness detection The results of the experiment confirmed that the system effectively met the specified requirements Additionally, the researchers thoroughly evaluated the system and identified its notable strengths and weaknesses Conclusion: Strengths Weaknesses It can be used for hardware with limited Limitations include lower recognition and performance and resources accuracy in low-light environments It demonstrates stability, fast and accurate The ability to run multi-threaded programs recognition in well-lit environments on Jetson also causes the system to slow The system has a user-friendly interface down and lack practicality that is easy to use and navigate The blinking detection ratio still requires It has a backup plan in case the primary manual adjustment rather than automatic, solution encounters issues, specifically and continuous improvement is needed to with the fingerprint sensor and control via determine the most suitable threshold mobile phone The anti-spoofing capability of face recognition is not yet optimized and aligned with reality, as there are still cases of recognition errors occurring Table 5.1: Strengths and weaknesses of the system Improvement: Replacing the anti-spoofing method of face recognition from blinking detection with techniques such as active flash, 3D camera, or utilizing thermal body cameras to enhance accuracy and recognition speed Integrating the entire program into a single processor to save and optimize the system Further, upgrade the door access solutions to enhance realism and alignment with realworld scenarios 71 REFERENCES [1] Image processing, theory, algorithms and architectures, M A Sid-Ahmed, January 1995 [2] An image processing method for morphology characterization and pitting corrosion evaluation, E.NCodaro, September 2002 [3] Deep Learning with PyTorch, Vishnu Subramanian, February 2018 [4] Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, Aurélien Géron, June 2019 [5] Voice recognition system, voice recognition method, and program for voice recognition, Ken Hanazawa, Fumihiro Adachi, Ryosuke Isotani, January 2018 [6] Voice recognition device, voice recognition method, and voice recognition program, Takayuki Arakawa, Ken Hanazawa, Masanori Tsujikawa, December 2013 [7] Voice Recognition System, Tuba Siddiqui, July 2020 [8] Voice recognition system and voice recognition method, Takayuki Arakawa, April 2015 [9] Systems for Low-Resource Speech Recognition Tasks in Open Automatic Speech Recognition and Formosa Speech Recognition Challenges, Hung-Pang Lin, Yu-Jia Zhang, Chia-Ping Chen, August 2021 [10] Speech Recognition Model Compression, Madhumitha Sakthi, Ahmed H Tewfik, Raj Pawate, May 2020 [11] Model-Based Deep Learning: On the Intersection of Deep Learning and Optimization, Nir Shlezinger, Yonina C Eldar, Stephen Boyd, May 2022 [12] A review of optimization method in face recognition: Comparison deep learning and non-deep learning methods, Sulis Setiowati, Zulfanahri, Eka Legya Franita, Igi Ardiyanto, October 2017 [13] Face recognition-based attendance management system by using machine learning, Salman Baig, Kasuni Geetadhari, Mohd Atif Noor, Amarkant Sonkar, April 2022 [14] Supervised Deep Learning in Face Recognition, M Arif Wani, Farooq Ahmad Bhat, Saduf Afzal, Asif Iqbal Khan, January 2020 [15] Face Detection and Recognition System, Akhil Awdhutrao Sambhe, January 2022 [16] Comparison of Face Detection Tools, Ye Amirgaliyev, A Sadykova, Ch Kenshimov, December 2021 72 [17] Face Detection and Recognition using OpenCV, Ajay Kumar, Shivansh Chaudhary, Sonik Sangal, Raj Dhama, May 2022 [18] Face Detection with Applications in Education, Juan Carlos Bonilla-Robles, José Alberto Hernández Aguilar, Guillermo Santamaria-Bonfil, October 2021 [19] Real-time face detection on a Raspberry PI, Leyla Muradkhanli, Eshgin Mammadov, July 2022 [20] Face Shape Classification Based on MTCNN and FaceNet, Wenxin Ji, Lina Jin, November 2021 [21] Research on Face Detection Technology Based on MTCNN, Ning Zhang, Junmin Luo, Wuqi Gao, September 2020 [22] Research on face detection method based on improved MTCNN network, Yang Wang, Guowu Yuan, Dong Zheng, Hao Wu, Yuanyuan Pu, August 2019 [23] Implementation of Jetson Nano Based Face Recognition System, Il-Sik Chang, Goo-Man Park, December 2021 [24] TensorRT-based Framework and Optimization Methodology for Deep Learning Inference on Jetson Boards, EunJin Jeong, Jangryul Kim, Soonhoi Ha, July 2022 [25] FaceNet: A Unified Embedding for Face Recognition and Clustering, Florian Schroff, Dmitry Kalenichenko, James Philbin, June 2015 [26] Audio Deep Learning Made Simple: Automatic Speech Recognition (ASR), How it Works, Ketan Doshi, March 2021 [27] Convolutional neural networks, Jianxin Wu, February 2020 [28] Boot jetson nano form USB, Jetsonhacks.com [29] Kopp, Philipp, et al "Analysis and improvement of facial landmark detection," in Swiss Federal Institute of Technology Zurich, Zurich (2019) [30] AS608 Processor Datasheet, Hangzhou Synochip Data Security Technology Co., Ltd (2015) [31] “Inception-v4, inception-resnet and the impact of residual connections on learning,” Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, In: AAAI (2017) [32] “Deep residual learning for image recognition,” He, K., Zhang, X., Ren, S., Sun, In: CVPR (2016) [33] “Rethinking atrous convolution for semantic image segmentation,” Chen, L.C., Papandreou, G., Schroff, F., Adam, In eprint arXiv:1706.05587 (2017) [34] “SSD: Single Shot MultiBox Detector,” Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, in Computer Vision – ECCV 2016, 14th European Conference, 2016, pp 21-37 [35] “Fast R-CNN,” R Girshick, in IEEE/CVF, 2015 [36] “Receptive Field Block Net for Accurate and Fast Object Detection,” Songtao Liu, Di Huang, Yunhong Wang, in Beijing Advanced Innovation Center for Big Data and Brain Computing Beihang University, Beijing 100191, China 73 [37] “Very deep convolutional networks for large-scale image recognition,” Simonyan, K., Zisserman, A, In NIPS (2014) [38] “Imagenet large scale visual recognition challenge,” Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, in IJCV (2015) 74 75 S K L 0