Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 100 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
100
Dung lượng
3,25 MB
File đính kèm
file dinh kem.rar
(18 MB)
Nội dung
HCMC UNIVERSITY OF SOCIALIST REPUBLIC OF TECHNOLOGY AND EDUCATION VIETNAM Faculty of Electrical and Electronics Engineering Independence – Freedom – Happiness ****** THESIS TASKS Student name: Vo Anh Quoc Student ID: 15151254 Student name: Tran Van Son Student ID: 15151211 Major: Automation and Control Engneering Technology Program: Full-time program Cohort: 2015 – 2019 Class: 151511C I THESIS NAME: BUILDING ATTENDANCE SYSTEM PROTOTYPE BASED ON FACE RECOGNITION USING MACHINE LEARNING II TASKS: INITIAL FIGURES AND DOCUMENTS: CONTENT OF THE THESIS: - Designing and constructing a real-time attendance system prototype in class based on face recognition using Machine Learning involved on NVIDIA Jetson Nano Developer Kit and C270 Logitech Webcam - Understanding and applying algorithms for Face Detection and Face Verification stage III RECEIVED DATE: IV THESIS COMPLETED DATE: V ADVISOR: My-Ha Le, PhD Ho Chi Minh City, 4th July 2019 Advisor Ho Chi Minh City, 4th July 2019 Head of department i HCMC UNIVERSITY OF SOCIALIST REPUBLIC OF TECHNOLOGY AND EDUCATION VIETNAM Faculty of Electrical and Electronics Engineering Independence – Freedom – Happiness ****** SCHEDULE Student name: Vo Anh Quoc Student ID: 15151254 Student name: Tran Van Son Student ID: 15151211 Major: Automation and Control Engneering Technology Program: Full-time program Cohort: 2015 – 2019 Class: 151511C THESIS NAME: BUILDING ATTENDANCE SYSTEM PROTOTYPE BASED ON FACE RECOGNITION USING MACHINE LEARNING Week/Day Week – (18/3 – 1/4) Student Content Tran Van Son - Choosing topic from suggestions of Vo Anh Quoc advisor - Making plan and detail outline Week – (2/4 – 15/4) Tran Van Son - Researching on theory of Vo Anh Quoc Convolution Neural Network (CNN), Machine Learning (ML) Week – (16/4 – 29/4) Tran Van Son - Researching on Face Recognition Vo Anh Quoc theory - Choosing method and algorithm Week – (30/4 – 13/5) Tran Van Son - Preparing for build the software perform on window 10 (PC) - Setup python libraries and environment for Face Recognition Advisor ii Week – 10 (14/5 – 27/5) Vo Anh Quoc - Getting samples of people’s face - Training and test the result - Adjust and complete the software Week 11 – 12 Tran Van Son - Setup hardware (NVIDIA Jetson (28/5 – 10/6) Nano), install the libraries and environment to build code - Testing and complete on NVIDIA Jetson Nano Week 13 – 14 Tran Van Son - Writing final report (11/6 – 24/6) Vo Anh Quoc - Preparing for presentation meeting Ho Chi Minh City, 4th July 2019 ADVISOR iii HCMC UNIVERSITY OF SOCIALIST REPUBLIC OF TECHNOLOGY AND EDUCATION VIETNAM Faculty of Electrical and Electronics Engineering Independence – Freedom – Happiness ****** ASSURANCE STATEMENT We hereby certify that the implementation for this topic has been by ourselves and depended on previous documents We have not reused or copied from any documents without reference Ho Chi Minh City, 4th July 2019 Implementer Tran Van Son Vo Anh Quoc iv ACKNOWLEDGEMENT First and foremost, we would like to express our sincere thanks to our thesis advisor, My-Ha Le, PhD Without his assistance and dedicated involvement in every step throughout the process, this thesis cannot be accomplished He is one of significant lecturers at Ho Chi Minh city University of Technology and Education, who have studies and papers in the field related to Artificial Intelligent in general and Image Processing in particular Therefore, we would like to say that we are so lucky to work with him He guided us what we needed to at the very beginning of research step, from reading and searching remarkable papers to choosing the solution to follow Also, he showed us core values that we need to focus on doing our graduation thesis and our final year project Moreover, he taught us how to present and debate when standing in front of the graduation thesis council Second, we would like to thank all lecturers in Faculty of Electrical and Electronics Engineering This thesis is the combination of knowledge that we learned over the past years We know that every lecture is useful for my career path after graduation Third, we would like to thank Intelligent System Laboratory (ISLab), Faculty of Electrical and Electronics Engineering for the support of facilities and enabling us to carry out the thesis Last but not least, with special mention to our parents and members of ISLab who are our kind friends Even though we are not the same topic, they really enthusiastically supported us when we got in stuck Ho Chi Minh City, 4th July 2019 Tran Van Son Vo Anh Quoc v HCMC UNIVERSITY OF SOCIALIST REPUBLIC OF TECHNOLOGY AND EDUCATION VIETNAM Faculty of Electrical and Electronics Engineering Independence – Freedom – Happiness ****** ADVISOR’S COMMENT SHEET Student name: Vo Anh Quoc Student ID: 15151254 Student name: Tran Van Son Student ID: 15151211 Major: Automation and Control Engneering Technology Program: Full-time program Cohort: 2015 – 2019 Class: 151511C About the thesis contents: - Students fulfill the requirements of graduation thesis Advantage: - The system works stably - The system identifies object with high accuracy Disadvantage: - Need to experiment on diverse data Propose defending thesis? Yes Rating: Excellent Mark: 9.8 (In writing: Nine point eight) Ho Chi Minh City, 4th July 2019 Advisor vi HCMC UNIVERSITY OF SOCIALIST REPUBLIC OF TECHNOLOGY AND EDUCATION VIETNAM Faculty of Electrical and Electronics Engineering Independence – Freedom – Happiness ****** REVIEWER’S COMMENT SHEET Student name: Vo Anh Quoc Student ID: 15151254 Student name: Tran Van Son Student ID: 15151211 Major: Automation and Control Engneering Technology Program: Full-time program Cohort: 2015 – 2019 Class: 151511C About the thesis contents: Advantage: Disadvantage: Propose defending thesis? Rating: Mark: ……………… (In writing ) Ho Chi Minh City, 9th July 2019 Reviewer vii TABLE OF CONTENTS THESIS TASKS .i SCHEDULE ii ASSURANCE STATEMENT iv ACKNOWLEDGEMENT v ADVISOR’S COMMENT SHEET vi REVIEWER’S COMMENT SHEET vii TABLE OF CONTENTS viii ABREVIATIONS AND ACRONYMS xi LIST OF FIGURES xii LIST OF TABLES xv CHAPTER 1: OVERVIEW 1.1 INTRODUCTION 1.1.1 Introduction to Face Recognition Problem 1.1.2 Face Recognition Application 1.2 THESIS OBJECTIVE 1.3 RESEARCH SCOPE OF THE THESIS 1.4 RESEARCH METHOD 1.5 MAIN CONTENT CHAPTER 2: ARTIFICIAL INTELLIGENT 2.1 OVERVIEW 2.1.1 Artificial Intelligent 2.1.2 Machine Learning 10 2.1.3 Deep Learning 14 2.2 CONVOLUTIONAL NEURAL NETWORK 18 2.2.1 Introduction 18 2.2.2 Convolutional Neural Network Architectures 19 2.2.2.1 Convolutional Layer 20 2.2.2.2 Non-linearity 22 2.2.2.3 Stride and Padding 23 2.2.2.4 Pooling layer 25 2.2.2.5 Flattening Layer 26 2.2.2.6 Fully-Connected Layer 27 2.3 MTCNN (Multi-task Cascaded Convolutional Neural Networks) 27 viii 2.3.1 Multi-Task 27 2.3.2 CNN Architectures 29 2.4 ONE SHOT LEARNING AND TRIPLET LOSS 32 2.4.1 One Shot Learning 32 2.4.2 Triplet Loss 34 2.5 SVM CLASSIFIER 36 2.5.1 Image classification 36 2.5.1.1 Definition of SVM 38 2.5.1.2 Hyperplanes and Support Vectors 39 2.5.1.3 Maximal-Margin Classifier 40 2.5.1.4 Soft Margin Classifier 41 2.5.1.5 Support Vector Machines (Kernels) 42 CHAPTER 3: HARDWARE AND RELATED WORKS 44 3.1 HARDWARE COMPONENTS 44 3.1.1 NVIDIA Jetson Nano Developer Kit 44 3.1.2 Logitech Webcam C270 48 3.1.3 Arduino Uno R3 50 3.2 HARDWARE BLOCK DIAGRAM AND WIRING DIAGRAM 55 3.3 THE CONSTRUCTION OF HARDWARE PLATFORM 56 CHAPTER 4: SOFTWARE AND ALGORITHM 57 4.1 SOFTWARE AND INTEGRATED DEVELOPMENT ENVIRONMENT 57 4.1.1 Facenet Introduction 57 4.1.2 Python 58 4.1.3 Visual Studio Code (VSC) 59 4.2 NETWORK ARCHITECTURE 60 4.2.1 Inception v1 60 4.2.1.1 The Premise: 61 4.2.1.2 The Solution: 61 4.2.2 Inception-ResNet v1 and v2 63 4.2.2.1 The Premise 64 4.2.2.2 The Solution 64 4.3 ALGORITHM AND FLOWCHART 66 4.3.1 Flowchart of collecting training data 66 4.3.2 Flowchart of face identification using trained data 68 CHAPTER 5: EXPERIMENTAL RESULT 70 5.1 DATA COLLECTING 70 5.2 DATA DESCRIPTION 70 5.3 FACE RECOGNITION PROCESS 75 ix CHAPTER 6: CONLUSION AND FUTURE WORKS 82 6.1 CONCLUSION 82 6.1.1 Advantages 82 6.1.2 Disadvantages 83 6.2 FUTURE WORKS 83 REFERNECES 84 x Figure 5.2 Facial images are collected under different condition of emotion The test dataset includes kinds of condition test to effectively validate how good the model is: Condition 1: Away meter from camera with fluorescent light Condition 2: Away meter from camera without fluorescent light Condition 3: Away meters from camera with fluorescent light Condition 4: Away meters from camera without fluorescent light Condition 5: Away meters from camera with fluorescent light Condition 6: Away meters from camera without fluorescent light For each kind of test dataset, there are 100 facial images in class, so the total number is 1000 The accuracy for each class will be calculated by number of true prediction divide by 100, then the average accuracy of that kind test dataset is sum of them divide by 10 71 Figure 5.3 Dataset in Condition Figure 5.4 Dataset in Condition 72 Figure 5.5 Dataset in Condition Figure 5.6 Dataset in Condition 73 Figure 5.7 Dataset in Condition Figure 5.8 Dataset in Condition 74 5.3 FACE RECOGNITION PROCESS The stored data is then copied to an Asus K550V, which is equipped with a Core-i5, 2.3GHz, Ram 8GB and NVIDIA GeForce GTX 950M where we store dataset and test model with pre-trained model in order to validate We use Google Colab to train dataset with the pre-trained model and generate new weights These weights are embedded to NVIDIA Jetson Nano for recognition The dataset consists of 400 facial images New folder store full frame for every person to test, for each person there are 100 frames contain their faces under different conditions (exposure, distance from camera, emotion) The accuracy is calculated by the true frame of that person divide 100, to represent the result visually, confusion matrix is good to validate accuracy In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix) Each row of the matrix represents the instances in a predicted class while each column represents the instances in an actual class (or vice versa) The name stems from the fact that it makes it easy to see if the system is confusing two classes (i.e commonly mislabeling one as another) It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical sets of "classes" in both dimensions (each combination of dimension and class is a variable in the contingency table) If a classification system has been trained to distinguish between cats and dogs, a confusion matrix will summarize the results of testing the algorithm for further inspection Assuming a sample of 13 animals — cats and dogs — the resulting confusion matrix could look like the table below: 75 Table 5-1 Example of confusion matrix Actual class Cat Dog Cat Dog 3 Predicted class In this confusion matrix, of the actual cats, the system predicted that three were dogs, and of the five dogs, it predicted that two were cats All correct predictions are located in the diagonal of the table (highlighted in bold), so it is easy to visually inspect the table for prediction errors, as they will be represented by values outside the diagonal For conditions of testing dataset, it returns confusion matrix which show all the result of prediction, how many frame is true, how many false, and the diagonal line of confusion matrix show number of true prediction Average accuracy is sum of values on diagonal line divide by 10 Away meter from camera with fluorescent light : 92.4% Table 5-2 Testing result in Condition 76 Away meter from camera without fluorescent light: 88.7% Table 5-3 Testing result in Condition Away meters from camera with fluorescent light: 87.9% Table 5-4 Testing result in Condition 77 Away meters from camera without fluorescent light: 85% Table 5-5 Testing result in Condition Away meters from camera with fluorescent light: 73.5% Table 5-6 Testing result in Condition 78 Away meters from camera without fluorescent light: 70% Table 5-7 Testing result in Condition As figures above, the first condition (1 meter away from camera with fluorescent light) return the best accuracy 92.4% due to the distance and ilumination is similar to trained dataset, they also clear, bright, and high solution The accuracy is effective by ilumination and distance, but according to confusion matrix the distance is more effective to accuracy When move camera further the accuracy reduces The accuracy significantly reduces when moving camera meter to meters For the ilumination, the accuracy reduces about 1% - 3% when turn off all the fluorescent In real class room test, we setup system in a class room of HCMUTE The room equip fluorescent light, glass window in two directions, so the ilumination is both natural and fluorescent light 10 person who was trained, sitting in class, in front of camera and the last person is meters away from camera The result is shown as figure below: 79 Figure 5.9 Testing result in classroom with fluorescent light For more confident, the system was tested in low ilumination condition, turn off all the fluorescent, the result is less exact : Figure 5.10 Testing result in classroom without fluorescent light 80 To save attendance informations, if a person is recognized, his/her name would be saved to an excel file attach the time as figure below : Figure 5.11 Saved timesheet For the processing speed of the system, since the figuration of Jetson Nano is not too high like a PC, the system cannot work smoothly so processing frame every 15 seconds is acceptable 81 CHAPTER 6: CONLUSION AND FUTURE WORKS 6.1 CONCLUSION 6.1.1 Advantages This thesis presented our project on building a real–time attendance system prototype based on face recognition using Machine Learning We implemented on training dataset consisting of 10 different people, 40 front face images per person and validated model with 100 front face images per person in the classroom lighting condition The program is embedded into NVIDIA Jetson Nano Developer Kit This thesis got some advantages significantly Although reduction of accuracy in bad condition, model still get highly accuracy for each condition: 92.4% for front face, meter away from camera, good illumination (fluorescent light), and 70% for front face, meters away from camera without fluorescent light For real-time testing in classroom, model worked stably Along with that, the combination of MTCNN and FaceNet algorithms results with high stability if it satisfies the limited conditions of the project MTCNN has worked well, stably detect face in different condition of range and illumination Facenet (Inception-Resnet-V1) is a good algorithm for face recognition with limited data to train and returns a good accuracy The goal of the project is to take attendance based on face recognition in the class We decided to save the result every 15 seconds, but in real, it should be to 10 minutes because in that period of time, it probably would not affect the result Eventually, we completely built the hardware with the main components being NVIDIA Jetson Nano Developer Kit and C270 Logitech Webcam In particular, the total cost of hardware is low because NVIDIA Jetson Nano Developer Kit is a small and powerful computer that NVIDIA produces but cost only $99 to own it All of components are neatly placed in a box with a compact size of 22x22x11 cm The hardware has worked stably, safety and easily to setup, it’s also neat and light 82 6.1.2 Disadvantages In order for the MTCNN detector to work at its best, our project requires frames that have frontal faces with the camera position and resolution as high as possible Moreover, those with similar facial contours and exposure also affect the accuracy of the model The real-time part does not really work well due to the speed processing The NVIDIA Jetson Nano Developer Kit does not have strong enough configuration for processing 6.2 FUTURE WORKS Face recognition is a classical problem, but it still has a lot of things to explore For example, standardizing the input data is a direction that we will consider it If the input images are high quality, the model can be able to work effectively This system can be developed as an attendance management system for company and school 83 REFERNECES [1] Kamlesh Kumar, M., Arain, R H., Maitlo, A., Ruk, S A., & Shaikh, H (2018) Study of Face Recognition Techniques: A Survey, [2] Wei-Lun Chao Face Recognition Survey GICE, National Taiwan University [3] Castle, N (n.d.) What is Artificial Intelligence? Retrieved from https://www.datascience.com/blog/what-is-artificial-intelligence [4] Deshpande, A (n.d.) A Beginner's Guide To Understanding Convolutional Neural Networks Retrieved from https://adeshpande3.github.io/A-Beginner'sGuide-To-Understanding-Convolutional-Neural-Networks/ [5] CNN architecture (2018, April 04) Retrieved from https://hub.packtpub.com/cnn-architecture/ [6] Paul Viola, Michael Jones (2001) Rapid Object Detection using a Boosted Cascade of Simple Features [7] CS231n: Convolutional Neural Networks for Visual Recognition Stanford CS class from http://cs231n.github.io/ [8] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich,“ Going deeper with convolutions” CoRR 17 Sep 2014 [9] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun “Deep residual learning for image recognition” [10] Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc “FaceNet: A Unified Embedding for Face Recognition and Clustering” 17 Jun 2015 (version 3) [11] Theodoros Evgeniou, Massimiliano Pontil “Support vector machine: Theory and Application” Jan 2001 [12] Ivan Dokmanic, Reza Parhizkar, Juri Ranieri, Martin Vetterli “Euclidean distance matrices: Essential Theory, Algorithm and Application” 15 Aug 2015 ( version 2) 84 [13] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Zbigniew Wojna “Rethinking the Inception Architecture for Computer Vision” CVPR 2016 [14] Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alexander A Alem “Inception V4, Inception Resnet and the Impact of Residual Connection on Learning ” AAAI 2017 85 ... is Building attendance system prototype based on face recognition using Machine Learning 1.3 RESEARCH SCOPE OF THE THESIS The scope of this thesis is to build a real-time face recognition system. .. Automation and Control Engneering Technology Program: Full-time program Cohort: 2015 – 2019 Class: 151511C THESIS NAME: BUILDING ATTENDANCE SYSTEM PROTOTYPE BASED ON FACE RECOGNITION USING MACHINE LEARNING. .. the one giving the top performance so far Therefore, this thesis is written about how to apply that method for Building attendance system prototype based on face recognition using Machine Learning