Design and implementation of attendance and student monitoring system using image processing and artificial intelligence

MINISTRY OF EDUCATION AND TRAINING HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY DESIGN AND IMPLEMENTATION OF ATTENDANCE AND STUDENT MONITORING SYSTEM USING IMAGE PROCESSING AND ARTIFICIAL INTELLIGENCE SUPERVISOR: LE MY HA, Assoc.Prof STUDENT: BUI MINH TRI SKL 0 Ho Chi Minh City, August, 2022 HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT DESIGN AND IMPLEMENTATION OF ATTENDANCE AND STUDENT MONITORING SYSTEM USING IMAGE PROCESSING AND ARTIFICIAL INTELLIGENCE BUI MINH TRI - 18151041 MAJOR: AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY ADVISOR: LE MY HA, Assoc.Prof Ho Chi Minh City, August 2022 HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT DESIGN AND IMPLEMENTATION OF ATTENDANCE AND STUDENT MONITORING SYSTEM USING IMAGE PROCESSING AND ARTIFICIAL INTELLIGENCE BUI MINH TRI - 18151041 MAJOR: AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY ADVISOR: LE MY HA, Assoc.Prof Ho Chi Minh City, August 2022 APPENDIX 3: (Graduation Project Assignment) THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness Ho Chi Minh City, August , 2022 GRADUATION PROJECT ASSIGNMENT Student name: Bui Minh Tri Student ID: 18151041 _ Major: Automation and control engineering technology Advisor: Assoc Prof Le My Ha Class: 18151CLA2 Date of assignment: Date of submission: Phone number: 036.971.8404 Project title: Design and implementation of attendance and student monitoring system using image processing and artificial intelligence Initial materials provided by the advisor: - Image processing and machine learning documents such as papers and books - The related thesis of previous students Content of the project: - Refer to documents, survey, read and summarize to determine the project directions - Survey to choose the suitable model - Write programs for Jetson Nano - Test and evaluate the completing system - Design the graphical user interface - Upload and retrieval of data to the database - Send email warning for absent students or students have the cumulated time is shorter than the standard time - Collect new data and retrain the classifier - Write a report - Prepare slides for presenting Final product: The hardware and software of an attendance system, real-time database, user interface CHAIR OF THE PROGRAM ADVISOR (Sign with full name) (Sign with full name) i APPENDIX 4: (Advisor’s Evaluation sheet) THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness Ho Chi Minh City, August , 2022 ADVISOR’S EVALUATION SHEET Student name: Bui Minh Tri Student ID: 18151041 Major: Automation and control engineering technology Project title: Design and implementation of attendance and student monitoring system using image processing and artificial intelligence Advisor: Assoc Prof Lê Mỹ Hà EVALUATION Content of the project: The thesis has a total of six chapters with 61 pages The construction and design of attendance and student monitoring system, in the particular room The real system is completed following the objectives in the proposal Strengths: The system can support the teachers and administration in terms of taking attendance and monitoring students who go into or go out of the class The system is designed with image processing and artificial intelligence This project is low-cost The execution time is suitable for practical application The accuracy of the system is guaranteed Weaknesses: The system is not tested in different environments and with a limited source of dataset Different environments require different camera positions to avoid the backlit Approval for oral defense? (Approved or denied) Overall evaluation: (Excellent, Good, Fair, Poor) Mark: …………… (In words: ) Ho Chi Minh City, month day , year ADVISOR (Sign with full name) ii APPENDIX 5: (Pre-Defense Evaluation sheet) THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness Ho Chi Minh City, August , 2022 PRE-DEFENSE EVALUATION SHEET Student name: Bui Minh Tri Student ID: 18151041 Major: Automation and control engineering technology Project title: Design and implementation of attendance and student monitoring system using image processing and artificial intelligence Name of Reviewer: EVALUATION Content and workload of the project Strengths: Weaknesses: Approval for oral defense? (Approved or denied) Overall evaluation: (Excellent, Good, Fair, Poor) Mark: ……………… (In words: ) Ho Chi Minh City, month day , year REVIEWER (Sign with full name) iii APPENDIX 6: (Evaluation sheet of Defense Committee Member) THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness EVALUATION SHEET OF DEFENSE COMMITTEE MEMBER Student name: Student ID: Major: Project title: Name of Defense Committee Member: EVALUATION Content and workload of the project Strengths: Weaknesses: Overall evaluation: (Excellent, Good, Fair, Poor) Mark: ……………… (In words: ) Ho Chi Minh City, month day , year COMMITTEE MEMBER (Sign with full name) iv ACKNOWLEDGEMENT We would like to sincerely thank Professor Le My Ha for his thorough instruction, which helped us to have the necessary information to use for completing the thesis During the whole progress, even if we did our best to complete everything completely, mistakes are still inevitable We anticipate having my advisor's focused assistance and direction to help us gain more experience and successfully complete the topic project On the other hand, we would like to express our sincere thanks to the Faculty of Hight Quality Training and Faculty of Electrical and Electronics Engineering where we obtained basic knowledge and experience Moreover, we also would like to thank ISLab members who helps us in detailing the works of this project They shared valuable experience and knowledge with us Ultimately, we would like to express our gratitude to our families for their support of our team throughout the implementation of this thesis Sincere thanks for everything! v GUARANTEE This thesis is the result of our study and implementation, which we hereby formally proclaim We did not plagiarize from a published article without author acceptance We will take full responsibility for any violations that may have occurred Authors Bui Minh Tri vi ABSTRACT Face recognition is a computer application that detects, tracks, identifies and verifies human faces in images or videos captured with a digital camera There has been significant progress in the field of face detection and recognition for security, identification, and attendance purposes, we are inspired by that to apply face recognition technology to every walk of life In the university, we witness teachers take a lot of time to take students’ attendance Moreover, some attendance systems in the workspace are not suitable for taking attendance of students All reasons mentioned above are the motivation for me to this project Furthermore, we also propose this solution to substitute the traditional attendance system because of our proactive recognition The attendance system is embedded in Jetson Nano because we want to design an affordable system We utilize face detection and recognition functions that are available on traditional systems and add the face tracker To manage information, we update all recorded data into the server database For users to operate easily, we design a graphical user interface The experiment is produced in a particular room simulates a practical lesson, with limited identities in the dataset The lowest frame per second of the system is 8fps but still satisfies real-time practical applications vii set In total, we prepare 700 images in the hands of 10 different identifies Figure 5.3 is one of the pieces of our dataset Figure Our dataset FaceNet model is trained on CASIA-WebFace [17] datasets while evaluated on the standard LFW [18] (Labeled Faces in the Wild) benchmark Figure 5.4 is one of the pieces of Labeled Faces in the Wild dataset Figure Labeled Faces in the Wild dataset 5.2 Training processing 5.2.1 Face and facemask detection To train YOLOv4, we set configuration parameters in table 5.1 in advance The model is trained in Google Colab and takes about hours to train completely, the result is shown as figure 5.6 Parameter Name Image size Max batches Subdivision Parameter Value 416 × 416 6000 47 Learning rate 0.00261 Momentum 0.9 Weight decay 0.0001 Table Training parameter of the detection model 5.2.2 Face recognition To train FaceNet, we set configuration parameters in table 5.2 in advance The model is trained in a server and takes about days to train completely, the result is shown as figure 5.5 shows the accuracy during training (solid line), and validation (dashed line) and evaluated after every epochs Parameter Name Parameter Value Image size 160× 160 Max epoch 150 Keep probability 0.8 Learning rate 0.0005 Momentum 0.9 Weight decay 5e-4 Table Training parameter of the recognition model Figure 5 Training graph of FaceNet 48 5.2.3 KNN classifier When we train KNN with k within from to 1000 because we want to choose the suitable k for our classifier based on the accuracy But it expresses the disadvantage of training time, it takes about a lot of minutes on Jetson Nano So we choose k to follow chapter 26 in [19], that equation is clarified in [20], and this is reduced to formula 5.1 K = number of images (5.1) Figure Training IoU and loss graph of the detection model 5.3 Evaluation 5.3.1 Detection model Table 5.3 is a comparison between classes while the results are statistics in table 5.4 about the number of TP, TN, FP, FN Table 5.5 is based on table 5.4 to compute precision, recall, f1-score 49 Class name Mask No mask ap (%) 92.63% TP = 791, FP = 142 84.63% TP = 261, FP = 61 Table Evaluate average precision following each class Positive Negative True 1052 False 203 115 Table Evaluate TP, TN, FP, FN for confident threshold = 0.25 with average IoU = 65.40 % Precision 0.84 Recall 0.90 F1-score 0.87 Mean average precision (mAP@0.50) 0.888 Table 5 Evaluate model by precision, recall, F1-score, mAP for confident threshold = 0.25 5.3.2 Face recognition While the training process, we also evaluate the model on LWF dataset and obtain the result as demonstrated in figure 5.7 with a final accuracy of 99.07% Figure Evaluation graph of FaceNet on LWF dataset 50 5.3.3 SORT The below evaluation is taken from [13] which illustrates the SORT performance in comparison with other trackers SORT seems to be exceeding opponents about MOTA Figure Evaluating SORT 5.3.4 KNN classifier To evaluate the classifier, we plot a confusion matrix like figure 5.8 and compute the precision, recall, f1 score that are demonstrated respectively in table 5.6 All results express the result is good with 91% accuracy and the correlation between classes is obvious Figure Confusion matrix at K=24 There are 10 people in the dataset 51 Class Cuong Dat Duy Hung Linh Manh Minh Nam Nien Tri Accuracy Macro avg Weighted avg Precision 1.00 0.91 0.75 0.82 0.90 1.00 1.00 0.90 0.90 1.00 Recall 0.90 1.00 0.90 0.90 0.90 0.89 0.91 0.90 0.90 0.90 F1-score 0.95 0.95 0.82 0.86 0.90 0.94 0.95 0.90 0.90 0.95 0.91 0.92 0.91 0.91 0.92 0.91 0.91 Table Evaluating classifier Support 10 10 10 10 10 10 10 10 10 10 100 100 100 5.4 Result 5.4.1 Models When it comes to models, it works well in a particular room In detail, face recognition works well for short distances of about meters Besides, face detection operates with good performance but we only want it to detect faces or facemasks for medium distances of about meters The speed of the whole system is not high, the lowest fps is documented at about fps The accuracy threshold of the detector equals 0.7, one of recognition equals 0.6 Figure 10 The final result 52 We meet a serious backlit phenomenon when we test in the daytime and set the camera in front of the door There are some software solutions but they still limit the speed of the system so we decide on adjusting the position of the cameras If two people go through in the workspace at the same time, the system works relatively well but the fps fluctuates within the wide range, which leads to an effect on the detector This situation makes recorded data discrete but has been handled by the idea from a flowchart The distance parameter is set at a small value in the tracker to avoid confusion between bounding boxes Moreover, the workspace is small so the best way to make students go in or out one by one is by using line barrier poles 5.4.2 GUI Regarding GUI, the system connects to Firebase server well, all functions work correctly The users need to login in advance if they want to monitor, and control the system The GUI also allows users to modify, add, and remove the value in the database and add new face data into the local dataset Recorded data is not missed even when the Internet is lost during transmission Besides, users need to logout to maintain system security Figure 11 GUI requires to login before controlling and monitoring the system Figure 12 Retrieval of database 53 In figure 5.12, the columns are respectively recorded date, student name, value (false: attend, true: absent), cumulative time sum, note In note column, “auto” is results generated automictically, “lack info” is results generated despite the lack of information 5.4.3 Database Following some below figures, all data is recorded correctly to desired fields, and tables (b) (a) Figure 13 The result after updating data into Firebase (a) The result in the attendance table (b) The result in the check table 54 At a particular time, every day, the system generates an attendance table recording the cumulative time when students study in the classroom The system sends warning emails following information declared at the initial for the absent students or students who have little cumulative time in the classroom The email form is shown in figure 5.15 Figure 14 The result in the student list after updating Figure 15 Send the warning email 55 5.4.4 Website The website change and display the correct parameters in the database The results are presented in figure 5.15 Figure 16 The retrieved data from the database on 17 August 5.5 Comparison 5.5.1 Detection model The comparison is conducted in Jetson Nano and the results show that the tensorrt model is more optimal than the normal model in term of fps despite the light sacrifice in terms of accuracy Type of model Frame per second (FPS) 8.5 Memory space (MB) 22.3 mAP @ IoU=0.5 0.888 YOLO Tiny v4 Original (Float32) YOLO Tiny v4 23 30.1 0.79 TensorRT (Float16) Table Comparison between the original model with TensorRT model 56 5.5.2 Recognition model Reducing embedding contributes to a significant increase in fps Moreover, OpenFace model is also superior to other models in terms of this extent Type of model Frame per second (FPS) Memory space (MB) 91.3 Accuracy Original 0.996 (512 embeddings) [21] FaceNet TFlite 89.5 0.97 (512 embeddings) FaceNet TensoRT 30.1 0.95 (128 embeddings) Table Comparison between the models with OpenFace model 57 CHAPTER CONCLUSION AND FUTURE WORK In conclusion, in this thesis, our team manage to design and implement of attendance and student monitoring system using image processing and artificial intelligence that can work basically in the particular room This work aims to operate this system in all working environments smoothly The function of the system includes: “face detection”, “face mask detection”, “face tracking”, “face recognition”, “database management”, “graphical user interface” This thesis proposes using face tracking after face detector to improve the performance of face recognition system in real-time In other words, people are recognized only once times when they go through the door Besides, we suggest how to manage data for the attendance system in the database Furthermore, GUI is designed to make the system more friendly to users The detection model is run with the Tensor RT pattern to optimize the referenced time of the model To push the works further and commercialize, in the future, our team will make the system better by several tasks listed below: ▪ Integrates many features on GUI includes saving operating videos to retrieve the history of check-in and check-out, retrieving page information check table and student list in Firebase dataset ▪ Build app, web to monitor, control system, retrieve, modify data remotely ▪ Scaling the current system with multiple classes And some other works will be mentioned in the future during the implementation process 58 [1] REFERENCE J Redmon and A Farhadi, “YOLOv3: An Incremental Improvement,” Apr 2018, doi: 10.48550/arxiv.1804.02767 [2] A Bochkovskiy, C.-Y Wang, and H.-Y M Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” Apr 2020, doi: 10.48550/arxiv.2004.10934 [3] G Huang, Z Liu, L Van Der Maaten, and K Q Weinberger, “Densely Connected Convolutional Networks,” Proc - 30th IEEE Conf Comput Vis Pattern Recognition, CVPR 2017, vol 2017-January, pp 2261–2269, Aug 2016, doi: 10.48550/arxiv.1608.06993 [4] C Y Wang, H Y Mark Liao, Y H Wu, P Y Chen, J W Hsieh, and I H Yeh, “CSPNet: A New Backbone that can Enhance Learning Capability of CNN,” IEEE Comput Soc Conf Comput Vis Pattern Recognit Work., vol 2020-June, pp 1571–1580, Nov 2019, doi: 10.48550/arxiv.1911.11929 [5] K He, X Zhang, S Ren, and J Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics), vol 8691 LNCS, no PART 3, pp 346–361, Jun 2014, doi: 10.1007/978-3-319-10578-9_23 [6] L Xu, J Huang, A Nitanda, R Asaoka, and K Yamanishi, “A Novel Global Spatial Attention Mechanism in Convolutional Neural Network for Medical Image Classification,” Jul 2020, doi: 10.48550/arxiv.2007.15897 [7] Z Yao, Y Cao, S Zheng, G Huang, and S Lin, “Cross-Iteration Batch Normalization,” Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit., pp 12326–12335, Feb 2020, doi: 10.48550/arxiv.2002.05712 [8] Z Q Wen and Z X Cai, “Mean shift algorithm and its application in tracking of objects,” Proc 2006 Int Conf Mach Learn Cybern., vol 2006, pp 4024–4028, 2006, doi: 10.1109/ICMLC.2006.258803 [9] Z Jiang, L Zhao, S Li, and Y Jia, “Real-time object detection method based on improved YOLOv4-tiny,” Nov 2020, doi: 10.48550/arxiv.2011.04244 [10] V Kazemi and J Sullivan, “One millisecond face alignment with an ensemble of regression trees,” Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit., pp 1867–1874, Sep 2014, doi: 10.1109/CVPR.2014.241 [11] “i·bug - resources - 300 Faces In-the-Wild Challenge (300-W), ICCV 2013.” https://ibug.doc.ic.ac.uk/resources/300-W/ (accessed Aug 05, 2022) [12] F Schroff, D Kalenichenko, and J Philbin, “FaceNet: A Unified Embedding for Face Recognition and Clustering,” Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit., vol 07-12-June-2015, pp 815–823, Mar 2015, doi: 10.1109/CVPR.2015.7298682 59 [13] A Bewley, Z Ge, L Ott, F Ramos, and B Upcroft, “Simple Online and Realtime Tracking,” Proc - Int Conf Image Process ICIP, vol 2016-August, pp 3464– 3468, Feb 2016, doi: 10.1109/ICIP.2016.7533003 [14] “The Hungarian Method for the Assignment Problem.” https://www.researchgate.net/publication/239059434_The_Hungarian_Method_for _the_Assignment_Problem (accessed Aug 05, 2022) [15] T Basar, “A New Approach to Linear Filtering and Prediction Problems,” Control Theory, Feb 2010, doi: 10.1109/9780470544334.CH9 [16] “Face Mask Detection | Kaggle.” https://www.kaggle.com/datasets/andrewmvd/face-mask-detection (accessed Aug 05, 2022) [17] “CASIA-WebFace Dataset | Papers With Code.” https://paperswithcode.com/dataset/casia-webface (accessed Aug 05, 2022) [18] “LFW Face Database : Main.” http://vis-www.cs.umass.edu/lfw/ (accessed Aug 05, 2022) [19] L Devroye, L Györfi, and G Lugosi, “A Probabilistic Theory of Pattern Recognition,” vol 31, 1996, doi: 10.1007/978-1-4612-0711-5 [20] “Rates of Convergence for Nearest Neighbor Classification.” https://papers.nips.cc/paper/2014/hash/db957c626a8cd7a27231adfbf51e20ebAbstract.html (accessed Aug 07, 2022) [21] “GitHub - davidsandberg/facenet: Face recognition using Tensorflow.” https://github.com/davidsandberg/facenet (accessed Aug 05, 2022) 60 S K L 0

Định dạng
Số trang	80
Dung lượng	9,03 MB