1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Service robot for students based on computer vision and natural language processing

65 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 65
Dung lượng 5,45 MB

Nội dung

MINISTRY OF EDUCATION AND TRAINING HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY SERVICE ROBOT FOR STUDENTS BASED ON COMPUTER VISION AND NATURAL LANGUAGE PROCESSING LECTURER: ASSOC PROF PHD LE MY HA STUDENT: NGUYEN TUAN THANH SKL009325 Ho Chi Minh City, August, 2022 HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT SERVICE ROBOT FOR STUDENTS BASED ON COMPUTER VISION AND NATURAL LANGUAGE PROCESSING NGUYỄN TUẤN THANH Student ID: 17151028 Major: AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY Advisor: Assoc Prof PhD LE MY HA Ho Chi Minh City, August 2022 HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY FOR HIGH QUALITY TRAINING GRADUATION PROJECT SERVICE ROBOT FOR STUDENTS BASED ON COMPUTER VISION AND NATURAL LANGUAGE PROCESSING NGUYỄN TUẤN THANH Student ID: 17151028 Major: AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY Advisor: Assoc Prof PhD LE MY HA Ho Chi Minh City, August 2022 Faculty for High Quality Training – HCMC University of Technology and Education THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness -Ho Chi Minh City, August 6th, 2022 GRADUATION PROJECT ASSIGNMENT Student name: Nguyen Tuan Thanh Student ID: 17151028 Major: Automation and Control Engineering Technology Class: 17151CLA1 Advisor: Assoc Prof PhD Le My Ha Phone number: 0938811201 Date of assignment: Feb 21th, 2022 Date of submission: August 6th, 2022 Project title: Service robot for students based on computer vision and natural language processing Initial materials provided by the advisor: References, reference programs, data sets, expected parameters of the Robot Content of the project: - Design, implement a service robot with two functions: chat and talk - Apply computer vision to identify wearing a mask and user information - Apply natural language processing in virtual voice assistant to communicate with human - Apply natural language toolkit (NLTK) to build chatbot to communicate with human - Build database and collect more database when communicate with human Final product: Finish a service robot that have abilities to recognize human with high accuracy and communicating with human by given knowledge database CHAIR OF THE PROGRAM (Sign with full name) ADVISOR (Sign with full name) Faculty for High Quality Training – HCMC University of Technology and Education THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness -Ho Chi Minh City, August 6, 2022 ADVISOR’S EVALUATION SHEET Student name: Nguyen Tuan Thanh Student ID: 17151028 Major: Automation and Control Engineering Technology Project title: Service robot for students based on computer vision and natural language processing Advisor: Assoc Prof PhD Le My Ha EVALUATION Content of the project: - Design, implement a service robot with two functions: chat and talk - Apply computer vision to identify wearing a mask and user information - Apply natural language processing in virtual voice assistant to communicate with human - Apply natural language toolkit (NLTK) to build chat bot to communicate with human - Build database and collect more database when communicate with human Strengths: Weaknesses: Approval for oral defense? (Approved or denied) Overall evaluation: (Excellent, Good, Fair, Poor) Mark: ………… (in words: ) Ho Chi Minh City, August 6th, 2022 ADVISOR (Sign with full name) Faculty for High Quality Training – HCMC University of Technology and Education THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness -Ho Chi Minh City, August 6, 2022 PRE-DEFENSE EVALUATION SHEET Student name: Nguyen Tuan Thanh Student ID: 17151028 Major: Automation and Control Engineering Technology Project title: Service robot for students based on computer vision and natural language processing Name of Reviewer: EVALUATION Content of the project: - Design, implement a service robot with two functions: chat and talk - Apply computer vision to identify wearing a mask and user information - Apply natural language processing in virtual voice assistant to communicate with human - Apply natural language toolkit (NLTK) to build chat bot to communicate with human - Build database and collect more database when communicate with human Strengths: Weaknesses: Approval for oral defense? (Approved or denied) Overall evaluation: (Excellent, Good, Fair, Poor) Mark: ………… (in words: ) Ho Chi Minh City, August 6th, 2022 REVIEWER (Sign with full name) Faculty for High Quality Training – HCMC University of Technology and Education THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom– Happiness EVALUATION SHEET OF DEFENSE COMMITTEE MEMBER Student name: Nguyen Tuan Thanh Student ID: 17151028 Major: Automation and Control Engineering Technology Project title: Service robot for students based on computer vision and natural language processing Name of Defense Committee Member: EVALUATION Content of the project: - Design, implement a service robot with two functions: chat and talk - Apply computer vision to identify wearing a mask and user information - Apply natural language processing in virtual voice assistant to communicate with human - Apply natural language toolkit (NLTK) to build chat bot to communicate with human - Build database and collect more database when communicate with human Strengths: Weaknesses: Overall evaluation: (Excellent, Good, Fair, Poor) Mark: ………… (in words: ) Ho Chi Minh City, August 6th, 2022 COMMITTEE MEMBER (Sign with full name) Graduation Thesis ACKNOWLEDGEMENT In the process of completing the graduation project, in addition to my own understanding, I have received a lot of support and dedicated help First, I would like to express my deep gratitude to Associate Professor Dr Le My Ha, who is both a teacher, a supporter, and an inspiration for me to complete this thesis He oriented me to the right topic, and how to it, and gave objective feedback to help me when defending myself in front of the council Therefore, I feel very fortunate to have worked with him Next, I would like to thank the faculty of electronics and electronics faculty as well as the high-quality training department for imparting useful knowledge during four years at the university This knowledge plays a fundamental role in the implementation of my graduation thesis In addition, I would also like to thank the Intelligent Systems Laboratory (ISLAB) of the Faculty of Electrical and Electronic Engineering for supporting me in terms of facilities as well as useful knowledge during the completion of the project And indispensable is the deep thanks to a friend Tran Thanh Hung who supported and guided me to develop this topic Finally, I would like to thank my family for always supporting, caring, and motivating me to complete the project in the best possible way Ho Chi Minh city, August 6th 2022 Student Graduation Thesis Table of Contents CHAPTER 1: INTRODUCTION .1 1.1 Define a problem 1.2 Project objectives 1.3 Project task 1.4 Project scopes 1.5 Approach and research 1.6 Project description CHAPTER 2: LITERATURE REVIEW 2.1 Survey of robots being used in service industry 2.1.1 Mission of robots in the service industry 2.1.2 Pepper robot 2.2 Background of face recognition system 2.2.1 Concept 2.2.2 Structure and procedure for face recognition 2.2.3 Face Detection 2.3 Color spaces in image processing .9 2.3.1 RGB color space (Red-Green-Blue) 2.3.2 HSV color space (Hue-Saturation-Value) 2.4 Histogram of Oriented Gradients algorithm 10 2.5 Support Vector Machine algorithm 11 2.6 Background of speech recognition system .13 2.6.1 Concept 13 2.6.2 Speech Recognition 13 2.6.3 Applications 15 2.7 Framework and libraries 17 2.7.1 Framework Pytorch 17 2.7.2 Pandas 18 2.7.3 Numpy 18 2.8 Voice Assistant 18 2.9 ChatBot 19 Graduation Thesis CHAPTER 3: SYSTEM DESIGN AND CONSTRUCTION 22 3.1 Requirements of the system 22 3.2 System description 22 3.2.1 The block diagram of the system 22 3.2.2 The function of each block 22 3.3 System design 23 3.3.1 Face detection: 23 3.3.2 Face recognition and identification: 24 3.3.3 Face mask detection 28 3.3.4 Speech recognition and voice assistant 30 3.3.5 Chatbot .30 CHAPTER 4: EXPERIMENT RESULTS, FINDINGS AND ANALYSIS 36 4.1 Face detection 36 4.2 Face recognition and identification 37 4.2.1 Training image data 37 4.2.2 Performing the face recognition 37 4.3 Face mask detection 38 4.4 Speech recognition and voice assistant 40 4.5 Chatbot 43 4.5.1 Create Training Data 43 4.5.2 NLP Basics 44 4.5.3 Complete chatbot 45 4.6 User interface 46 CHAPTER 5: CONCLUSIONS AND DIRECTIONS OF DEVELOPMENT 47 5.1 Conclusion 47 5.2 Direction of development 47 REFERENCES 48 Graduation Thesis Analysis Chapter 4: Experiment Results, Findings and CHAPTER 4: EXPERIMENT RESULTS, FINDINGS AND ANALYSIS In this chapter, students will experiment with the solutions proposed in chapter At the same time, make an objective assessment of the system compared to the original goal 4.1 Face detection To demonstrate the effectiveness of using the mediapipe library mentioned in chapter 3, the student conducted experiments and calculated the frame rate when recognizing human faces The results are shown in Figures 4.1 and 4.2 Figure Six facial features are displayed when human face is detected and frame rate is measured 36 Graduation Thesis Analysis Chapter 4: Experiment Results, Findings and Figure Detecting multiple faces in the same frame The results show that the frame rate when detecting human faces is above 60fps, detecting human faces with high accuracy, and can detect many people per frame This result has met the system requirements set forth by the student 4.2 Face recognition and identification 4.2.1 Training image data First, the student will collect image data of each person and save it in each folder containing that person's name Then the whole image will go through face detection using the Haar cascade to extract the face present in the image of each image Finally, these face-extracted images will be the input of face recognition and identification using LBPH training with the aim that each image will be assigned a corresponding ID to recognize who the person is The whole process will be visualized more clearly in Figure 4.3 Figure The process of training image data 4.2.2 Performing the face recognition After the results from the training process, the student will test in real time on the camera The system will take the image from the camera, then extract the face using the 37 Graduation Thesis Chapter 4: Experiment Results, Findings and Analysis Haar cascade algorithm and use face recognition and identification using LBPH to compare the histogram with the images in the training set If any image in the training set has the closest histogram result and exceeds the threshold set by the student, the ID of that image will be assigned to the image taken from the camera The result will be displayed in Figure 4.4 Figure 4 Username recognition and display The results show that the face recognition and identification algorithm using LBPH gives good recognition results, and the accuracy is relatively consistent with the purposes set by the students Sometimes, the results can be confusing due to factors such as poor quality camera images, inconsistent training results, and possibly an unreasonable threshold for comparison 4.3 Face mask detection The image taken directly from the camera will be extracted from the face using the mediapipe library, then 68 face landmarks will be detected on the face Next, we will proceed to localize the mouth and calculate the average saturation on the HSV color system of the mouth area compared to the whole face If the average saturation of the mouth area is greater than the threshold set by the student, the system will warn the user not to wear a mask and vice versa The results will be shown in Figures 4.5, 4.6, and 4.7 38 Graduation Thesis Analysis Chapter 4: Experiment Results, Findings and Figure Detect 68 landmarks on user's face Figure Bounding the mouth and warning when the user is not wearing a mask 39 Graduation Thesis Analysis Chapter 4: Experiment Results, Findings and Figure When the user wears a mask, the system will not give an alert The accuracy of detecting whether the user is wearing a mask or not depends on the quality of the image taken from the camera, the brightness of the surrounding environment will affect the average saturation calculation result and distort desired results Therefore, it is necessary to adjust the appropriate threshold to ensure the objectivity of the results 4.4 Speech recognition and voice assistant First, the student will collect question and answer data and save it in an excel file Some examples of data sets are shown in Table 4.1 Table 4- Sample collects data from students Question Answer Trường có bãi giữ xe Các bãi giữ xe nằm cạnh giảng đường khu A, B, D, E Trường có khoa Trường Đại Học Sư Phạm Kỹ Thuật Thành Phố Hồ Chí Minh có tổng cộng mười ba khoa viện sư phạm kĩ thuật Trường thành lập năm Trường Đại Học Sư Phạm Kỹ Thuật Thành Phố Hồ Chí Minh thành lập vào năm 1962 Bạn đăng ký giấy xác nhận sinh viên trang Đăng ký lấy giấy xác nhận sinh online cá nhân, sau nhận giấy phịng cơng tác viên đâu tuyển sinh sinh viên tầng khu A 40 Graduation Thesis Analysis Chapter 4: Experiment Results, Findings and Đăng ký xin phiếu điểm Bạn xin phiếu điểm phịng công tác tuyển sinh sinh viên tầng khu A Vị trí khoa điện Khoa Điện Điện tử nằm khu C khu D Phòng thí nghiệm hệ thống thơng Phịng thí nghiệm hệ thống thơng minh phịng minh ISLAB đâu C103 quản lý thầy Lê Mỹ Hà Có nhiều tuyến xe buýt qua trường xe số 8, Có tuyến xe buýt qua số 56, số 6, số 89, số 141, số 611, số 99 Bạn trường dùng bus map để tra cứu đường Next, the system will proceed to receive questions from the user In this step, as mentioned in Chapter 3, the student will use the Speech Recognition Package in Python to convert speech into text Then, the system will compare the received text with the question in the data set and calculate the accuracy If the accuracy of the question from the user is greater than the threshold set by the student, the system will take the answer to that question and use the virtual assistant to answer for the user Figure 4.8 illustrates the process of identifying and answering questions from users Figure Identify and answer questions from users when the question is in the data set When the question is not in the data, the system will ask the user to repeat the question three times and give the answer that the question is not in the data An example is given in Figure 4.9 Figure Identify and answer questions from users when the question is not in the data set The questions that the system does not have in the data set will be saved to the excel file so that students can proceed to update more data for the question file This process is illustrated in Figure 4.10 41 Graduation Thesis Analysis Chapter 4: Experiment Results, Findings and Figure 10 Save unknown questions to unknown question sheet in excel To end the conversation with the system, the user will say "cảm ơn" and the system will automatically end the program In addition, students also compared the speed between the two virtual assistant methods gtts and pyttsx3 This comparison will be based on using the same speech recognition and stable Internet speed As seen in Figures 4.11 and 4.12, the response speed of pyttsx3 is much faster than that of gtts due to the advantages that pyttsx3 mentioned in chapter Figure 11 Relative calculation of response speed of gtts library 42 Graduation Thesis Analysis Chapter 4: Experiment Results, Findings and Figure 12 Relative calculation of response speed of pyttsx3 library 4.5 Chatbot 4.5.1 Create Training Data The student collects training data and put it in a JSON file with the structure shown in Figure 4.13 Figure 13 Training data made by the student As can be seen, the intents will contain a lot of tags Each tag will represent a Q&A topic that will include many patterns and responses This structure is very suitable 43 Graduation Thesis Chapter 4: Experiment Results, Findings and Analysis for training because each person will have a different way of asking the same question, so the student will collect many different ways of asking for better training data 4.5.2 NLP Basics First, the system will collect all the patterns in the training dataset and perform the tokenize step Figure 4.14 describes this process in detail Figure 14 Tokenize all questions from data file Next, the system will remove punctuation marks, and convert them all to lowercase The result is shown in Figure 4.15 Figure 15 Lowercase all word tokenized and remove characters Then the system will remove the duplicate words and rearrange them in alphabetical order The result is shown in Figure 4.16 44 Graduation Thesis Analysis Chapter 4: Experiment Results, Findings and Figure 16 All words after remove duplicate word and sorted Finally, the system will carry out the bag of words step with each pattern Each question will be a separate binary array so that corresponding to that value will output the desired tag An example is given in Figure 4.17 Figure 17 Example of the bag of words for all patterns 4.5.3 Complete chatbot NLP processing results will be fed into the Feed-forward neural network for training with the expectation that each bag of words will produce a corresponding tag From there, when the user enters the question, the system also conducts the NLP basics steps as above and compares which bag of words has the most similar ratio Students will experiment and choose the appropriate threshold for the system to answer If the question has a lower rate than the threshold, then the system will record that question into the unknown dataset just like in the above-mentioned voice communication In addition, to make it more convenient to communicate with users, the student has built an interface for chatbot shown in Figure 4.18 45 Graduation Thesis Analysis Chapter 4: Experiment Results, Findings and Figure 18 Chatbot interface 4.6 User interface After completing two functions, communication by talking and communication by chatting The student designed an interface to give the user options when coming and using this service robot The interface image is shown in Figure 4.19 Figure 19 User interface designed by the student This interface is built in C# language, connecting two separate python programs into two buttons "Talk" and "Chat" to bring convenience to users 46 Graduation Thesis Development CHAPTER 5: DEVELOPMENT Chapter CONCLUSIONS 5: Conclusions AND and Directions DIRECTIONS of OF 5.1 Conclusion After researching and completing, the project has solved some problems raised by the student about face detection, face recognition and identification, face mask detection, speech recognition, voice assistant, and chatbot Besides, this topic is a small study contributing to the field of using robots to assist people in the university environment in particular and other fields in general In addition, in the face of the current COVID pandemic situation, it is necessary to limit human-to-human contact, so the use of a robot to support students can both help answer questions for students and meet other requirements However, this project still has many limitations to be solved:  In the face recognition and identification section, the system will misrecognize or not recognize when the image training data from the user is small, poor quality, and with few facial expressions In addition, the image when taken from the camera to compare with the training data may be affected by the brightness of the surrounding environment, making the image quality unstable  In the face mask detection part, the accuracy of the recognition depends on the brightness of the surrounding environment and calculates the appropriate threshold value Therefore, this process will require the system designer to adjust every time the ambient light changes  In the speech recognition part, because it depends on the Internet for voice recognition, if it is in a high noise environment and the Internet speed is unstable, it will cause the system to recognize the question incorrectly, affecting the answer of the system 5.2 Direction of development In the future, students will improve the system in the following ways:  Improve the accuracy of face recognition, identification, and face mask detection so that it is less affected by environmental light factors and has high stability  Improved speech recognition so that it does not depend on the Internet and has good anti-interference  Make suggestions to users about answers that are not in the data set, such as looking up information on the website 47 Graduation Thesis REFERENCES [1] Z al Barakeh, S Alkork, A S Karar, S Said, and T Beyrouthy, “Pepper humanoid robot as a service robot: A customer approach,” BioSMART 2019 Proceedings: 3rd International Conference on Bio-Engineering for Smart Technologies, Apr 2019, doi: 10.1109/BIOSMART.2019.8734250 [2] D Gries and F B Schneider, “Computer Vision: Algorithms and Applications.” [Online] Available: www.springer.com/series/3191 [3] M M Sani, K A Ishak, and S A Samad, “Evaluation of face recognition system using support vector machine,” SCOReD2009 - Proceedings of 2009 IEEE Student Conference on Research and Development, pp 139–141, 2009, doi: 10.1109/SCORED.2009.5443223 [4] NVIDIA: “Jetson Nano,” Nvidia.com, May 20, 2022 [5] Pham Dinh Khanh: “Thuật toán HOG (Histrogram of oriented gradient)”, phamdinhkhanh.github.io, June 6,2022 [6] Simplilearn: “Speech Recognition in Python”, simplilearn.com, july 12, 2022 [7] SAS Insights: “Natural Language Processing (NLP)”, sas.com, july 3, 2022 [8] David Amos: “The Ultimate Guide To Speech Recognition With Python’, realpython.com, july 11, 2022 [9] Khuyen Tran: “What is PyTorch? Think about Numpy, but with strong GPU acceleration”, towardsdatascience.com, June 6,2022 [10] XUAN HIEP: “Framework Pytorch”, itguru.vn, June 6,2022 [11] Nguyen Van Gieu: “Pandas Python Tutorial”, viblo.asia, April 21, 2022 [12] Hybrid Technologies: “SỰ PHÁT TRIỂN CỦA TRỢ LÝ ẢO”, jobs.hybridtechnologies.vn, july 3, 2022 [13] Tung Phat: “ChatBox gì?”, tungphat.com, july 7, 2022 [14] Corinne Bernstein: “Face Detection”, techtarget.com, july 2, 2022 [15] Machine Learning Project: “AI Chatbot in Python (using NLTK): How to build a chatbot?”, pykit.org, May 22, 2022 [16] THEMEGAZINE.CA: “Chatbot Statistics: Which Industries Are Using Chatbots the Most?”, themegazine.ca, May 20, 2022 [17] Md Zubair: “Write a Few Lines of Code and Detect Faces, Draw Landmarks from Complex Images”, towardsdatascience.com, july 18, 2022 [18] Gaurav Maindola: “Face Detection with HAAR Cascade in OpenCV Python”, machinelearningknowledge.ai, june 10, 2022 48 Graduation Thesis [19] Kelvin Salton Prado: "Face Recognition: Understanding LBPH Algorithm", june 13, 2022 [20] Sovit Ranjan: "Face Landmark Detection using Dlib", debuggercafe.com, june 3, 2022 [21] gTTS: "GTTS Documentation", gtts.readthedocs.io, june 3, 2022 [22] Patrick Loeber: "Chat Bot With PyTorch - NLP And Deep Learning", pythonengineer.com, june 20, 2022 [23] Khang: "Understanding Feedforward Neural Networks", vankhangfet.github.io, june 20, 2022 49 S K L 0

Ngày đăng: 25/05/2023, 12:21

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w