Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 81 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
81
Dung lượng
4,37 MB
Nội dung
MINISTRY OF EDUCATION AND TRAINING HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY AND EDUCATION GRADUATION THESIS AUTOMATION AND CONTROL ENGINEERING TECHNOLOGY DESIGN CHAT-BOT FOR PLANNING TASK BY DIALOGFLOW AND GOOGLE CALENDAR SUPERVISOR:TRẦN NGUYEN MANH HUNG STUDENT : TRAN MINH DUC STUDENT ID: 14151028 STUDENT : HANH QUANG HIEP STUDENT ID: 14151034 S K L0 7 HO CHI MINH CITY, 18 JULY 2018 an THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom – Happiness *** Ho Chi Minh City, 18 July 2018 TASK OF THE THESIS Full name of student 1: Tran Minh Duc Full name of student 2: Thanh Quang Hiep Major: Automation and Control Engineering Technology Full name of supervisor: Nguyen Manh Hung, Ph.D Student ID: 14151028 Student ID: 14151034 Class: 14151CLC Phone number: 0937421386 Name of the thesis: Design chat-bot for planning task by Dialogflow and Google Calendar Initial figures and documents: The content of the thesis: - Choosing the approach to extract information from text requests (NLTK, Stanford Core NLP, and Dialogflow) Manually collect data for training task Designing the intent system for planning task in Dialogflow Choosing the training methods for the agent Using Google Calendar APIs to tasks in Google Calendar Building a back-end system to process the data from Dialogflow and Google Calendar API Evaluating the effectiveness NLU model and comparing agent with Google Assistant Product: - Dialogflow Agent - Planning Assistant program Dean of Faculty Supervisor Faculty of High Quality - HCMUTE an THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom – Happiness ******* SUPERVISOR’S COMMENT SHEET Full name of student 1: Tran Minh Duc Full name of student 2: Thanh Quang Hiep Major: Automation and Control Engineering Technology Full name of supervisor: Nguyen Manh Hung, Ph.D Student ID: 14151028 Student ID: 14151034 Class: 14151CLC Name of the thesis: Design chat-bot for planning task by Dialogflow and Google Calendar COMMENT About thesis’s contents: Advantage: Disadvantage: Propose defending thesis? Rating: Mark:……………….(In writing: ) Ho Chi Minh City, Supervisor Faculty of High Quality - HCMUTE an July 2018 THE SOCIALIST REPUBLIC OF VIETNAM Independence – Freedom – Happiness ******* REVIEWER’S COMMENT SHEET Full name of student 1: Tran Minh Duc Full name of student 2: Thanh Quang Hiep Major: Automation and Control Engineering Technology Full name of supervisor: Nguyen Manh Hung, Ph.D Student ID: 14151028 Student ID: 14151034 Class: 14151CLC Name of the thesis: Design chat-bot for planning task by Dialogflow and Google Calendar COMMENT About thesis’s contents: Advantage: Disadvantage: Propose defending thesis? Rating: Mark:……………….(In writing: ) Ho Chi Minh City, July 2018 Reviewer iii an ABSTRACT Virtual assistants (or chatbot) are very popular nowadays The technology behind these is Natural Language Processing (NLP), which is a specialized area of Artificial Intelligence (AI) For example, some of the most famous assistants are Siri of Apple, Google Assistant of Google and Alexa of Amazon People use digital assistants to daily tasks like setting alarm, making a call or searching some information In this project, we create an assistant for scheduling plans that can more tasks that Google Assistant does Specifically, Plan Assistant can get requests from users to set up, delete or change a plan on Google Calendar It also has a system that recommends a suitable schedule to users so users can finish their tasks smoothly This chat-bot breaks a limitation of Google Assistant that it can set a plan for a wider range of time comparing with just only day when set by Google Assistant In addition, this app is very suitable for managers because they can set multi tasks which are very reasonable and keep track of their plan everywhere with Google Calendar To build this application, we use Dialogflow, a toolkit of Google for building chat-bots, and Google Calendar API, a toolkit of Google for process calendars in Google Calendar, then combine them with our back-end system Overall, processing tasks on Google Calendar work really well with the help of Google API In contrast, due to lack of data, Plan Assistant still have many limitations when it can just answer some simple requests and can get errors with small changes in sentence structure In the future, the understanding of chat-bots will be improved very much due to Deep Learning with its Recurrent Neural Networks (RNN) and at that time, we can have some chat-bots that talk like humans iv an ACKNOWLEDGMENT We are deeply thankful to Dr Nguyen Manh Hung for his support and consultation in giving us ideas, solutions, and knowledge to run our project With his help, we have a good general comprehension of AI and NLP to pursue our future career goal Finally, thank you all of my friends who are our motivation for our study during this time v an TABLE OF CONTENTS TASK OF THE THESIS i SUPERVISOR’S COMMENT SHEET ii REVIEWER’S COMMENT SHEET iii ABSTRACT iv ACKNOWLEDGMENT v TABLE OF CONTENTS vi LIST OF FIGURES ix LIST OF TABLES xi LIST OF ABBREVIATIONS xii Section 1: INTRODUCTION 1.1 CHAT-BOT 1.1.1 Definition .1 1.1.2 History of chat-bots 1.1.3 Applications: 1.1.4 Chat-bot technology: 1.2 MACHINE LEARNING 1.2.1 Definition: .5 1.2.2 History of Machine Learning: .5 1.2.3 Advantage of Machine Learning: 1.2.4 Type of Machine Learning tasks: 1.2.5 Application of Machine Learning: 11 1.3 NATURAL LANGUAGE PROCESSING 12 1.3.1 Definition .12 1.3.2 Extracting information from text: 13 1.4 Google Calendar 16 1.4.1 Introduction 16 1.4.2 Features 17 vi an 1.5 APPLICATION PROGAMMING INTERFACE (API) .18 1.5.1 Definition .18 1.5.2 API nowadays 19 Section 2: MATERIALS AND METHODS 20 2.1 SYSTEM STRUCTURE 20 2.2 DIALOGFLOW 21 2.2.1 Introduction 21 2.2.2 Advantage of Dialogflow: 21 2.2.3 Case study: Domino’s pizza .22 2.2.4 How Dialogflow chat-bot works? 24 2.2.5 Features 25 2.3 Google Calendar API 38 2.3.1 Overview .38 2.3.2 Authorization 39 2.3.3 Methods for event in Google Calendar 40 2.4 BACK-END SYSTEM 41 2.4.1 Overall 41 2.4.2 Libraries: 42 2.4.3 Functions: .42 2.4.2 Functions: .47 Section 3: RESULT .49 3.1 DEMO 49 3.1.1 Feature: 50 3.2 TEST RESULTS: 56 Section 4: DISCUSSION 63 4.1 NLU SYSTEM’S TRAINING LIMITATION 63 4.2 FUTURE WORKS 64 Section 5: CONCLUSION 65 vii an Section 6: REFERENCES 66 viii an LIST OF FIGURES Figure 1.1.1 Early Chat-bot components Figure 1.1.2.a Generative model architecture Figure 1.1.2.b Retrieval-based models architecture Figure 1.2.1 The traditional approach Figure 1.2.2 The ML-based approach Figure 1.2.3 Regression example Figure 1.2.4 Classification example: Spam mail filter Figure 1.2.5 Clustering example Figure 1.3.1 The simple pipeline of information extraction Figure 1.4.1 Google Calendar UI Figure 1.5.1 API definition Figure 2.1.1 System structure Figure 2.2.1 Domino’s pizza, a big brand in the pizza business Figure 2.2.2 Domino’s chat-bot example Figure 2.2.3 Dialogflow chat-bot model architecture Figure 2.2.4 Handling of a user request An agent encompasses the Dialogflow components (Note: DB: Database) Figure 2.2.5 Dialogflow’s first agent UI Figure 2.2.6 The setting screen for an agent Figure 2.2.7 Training phrases section Figure 2.2.8 Editing entities Figure 2.2.9 Response section in Intent Figure 2.2.10 Contexts section Figure 2.2.11.a Contexts in intent Set alarm Figure 2.2.11.b Contexts in intent Set alarm-cancel ix an Duplicated event: If one event given by the user is duplicated, the chatbot will then show the plan of that day, so that user can choose to delete the old one or just set the plan at the different time Google Calendar and information extracted remain the same (as shown in Figure 3.1.6) Figure 3.1.6 Interaction for duplicated event 3.1.1.3.2 Remove events: To remove events, the user needs to provide the exact date and timeS to start searching for plans in the period time and then delete it If not provided with timeE, timeE will be set to default which is 23:59:59 the same day, which means you will delete everything that day from timeS If not provided with timeS, chatbot will ask for it The user needs to confirm to start removing plans Example: I have created multi plans for July 14th (today) Then I ask “Delete plan today” Example and its result are shown in Figure 3.1.7 Figure 3.1.7.a Google Calendar before removing 54 an Figure 3.1.7.b Google Calendar after removing Google Calendar: All plans listed on July 14th from 9:30 am is removed 3.1.3.3 Check event: You can also show what you have for the day by giving the chatbot the day you want to see (as shown in Figure 3.1.8) Figure 3.1.8 Comparing chatbot and Google Calendar 55 an 3.2 TEST RESULTS: In this part, we use the method based on the test method in [23] to evaluate our NLU system Table 3.1: Test results (Hybrid training) Time Dialogflow Date Dialogflow Datetime back-end Intent confidence scores Make a plan at T 4am T T T 0.92 Make a plan at T T T T 0.92 Make a plan T am T T T Make a plan 7am T T T T 0.92 Call me up at T T T T Go to school T from 7am to 4pm T T T 0.93 Make an alarm at T 7am T T T Make alarm T T T T 0.92 Alarm 6am T T T T Watch a movie T at 19:00 T T T 0.96 Meeting at 2pm T T T T Conference T meeting at 13:00 T T T 0.96 Create a plan F tomorrow 4am N/A N/A N/A User message Intent 56 an Make an alarm T 10am T T T 0.92 Pick Harry up at T 20:00 T T T 0.96 T T T 0.96 T T T 0.54 N/A N/A N/A N/A N/A N/A T T T 0.83 T T T T T T T T 0.9 T N/A N/A N/A 0.53 T T T T T N/A N/A N/A F N/A N/A N/A T T T T Pick up son at T 4pm Notify me about the meeting at T 14:00 tomorrow I want to make a F date Arrange meeting a T I wanna go T shoping at 7pm I want alarm at 8am tomorrow I will get a gift from my friend next Monday 9am I need you to m ake a plan for m e, to write 4000 words in days o’clock 30 meeting Meeting o’clock 30 tomorrow July 20th, 7am go to school Meeting o’clock 30 57 an Intent true percentage: 88% Time true percentage: 78% Date true percentage: 78% The results taken from user message is changed following the current time which has been included in the test result chart Table 3.2: Test result (ML only) Time Dialogflow Date Dialogflow Datetime back-end Intent confidence scores Make a plan at T 4am T T T Make a plan at T T T T Make a plan T am T T T Make a plan 7am T T T T Call me up at T T T T Go to school T from 7am to 4pm T T T 0.97 Make an alarm at T 7am T T T 2Make alarm T T T T Alarm 6am T T T T Watch movie at T 19:00 T T T Meeting at 2pm T T T User message Intent T 58 an Conference T meeting at 13:00 T T T Create a plan F tomorrow 4am N/A N/A N/A Make an alarm T 10am T T T Pick Harry up at T 20:00 T T T T T T T T T 0.56 N/A N/A N/A 0.74 N/A N/A N/A T T T 0.93 T T T T T T F F 0.82 F N/A N/A N/A T F F F T N/A N/A N/A Pick up son at T 4pm Notify me about the meeting at T 14:00 tomorrow I want to make a T date Arrange meeting a T I wanna go T shoping at 7pm I want alarm at 8am tomorrow I will get a gift from my friend next Monday 9am I need you to m ake a plan for m e, to write 4000 words in days o’clock 30 meeting Meeting o’clock 30 tomorrow 59 an July 20th, 7am F go to school Meeting T o’clock 30 Intent true percentage: 88% N/A N/A N/A T T T Time true percentage: 70% Date true percentage: 70% Note: The results taken from user message is changed following the current time which has been included in the test result chart Evaluate: Two different training methods output different results ML only has slightly better intent confidence scores than the Hybrid method but has difficulty in recognizing unseen expression Generally, the Hybrid method has a better result With certain built-in rules, the app is able to recognize more natural message than it is without those rules Although there are still some mistakes in recognizing date/time, that can be improved through data collecting from users Google Assistant is Google’s service to assist android user a simple task with their Android devices One of the features available is adding a clock alarm The following chart comparing results talking the same phrase to both our chat-bot and Google Assistant Table 3.3: Comparing to google assistant: User message App Make a plan at T 4am Google Assistant Make a plan at T F Call me up at T F Go to school T from 7am to 4pm F Make an alarm at T 7am T F 60 an Make alarm T T Alarm 6am T T Watch movie at T 19:00 F Meeting at 2pm T F Conference T meeting at 13:00 F Create a plan F tomorrow 4am F Make an alarm T 10am F Pick Harry up at T 20:00 F Pick up son at T 4pm Notify me about the meeting at T 14:00 tomorrow I wanna go T shoping at 7pm I want alarm at T 8am tomorrow o’clock 30 T meeting Meeting o’clock 30 T tomorrow July 20th, 7am F go to school Meeting 9o’clock 30 T F F F F F F F F 61 an Comment: Following Table 3.3, Google assistant can not detect expression as well as our agent does Google assistant requires “alarm” and specific time for it to comply Natural language most of the time will get the user to google search service Remember, our evaluating method doesn’t mean that we have a better model than Google’s one The reason for this greater performance is that we just focus on a domain, in the other word, Google Assistant is still smarter than ours because it works on a variety of domains 62 an Section 4: DISCUSSION 4.1 NLU SYSTEM’S TRAINING LIMITATION Although NLU Dialogflow’s pre-built model is much powerful, it turns out that there have been issues that it hasn’t solved well, especially when it deals with a specifical task Specifically, in Table 3.1, we can see that the NLU model becomes vulnerable with expressions without the word “at” For example, even though we have trained our model with the phrase “Create a plan tomorrow 4am” when we try similar phrase such as “Create a plan tomorrow 10pm”, our chat-bot put the intent in the wrong intention However, when we changed the word “pm” into “am” and got the phrase “Create a plan tomorrow 10am”, our agent did very well (as shown in Figure 4.1.1) Figure 4.1.1 Wrong recognition with phrases without “at” These results were checked with Hybrid training method at the threshold is 0.5 Although these expressions contain wrong grammar and typos, it is unarguable that human can still understand it, even with the agent in some cases We not know the reason behind this, but we can solve this problem by putting more data on that types in our training 63 an 4.2 FUTURE WORKS There are many potential features that we want to implement in our agent to improve its performance in planning task Below here is the list of those features: Multiple platforms and environment: In our project, only Google Calendar is the part that can exist in multiple platform and environment Users can only use our chat-bot feature to set up plan in their laptop If we can apply our agent in phones or webs, users will get more options and we can collect more data features supporting our program Time estimator Because users’ behaviors are different so the estimated time for a plan is never fixed If we can collect more data features such as the user’s location by GPS, our agent can estimate a more suitable time for a specific task For example, user A wants to attend a conference at location 1, and then attend another conference at location Our current agent only set hour between two events by default if the user does not prove the time duration However, with time estimator, the agent can recommend user A suitable arrival time so A can make sure all of his/her plans are on time Build a model from scratch Building a model in Dialogflow leads to some limitations We can not know what their model’s architecture exactly and fine-tune the hyperparameters Actually, it is a trade-off if we build a model from scratch because it may take us a lot of resources such as time or energy consumption Also, data limitation is also an issue we should concern 64 an Section 5: CONCLUSION The main purpose of our study is to try to apply AI in our common life, especially chat-bot Dialogflow is a really powerful tool for building chat-bot which helped us a lot when we not have a background in computer science For our agent, it meets the bottom line when it can basic tasks in Google Calendar like inserting and removing events However, we believe it can better with some features that we will integrate into it in the future 65 an Section 6: REFERENCES [1] Oxford University Press, "English Oxford Living Dictionaries," 2018 [Online] Available: https://en.oxforddictionaries.com/definition/chatbot [2] A Turing, "The Turing Test," Mind, vol LIX, no 236, pp 433 - 460, October 1950 [3] J Rembach, "The Different Generations of Chatbot Technology," 12 Dec 2017 [Online] Available: http://blog.rul.ai/3-different-generations-chatbot-technology [4] Anadea, "What is a Chatbot and How to Use It for Your Business," Jan 2018 [Online] [5] P B B a A Følstad, "Why people use chatbots," in 4th International Conference on Internet Science, Oslo, 2017 [6] L F M R Donald J Stoner, "Simulating Military Radio Communications Using Speeching Recognition and Chat-Bot Technology," The Titan Corporation, p 3, 2004 [7] P Surmenok, "Chatbot Architecture," 12 Sep 2016 [Online] Available: https://medium.com/@surmenok/chatbot-architecture-496f5bf820ed [8] B Marr, "A Short History of Machine Learning Every Manager Should Read," 19 Feb 2016 [Online] Available: https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-ofmachine-learning-every-manager-should-read/#32dae6d515e7 [9] V H Tiệp, “Bài 35: Lược sử Deep Learning,” 22 Jun 2018 [Trực tuyến] Available: https://machinelearningcoban.com/2018/06/22/deeplearning/ [10] A Géron, Hands-On Machine Learning with Scikit-Learn and TensorFlow, California: O'Reilly, 2017 [11] Daffodil Software, "9 Applications of Machine Learning from Day-to-Day Life," 31 Jul 2017 [Online] Available: https://medium.com/app-affairs/9-applications-of-machine-learning-fromday-to-day-life-112a47a429d0 [12] "Introduction to Natural Language Processing (NLP)," 11 August 2016 [Online] Available: https://blog.algorithmia.com/introduction-natural-language-processing-nlp/ 66 an [13] "Information extraction," [Online] Available: https://en.wikipedia.org/wiki/Information_extraction [14] E K E L Steven Bird, Natural Language Processing with Python, California: O'Reilly, 2009 [15] Y Shao, "Tokenization and Sentence Segmentation," Department of Linguistics and Philology, Uppsala University, 2017 [16] C Trim, "Language Processing," 24 Jan 2013 [Online] Available: https://www.ibm.com/developerworks/community/blogs/nlp/entry/tokenization?lang=en [17] L Zettlemoyer, "Relation Extraction," CSE 517 , p 13, Winter 2013 [18] Google Corporation, "Domino's simplifies ordering pizza using Dialogflow's conversational technology," [Online] Available: https://dialogflow.com/case-studies/dominos/ [19] Google Corporation, "Agents," [Online] Available: https://dialogflow.com/docs/agents [20] D M W Powers, "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation," Journal of Machine Learning Technologies, pp 37-63, 2011 [21] Google Corporation, "Dialogflow SDKs," [Online] Available: https://dialogflow.com/docs/sdks [22] Google Corporation, "OAuth 2.0," [Online] Available: https://developers.google.com/api-clientlibrary/python/guide/aaa_oauth [23] D Dutta, "Developing an Intelligent Chat-bot Tool to assist high school students for learning general knowledge subjects," Georgia Institute of Technology, p 2, 2017 67 an an ... Google for building chat- bots, and Google Calendar API, a toolkit of Google for process calendars in Google Calendar, then combine them with our back-end system Overall, processing tasks on Google. .. Stanford Core NLP, and Dialogflow) Manually collect data for training task Designing the intent system for planning task in Dialogflow Choosing the training methods for the agent Using Google Calendar. .. of the thesis: Design chat- bot for planning task by Dialogflow and Google Calendar Initial figures and documents: The content of the thesis: - Choosing the approach to extract information from