1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Ripple down rules for question analysis

79 11 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY  NGUYEN QUOC DAT RIPPLE DOWN RULES FOR QUESTION ANALYSIS Major: Computer Science Code: MASTER THESIS Supervised by: Dr Pham Bao Son Hanoi - 2011 Ripple Down Rules for Question Analysis Nguyen Quoc Dat Faculty of Information Technology University of Engineering and Technology Vietnam National University, Hanoi Supervised by Dr Pham Bao Son A thesis submitted in fulfillment of the requirements for the degree of Master of Science in Computer Science August 2011 ORIGINALITY STATEMENT ‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at University of Engineering and Technology (UET/Coltech) or any other educational institution, except where due acknowledgement is made in the thesis Any contribution made to the research by others, with whom I have worked at UET/Coltech or elsewhere, is explicitly acknowledged in the thesis I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.’ rd Hanoi, August 23 , 2011 Signed i ABSTRACT For the task of turning a natural language question into an explicit intermediate representation of the complexity in question answering systems, all published works so far use rule-based approach to the best of our knowledge We believe that it is because of the complexity of the representation and the variety of question types and also there are no publicly available corpora of a decent size In these rule-based approaches, the process of creating rules is not discussed It is clear that manually creating the rules in an ad-hoc manner is very expensive and error-prone This thesis firstly describes an ad-hoc method to convert Vietnamese natural language questions into intermediate representation elements over semantic annotations via grammar rules Importantly, this thesis focuses on proposing a language independent approach on the process of creating those rules manually, in a way that consistency between rules is maintained and the effort to create a new rule is independent of the size of the current rule set Experimental results are promising to show that our language independent approach is easy to adapt for a new domain and a new language Publications: ? Dat Quoc Nguyen, Dai Quoc Nguyen and Son Bao Pham Systematic Knowledge Acquisition for Question Analysis In Proc of the 8th International Conference on Recent Advances in Natural Language Processing (RANLP 2011) ? Dat Quoc Nguyen, Dai Quoc Nguyen, Son Bao Pham and Dang Duc Pham Ripple Down Rules for Part-Of-Speech Tagging In Proc of 12th International Conference on Intelligent Text Process-ing and Computational Linguistics (CICLING 2011), Springer-Verlag LNCS, part I, pp 190-201 ? Dai Quoc Nguyen, Dat Quoc Nguyen and Son Bao Pham A Vietnamese question answering system In Proc of the 2009 International Conference on Knowledge and Systems Engineering (KSE 2009), IEEE CS, pp 26 32 ii ACKNOWLEDGEMENTS First and foremost, I would like to express my deepest gratitude to my supervisor, Dr Pham Bao Son, for his patient guidance and continuous support throughout the years He always appears when I need help, and responds to queries so helpfully and promptly I would like to give my honest appreciation to my brother, Nguyen Quoc Dai, for his great support I would like to specially thank Prof Bui The Duy and my colleagues for their help through my time at Human Machine Interaction Laboratory, UET/Coltech I would also like to thank my friend, Nguyen Le Trang, for her kindly help I sincerely acknowledge the Vietnam National University, Hanoi, NAFOSTED Viet-nam, Toshiba Foundation Scholarship, and especially Dr Pham Bao Son for sup-porting finance to my master study Finally, this thesis would not have been possible without the support and love of my mother and my father Thank you! iii To my family ~ iv Table of Contents Introduction Literature review 2.1 Question analysis in question answering systems 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.2 GATE 2.2.1 2.2.2 Single Classification Ripple Down R 2.3 Our Question Answering System Architecture 3.1 3.2 3.3 3.4 3.5 Systematic Knowledge Acquisition for Question Analysis v Introduction Preprocessing module Syntactic analysis module 3.3.1 3.3.2 3.3.3 Semantic analysis module Answer retrieval component vi TABLE OF CONTENTS 4.1 4.2 4.3 Recall Intermediate Representation of an input question Rule language Knowledge Acquisition Process Evaluation 5.1 5.2 Question Analysis for Vietnamese Question Analysis for English Conclusion A Definitions of question-class types B Definitions of question-structures C Intermediate Representation Elements of English questions D Embedding Java code in JAPE List of Figures 2.1 2.2 2.3 2.4 2.5 Parse tree of question which rock contains magnesium? The syntactic-semantic tree example Aqualog’s architecture GATE’s architecture A set of Token annotations in GATE 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 Architecture of our question answering system An example of intermediate representation element An example of redefining the TokenVn annotation NounPhrase annotations QU-E-L-MC and QUTerm annotations Relation between phrases Relation annotations Question structures 4.1 4.2 Question analyzer’s GUI Question processing component to create the intermediate tion of question tr÷íng ⁄i håc Cỉng Ngh» câ bao nhi¶u sinh many students are there in the College of Technology? ) C.1 C.2 C.3 C.4 C.5 C.6 C.7 C.8 Question-structure of Definition Question-structure of UnknTerm Question-structure of UnknRel Question-structure of Normal Question-structure of Affirm Question-structure of ThreeTerm Question-structure of Affirm_3Term Question-structure of And vii Appendix C Figure C.6: Question-structure of ThreeTerm Figure C.7: Question-structure of Affirm_3Term 52 Appendix C Figure C.8: Question-structure of And Appendix C Figure C.9: Question-structure of And (2) 54 Appendix C Figure C.10: Question-structure of And (3) Appendix C Figure C.11: Question-structure of And (4) 56 Appendix C Figure C.12: Question-structure of Or Appendix C Figure C.13: Question-structure of Clause 58 Appendix C Figure C.14: Question-structure of Clause (2) Appendix D Embedding Java code in JAPE Phase: EditYesnoAnno Input: TokenVn Split Options: control = appelt Macro: YESNO /* Macro YESNO is used to match question-word phrases ph£i khỉng , óng khỉng , câ óng l , câ ph£i l , câ óng , câ ph£i , Câ óng , Câ ph£i , Câ óng l , and Câ ph£i l These phrases means is that , is this , are these , are those in English */ ( ( ({TokenVn.string == "ph£i"}|{TokenVn.string == " óng"}) ? {TokenVn.string == "khỉng"} ) | ( ({TokenVn.string == "Câ"} | {TokenVn.string == "câ"}) ({TokenVn.string == " óng"} | {TokenVn.string == "ph£i"}) ({TokenVn.string == "l "})? ) ) Rule: editYesNoTerm Priority: 50 ( YESNO ):ynSet 99K 59 60 Appendix D { // Retrieve YesNoSet annotations from the LHS side gate.AnnotationSet YesNoSet = (gate.AnnotationSet)bindings.get("ynSet"); / Create a new list to hold YesNoSet annotations List listTerm = new ArrayList(YesNoSet); / Get an iterator of the annotations over created list Iterator termIter = (Iterator)listTerm.iterator(); / Declare variables gate.Annotation yesnoAnn; gate.FeatureMap yesnoAnnFeatures; String string = ""; / Get feature map while(termIter.hasNext()){ yesnoAnn = (gate.Annotation)termIter.next(); yesnoAnnFeatures = (gate.FeatureMap)yesnoAnn.getFeatures(); string += (String)yesnoAnnFeatures.get("string") + " "; } / Create features gate.FeatureMap features = Factory.newFeatureMap(); features.put("string", string.trim()); features.put("category", "questionword"); features.put("type", "YesNo"); /* Remove all of old TokenVn annotations corresponding with words in the phrase that LHS matched */ inputAS.removeAll(YesNoSet); /* Create a new annotation TokenVn annotating the matched phrase */ outputAS.add(YesNoSet.firstNode(), YesNoSet.lastNode(), "TokenVn", features); } Bibliography I Androutsopoulos, G Ritchie, and P Thanisch Masque/sql: an efficient and portable natural language query interface for relational databases In Proceedings of the 6th international conference on Industrial and engineering applications of artificial intelligence and expert systems, pages 327 330, 1993 Ion Androutsopoulos, Graeme Ritchie, and Peter Thanisch Natural language interfaces to databases an introduction Natural Language Engineering, 1:29 81, 1995 Paolo Atzeni, Roberto Basili, Dorte Haltrup Hansen, Paolo Missier, Patrizia Paggio, Maria Teresa Pazienza, and Fabio Massimo Zanzotto Ontology-based question answering in a federation of university sites: The MOSES case study In Proceedings of 9th International Conference on Applications of Natural Languages to Information Systems, NLDB 2004, pages 413 420, 2004 Van Dur Benjamin, Yifen Huang, Anna Kupsc, and Eric Nyberg Towards light semantic processing for question answering In Proceedings of the HLT-NAACL 2003 workshop on Text meaning - Volume 9, pages 54 61, 2003 Noam Chomsky Syntactic Structures Mouton, The Hague, 1957 Philipp Cimiano, Peter Haase, Jorg Heizmann, Matthias Mantel, and Rudi Studer Towards portable natural language interfaces to knowledge bases - the case of the orakel system Data Knowl Eng., 65:325 354, 2008 Stephen Clark, Mark Steedman, and James R Curran Object-extraction and question-parsing using ccg In Proceedings of the SIGDAT Conference on Empirical Methods in Natural Language Processing, pages 111 118, 2004 William W Cohen, Pradeep Ravikumar, and Stephen E Fienberg A comparison of string distance metrics for name-matching tasks In Proceedings of IJCAI-03 Workshop on Information Integration, pages 73 78, 2003 P Compton and R Jansen A philosophical basis for knowledge acquisition Knowledge Aquisition, 2(3):241 257, 1990 61 62 Bibliography Paul Compton and Bob Jansen Knowledge in context: A strategy for expert system maintenance In Proceedings of the second Australian joint conference on Artificial intelligence, volume 406, pages 292 306, 1988 Hammish Cunningham, Diana Maynard, Kalina Bontcheva, and Valentin Tablan GATE: A Frame-work and Graphical Development Environment for Robust NLP Tools and Applications In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics, pages 168 175, 2002 Danica Damljanovic, Valentin Tablan, and Kalina Bontcheva A text-based query interface to owl ontologies In Proceedings of 6th Language Resources and Evaluation Conference, 2008 Christiane D Fellbaum WordNet: An Electronic Lexical Database MIT Press, 1998 A Galea Open-domain surface-based question answering system In Proceedings of the Computer Science Annual Workshop (CSAW), 2003 Sanda Harabagiu, Dan Moldovan, Marius Pasca, Rada Mihalcea, Mihai Surdeanu, Zvan Bunescu, Roxana Girju, Vasile Rus, and Paul Morarescu Falcon: Boosting knowledge for answer engines In Proceedings of the Ninth Text REtrieval Conference, pages 479 488, 2000 Sanda M Harabagiu, Steven J Maiorano, and Marius A Pasca Open-domain textual question answering techniques Natural Language Engineering, 9(3):231 267, 2003 Zhiheng Huang, Marcus Thint, and Zengchang Qin Question classification using head words and their hypernyms In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’08, pages 927 936, 2008 John Judge, Yuqing Guo, Gareth J F Jones, and Bin Wang An analysis of question processing of english and chinese for the ntcir cross-language question answering task 2005 Daniel Jurafsky and James H Martin Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition (Second Edition) Prentice Hall, 2008 Boris Katz Annotating the world wide web using natural language In Proceedings of the 5th RIAO Conference on Computer Assisted Information Searching on the Internet - RIAO 1997, pages 136 159, 1997 Boris Katz, Gary C Borchardt, and Sue Felshin Syntactic and semantic decomposition strategies for question answering from multiple resources In Proceedings of the AAAI 2005 Workshop on Inference for Textual Question Answering, pages 35 41, 2005 Boris Katz, Gary C Borchardt, and Sue Felshin Natural language annotations for question answering In Proceedings of the 19th International Florida Artificial Intelligence Research Society Conference, pages 303 306, 2006 Bibliography Krystle Kocik Question classification using maximum entropy models Master’s thesis, University of Sydney, 2004 Wei Li Question classification using language modeling Technical report, In CIIR Technical Report: University of Massachusetts, 2002 Xin Li and Dan Roth Learning question classifiers In Proceedings of the 19th international conference on Computational linguistics - Volume 1, COLING ’02, pages Association for Computational Linguistics, 2002 Xin Li and Dan Roth Learning question classifiers: the role of semantic information Natural Language Engineering, 12(3):229 249, 2006 Vanessa Lopez, Victoria Uren, Enrico Motta, and Michele Pasin Aqualog: An ontology-driven question answering system for organizational semantic intranets Web Semantics: Science, Ser-vices and Agents on the World Wide Web, 5(2):72 105, 2007 Christopher D Manning and Hinrich Schutze Foundations of statistical natural language process-ing MIT Press, Cambridge, MA, USA, 1999 Christopher D Manning, Prabhakar Raghavan, and Hinrich Schtze Introduction to Information Retrieval Cambridge University Press, New York, NY, USA, 2008 Donald Metzler and W Bruce Croft Analysis of statistical question classification for fact-based questions Inf Retr., 8:481 504, May 2005 ISSN 1386-4564 Wu Min and Strzalkowski Tomek Utilizing entity relation to bridge the language gap in crosslingual question answering system In Proceedings of the 6th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-Lingual Information Access, 2006 D Moldovan, S Harabagiu, R Girju, P Morarescu, F Lacatusu, A Novischi, A Badulescu, and O Bolohan Lcc tools for question answering In Voorhees and Buckland, editors, Proceedings of the 11th Text REtrieval Conference (TREC-2002), 2002 Anh Kim Nguyen and Huong Thanh Le Natural language interface construction using seman-tic grammars In Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence, pages 728 739, 2008 Dai Quoc Nguyen, Dat Quoc Nguyen, and Son Bao Pham A vietnamese question answering system In Proceedings of the 2009 International Conference on Knowledge and Systems Engineering, pages 26 32, 2009 Dat Quoc Nguyen, Dai Quoc Nguyen, and Son Bao Pham Systematic knowledge acquisition for question analysis In Proceedings of 8th International Conference on Recent Advances in Natural Language Processing, (In press), September, 2011a 64 Bibliography Dat Quoc Nguyen, Dai Quoc Nguyen, Son Bao Pham, and Dang Duc Pham Ripple down rules for part-of-speech tagging In Proc of 12th International on Conference Computational Linguistics and Intelligent Text Processing, pages 190 201, 2011b Ahad Niknia and Leila Sharif Hassanabadi A question answering system based on grammatical structure matching In Proceedings of the IADIS International Conference Applied Computing 2009, pages 165 172, 2009 Dang Duc Pham, Giang Binh Tran, and Son Bao Pham A hybrid approach to vietnamese word segmentation using part of speech tags In Proceedings of the 2009 International Conference on Knowledge and Systems Engineering, pages 154 161, 2009 Son Bao Pham and Achim Hoffmann Efficient knowledge acquisition for extracting temporal relations In Proceeding of the 17th European Conference on Artificial Intelligence, pages 521 525, 2006 T.T Phan and T.C Nguyen Question semantic analysis in vietnamese qa system In Edited book "Advances in Intelligent Information and Database Systems" of The 2nd Asian Conference on Intelligent Information and Database Systems (CIIDS2010), pages 29 40, 2010 Ana-Maria Popescu, Oren Etzioni, and Henry Kautz Towards a theory of natural language interfaces to databases In Proceedings of the 8th international conference on Intelligent user interfaces, IUI ’03, pages 149 157, 2003 Debbie Richards Two decades of ripple down rules research Knowledge Engineering Review, 24 (2):159 184, 2009 Ashish Kumar Saxena, Ganesh Viswanath Sambhu, Saroj Kaushik, and L Venkata Subramaniam Iitd-ibmirl system for question answering using pattern matching, semantic type and semantic category recognition In Proceedings of The Sixteenth Text REtrieval Conference, 2007 Sanjay Silakari, Mahesh Motwani, and Neelu Nihalani Natural language interface for database: A brief review IJCSI International Journal of Computer Science Issues, 8:600 608, 2011 Eriks Sneiders Automated question answering using question templates that cover the conceptual model of the database In Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers, NLDB ’02, pages 235 239, 2002 Niculae Stratica, Leila Kosseim, and Bipin C Desai Nlidb templates for semantic parsing In Pro-ceedings of the 8th International Conference on Applications of Natural Language to Information Systems, pages 235 241, 2003 Valentin Tablan, Daina Maynard, Kalina Bontcheva, and Hamish Cunningham Gate an application developer’s guide http: // gate ac uk/ sale/ pg/ pg pdf , 2004 Bibliography Marjorie Templeton and John Burger Problems in natural-language interface to dbms with exam-ples from eufid In Proceedings of the first conference on Applied natural language processing, pages 16, 1983 M Vargas-Vera and E Motta An ontology-driven similarity algorithm Technical report, Knowledge Media Institute, The Open University, 2004 David L Waltz An english language question answering system for a large relational database Commun ACM, 21:526 539, July 1978 W A Woods, Ron Kaplan, and Nash B Webber The LUNAR sciences natural language informa-tion system: Final report Technical Report BBN Report No 2378, Bolt Beranek and Newman, 1972 Min Wu, Xiaoyu Zheng, Michelle Duan, Ting Liu, and Tomek Strzalkowski Question answering by pattern matching, web-proofing, semantic form proofing In Proceedings of the Twelfth Text REtrieval Conference (TREC 2003), pages 578 585, 2003 Dell Zhang and Wee Sun Lee Question classification using support vector machines In Proceed-ings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 26 32, 2003 Copyright c 2011 by Nguyen Quoc Dat Printed and bound by Nguyen Quoc Dat ... Classification Ripple Down Rules 2.3 Single Classification Ripple Down Rules Ripple Down Rules (RDR) (Compton and Jansen, 1988, 1990; Richards, 2009) were developed to allow users incrementally add rules. .. the question- structure and question- class, the best semantic answer will be returned Chapter Systematic Knowledge Acquisition for Question Analysis Unlike existing approaches for question analysis. .. Patterns Engine ANNIE A New-Nearly Information Extraction RDR Ripple Down Rules SCRDR Single Classification Ripple Down Rules QC Question Classification SVM Support Vector Machine SRW Semantically

Ngày đăng: 11/11/2020, 22:20

TỪ KHÓA LIÊN QUAN

w