Một số kỹ thuật tìm kiếm thực thể dựa trên quan hệ ngữ nghĩa ẩn và gợi ý truy vấn hướng ngữ cảnh

BỘ GIÁO DỤC VÀ ĐÀO TẠO VIỆN HÀN LÂM KHOA HỌC VÀ CÔNG NGHỆ VIỆT NAM HỌC VIỆN KHOA HỌC VÀ CÔNG NGHỆ Trần Lâm Quân MỘT SỐ KỸ THUẬT TÌM KIẾM THỰC THỂ DỰA TRÊN QUAN HỆ NGỮ NGHĨA ẨN VÀ GỢI Ý TRUY VẤN HƯỚNG NGỮ CẢNH LUẬN ÁN TIẾN SĨ TOÁN HỌC Hà Nội – 2020 BỘ GIÁO DỤC VÀ ĐÀO TẠO VIỆN HÀN LÂM KHOA HỌC VÀ CÔNG NGHỆ VIỆT NAM HỌC VIỆN KHOA HỌC VÀ CÔNG NGHỆ Trần Lâm Quân MỘT SỐ KỸ THUẬT TÌM KIẾM THỰC THỂ DỰA TRÊN QUAN HỆ NGỮ NGHĨA ẨN VÀ GỢI Ý TRUY VẤN HƯỚNG NGỮ CẢNH Chuyên ngành: Cơ sở toán học cho tin học Mã số: 9.46.01.10 LUẬN ÁN TIẾN SĨ TOÁN HỌC NGƯỜI HƯỚNG DẪN KHOA HỌC: T S V ũ T ất T h ắ n g Hà Nội – 2020 i LỜI CAM ĐOAN Tôi xin cam đoan công trình nghiên cứu riêng tơi, hồn thành hướng dẫn TS Vũ Tất Thắng Các kết nêu luận án trung thực chưa cơng bố cơng trình khác Tôi xin chịu trách nhiệm lời cam đoan Hà nội, tháng 12 năm 2020 Tác giả Trần Lâm Quân ii LỜI CẢM ƠN Luận án hồn thành với nỗ lực khơng ngừng tác giả giúp đỡ từ thầy hướng dẫn, gia đình, bạn bè đồng nghiệp Đầu tiên, tác giả xin bày tỏ lời tri ân tới TS Vũ Tất Thắng, Thầy tận tình hướng dẫn tác giả hoàn thành luận án này, Thày kiên trì đặc biệt, định hướng cho nghiên cứu sinh suốt trình nghiên cứu Tác giả xin gửi lời cảm ơn tới Thầy, Cô cán Viện Công nghệ thông tin, Học viện Khoa học Công nghệ (Viện Hàn lâm Khoa học Cơng nghệ Việt Nam) nhiệt tình giúp đỡ tạo môi trường nghiên cứu tốt để tác giả hồn thành cơng trình nghiên cứu; có góp ý xác để tác giả có cơng bố ngày hôm Tác giả xin cảm ơn tới Ban Lãnh đạo Tổng công ty Hàng không Việt Nam (Vietnam Airlines), Trung tâm Nghiên cứu Ứng dụng đồng nghiệp nơi tác giả công tác ủng hộ để luận án hoàn thành Cuối cùng, xin gửi lời cảm ơn đến tất thành viên gia đình, bạn bè ln ủng hộ, chia sẻ, động viên khích lệ tơi học tập, nghiên cứu Hà Nội, tháng 12 năm 2020 Trần Lâm Quân iii MỤC LỤC Trang phụ bìa Lời cam đoan Lời cảm ơn Mục lục Danh mục ký hiệu, chữ viết tắt Danh mục bảng Danh mục hình vẽ, đồ thị MỞ ĐẦU CHƯƠNG 1: TỔNG QUAN 1.1 Bài tốn tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn 1.2 Các nghiên cứu liên quan đến tìm kiếm thực thể dựa ngữ nghĩa ẩ 1.2.1 Lý thuyết ánh xạ cấu trúc (Structure Mapping Theory – SMT) 1.2.2 Mơ hình khơng gian vector (Vector Space Model - VSM) 1.2.3 Phân tích quan hệ tiềm ẩn (Latent Relational Analysis - LRA) 1.2.4 Ánh xạ quan hệ tiềm ẩn (Latent Relational Mapping Engine - LRME) 1.2.5 Quan hệ ngữ nghĩa tiềm ẩn (Latent Semantic Relation – LSR) 1.2.6 Tương đồng quan hệ dựa Wordnet 1.2.7 Mơ hình học biểu diễn vector từ Word2Vec 1.3 Phương pháp tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn với cá liên quan 1.4 Bài toán gợi ý truy vấn hướng ngữ cảnh 1.5 Các nghiên cứu liên quan đến gợi ý truy vấn 1.5.1 Kỹ thuật gợi ý truy vấn dựa phiên (Session-based) 1.5.2 Kỹ thuật gợi ý truy vấn dựa cụm (Cluster-based) 1.6 Phương pháp gợi ý truy vấn dựa hướng ngữ cảnh với nghiên quan 1.7 Các kết đạt luận án CHƯƠNG 2: TÌM KIẾM THỰC THỂ DỰA TRÊN QUAN HỆ NGỮ NGHĨA ẨN 2.1 Bài toán 2.2 Phương pháp tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn iv 2.2.1 Kiến trúc – Mô hình 2.2.2 Thành phần rút trích quan hệ ngữ nghĩa 2.2.3 Thành phần gom cụm quan hệ ngữ nghĩa 2.2.4 Thành phần tính tốn độ tương đồng quan hệ cặp thực thể 2.3 Kết thực nghiệm - Đánh giá 2.3.1 Dataset 2.3.2 Kiểm thử - Điều chỉnh tham số 2.3.3 Đánh giá với độ đo MRR 2.3.4 Hệ thống thực nghiệm 2.4 Kết luận chương CHƯƠNG 3: GỢI Ý TRUY VẤN HƯỚNG NGỮ CẢNH 3.1 Bài toán 3.2 Phương pháp hướng ngữ cảnh 3.2.1 Định nghĩa – Thuật ngữ 3.2.2 Đề dẫn – Ví dụ minh họa 3.2.3 Kiến trúc – Mơ hình 3.2.4 Offline phase 3.2.5 Online phase – Giải thuật gợi ý truy vấn 3.2.6 Phân tích ưu nhược điểm 3.2.7 Các đề xuất kỹ thuật 3.2.8 Kỹ thuật phân lớp kết tìm kiếm dựa dàn khái niệm 3.3 Kết thực nghiệm - Đánh giá 3.3.1 Dataset 3.3.2 Đánh giá, so sánh 3.3.3 Hệ thống thực nghiệm 3.4 Kết luận chương CHƯƠNG 4: KẾT LUẬN VÀ KIẾN NGHỊ 4.1 Kết luận 4.2 Kiến nghị DANH MỤC CƠNG TRÌNH CỦA TÁC GIẢ TÀI LIỆU THAM KHẢO v DANH MỤC CÁC KÝ HIỆU, CÁC CHỮ VIẾT TẮT Ký hiệu CBOW C C C CL C Dataset D FCA F Fe F FC F IRES Im S IR In IRS Im LM L LRME L E LRA L LSR L MRR M NE N PMI P In q Q QLogs Q Q-suggest Q Re R RelSim R RR R SE S SL S Session S SR S SMT S term T mining T VS V VSM V Word2Vec W 91 thực nghiệm sử dụng kỹ thuật AJAX (Asynchronous Javascript And XML) để gửi, xử lý nhận chuỗi ký tự tương tác client - server mà khơng cần tải lại tồn trang Để tiện lợi cho việc tìm kiếm, tiết kiệm thời gian cho người sử dụng, câu tìm kiếm (câu truy vấn) phổ biến có tổ hợp trọng số cao (highest score) gợi ý người sử dụng gõ vào phần câu truy vấn: Hình 3.20: Gợi ý nhanh Phân loại kết (áp dụng dàn khái niệm): Sau câu truy vấn, Máy tìm kiếm (như Google, Bing, Yahoo! Search, Ask, v.v.) thường trả danh sách dài (hàng triệu kết quả) đa chủ đề Nếu người dùng muốn tìm kiếm chuyên sâu lĩnh vực cụ thể, người dùng phải tự xử lý lượng liệu lớn để tìm thơng tin mà họ cần Phân loại, gom tập tài liệu kết vào lĩnh vực cụ thể hạn chế việc thông tin bị vùi lấp danh sách dài, giúp người sử dụng dễ dàng quan sát tập kết quả, đưa định tài liệu thích hợp Hình 3.21: Phân loại kết 92 Các máy tìm kiếm tổng quát nói thu thập liệu từ khơng gian Internet nơi kho liệu khổng lồ, đa ngôn ngữ, nhiều lĩnh vực, đa cấu trúc, định dạng, v.v Kỹ thuật phân loại kết sau tìm kiếm kỹ thuật online Vì yếu tố thời gian - phải trả kết tức thời cho người sử dụng - nên gần không khả thi thực phân loại tài liệu Máy tìm kiếm tổng quát Một trở ngại khác, đáng kể mà toán phân loại online phải vượt qua, gán nhãn (đặt tiêu đề cho chủ đề tương ứng) Tiêu đề phải mô tả đủ ngữ nghĩa dễ hiểu để người dùng lựa chọn Máy tìm kiếm luận án thực tìm kiếm chuyên sâu, miền liệu cụ thể (dữ liệu tác nghiệp Hàng không, "vắng mặt" Internet), lượng tài liệu biết trước, áp dụng giải thuật dựng dàn thực off-line, kết hợp với việc duyệt lại tập nhãn cách thủ công (thêm yếu tố xử lý người), thích hợp cho kỹ thuật phân loại kết trước tìm kiếm 3.4 Kết chương Dưới góc nhìn lý thuyết, Chương trình bày cách tường minh phương pháp hướng ngữ cảnh: tư tưởng, ngun lý, mơ hình, cơng thức thuật toán, v.v nêu lên đề xuất cải thiện kỹ thuật Dưới góc nhìn thực nghiệm, việc cài đặt (các biến, cấu trúc liệu, thuật toán, đáp ứng tức thời gợi ý truy vấn, ) trở nên hoàn toàn khả thi Kết thực nghiệm đưa dạng gợi ý: Gợi ý truy vấn, gợi ý tài liệu gợi ý chủ đề Đóng góp chương bao gồm: 1) Ứng dụng kỹ thuật hướng ngữ cảnh, xây dựng máy tìm kiếm chuyên sâu áp dụng hướng ngữ cảnh miền sở tri thức riêng (dữ liệu hàng không) 2) Đề xuất độ đo tương đồng tổ hợp toán gợi ý truy vấn theo ngữ cảnh nhằm nâng cao chất lượng gợi ý Ngồi ra, chương có đóng góp bổ sung thực nghiệm: i) Tích hợp nhận dạng tổng hợp tiếng nói tiếng Việt tùy chọn vào máy tìm kiếm để tạo thành hệ tìm kiếm có tương tác tiếng nói ii) Áp dụng cấu trúc dàn khái niệm để phân lớp tập kết trả Phương pháp gợi ý truy vấn hướng ngữ cảnh nhánh toán máy tìm kiếm, nhiên vấn đề thiết thực, thu hút quan tâm nghiên cứu 93 rõ ràng tốn khó Nắm vững nguyên lý, cài đặt hiệu phương pháp hướng ngữ cảnh, giải pháp tốt hỗ trợ người sử dụng q trình tìm kiếm thơng tin Máy tìm kiếm tiếng Việt áp dụng phương pháp hướng ngữ cảnh hứa hẹn đem đến kết đột biến, thú vị hiệu lĩnh vực gợi ý truy vấn Việc phát tri thức tiếp tục đặt nhiều vấn đề nội Query Logs cịn chứa nhiều tri thức tiềm ẩn, ví dụ liệu {IP, query}: phản ánh lịch sử người dùng (user’s history) khai phá để tìm kiếm cá nhân hóa (personalized search) hay gợi ý truy vấn cá nhân hóa (personalized query suggestion); Hay khai phá liệu cặp {URL, title} để tìm kết liên quan Hoặc khai phá đồ thị phía để tìm mối quan hệ tài liệu – truy vấn dù tập tài liệu (tập đỉnh U), tập truy vấn (tập đỉnh Q) khơng có terms chung: Nếu tập tài liệu D’ thường xuyên click đọc tập queries Q’, terms Q’ liên quan mạnh đến terms D’ Cũng vậy, gợi ý truy vấn phân loại tập kết thực chất tiến trình riêng biệt, cần nghiên cứu áp dụng tính tốn song song 94 CHƯƠNG 4: KẾT LUẬN VÀ KIẾN NGHỊ Trong phần kết luận, tác giả tóm lược lại kết đóng góp luận án Ngồi ra, tác giả trình bày số hạn chế luận án thảo luận hướng phát triển nghiên cứu tương lai 4.1 Kết luận Áp dụng phân tích khái niệm hình thức (FCA – Formal Concept Analysis) cấu trúc dàn khái niệm để khai phá tìm kiếm liệu văn Dàn cấu trúc đẹp mặt tốn học, thích hợp với khai phá, phân tích gom cụm liệu, dàn khơng hồn tồn thích hợp lĩnh vực tìm kiếm Do đó, luận án chuyên sâu hai hướng nghiên cứu chính: i) Tìm kiếm thực thể dựa quan hệ ngữ nghĩa, nhằm mô khả suy thông tin/tri thức chưa biết suy diễn tương tự, khả “tự nhiên” người; ii) Gợi ý truy vấn hướng ngữ cảnh xét chuỗi truy vấn liền mạch nhằm nắm bắt ý định tìm kiếm, sau đưa xu hướng mà tri thức số đông thường hỏi sau truy vấn hành Đóng góp luận án gồm: Với phương pháp Tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn, nhằm giải toán thứ nhất: - Luận án nghiên cứu, xây dựng kỹ thuật tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn sử dụng phương pháp phân cụm nhằm nâng cao hiệu tìm kiếm Với phương pháp Gợi ý truy vấn Hướng ngữ cảnh, mục đích giải toán thứ hai: - Ứng dụng kỹ thuật hướng ngữ cảnh, xây dựng máy tìm kiếm chuyên sâu áp dụng hướng ngữ cảnh miền sở tri thức riêng (dữ liệu hàng không) 95 - Đề xuất độ đo tương đồng tổ hợp toán gợi ý truy vấn theo ngữ cảnh nhằm nâng cao chất lượng gợi ý 4.2 Kiến nghị Với hướng nghiên cứu Tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn, nhận thấy mơ hình tìm kiếm bị cứng hóa thực thể đầu vào, nhược điểm Để khắc phục nhược điểm, mặt - xét thêm loại ánh xạ quan hệ, thêm yếu tố thời gian để kết tìm kiếm cập nhật xác Mặt khác, mở rộng tìm kiếm thực thể với truy vấn đầu vào gồm thực thể, ví dụ: “Sơng dài Trung Quốc?”, mơ hình tìm kiếm thực thể dựa ngữ nghĩa ẩn đưa câu trả lời xác: “Trường Giang”, dù Corpus có câu gốc “Trường Giang sông lớn Trung Quốc” Với hướng nghiên cứu Gợi ý truy vấn dựa kỹ thuật hướng ngữ cảnh, mặt, nghiên cứu vài thiếu sót chí khuyết điểm, lọc nhiễu âm đầu vào để cải thiện chất lượng nhận dạng, áp dụng học máy để tối ưu tham số α, β, γ cách tính độ tương đồng tổ hợp phương pháp tìm kiếm hướng ngữ cảnh Mặt khác, nghiên cứu biến thể tương đồng quan hệ RelSim (Relational Similarity) [100], nghiên cứu phương pháp kết hợp Word2Vec, Doc2Vec, Word embeddings [101] … cho máy tìm kiếm Hướng phát triển, luận án tập trung vào nghiên cứu áp dụng thuật tốn thích nghi, mơ hình thống kê, thành phần cốt lõi hệ thống xử lý ngôn ngữ tự nhiên 96 DANH MỤC CƠNG TRÌNH CỦA TÁC GIẢ Trần Lâm Qn - Vũ Tất Thắng “Tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn” Hội thảo Quốc gia lần thứ XXI: Một số vấn đề chọn lọc Công nghệ Thông tin Truyền thông (27-28/07.2018) Trần Lâm Quân - Vũ Tất Thắng “Search for entities based on the Implicit Semantic Relations” Tạp chí Tin học Điều khiển 2019 (Volume 35, Number 2019) Trần Lâm Quân - Đỗ Quốc Trường - Phan Đăng Hưng - Đinh Anh Tuấn - Phi Tùng Lâm - Vũ Tất Thắng - Lương Chi Mai “A study of applying Vietnamese voice interaction for a context-based Aviation search engine” The IEEE RIVF 2013 International Conference on Computing and Communication Technologies 10-13.11.2013 Trần Lâm Quân – Vũ Tất Thắng “Context-aware and voice interactive search” (the SoCPaR 2013 special issue) Journal of Network and Innovative Computing ISSN 2160-2174 Volume 2, pages 233-239, 2014 Trần Lâm Quân - Phan Đăng Hưng - Vũ Tất Thắng “Tìm kiếm giọng nói với kĩ thuật hướng ngữ cảnh” Tạp chí Khoa học Cơng nghệ - Viện Hàn lâm Khoa học Công nghệ Việt Nam ISSN: 0886 768X Số 52 (1B), 29.06.2014 Trần Lâm Quân - Lê Đức Hiếu - Lê Ngọc Thế - Vũ Tất Thắng “Một cách tiếp cận sử dụng cấu trúc dàn khái niệm để khai phá tìm kiếm liệu văn bản” Hội thảo Quốc gia lần thứ XVII: Một số vấn đề chọn lọc Công nghệ Thông tin Truyền thông 30-31.10.2014 97 TÀI LIỆU THAM KHẢO [1] Christoph Kofler, Martha Larson, Alan Hanjalic, User Intent in Multimedia Search: A Survey of the State of the Art and Future Challenges ACM Journals Computing Surveys, Vol 49, No 2, August 2016 [2] R Song, Z Luo, J.-Y Nie, Y Yu and H.-W Hon, Identification of ambiguous queries in web search Information Processing & Management, 45(2), pages 216– 229, 2009 [3] W Song, Y Liu, L Liu et al., Semantic composition of distributed representations for query subtopic mining Frontiers Inf Technol Electronic Eng 19, 2018 [4] J Xu, F Ye, Query Recommendation Using Hybrid Query Relevance Future Internet, 2018 [5] S Gaou, A Bekkari, The Optimization of Search Engines to Improve the Ranking to Detect User’s Intent In Advanced Information Technology, Services and Systems (AIT2S) 2017 [6] Dirk Lewandowski, Jessica Drechsler, Sonja von Mach, Deriving query intents from web search engine queries Journal of the American Society for Information Science and Technology, September 2012 [7] Imrattanatrai, Wiradee & Kato, Makoto & Tanaka, Katsumi & Yoshikawa, Masatoshi, Entity Ranking for Queries with Modifiers Based on Knowledge Bases and Web Search Results In IEICE Transactions on Information and Systems, 2018 [8] Li, Jing & Sun, Aixin & Han, Ray & Li, Chenliang, A Survey on Deep Learning for Named Entity Recognition In IEEE Transactions on Knowledge and Data Engineering, 2020 [9] H Cao, D Jiang, J Pei, Q He, Z Liao, E Chen and H Li, Towards context- aware search by learning a very large variable length hidden markov model from search logs In Proceedings of the 18th international conference on World wide web, pages 191–200, April 2009 [10] H Cao, D Jiang, J Pei, Q He, Z Liao, E Chen, E and H Li, Context-aware query suggestion by mining click-through and session data In Proceedings of KDD, pages 875-883, 2008 98 [11] Peter D Turney, The latent relation mapping engine: Algorithm and experiments Journal of Artificial Intelligence Research (JAIR), 33, pages 615-655, 2008 [12] Dedre Gentner, Structure-mapping: A Theoretical Framework for Analogy Elsevier Cognitive Science, Volume 7, Issue 2, pages 155-170, April–June 1983 [13] Peter D Turney, M.L Littman, Corpus-based Learning of Analogies and Semantic Relations Machine Learning, 60(1–3), pages 251–278, 2005 [14] Peter D Turney, Distributional semantics beyond words: Supervised learning of analogy and paraphrase Transactions of the Association for Computational Linguistics (TACL), 1, pages 353-366, 2013 [15] Peter D Turney and P Pantel, From frequency to meaning: Vector space models of semantics Journal of Artificial Intelligence Research (JAIR), 37, pages 141-188, 2010 [16] Peter D Turney, Similarity of semantic relations Computational Linguistics, 32(3), 2006 [17] Bollegala, Danushka & Matsuo, Yutaka & Ishizuka, Mitsuru, Measuring the Similarity between Implicit Semantic Relations from the Web Proceedings of WWW, pages 651-660, 2009 [18] Duc, N., Bollegala et al., Cross-Language Latent Relational Search: Mapping Knowledge across Languages In Association for the Advancement of AI, 2011 [19] Kato et al., Query by analogical example: relational search using web search engine indices In Proceedings of the 18th ACM conference on Information and knowledge management ACM, 2009 [20] Y.J Cao et al., Relational Similarity Measure: An Approach Combining Wikipedia and WordNet Journal of Applied Mechanics and Materials, 2011 [21] E Agirre, E Alfonseca, K Hall, J Kravalova, M Pasca and A Soroa, A study on similarity and relatedness using distributional and wordnet-based approaches In NAACL ’09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 19–27, 2009 [22] Mikolov et al., Distributed Representations of Words and Phrases and their Compositionality In Advances in Neural Information Processing Systems 26 (NIPS), 99 2013 [23] Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov, Enriching Word Vectors with Subword Information Transactions of the Association for Computational Linguistics, Vol 5, pages 135-146, 2017 [24] Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, Tomas Mikolov Learning Word Vectors for 157 Languages In LREC (Language Resources and Evaluation) Feb 19, 2018 [25] Tomas Mikolov et al., Efficient Estimation of Word Representations in Vector Space In ICLR (Workshop Poster), 2013 [26] Kata Gábor, Haïfa Zargayouna, Isabelle Tellier, Davide Buscaldi, Thierry Charnois, Exploring Vector Spaces for Semantic Relations In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1814–1823, 2017 [27] Hugo Caselles-Dupré, Florian Lesaint, Jimena Royo-Letelier, Word2vec applied to recommendation: hyperparameters matter In Proceedings of the 12th ACM Conference on Recommender Systems, pages 352–356, September 2018 [28] S Yilmaz, S Toklu, A deep learning analysis on question classification task using Word2vec representations Neural Comput & Applic 32, pages 2909–2928, 2020 [29] Prajakta Shinde, Pranjali Joshi, Survey of various query suggestion system, International Journal of Engineering And Computer Science ISSN:2319-7242; Volume Issue 12, pages 9576-9580, December 2014 [30] Susan Dumais, Personalized search: potential and pitfalls, In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, October 2016 [31] Jinyoung Kim, Jaime Teevan, Nick Craswell, Explicit In Situ User Feedback for Web Search Results SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pages 829–832, July 2016 [32] Sørig, Esben; Collignon, Fiebrink and Kando, Evaluation of Rich and Explicit Feedback for Exploratory Search In Second Workshop on Evaluation of Personalisation in Information Retrieval (WEPIR), March, 2019 100 [33] Thorsten Joachims et al., Accurately Interpreting Clickthrough Data as Implicit Feedback SIGIR, Volume 51, Issue 1, June 2017 [34] Edward Rolando Núñez-Valdéz et al., Implicit feedback techniques on recommender systems applied to electronic books Computers in Human Behavior Volume 28, Issue 4, ScienceDirect, 2012 [35] Gai Li and Qiang Che, Exploiting Explicit and Implicit Feedback for Personalized Ranking Hindawi Publishing Corporation - Mathematical Problems in Engineering, Article ID 2535329, 11 pages, 2016 [36] Keping Bi, Choon Hui Teo, Yesh Dattatreya, Vijai Mohan, W Bruce Croft Leverage Implicit Feedback for Context-aware Product Search In SIGIR 2019 eCom, Paris, France, July 2019 [37] W Chen, F Cai, H Chen, M De Rijke, Personalized query suggestion diversification In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 817–820, 2017 [38] W Chen, F Cai, H Chen et al., Personalized query suggestion diversification in information retrieval Springer Link, Front Comput Sci 14, 143602, 19 December 2019 [39] C Bouhini, M Géry and C Largeron, Personalized information retrieval models integrating the user's profile IEEE Tenth International Conference on Research Challenges in Information Science (RCIS), Grenoble, pages 1-9, 2016 [40] Hiteshwar Kumar Azad, Akshay Deepak, A new approach for query expansion using Wikipedia and WordNet Elsevier, Information Sciences Volume 492, pages 147-163, August 2019 [41] Hiteshwar Kumar Azad, Akshay Deepak Query expansion techniques for information retrieval: A survey Elsevier Information Processing & Management Volume 56, Issue 5, pages 1698-1735, September 2019 [42] Claveau, Vincent, Kijak, Ewa, Distributional thesauri for information retrieval and vice versa In Language and Resource Conference, LREC, 2016 [43] Q Chen, L Yao and J Yang, Short text classification based on LDA topic model International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, pages 749-753, 2016 [44] J Xu, F Ye, Query Recommendation Using Hybrid Query Relevance Future 101 Internet Journals Volume 10, Issue 11, 2018 [45] Wanyu Chen, Fei Cai, Honghui Chen, Maarten de Rijke, Personalized Query Suggestion Diversification In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 817– 820, August 2017 [46] Choudhary, Durga & Chandra, Subhash, Adaptive Query Recommendation Techniques for Log Files Mining to Analysis User’s Session Pattern In International Journal of Computer Applications, 2016 [47] J Guo, X Zhu, Y Lan et al., Modeling users’ search sessions for high utility query recommendation Information Retrieval Journal 20, 2017 [48] Lingling Meng, A Survey on Query Suggestion International Journal of Hybrid Information Technology Vol 7, No 6, 2014 [49] Bai, Lu, Jiafeng Guo, Xueqi Cheng, Xiubo Geng and Pan Du, Exploring the Query-Flow Graph with a Mixture Model for Query Recommendation SIGIR Workshop on Query Representation and Understanding, July 2011 [50] P Boldi, F Bonchi, C Castillo, D Donato, A Gionis and S Vigna, The query- flow graph: model and applications In Proceeding of the 17th ACM conference on Information and knowledge management (CIKM’08), pages 609–618, 2008 [51] P Boldi, F Bonchi, C Castillo, D Donato, A Gionis and S Vigna, Query Suggestions Using Query-Flow Graphs In Proceedings of the 2009 workshop on Web Search Click Data (WSCD ’09), pages 56–63, Feb 9, 2009 [52] Xinbao Shao, Qingshan Li, Yishuai Lin, Boyu Zhou, A meta-search group recommendation mechanism based on user intent identification In Proceedings of the 6th International Conference on Software and Computer Applications (ICSCA '17), pages 102–106, February 2017 [53] E Sadikov, J Madhavan, L.Wang and A Halevy, Clustering query refinements by user intent In Proceedings of the International World Wide Web Conference (WWW’10), pages 841–850, 2010 [54] Saxena et al., A Review of Clustering Techniques and Developments Article in Neurocomputing, July 2017 [55] T Sajana, C M Sheela Rani and K V Narayana, A Survey on Clustering Techniques for Big Data Mining Indian Journal of Science and Technology, Vol 102 9(3), January 2016 [56] Parth Ritin Saraiya et al., Study of Clustering Techniques in the Data Mining Domain In International Journal of Computer Science and Mobile Computing, Vol.7 Issue.11, pages 31-37, November 2018 [57] K Sathiyakumari, G Manimekalai, V Preamsudha and M P Scholar, A survey on various approaches in document clustering Int J Comput Technol, pages 1534– 1539, 2011 [58] Manpreet Kaur, Usvir Kaur, A Survey on Clustering Principles with K-means Clustering Algorithm Using Different Methods in Detail IJCSMC, Vol 2, Issue 5, pages 327 – 331, May 2013 [59] Gursharan Saini, Harpreet Kaur, K-Mean Clustering and PSO: A Review International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-3, Issue-5, June 2014 [60] Hamada M Zahera, Gamal F El Hady, F Waiel, Abd El-Wahed, Query Recommendation for Improving Search Engine Results In Proceedings of the World Congress on Engineering and Computer Science (WCECS), October 20-22, 2010 [61] Naeem, Arshia; Rehman, Mariam; Anjum, Maria; Asif, Muhammad, Development of an efficient hierarchical clustering analysis using an agglomerative clustering algorithm Current Science (00113891), Vol 117 Issue 6, pages 10451053, 9/25/2019 [62] Dhiliphanrajkumar Thambidurai, Suruliandi Aandavar and Selvaperumal Prakasam Query Recommendation by Coupling Personalization with Clustering for Search Engine I.J Information Technology and Computer Science, pages 82-91, 11/2016 [63] W Wu, H Li, and J Xu, Learning query and document similarities from clickthrough bipartite graph with metadata In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, 2013 [64] L Noce, I Gallo and Zamberletti, A Query and Product Suggestion for Price Comparison Search Engines based on Query-product Click-through Bipartite Graphs In Proceedings of the 12th International Conference on Web Information Systems and Technologies (WEBIST 2016) - Volume 1, pages 17-24, 2016 103 [65] Sébastien Harispe, Sylvie Ranwez, Stefan Janaqi, and Jacky Montmain, Semantic Similarity from Natural Language and Ontology Analysis Synthesis Lectures on Human Language Technologies, Vol 8, No (Arxiv, 167 pages), May 2015 [66] Slimani, Thabet, Description and Evaluation of Semantic Similarity Measures Approaches International Journal of Computer Applications Vol 80 25-33 10.5120/13897-1851, 2013 [67] Christoph Lofi, Measuring Semantic Similarity and Relatedness with Distributional and Knowledge-based Approaches Information and Media Technologies, Volume 10, Issue Online ISSN 1881-0896, pages 493-501, September 15, 2015 [68] N Craswell, Mean Reciprocal Rank In Encyclopedia of Database Systems Springer, Boston, MA, 2009 [69] Yao, Yuan et al., DocRED: A Large-Scale Document-Level Relation Extraction Dataset ACL (Association for Computational Linguistics), 2019 [70] Michele Banko and Oren Etzioni, The Tradeoffs Between Open and Traditional Relation Extraction In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, ACL 2008, Columbus, Ohio, USA, pages 28-36, 2008 [71] Yun Liu, Mingxin Li, Hui Liu, Junjun Cheng, Yanping Fu, Research of Unsupervised Entity Relation Extraction Journal of Computers Vol 30 No 1, pages 31-41, 2019 [72] Bollegala et al., Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web In Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pages 151-160, Raleigh, North Carolina, USA, 2010 [73] Parapar, Javier & Losada, David & Presedo-Quindimil, Manuel & Barreiro, Alvaro Using score distributions to compare statistical significance tests for information retrieval evaluation Journal of the Association for Information Science and Technology 71 (10.1002/asi.24203), 2019 [74] Christopher D Manning, Prabhakar Raghavan, and Hinrich Schutze, Introduction to Information Retrieval Cambridge University Press, 2008 104 [75] W Chen, F Cai, H Chen et al., Personalized query suggestion diversification in information retrieval Front Comput Sci 14, 143602, 2020 [76] Trần Lâm Quân, Vũ Tất Thắng, Kỹ thuật gợi ý truy vấn hướng ngữ cảnh tốn tìm kiếm Hội thảo Quốc gia lần thứ XV: Một số vấn đề chọn lọc Công nghệ Thông tin Truyền thông, 03-04.12.2012 [77] Z Liao, D Jiang, E Chen, P Pei, H Cao, H Li, Mining Concept Sequences from Large-Scale Search Logs for Context-Aware Query Suggestion ACM Trans Intell Syst Technol 9, 4, Article 87, 40 pages, 2011 [78] T Ruotsalo, G Jacucci & S Kaski, Interactive faceted query suggestion for exploratory search: Whole-session effectiveness and interaction engagement Journal of the Association for Information Science and Technology, 2019 [79] Souvick Ghosh, Chirag Shah, Session-based Search Behavior in Naturalistic Settings for Learning-related Tasks In CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pages 2449– [80] Sowmya Yalamanchili (IBM), Log mining in Query recommendation International Journal of Information Technology & Systems, Vol 4; No 1: ISSN: 2277-9825, 2015 [81] X Fei, S Zheng, L Yan and C Fan, A improved sequential pattern mining algorithm based on PrefixSpan World Automation Congress (WAC), Rio Grande, 2016 [82] Zhengshen Jiang, Hongzhi Liu, Bin Fu, Zhonghai Wu, Tao Zhang, Recommendation in Heterogeneous Information Networks based on Generalized Random Walk Model and Bayesian Personalized Ranking In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (WSDM '18), pages 288–296, February 2018 [83] Gao, J., et al., Smoothing clickthrough data for web search ranking SIGIR'09, pages 355-362, 2009 [84] C Rasell and M Szummer, Random walks on the click graph In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval(SIGIR’07), pages 239-246, 2007 [85] M Shajalal, M Z Ullah, A N Chy and M Aono, Query subtopic 105 diversification based on cluster ranking and semantic features International Conference On Advanced Informatics: Concepts, Theory And Application (ICAICTA), George Town, pages 1-6, 2016 [86] Xiaofei, Zhu., et al., A unified framework for recommending diverse and relevant queries In Proceedings of the 20th international conference on World wide web (WWW '11), pages 37–46, March 2011 [87] Wanyu Chen, Fei Cai, Honghui Chen, Maarten de Rijke, Personalized Query Suggestion Diversification In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17), pages 817–820, August 2017 [88] Claudio Carpineto, Sergei O Kuznetsov, Amedeo Napoli, Formal Concept Analysis Meets Information Retrieval Workshop co-located with the 35th European Conference on Information Retrieval (ECIR), 2013 [89] Frano Škopljanac-Mačina, Bruno Blašković, Formal Concept Analysis – Overview and Applications, ScienceDirect, 24th DAAAM International Symposium on Intelligent Manufacturing and Automation, 2013 [90] Larry González, Aidan Hogan, Modelling Dynamics in Semantic Web Knowledge Graphs with Formal Concept Analysis In Proceedings of the 2018 World Wide Web Conference (WWW '18), pages 1175–1184, April 2018 [91] A Abid, M Rouached & N Messai, Semantic web service composition using semantic similarity measures and formal concept analysis Multimed Tools Appl 79, 6569–6597, Dec 2019 [92] Claudio Carpineto and Giovanni Romano, Using Concept Lattices for Text Retrieval and Mining In Formal Concept Analysis, pages 161-179, 2005 [93] Singh, Prem & Cherukuri, Aswani Kumar, Concept lattice reduction using different subset of attributes as information granules In Granular Computing Springer International Publishing Switzerland 2016 [94] Bernhard Ganter, Sebastian Rudolph, Gerd Stumme, Explaining Data with Formal Concept Analysis, Springer International Publishing, 2019 [95] Nizar Messai, Marie-Dominique Devignes, Amedeo Napoli, and Malika Smail-Tabbone, BR-Explorer: An FCA-based algorithm for Information Retrieval Fourth International Conference, CLA, 2006 ... pháp tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn với cá liên quan 1.4 Bài toán gợi ý truy vấn hướng ngữ cảnh 1.5 Các nghiên cứu liên quan đến gợi ý truy vấn 1.5.1 Kỹ thuật gợi ý truy vấn dựa phiên... nghiên cứu giải vấn đề tìm kiếm thực thể dựa quan hệ ngữ nghĩa gợi ý truy vấn hướng ngữ cảnh Đóng góp luận án gồm: 1) Xây dựng kỹ thuật tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn sử dụng phương... - Phương pháp tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn - Phương pháp gợi ý truy vấn hướng ngữ cảnh 4 Phạm vi luận án Với phương pháp tìm kiếm thực thể dựa quan hệ ngữ nghĩa ẩn, liệu thử nghiệm

Định dạng
Số trang	123
Dung lượng	2,59 MB