Phát triển các mô hình dựa trên mạng nơ ron cho phân tích quan điểm theo khía cạnh

ĐẠI HỌC QUỐC GIA HÀ NỘI TRƯỜNG ĐẠI HỌC CÔNG NGHỆ ——————— PHẠM ĐỨC HỒNG PHÁT TRIỂN CÁC MƠ HÌNH DỰA TRÊN MẠNG NƠ-RON CHO PHÂN TÍCH QUAN ĐIỂM THEO KHÍA CẠNH LUẬN ÁN TIẾN SĨ NGÀNH KHOA HỌC MÁY TÍNH Hà Nội - 2019 ĐẠI HỌC QUỐC GIA HÀ NỘI TRƯỜNG ĐẠI HỌC CÔNG NGHỆ ——————— PHẠM ĐỨC HỒNG PHÁT TRIỂN CÁC MƠ HÌNH DỰA TRÊN MẠNG NƠ-RON CHO PHÂN TÍCH QUAN ĐIỂM THEO KHÍA CẠNH Chuyên ngành: Khoa học máy tính Mã số: 9480101.01 LUẬN ÁN TIẾN SĨ NGÀNH KHOA HỌC MÁY TÍNH NGƯỜI HƯỚNG DẪN KHOA HỌC: PGS.TS Lê Anh Cường Hà Nội - 2019 LỜI CẢM ƠN Luận án tiến sĩ ngành Khoa học Máy tính Chính phủ Việt Nam hỗ trợ phần kinh phí thơng qua Đề án 911 thực Bộ mơn Khoa học máy tính, Khoa Cơng nghệ thông tin, Trường Đại học Công nghệ, Đại học Quốc gia Hà Nội hỗ trợ mặt thủ tục Phòng Tài kế tốn, Trường Đại học Cơng nghệ Bên cạnh có hỗ trợ kinh phí cơng bố có số SCIE, SCI Trường Đại học Điện lực đề tài NAFOSTED, mã số 102.01-2014.22 thuộc Quỹ Phát triển khoa học công nghệ Quốc gia Tôi xin chân thành cảm ơn đơn vị, tổ chức giúp đỡ thời gian nghiên cứu Luận án có hợp tác hỗ trợ cá nhân, người đóng góp nhiều q trình hồn thành vấn đề nghiên cứu luận án Trước hết xin chân thành cảm ơn PGS.TS Lê Anh Cường trực tiếp hướng dẫn, giúp đỡ ln sẵn lòng tạo điều kiện thuận lợi cho tơi q trình học tập nghiên cứu Tôi xin gửi lời cảm ơn chân thành tới PGS.TS Hoàng Xuân Huấn, PGS.TS Phan Xuân Hiếu, TS Nguyễn Văn Vinh, TS Lê Nguyên Khôi, TS Nguyễn Bá Đạt, TS Nguyễn Thị Ngọc Điệp (Trường Đại học Công nghệ, Đại học Quốc gia Hà Nội), PGS.TS Lê Thanh Hương (Trường Đại học Bách khoa Hà Nội), TS Nguyễn Thị Minh Huyền (Trường Đại học Khoa học Tự nhiên, Đại học Quốc gia Hà Nội), PGS TS Trần Đăng Hưng (Trường Đại học Sư Phạm Hà Nội), TS Đặng Thị Thu Hiền (Trường Đại học Thủy lợi) góp ý chân thành thẳng thắn, giúp cho luận án tơi hồn thiện tốt Tôi biết ơn chân thành PGS.TS Nguyễn Lê Minh (Viện Khoa học công nghệ Tiên tiến Nhật Bản), TS Trần Quốc Long (Trường Đại học Công nghệ, Đại học Quốc gia Hà Nội) Hai thầy trực tiếp giảng dạy, chia sẻ cho nhiều hiểu biết liên quan đến nội dung nghiên cứu Tôi xin gửi lời cảm ơn đến tất anh, chị, em bạn bè đồng nghiệp, nghiên cứu sinh Bộ môn Khoa học máy tính, Khoa Cơng nghệ thơng tin, Trường Đại học Công nghệ, Đại học Quốc gia Hà Nội giúp đỡ tơi hồn thành kế hoạch thủ tục hành thời gian làm nghiên cứu sinh Tôi muốn cảm ơn đến anh/chị/em đồng nghiệp, giảng viên khoa Công nghệ Thông tin, Trường Đại học Điện lực cổ vũ động viên sát cánh bên tơi suốt q trình nghiên cứu Cuối cùng, tơi muốn nói lời cảm ơn đặc biệt tới vợ Lê Thị Kim Chung, trai Phạm Cơng Phúc dành cho tơi tình u cảm thông, cho phép dành nhiều thời gian, tập trung cho cơng việc nghiên cứu Tơi hết lòng biết ơn bố mẹ tơi tình u cống hiến to lớn để trưởng thành ngày hôm nay, cảm ơn anh, chị, em tình yêu gia đình quan tâm giúp đỡ họ cho công việc LỜI CAM ĐOAN Tôi xin cam đoan luận án kết nghiên cứu tôi, thực hướng dẫn PGS.TS Lê Anh Cường Các nội dung trích dẫn từ nghiên cứu tác giả khác mà tơi trình bày luận án ghi rõ nguồn phần tài liệu tham khảo Phạm Đức Hồng ii Mục lục Lời cảm ơn i Lời cam đoan ii Mục lục iii Danh mục chữ viết tắt vii Danh mục bảng ix Danh mục hình vẽ xi Lời mở đầu 1 Tổng quan vấn đề nghiên cứu 1.1 Giới thiệu toán 1.2 Các tốn phân tích quan điểm 1.2.1 Tổng quan hệ thống phân tích quan điểm 1.2.2 Phân tích quan điểm cho tồn văn 1.2.3 Phân tích quan điểm theo khía cạnh 1.2.4 Các toán phân tích quan điểm theo khía cạnh Các nghiên cứu liên quan 10 1.3.1 Trích xuất từ thể khía cạnh 10 1.3.2 Xác định khía cạnh 11 1.3.3 Phân đoạn khía cạnh 11 1.3.4 Phân loại quan điểm theo khía cạnh 12 1.3 iii 1.3.5 Xếp hạng khía cạnh 12 1.3.6 Xác định hạng trọng số khía cạnh ẩn 13 1.4 Các tiếp cận giải toán 14 1.5 Nghiên cứu giới Việt nam 15 1.6 Thảo luận 16 Kiến thức sở 18 2.1 Các ký hiệu khái niệm liên quan 18 2.2 Các mơ hình học máy sở cho phân tích quan điểm theo khía cạnh 21 2.2.1 Mơ hình hồi quy đánh giá ẩn 21 2.2.2 Thuật tốn xác suất xếp hạng khía cạnh 22 Các mơ hình học biểu diễn mức từ, câu, đoạn/văn 25 2.3.1 Mơ hình Word2Vec 25 2.3.2 Mơ hình GloVe 26 2.3.3 Mơ hình véc-tơ Paragraph 27 2.3.4 Mơ hình mạng nơ-ron tích chập CNN 30 2.3.5 Mơ hình véc-tơ kết hợp 33 Kết luận thảo luận 34 2.3 2.4 Đề xuất mơ hình dựa mạng nơ-ron xác định hạng trọng số khía cạnh thực thể 35 3.1 Giới thiệu 35 3.2 Mơ hình hóa toán 36 3.2.1 Bài tốn xác định hạng trọng số khía cạnh ẩn thực thể 36 3.2.2 Bài toán xác định trọng số khía cạnh chung thực thể 37 Phương pháp đề xuất 38 3.3 3.3.1 3.3.2 Xác định hạng trọng số khía cạnh ẩn thực thể sử dụng mơ hình mạng nơ-ron lớp ẩn 38 Xác định hạng trọng số khía cạnh ẩn thực thể sử dụng mơ hình học biểu diễn đa tầng 45 iv 3.3.3 3.4 3.5 Xác định trọng số khía cạnh chung thực thể sử dụng mơ hình mạng nơ-ron 52 Thực nghiệm 54 3.4.1 Các độ đo 57 3.4.2 Cài đặt mơ hình 58 3.4.3 Kết thực nghiệm 59 3.4.4 Đánh giá 60 3.4.5 Hiệu tham số mơ hình LRNN-ASR 63 Kết luận 66 Học véc-tơ biểu diễn từ cho phân tích quan điểm theo khía cạnh 68 4.1 Giới thiệu 68 4.2 Mơ hình hóa tốn 69 4.2.1 Bài toán tinh chỉnh véc-tơ biểu diễn từ 70 4.2.2 Bài toán học véc-tơ biểu diễn từ 71 Phương pháp đề xuất 72 4.3.1 Mơ hình tinh chỉnh véc-tơ biểu diễn từ 72 4.3.2 Mơ hình học véc-tơ biểu diễn từ SSCWE 77 Thực nghiệm 82 4.4.1 Dữ liệu thực nghiệm độ đo 82 4.4.2 Các độ đo 82 Cài đặt đánh giá mơ hình tinh chỉnh véc-tơ từ WEFT 83 4.5.1 Cài đặt mơ hình 83 4.5.2 Đánh giá mơ hình 83 Cài đặt đánh giá mô hình SSCWE 85 4.6.1 Cài đặt mơ hình 85 4.6.2 Đánh giá mơ hình 85 4.6.3 So sánh hai mô hình WEFT SSCWE 88 Kết luận 90 4.3 4.4 4.5 4.6 4.7 v Mơ hình đa kênh dựa CNN nhằm khai thác đa véc-tơ biểu diễn từ ký tự cho phân tích quan điểm theo khía cạnh 91 5.1 Giới thiệu 92 5.2 Mô tả toán 93 5.3 Phương pháp đề xuất 94 5.3.1 Thành phần tích chập 94 5.3.2 Mơ hình mạng nơ-ron tích chập đa kênh cho phân tích quan điểm theo khía cạnh 95 5.4 5.5 Thực nghiệm 100 5.4.1 Dữ liệu thực nghiệm cài đặt mơ hình MCNN 100 5.4.2 Môi trường thời gian thực nghiệm 101 5.4.3 Đánh giá 101 5.4.4 Hiệu loại tham số 105 Kết luận 109 Kết luận 110 Danh mục công trình khoa học tác giả liên quan đến luận án 112 Tài liệu tham khảo 113 vi Danh mục chữ viết tắt LRNN LRR ASR NNAWs CNN MCNN NLP POS SVM Latent Rating Neural Network (Mạng nơ-ron đánh giá ẩn) Latent Rating Regression (Hồi quy đánh giá ẩn) Aspect Semantic Representation (Biểu diễn ngữ nghĩa khía cạnh) Neural Network Aspect Weights (Mạng nơ-ron trọng số khía cạnh) Convolutional Neural Network (Mạng nơ-ron tích chập) Multichannel Convolutional Neural Network (Mạng nơ-ron đa kênh tích chập) Natural Language Processing (Xử lý ngôn ngữ tự nhiên) Part Of Speech (Nhãn từ loại) Support Vector Machine (Máy véc-tơ hỗ trợ) vii Danh sách bảng 3.1 Các từ hạt nhân lựa chọn cho thuật tốn phân đoạn khía cạnh 55 3.2 Thống kế liệu thực nghiệm 55 3.3 Kết dự đoán hạng khách sạn 59 3.4 Kết xác định trọng số khía cạnh khách sạn 59 3.5 So sánh mơ hình LRNN với phương pháp LRR bốn trường hợp biểu diễn khía cạnh 61 3.6 Top 10 từ có trọng số tích cực tiêu cực khía cạnh 61 3.7 Các kết thực nghiệm so sánh mơ hình việc xác định hạng khía cạnh 62 3.8 Kết so sánh chất lượng trọng số khía cạnh chung 63 3.9 Các kết thực nghiệm trường hợp khởi tạo trọng số khía cạnh 64 3.10 Kết thực nghiệm mơ hình đề xuất sử dụng trọng số khía cạnh chung so với sử dụng riêng 65 4.1 Thống kê tập liệu thứ 82 4.2 Kết xác định khía cạnh 84 4.3 Kết phân loại quan điểm theo khía cạnh 84 4.4 Bốn từ gần ngữ nghĩa với từ cho mơ hình 85 4.5 Các kết xác định khía cạnh 87 4.6 Các kết phân loại quan điểm 87 4.7 Năm từ gần ngữ nghĩa với từ cho mơ hình 88 4.8 So sánh kết phân loại quan điểm mơ hình WEFT SSCWE 89 4.9 So sánh thời gian thực mô hình WEFT SSCWE 90 5.1 Thống kê số lượng câu sử dụng thực nghiệm 100 viii 5.5 Kết luận Trong chương 5, luận án trình bày mơ hình mạng nơ-ron tích chập đa kênh để khai thác đa véc-tơ biểu diễn từ véc-tơ biểu diễn ký tự Mơ hình đề xuất cơng bố cơng trình [1], tạp chí quốc tế International Journal of Approximate Reasoning Mơ hình thực đánh giá thơng qua hai cơng việc phân tích quan điểm theo khía cạnh, gồm xác định khía cạnh phân loại quan điểm khía cạnh Các kết thực nghiệm cho thấy tính hiệu mơ hình đề xuất Đặc biệt thông tin mức ký tự cho thấy vai trò quan trọng việc kết hợp với thông tin mức từ Trong tương lai, định hướng áp dụng mơ hình MCNN cho nhiều tập liệu khác ngôn ngữ khác Mô hình đề xuất thực cho cơng việc khác phân tích quan điểm, dự đốn trọng số khía cạnh xếp hạng khía cạnh 109 KẾT LUẬN Phân tích quan điểm theo khía cạnh toán quan trọng lĩnh vực xử lý ngôn ngữ tự nhiên, học máy, khai phá tri thức, thu hút nhiều nghiên cứu có nhiều ý nghĩa hệ thống thương mại quản trị sản phẩm, dịch vụ, kiện, danh tiếng Luận án tập trung vào phát triển mơ hình dựa học biểu diễn học sâu nhằm cải tiến chất lượng cho hệ thống phân tích quan điểm theo khía cạnh Hai cơng việc cụ thể mà luận án tập trung vào thực bao gồm: (1) xác định hạng (aspect rating) xác định trọng số khía cạnh (aspect weight); (2) xây dựng mơ hình hiệu cho tốn xác định khía cạnh (aspect category detection) phân loại quan điểm theo khía cạnh (aspect based sentiment classification) Chúng tơi đề xuất mơ hình, thực thực nghiệm so sánh với nghiên cứu liên quan nhằm khẳng định độ tin cậy mơ hình đề xuất Các đóng góp luận án bao gồm: • Đề xuất mơ hình mạng nơ-ron xác định hạng trọng số khía cạnh ẩn thực thể Sử dụng véc-tơ biểu diễn khía cạnh học từ mơ hình véc-tơ Paragraph làm đầu vào • Tập trung vào giải thực thể loại đối tượng chia sẻ khía cạnh (ví dụ khách sạn chia sẻ khía cạnh), chúng tơi đề xuất mơ hình mạng nơ-ron xác định trọng số khía cạnh chung thực thể • Cũng với tốn xác định hạng trọng số khía cạnh ẩn, chúng tơi đề xuất mơ hình dựa mạng nơ-ron nhằm biểu diễn phân cấp mức nghĩa ngữ từ mức từ lên mức câu mức đoạn, đồng thời tích hợp với biểu diễn khía cạnh vào mơ hình thống • Đề xuất hai mơ hình học véc-tơ biểu diễn từ: mơ hình thực tinh chỉnh véc-tơ học từ mơ hình Word2Vec Glove; mơ hình học véc-tơ biểu diễn từ gồm hai thành phần: thành phần thiết kế dựa mơ hình Word2Vec thực bắt mối quan hệ ngữ nghĩa từ, thành phần sử dụng thông tin giám sát để bắt lấy thơng tin khía cạnh quan điểm khía cạnh • Đề xuất mơ hình kết hợp nguồn biểu diễn liệu khác nhau, gọi mơ hình đa kênh CNN Từ khai thác đa phiên véc-tơ biểu diễn từ véc-tơ biểu diễn ký tự 110 Tất mơ hình đề xuất thực nghiệm đánh giá chi tiết thông qua tập liệu tiếng Anh, miền liệu thực thể gồm khía cạnh khách hàng thảo luận/đánh giá ý kiến/văn Các kết đạt mơ hình đề xuất tốt nghiên cứu liên quan Đặc biệt với việc sử dụng mơ hình mạng nơ-ron nhiều tầng học biểu diễn xác định hạng trọng số khía cạnh ẩn chứng tỏ hiệu vượt trội so với phương pháp khác Trong tương lai tìm hiểu thực đánh giá mơ hình đề xuất tập liệu tiếng Anh khác Chúng định hướng trọng việc áp dụng mơ hình đề xuất vào hệ thống phân tích liệu thực tế tiếng Việt, liệu Ngân hàng, Chứng khoán, Điện thoại di động 111 Danh mục cơng trình khoa học tác giả liên quan đến luận án [1] Duc-Hong Pham, and Anh-Cuong Le, “Exploiting Multiple Word Embeddings and One-hot Character Vectors for Aspect-Based Sentiment Analysis”, International Journal of Approximate Reasoning (IJAR), 103, 2018, pp 1-10 (ISI-SCI) [2] Duc-Hong Pham, and Anh-Cuong Le, “Learning Multiple Layers of Knowledge Representation for Aspect Based Sentiment Analysis”, Journal: Data&Knowledge Engineering (DKE), 114, 2018, pp 26-39 (ISI-SCIE) [3] Duc-Hong Pham, Thi-Thanh-Tan Nguyen, and Anh-Cuong Le, “Fine-Tuning Word Embeddings for Aspect-based Sentiment Analysis”, Proceedings of the 20th International Conference on Text, Speech and Dialogue (TSD), 2017, pp 500-508 (Rank B1) [4] Duc-Hong Pham, Anh-Cuong Le, and Thi-Kim-Chung Le, “Learning Word Embeddings for Aspect-based Sentiment Analysis”, Proceedings of the 15th International Conference of the Pacific Association for Computational Linguistics (PACLING), 2017, pp 28-40 (Rank B) [5] Duc-Hong Pham, Anh-Cuong Le, and Thi-Thanh-Tan Nguyen, “Determing Aspect Ratings and Aspect Weights from Textual Reviews by Using Neural Network with Paragraph Vector Model”, Proceedings of the 5th International Conference on Computational Social Networks (CSONet), 2016, pp 309-320 [6] Duc-Hong Pham, and Anh-Cuong Le, “A Neural Network based Model for Determining Overall Aspect Weights in Opinion Mining and Sentiment Analysis”, Indian Journal of Science and Technology, 2016, pp 1-6 112 Tài liệu tham khảo [1] H Wang, Y Lu, C Zhai, Latent aspect rating analysis on review text data: A rating regression approach, in: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10, ACM, New York, NY, USA, 2010, pp 783–792 [2] Q V Le, T Mikolov, Distributed representations of sentences and documents, in: ICML, Vol 32 of JMLR Workshop and Conference Proceedings, JMLR.org, 2014, pp 1188–1196 [3] Y Kim, Convolutional neural networks for sentence classification, in: EMNLP, ACL, 2014, pp 1746–1751 [4] T Wong, W Lam, Hot item mining and summarization from multiple auction web sites, in: ICDM, IEEE Computer Society, 2005, pp 797–800 [5] W Jin, H H Ho, New York, NY, USA [6] F Li, C Han, M Huang, X Zhu, Y.-J Xia, S Zhang, H Yu, Structure-aware review mining and summarization, in: Proceedings of the 23rd International Conference on Computational Linguistics, COLING ’10, 2010, pp 653–661 [7] S Poria, E Cambria, A F Gelbukh, Aspect extraction for opinion mining with a deep convolutional neural network, Knowl.-Based Syst 108 (2016) 42–49 [8] X Li, L Bing, P Li, W Lam, Z Yang, Aspect term extraction with history attention and selective transformation, in: Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI’18, AAAI Press, 2018, pp 4194–4200 URL http://dl.acm.org/citation.cfm?id=3304222.3304353 [9] J Zhang, G Xu, X Wang, X Sun, T Huang, Syntax-aware representation for aspect term extraction, in: Q Yang, Z.-H Zhou, Z Gong, M.-L Zhang, S.-J Huang (Eds.), Advances in Knowledge Discovery and Data Mining, Springer International Publishing, Cham, 2019, pp 123–134 113 [10] M Hu, B Liu, Mining and summarizing customer reviews, in: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04, ACM, New York, NY, USA, 2004, pp 168–177 [11] A.-M Popescu, O Etzioni, Extracting product features and opinions from reviews, in: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05, Association for Computational Linguistics, Stroudsburg, PA, USA, 2005, pp 339–346 [12] O Etzioni, M Cafarella, D Downey, A.-M Popescu, T Shaked, S Soderland, D S Weld, A Yates, Unsupervised named-entity extraction from the web: An experimental study, Artif Intell 165 (1) (2005) 91–134 [13] Q Mei, X Ling, M Wondra, H Su, C Zhai, Topic sentiment mixture: modeling facets and opinions in weblogs, in: WWW, ACM, 2007, pp 171–180 [14] Y Wu, Q Zhang, X Huang, L Wu, Phrase dependency parsing for opinion mining, in: EMNLP, ACL, 2009, pp 1533–1541 [15] Z Luo, S Huang, F F Xu, B Y Lin, H Shi, K Zhu, ExtRA: Extracting prominent review aspects from customer feedback, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp 3477–3486 [16] M Dragoni, M Federici, A Rexha, An unsupervised aspect extraction strategy for monitoring real-time reviews stream, Information Processing & Management 56 (3) (2019) 1103 – 1118 [17] G Ganu, N Elhadad, A Marian, Beyond the stars: Improving rating predictions using review text content, in: WebDB, 2009 [18] S Kiritchenko, X Zhu, C Cherry, S Mohammad, Nrc-canada-2014: Detecting aspects and sentiment in customer reviews, in: SemEval@COLING, The Association for Computer Linguistics, 2014, pp 437–442 [19] J J McAuley, J Leskovec, D Jurafsky, Learning attitudes and attributes from multi-aspect reviews, in: 12th IEEE International Conference on Data Mining, ICDM 2012, Brussels, Belgium, December 10-13, 2012, 2012, pp 1020–1025 [20] X Zhou, X Wan, J Xiao, Representation learning for aspect category detection in online reviews, in: Proceedings of the 29th Conference on Artificial Intelligence, AAAI 2015, Austin, Texas, USA, Association for the Advancement of Artificial Intelligence, 2015, pp 417–424 114 [21] C Sun, L Huang, X Qiu, Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence, CoRR abs/1903.09588 [22] J Devlin, M Chang, K Lee, K Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, CoRR abs/1810.04805 [23] M Hu, S Zhao, L Zhang, K Cai, Z Su, R Cheng, X Shen, CAN: constrained attention networks for multi-aspect sentiment analysis, CoRR abs/1812.10735 [24] S Movahedi, E Ghadery, H Faili, A Shakery, Aspect category detection via topic-attention network, CoRR abs/1901.01183 [25] Y Lu, C Zhai, N Sundaresan, Rated aspect summarization of short comments, in: Proceedings of the 18th International Conference on World Wide Web, WWW ’09, ACM, New York, NY, USA, 2009, pp 131–140 [26] Z Zha, J Yu, J Tang, M Wang, T Chua, Product aspect ranking and its applications, IEEE Trans Knowl Data Eng 26 (5) (2014) 1211–1224 [27] X Ding, B Liu, P S Yu, A holistic lexicon-based approach to opinion mining, in: WSDM, ACM, 2008, pp 231–240 [28] W Xu, Y Tan, Semi-supervised target-oriented sentiment classification, Neurocomputing 337 (2019) 120 – 128 [29] D Tang, B Qin, T Liu, Aspect level sentiment classification with deep memory network, in: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, Association for Computational Linguistics, 2016, pp 214–224 [30] Y Wang, M Huang, X Zhu, L Zhao, Attention-based LSTM for aspect-level sentiment classification, in: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, Association for Computational Linguistics, 2016, pp 606–615 [31] F Fan, Y Feng, D Zhao, Multi-grained attention network for aspect-level sentiment classification, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp 3433–3442 URL https://www.aclweb.org/anthology/D18-1380 [32] W Xue, T Li, Aspect based sentiment analysis with gated convolutional networks, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, 2018, pp 2514–2523 115 [33] B Snyder, R Barzilay, Multiple aspect ranking using the good grief algorithm, in: Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, April 22-27, 2007, Rochester, New York, USA, 2007, pp 300–307 [34] K Crammer, Y Singer, Pranking with ranking, in: Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3-8, 2001, Vancouver, British Columbia, Canada], 2001, pp 641–647 [35] I Titov, R T McDonald, A joint model of text and aspect ratings for sentiment summarization, in: ACL, The Association for Computer Linguistics, 2008, pp 308–316 [36] W Wang, H Wang, Y Song, Ranking product aspects through sentiment analysis of online reviews, Journal of Experimental & Theoretical Artificial Intelligence 29 (2) (2017) 227–246 [37] Y Liu, J.-W Bi, Z.-P Fan, Ranking products through online reviews: A method based on sentiment analysis technique and intuitionistic fuzzy set theory, Information Fusion 36 (2017) 149 – 161 [38] C Guo, Z Du, X Kou, Products ranking through aspect-based sentiment analysis of online heterogeneous reviews, Journal of Systems Science and Systems Engineering 27 (2018) 542–558 [39] H Wang, Y Lu, C Zhai, Latent aspect rating analysis without aspect keyword supervision, in: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’11, ACM, New York, NY, USA, 2011, pp 618–626 [40] Y Xu, T Lin, W Lam, Z Zhou, H Cheng, A M.-C So, Latent aspect mining via exploring sparsity and intrinsic information, in: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM ’14, ACM, New York, NY, USA, 2014, pp 879–888 [41] H Wang, M Ester, A sentiment-aligned topic model for product aspect rating prediction, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, 2014, pp 1192–1202 [42] F Wang, L Chen, Review mining for estimating users’ ratings and weights for product aspects, Web Intelligence 13 (3) (2015) 137–152 doi:10.3233/ web-150317 116 [43] Y Li, C Shi, H Zhao, F Zhuang, B Wu, Aspect mining with rating bias, in: P Frasconi, N Landwehr, G Manco, J Vreeken (Eds.), Machine Learning and Knowledge Discovery in Databases, Springer International Publishing, Cham, 2016, pp 458–474 [44] D Xiao, J Yugang, Y Li, F Zhuang, C Shi, Coupled matrix factorization and topic modeling for aspect mining, Information Processing and Management 54 (2018) 861–873 doi:10.1016/j.ipm.2018.05.002 [45] B Pang, L Lee, S Vaithyanathan, Thumbs up? sentiment classification using machine learning techniques, in: EMNLP, 2002 [46] V Narayanan, I Arora, A Bhatia, Fast and accurate sentiment classification using an enhanced naive bayes model, in: Proceedings of the 14th International Conference on Intelligent Data Engineering and Automated Learning — IDEAL 2013 - Volume 8206, IDEAL 2013, Springer-Verlag New York, Inc., New York, NY, USA, 2013, pp 194–201 [47] J Kramer, C Gordon, Improvement of a naive bayes sentiment classifier using mrs-based features, 2014, pp 22–29 [48] Y Bengio, A C Courville, P Vincent, Representation learning: A review and new perspectives, IEEE Trans Pattern Anal Mach Intell 35 (8) (2013) 1798 1828 [49] A Hyvăarinen, E Oja, Independent component analysis: Algorithms and applications, Neural Netw 13 (4-5) (2000) 411–430 [50] G Hinton, R Salakhutdinov, Reducing the dimensionality of data with neural networks, Science (New York, N.Y.) 313 (2006) 504–7 [51] M Weimer, A Karatzoglou, Q V Le, A J Smola, Cofi rank - maximum margin matrix factorization for collaborative ranking, in: J C Platt, D Koller, Y Singer, S T Roweis (Eds.), Advances in Neural Information Processing Systems 20, Curran Associates, Inc., 2008, pp 1593–1600 [52] Y Bengio, A connectionist approach to speech recognition, International Journal on Pattern Recognition and Artificial Intelligence (4) (1993) 647–667 [53] G E Hinton, R R Salakhutdinov, Reducing the dimensionality of data with neural networks, Science 313 (5786) (2006) 504–507 [54] Y Bengio, P Lamblin, D Popovici, H Larochelle, Greedy layer-wise training of deep networks, in: Advances in Neural Information Processing Systems, MIT Press, 2006, pp 153–160 117 [55] G E Hinton, Learning distributed representations of concepts, in: Proceedings of the eighth annual conference of the cognitive science society, Vol 1, Amherst, MA, 1986, p 12 [56] Y Bengio, R Ducharme, P Vincent, C Janvin, A neural probabilistic language model, Journal of Machine Learning Research (2003) 1137–1155 [57] T Mikolov, K Chen, G Corrado, J Dean, Efficient estimation of word representations in vector space, CoRR abs/1301.3781 [58] J Pennington, R Socher, C D Manning, Glove: Global vectors for word representation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, Qatar, Association for Computational Linguistics, 2014, pp 1532–1543 [59] J Pavlopoulos, I Androutsopoulos, Aspect term extraction for sentiment analysis: New datasets, new evaluation measures and an improved unsupervised method, in: Proceedings of the 5th Workshop on Language Analysis for Social Media (LASM), Association for Computational Linguistics, Gothenburg, Sweden, 2014, pp 44–52 [60] L Zhuang, F Jing, X.-Y Zhu, Movie review mining and summarization, in: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, CIKM ’06, ACM, New York, NY, USA, 2006, pp 43–50 [61] P D Turney, Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews, in: ACL, ACL, 2002, pp 417–424 [62] R Mihalcea, C Banea, J Wiebe, Learning multilingual subjective language via cross-lingual projections, in: ACL, The Association for Computational Linguistics, 2007 [63] F Su, K Markert, From words to senses: A case study of subjectivity recognition, in: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Coling 2008 Organizing Committee, Manchester, UK, 2008, pp 825–832 [64] B Pang, L Lee, A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts, in: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 21-26 July, 2004, Barcelona, Spain., 2004, pp 271–278 [65] B Pang, L Lee, Opinion mining and sentiment analysis, Found Trends Inf Retr (1-2) (2008) 1–135 118 [66] N Jindal, B Liu, Review spam detection, in: Proceedings of the 16th International Conference on World Wide Web, WWW ’07, ACM, New York, NY, USA, 2007, pp 1189–1190 [67] E.-P Lim, V.-A Nguyen, N Jindal, B Liu, H W Lauw, Detecting product review spammers using rating behaviors, in: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM ’10, ACM, New York, NY, USA, 2010, pp 939–948 [68] N Jindal, B Liu, Opinion spam and analysis, in: Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM ’08, ACM, New York, NY, USA, 2008, pp 219–230 [69] W Jin, H H Ho, A novel lexicalized hmm-based learning framework for web opinion mining, in: Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, ACM, New York, NY, USA, 2009, pp 465–472 [70] M Hu, B Liu, Mining and summarizing customer reviews, in: KDD, ACM, 2004, pp 168–177 [71] J Yu, Z Zha, M Wang, T Chua, Aspect ranking: Identifying important product aspects from online consumer reviews, in: The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011, pp 1496–1505 [72] N X Bach, P D Van, N D Tai, T M Phuong, Mining vietnamese comparative sentences for sentiment analysis, in: 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE), 2015, pp 162–167 [73] V D Nguyen, K V Nguyen, N L Nguyen, Variants of long short-term memory for sentiment analysis on vietnamese students’ feedback corpus, in: 2018 10th International Conference on Knowledge and Systems Engineering (KSE), 2018, pp 306–311 [74] K V Nguyen, V D Nguyen, P X V Nguyen, T T H Truong, N L Nguyen, Uit-vsfc: Vietnamese students’ feedback corpus for sentiment analysis, in: 2018 10th International Conference on Knowledge and Systems Engineering (KSE), 2018, pp 19–24 [75] Q Vo, H Nguyen, B Le, M Nguyen, Multi-channel lstm-cnn model for vietnamese sentiment analysis, in: 2017 9th International Conference on Knowledge and Systems Engineering (KSE), 2017, pp 24–29 119 [76] L Mai, B Le, Aspect-Based Sentiment Analysis of Vietnamese Texts with Deep Learning, 2018, pp 149–158 [77] D Van Thin, V Duc Nguye, K Nguyen, N Luu-Thuy Nguyen, Deep learning for aspect detection on vietnamese reviews, 2018, pp 104–109 [78] A L Maas, R E Daly, P T Pham, D Huang, A Y Ng, C Potts, Learning word vectors for sentiment analysis, in: ACL, The Association for Computer Linguistics, 2011, pp 142–150 [79] D Tang, F Wei, N Yang, M Zhou, T Liu, B Qin, Learning sentiment-specific word embedding for twitter sentiment classification, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, June 22-27, 2014, Baltimore, MD, USA, Volume 1: Long Papers, 2014, pp 1555–1565 [80] Y Ren, Y Zhang, M Zhang, D Ji, Improving twitter sentiment classification using topic-enriched multi-prototype word embeddings, in: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA., 2016, pp 3038–3044 [81] X Zhang, J J Zhao, Y LeCun, Character-level convolutional networks for text classification, in: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS 2015, Montreal, Canada, Curran Associates Inc., 2015, pp 649–657 [82] C N dos Santos, M Gatti, Deep convolutional neural networks for sentiment analysis of short texts, in: Proceedings of 25th International Conference on Computational Linguistics, COLING 2014, Dublin, Ireland, Association for Computational Linguistics, 2014, pp 69–78 [83] Y Kim, Y Jernite, D Sontag, A M Rush, Character-aware neural language models, in: AAAI, AAAI Press, 2016, pp 2741–2749 [84] K Ganesan, C Zhai, Opinion-based entity ranking, Information Retrieval 15 (2) (2012) 116–150 [85] T Mikolov, K Chen, G Corrado, J Dean, Efficient estimation of word representations in vector space, CoRR abs/1301.3781 [86] X Rong, word2vec parameter learning explained, CoRR abs/1411.2738 [87] Y Lecun, L Bottou, Y Bengio, P Haffner, Gradient-based learning applied to document recognition, in: Proceedings of the IEEE, 1998, pp 2278–2324 120 [88] R Collobert, J Weston, L Bottou, M Karlen, K Kavukcuoglu, P Kuksa, Natural language processing (almost) from scratch, J Mach Learn Res 12 (2011) 2493– 2537 [89] J Mitchell, M Lapata, Vector-based models of semantic composition, in: ACL 2008, Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, June 15-20, 2008, Columbus, Ohio, USA, 2008, pp 236–244 [90] K M Hermann, P Blunsom, Multilingual models for compositional distributed semantics, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, June 22-27, 2014, Baltimore, MD, USA, Volume 1: Long Papers, 2014, pp 58–68 [91] T Mikolov, K Chen, G Corrado, J Dean, Efficient estimation of word representations in vector space, CoRR abs/1301.3781 [92] T Mikolov, I Sutskever, K Chen, G S Corrado, J Dean, Distributed representations of words and phrases and their compositionality, in: NIPS, 2013, pp 3111–3119 [93] C.-Y Liou, W.-C Cheng, J.-W Liou, D.-R Liou, Autoencoder for words, Neurocomput 139 (2014) 84–96 [94] L Bottou, Stochastic learning, in: Advanced Lectures on Machine Learning, Vol 3176 of Lecture Notes in Computer Science, Springer, 2003, pp 146–168 [95] A Cotter, O Shamir, N Srebro, K Sridharan, Better mini-batch algorithms via accelerated gradient methods, in: Proceedings of the 24th International Conference on Neural Information Processing Systems, NIPS’11, Curran Associates Inc., USA, 2011, pp 1647–1655 [96] K Toutanova, D Klein, C D Manning, Y Singer, Feature-rich part-of-speech tagging with a cyclic dependency network, in: Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, HLT-NAACL 2003, Edmonton, Canada, May 27 - June 1, 2003, 2003 [97] H Wang, M Ester, A sentiment-aligned topic model for product aspect rating prediction, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, 2014, pp 1192–1202 [98] D E Rumelhart, G E Hinton, R J Williams, Neurocomputing: Foundations of research, MIT Press, Cambridge, MA, USA, 1988, Ch Learning Representations by Back-propagating Errors, pp 696–699 121 [99] R Collobert, J Weston, A unified architecture for natural language processing: deep neural networks with multitask learning, in: ICML, Vol 307 of ACM International Conference Proceeding Series, ACM, 2008, pp 160–167 [100] C D Manning, P Raghavan, H Schăutze, Introduction to information retrieval, Cambridge University Press, 2008 [101] N Kalchbrenner, E Grefenstette, P Blunsom, A convolutional neural network for modelling sentences, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Baltimore, Maryland, 2014, pp 655–665 [102] M Lakshmana, S Sellamanickam, S K Shevade, K Selvaraj, Learning semantically coherent and reusable kernels in convolution neural nets for sentence classification, CoRR abs/1608.00466 [103] Y Shen, X He, J Gao, L Deng, G Mesnil, Learning semantic representations using convolutional neural networks for web search, in: 23rd International World Wide Web Conference, WWW ’14, Seoul, Republic of Korea, April 7-11, 2014, Companion Volume, 2014, pp 373–374 [104] W Yin, H Schăutze, Multichannel variable-size convolution for sentence classification, in: Proceedings of the 19th Conference on Computational Natural Language Learning, CoNLL 2015, Beijing, China, Association for Computational Linguistics, 2015, pp 204–214 [105] Y Zhang, S Roller, B C Wallace, MGNC-CNN: A simple approach to exploiting multiple word embeddings for sentence classification, in: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2016, San Diego California, USA, Association for Computational Linguistics, 2016, pp 1522–1527 [106] W X Zhao, J Jiang, H Yan, X Li, Jointly modeling aspects and opinions with a maxent-lda hybrid, in: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, 9-11 October 2010, MIT Stata Center, Massachusetts, USA, A meeting of SIGDAT, a Special Interest Group of the ACL, 2010, pp 56–65 [107] R Collobert, J Weston, A unified architecture for natural language processing: deep neural networks with multitask learning, in: ICML, Vol 307 of ACM International Conference Proceeding Series, ACM, 2008, pp 160–167 122 [108] N Srivastava, G E Hinton, A Krizhevsky, I Sutskever, R Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting, Journal of Machine Learning Research 15 (1) (2014) 1929–1958 [109] S Brody, N Elhadad, An unsupervised aspect-sentiment model for online reviews, in: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’10, Association for Computational Linguistics, Stroudsburg, PA, USA, 2010, pp 804–812 [110] L Wang, K Liu, Z Cao, J Zhao, G de Melo, Sentiment-aspect extraction based on restricted boltzmann machines, in: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, 2015, pp 616–625 [111] X Glorot, A Bordes, Y Bengio, Domain adaptation for large-scale sentiment classification: A deep learning approach, in: Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28 - July 2, 2011, 2011, pp 513–520 [112] R Astudillo, S Amir, W Ling, M Silva, I Trancoso, Learning word representations from scarce and noisy data with embedding subspaces, in: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Association for Computational Linguistics, 2015, pp 1074– 1084 123 ... 1.2.2 Phân tích quan điểm cho tồn văn 1.2.3 Phân tích quan điểm theo khía cạnh 1.2.4 Các tốn phân tích quan điểm theo khía cạnh Các nghiên cứu liên quan ... loại quan điểm theo khía cạnh Sau cơng việc xác định khía cạnh cơng việc phân loại quan điểm theo khía cạnh Hai tiếp cập cho phân loại quan điểm theo khía cạnh dựa từ điển tiếp cập học giám sát Các. .. theo khía cạnh quan tâm nghiên cứu nhiều Các tốn điển hình phân tích quan điểm theo khía cạnh gồm có: trích xuất từ khía cạnh [4,11,13,14,69]; xác định khía cạnh [17–19]; phân loại quan điểm theo

Định dạng
Số trang	136
Dung lượng	2,16 MB