1. Trang chủ
  2. » Luận Văn - Báo Cáo

Aspect based sentiment analysis for text documents

93 18 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Aspect Based Sentiment Analysis For Text Documents
Tác giả Trần Cống Toàn Trí, Nguyễn Phú Thiện
Người hướng dẫn TS. Lê Thanh Văn
Trường học Ho Chi Minh City National University
Chuyên ngành Computer Science
Thể loại graduate thesis
Năm xuất bản 2021
Thành phố Ho Chi Minh
Định dạng
Số trang 93
Dung lượng 2,35 MB

Nội dung

HO CHI MINH CITY NATIONAL UNIVERSITY UNIVERSITY OF TECHNOLOGY FACULTY OF COMPUTER SCIENCE AND ENGINEERING GRADUATE THESIS ASPECT BASED SENTIMENT ANALYSIS FOR TEXT DOCUMENTS Major: Computer Science Council: Supervisor: Examiner: Students: KHMT Dr Le Thanh Van Dr Nguyen Quang Hung Tran Cong Toan Tri 1713657 Nguyen Phu Thien 1713304 Ho Chi Minh, December 2021 ĐẠI HỌC QUỐC GIA TP.HCM -TRƯỜNG ĐẠI HỌC BÁCH KHOA KHOA:KH & KT Máy tính BỘ MƠN:Hệ thống & Mạng CỘNG HÒA XÃ HỘI CHỦ NGHĨA VIỆT NAM Độc lập - Tự - Hạnh phúc NHIỆM VỤ LUẬN ÁN TỐT NGHIỆP Chú ý: Sinh viên phải dán tờ vào trang thuyết trình HỌ VÀ TÊN: TRẦN CƠNG TỒN TRÍ HỌ VÀ TÊN: NGUYỄN PHÚ THIỆN NGÀNH: KHOA HỌC MÁY TÍNH MSSV: 1713657 MSSV: 1713304 LỚP: MTKH03 Đầu đề luận án: Phân tích cảm xúc theo khía cạnh từ liệu văn Aspect based sentiment analysis for text documents Nhiệm vụ (yêu cầu nội dung số liệu ban đầu): - Tìm hiểu đặc điểm tốn phân tích cảm xúc theo khía cạnh từ liệu văn - Tìm hiểu cơng trình liên quan - Nghiên cứu đề xuất mơ hình nhận biết khía cạnh cảm xúc khía cạnh từ liệu văn - Thu thập liệu để huấn luyện kiểm thử mơ hình - Hiện thực mơ hình đề xuất, thực nghiệm, so sánh đánh giá Ngày giao nhiệm vụ luận án: 30/08/2021 Ngày hoàn thành nhiệm vụ: 31/12/2021 Họ tên giảng viên hướng dẫn: Phần hướng dẫn: 100% 1) TS Lê Thanh Vân Nội dung yêu cầu LVTN thông qua Bộ môn Ngày tháng năm CHỦ NHIỆM BỘ MÔN GIẢNG VIÊN HƯỚNG DẪN CHÍNH (Ký ghi rõ họ tên) (Ký ghi rõ họ tên) Lê Thanh Vân PHẦN DÀNH CHO KHOA, BỘ MÔN: Người duyệt (chấm sơ bộ): Đơn vị: _ Ngày bảo vệ: _ Điểm tổng kết: _ Nơi lưu trữ luận án: _ TRƯỜNG ĐẠI HỌC BÁCH KHOA KHOA KH & KT MÁY TÍNH CỘNG HỊA XÃ HỘI CHỦ NGHĨA VIỆT NAM Độc lập - Tự - Hạnh phúc -Ngày 27 tháng 12 năm 2021 PHIẾU CHẤM BẢO VỆ LVTN (Dành cho người hướng dẫn/phản biện) Họ tên SV: Trần Cơng Tồn Trí, Nguyễn Phú Thiện MSSV: 1713657, 1713304 Ngành (chuyên ngành): Khoa học máy tính Đề tài: Phân tích cảm xúc theo khía cạnh từ liệu văn Họ tên người hướng dẫn/phản biện: TS Lê Thanh Vân Tổng quát thuyết minh: Số trang: 88 Số chương: (bao gồm chương phụ lục) Số bảng số liệu: 16 Số hình vẽ: 38 Số tài liệu tham khảo: Phần mềm tính tốn: Hiện vật (sản phẩm) Tổng quát vẽ: - Số vẽ: Bản A1: Bản A2: Khổ khác: - Số vẽ vẽ tay Số vẽ máy tính: Những ưu điểm LVTN: Luận văn hướng đến việc đề xuất mơ hình phân tích cảm xúc theo khía cạnh từ liệu văn Để đạt mục tiêu đề tài, nhóm sinh viên thực tốt việc sau: -! Tìm hiểu đặc điểm tốn phân tích cảm xúc nói chung theo khía cạnh nói riêng từ liệu văn -! Tìm hiểu cơng trình nghiên cứu liên quan bật năm gần -! Chủ động liên hệ nhóm nghiên cứu VLSP UIT để thu thập tập liệu mẫu để xây dựng tập liệu huấn luyện kiểm thử Xây dựng công cụ crawler để thu thập liệu từ trang booking.com để có liệu thực tế phục vụ đánh giá mơ hình đề xuất -! Tìm hiểu phân tích tốt ưu điểm mơ hình xử lý ngơn ngữ tự nhiên BERT, PhoBert, mơ hình tích hợp lớp tiềm ẩn mơ hình phân loại phân cấp theo entity, aspect sentiment, mơ hình dựng câu bổ trợ dựa Bert -! Ứng dụng mơ hình NLI-B dựng câu bổ trợ dựa PhoBert cho ngơn ngữ tiếng Việt -! Đề xuất mơ hình HSUM-HC kết hợp tận dụng tốt ưu điểm PhoBert, lớp ẩn mơ hình để tăng khả nhận biết ngữ nghĩa tích hợp với mơ hình phân loại phân cấp -! Đánh giá thực nghiệm NLI-B, HSUM-HC lớp tiềm ẩn HSUM-HC lớp tiềm ẩn với tập liệu VLSP, UIT Booking.com cho miền liệu nhà hàng khách sạn Thực nghiệm cho kết tốt đa số trường hợp so sánh với Linear SVM, Multilayer Perceptron, CNN, BiLSTM+CNN, PhoBert viBErt Thêm vào đó, mơ hình có độ đo đánh giá cao liệu biểu diễn mức document nhận biết tốt liên kết mặt ngữ nghĩa câu Ngồi ra, nhận thấy mơ hình mang lại kết đánh giá tốt, giai đoạn cuối thực hiện, luận văn đề xuất bổ sung thêm ứng dụng đơn giản hỗ trợ tìm kiếm cho phép người dùng nhập vào yêu cầu khách sạn mà khơng theo tiêu chí định trước trang đặt phịng Ứng dụng phân tích yêu cầu, nhận biết yêu cầu trả kết khách sạn ứng với tiêu chí cần tìm kiếm Ứng dụng nhằm thể tính ứng dụng thực tế tốn đề xuất phát triển hồn thiện hệ thống gợi ý thực nhóm đề tài sau Bên cạnh đó, nhóm sinh viên viết báo khoa học “HSUM- HC: Integrating Bert-based hidden aggregation to hierarchical classifier for Vietnamese aspect-based sentiment analysis” chấp thuận trình bày hội nghị IEEE, 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) vào ngày 21/12/2021, Hà Nội, Việt Nam Những thiếu sót LVTN: Một số câu báo cáo luận văn dài, nên tách ý để rõ nghĩa đọc dễ hiểu Đề nghị: Được bảo vệ ! Bổ sung thêm để bảo vệ o Không bảo vệ o câu hỏi SV phải trả lời trước Hội đồng: a b c 10 Đánh giá chung (bằng chữ: giỏi, khá, TB): Giỏi Điểm : 10 /10 Ký tên (ghi rõ họ tên) Lê Thanh Vân DEDICATION We, Tran Cong Toan Tri and Nguyen Phu Thien, declare that this thesis titled "Sentiment Analysis of User Comments" and the work presented in it are our own and that, to the best of our knowledge and belief, it contains no material previously published or written by another person (except where explicitly defined in the acknowledgments), nor material which to a substantial extent has been submitted for the award of any other degree or diploma of a university or other institution of higher learning Acknowledgements It gives us great pleasure and satisfaction in presenting our thesis on “Aspect based sentiment analysis for text documents” This would be our last project as bachelor students in university, this project reflects what we have learned and the skills we acquire during the years at the University of Technology For that reason, we would like to express our deepest sense of gratitude towards our guidance teacher, Dr Le Thanh Van for allowing us to work on this project, for her continuous support throughout our study and research, and for the amount of patience, motivation, and knowledge that she has given us We could not have imagined a better advisor and mentor for our thesis Besides our advisor, we would like to say thank you for all the knowledge, experience, and support of the teachers and staff of the Computer Science and Engineering Faculty that has been given to us in our time at university, their teachings have helped us acquire the foundational knowledge for this thesis Lastly, we would like to thank all of our friends and family who have motivated us every step of the way, we would not have been able to finish this without them, and we are grateful for everything we have been given till this day Abstract In today’s world, customers and their feedback are vital to any business’s survival Competition is harsh on every market, the competitor who understands and pleases their customer the most will be more successful For that to happen, businesses need to gather information about their customers’ opinions on a large scale ABSA is a method for them to achieve this, it has been studied rigorously by researchers in the past, and since the creation of Bert, ABSA methods are getting more and more advance, showing better and better results in recent years However for Vietnamese, ABSA is still not as developed, due to the limited resources and the nuances of the language In our work, we want to improve the capabilities of Vietnamese ABSA, we use the Vietnamese SOTA pre-trained PhoBert and built two models from it One has a custom classifier, made from a combination of previous methods that proved effective, and the other is made to utilize Bert’s sequence pair feature, constructing auxiliary sentences and turning ABSA into a question-answering problem With our work, we hope to set new baseline results for the Vietnamese ABSA datasets, along with providing useful knowledge for any researchers who want to improve it further Our implementation achieved SOTA scores for both public datasets on Vietnamese ABSA, getting considerably higher scores than previous works To demonstrate that our model not only works on filtered data but also actual user reviews, we also obtained reviews from a booking site and use our model on them From that data, we made a profile for each hotel, finding their pros and cons, then we built a search engine to help users in booking their accommodation by providing immediate access to necessary information In this work, we will provide the acquisition of these data, their evaluation results, and our process of designing the search engine Contents Introduction 1.1 Why we chose this project 1.2 Project goal 1.3 Project scope 1.4 Project structure Aspect Based Sentiment Analysis 2.1 What is ABSA 2.2 ABSA research overview 2.3 Vietnamese ABSA shared task 2.4 Related work 10 Our proposed models 27 3.1 Bert sequence-pair with auxiliary sentences 27 3.2 HSUM-HC 32 Experimental results and discussion 35 4.1 Exploratory data analysis 35 4.2 Training process 39 4.3 Training cost 40 4.4 Experimental Results 42 4.5 Analysis and Discussion 45 4.6 Evaluation on real-life data 49 4.7 Survey results 51 Model application for a recommender system 53 5.1 Inspiration 53 5.2 Overview 54 5.3 Technology 54 5.4 Design 56 Ho Chi Minh city National University - University of Technology Faculty of Computer Science and Engineering 5.6 5.6.1 Evaluation Hotel profiling evaluation With traditional scoring that relies on users, the score fluctuates depending on each customer A score of 8.0 can be high to some but low to others Not only that, but a large number of users also leave their rating without a review to express why they left that rating This can be a difficult problem for hotels since they can have low ratings without knowing the reason With our method of calculating scores for each hotel based on past reviews, we can build a more trustworthy and systematic scoring system Every rating that contributes to the final score is backed up by an actual review, with this system, hotel owners can monitor their business, having clear information as to which aspect is making their customers dissatisfied For users, it is also useful to know that each hotel score is directly from customer reviews, with reasoning and information of why the score is that way And with our feature of displaying relevant reviews, users won’t have to wonder about arbitrary bad scores without any explanations This gives them more confidence in our recommendations 5.6.2 Recommendation evaluation With Travel Link, we hope to give users a list of suitable hotels, and by storing the reviews for each hotel, we can analyze and sort these reviews by relatedness to the user’s needs, this saves them time from having to filter through too many comments, and present them with the necessary information to make their choice easier Overall we evaluate that our system succeeded in this task, it can give users a generally accurate list of hotels, and by reading top comments sorted by us, users can have a grasp of the hotel’s pros and cons, and when they are interested, they can have more information by clicking on the link to booking.com official page Even though our method can help users in searching for an accommodation, we realize the flaws of this system, using predefined labels are too general for some queries and cannot cover the user requirements In that case, the recommended hotels and related comments can have similar features, but not exactly what the user wants If a feature matters to a user, they still have to spend a little more effort to read through hotel comments to find suitable features However, this process is still faster than reading through original comments without labeling and sorting Graduate thesis Page 64/78 Chapter Conclusion In this chapter, we will conclude our work by summarizing the key research findings of the research goals and discussing the results and contribution thereof We will also review the limitations of our research and propose opportunities for future work 6.1 Our contribution This study aimed to not only improve and set new baseline results for the Vietnamese ABSA task but also to provide more useful knowledge for any researchers who want to work on this task in the future We did extensive research on the ABSA implementations for Vietnamese and English as well to learn their methodology, advantages, and disadvantages, From that knowledge, we presented two ABSA models: HSUM-HC and Vietnamese NLI_B We experimented our models on two different datasets (VLSP 2018 ABSA and UIT ABSA) on different review levels and achieve great results compare to previous methods, with the HSUM-HC model achieving SOTA results for both datasets And we find that since Bert-large has 24 hidden layers, using layers for aggregation gives better performance compared to the originally proposed layers usage We also analyzed our HSUM-HC model performance by using the model to classify real-life data obtained from Booking.com and by conducting a small survey to get others’ opinions on how accurate our model’s labels are Both showed a very promising result and ensured us that the model not only can work great with pre-made datasets but also can perform on real users’ comments as well And to further demonstrate our model’s potential, we also built a hotel recommend system using the HSUM-HC model to create hotel profiles from past customer reviews and comments The system works as follow: Given a query from a user, we will analyze the aspects and sentiments of that review by our model and use the result to suggest the user with suitable hotels based on their said query, as well as giving users quick access to Ho Chi Minh city National University - University of Technology Faculty of Computer Science and Engineering the most useful reviews according to their interest 6.2 Research Paper At the time of writing this thesis, we have submitted a research paper titled HSUMHC: Integrating Bert-based hidden aggregation to hierarchical classifier for Vietnamese aspect-based sentiment analysis to the 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) and it has been accepted for presentation In the paper, we went over how we constructed the HSUM-HC model by utilizing PhoBert’s top-level hidden layers integrated into a hierarchical classifier and evaluated outperforms previous models on two public datasets in Vietnamese The NICS 2021 conference will take place on December 21-22, 2021 in Hanoi and the paper publication will follow soon after We hope this research paper will help inspire and elevate research on Vietnamese NLP and ABSA in particular 6.3 Limitations Even though we carried out our study and implemented it very carefully, certain limitations cannot be avoided, both objectively and subjectively, and thus need more work and study in the future Limited resource and data for ABSA task As we have stated throughout the paper, is the limited resources and study for Vietnamese NLP task in general or ABSA task in specific This lead to the lack of standard datasets for us to fully train and test our models across different domains and review levels Sensitive to user’s spelling mistakes and no accent mark sentences Even though our model performed quite well on real-life data, it still has some trouble with non-standard input data i.e input with spelling mistakes, input with no accent mark, These input data appear both in the training and testing dataset and in our real-life data, although in low number, still hinder our model’s performance Graduate thesis Page 66/78 Ho Chi Minh city National University - University of Technology Faculty of Computer Science and Engineering PhoBert’s 256 maximum-length subword tokens We built both of our models based on PhoBert-large, which employ the RoBERTa [27] implementation with a maximum length of 256 subword tokens This means that any input that has a length greater than 256 words will crash the model and stop the whole training process To accommodate this, we have to set a maximum length for our input data equal to 256 for document-level, any input that has a greater length than this will be truncated, thus leading to information loss which hinders our models’ performance on long input data Google Colab’s limitations We built our models using Google Colab with the Colab+ package which provides us access to a Tesla P100 GPU with 16GB Vram This amount of resources is still quite limited for the computation of such a robust model Since we are using PhoBert-large, with a hidden layer size of 1024, and setting our input’s max length to the model’s 256 maximum tokens, we have to reduce our batch size to a very small number, or else we would run into the Out of memory problem This limits us from experimenting with a wide range of parameters and significantly increases our training time Even though we have Google Colab+, which is a premium service compared to standard Colab, we still have limited use of the GPU Since we have such a big model and experimenting for long hours, sometimes our GPU usage is revoked and only provided again after some time Furthermore, Colab+’s maximum session runtime is 24 hours, and our training for NLI_B for only epochs is already 21 hours, which means we are unable to fully test the extent of NLI_B’s capabilities with more epochs We also find that Colab’s connection is not stable enough to train for a long time Upon training our NLI_B models, we had to re-train several times before reaching a satisfying result Sometimes the training process stops without warning, or the internet is down briefly, and if we are not present to take action, the training process will be stopped 6.4 Future work In the future, we would like to apply our models to different domains and different languages We would also like to apply our model accommodated by a Vietnamese spelling correction model and/or inserting Vietnamese accent marks model, this would greatly increase our models’ performance, especially on real-life data If we can accomplish that, we can use our models to classify online data to make new and standard ABSA datasets, which would help the growth of Vietnamese NLP research greater and faster than before Graduate thesis Page 67/78 Ho Chi Minh city National University - University of Technology Faculty of Computer Science and Engineering For Travel Link, our traveling search engine, we would like to deploy our search engine and invite others to test it and give us feedback to make improvements accordingly Then we would like to expand it further, adding more features to help users filter information We can add logins to save user-profiles and store their preferences, adding these references to the suggesting algorithm helps us gives better and better suggestions over time Graduate thesis Page 68/78 Bibliography [1] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova Bert: Pretraining of deep bidirectional transformers for language understanding, 2019 [2] Bing Liu Web data mining 07 2011 [3] Maria Pontiki, Dimitris Galanis, John Pavlopoulos, Harris Papageorgiou, Ion Androutsopoulos, and Suresh Manandhar SemEval-2014 task 4: Aspect based sentiment analysis In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 27–35, Dublin, Ireland, August 2014 Association for Computational Linguistics [4] Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Ion Androutsopoulos, Suresh Manandhar, Mohammad AL-Smadi, Mahmoud Al-Ayyoub, Yanyan Zhao, Bing Qin, Orphée De Clercq, Véronique Hoste, Marianna Apidianaki, Xavier Tannier, Natalia Loukachevitch, Evgeniy Kotelnikov, Nuria Bel, Salud Marớa Jimộnez-Zafra, and Gă ulásen Eryigit SemEval-2016 task 5: Aspect based sentiment analysis In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pages 19–30, San Diego, California, June 2016 Association for Computational Linguistics [5] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin Attention is all you need, 2017 [6] Paperswithcode 2014 task aspect-based sub task sentiment analysis on semeval https://paperswithcode.com/sota/ aspect-based-sentiment-analysis-on-semeval Accessed: 2021-08-12 [7] Kiet Van Nguyen, Vu Duc Nguyen, Phu X V Nguyen, Tham T H Truong, and Ngan Luu-Thuy Nguyen Uit-vsfc: Vietnamese students’ feedback corpus for sentiment analysis In 2018 10th International Conference on Knowledge and Systems Engineering (KSE), pages 19–24, 2018 [8] Vong Anh Ho, Duong Huynh-Cong Nguyen, Danh Hoang Nguyen, Linh Thi-Van Ho Chi Minh city National University - University of Technology Faculty of Computer Science and Engineering Pham, Duc-Vu Nguyen, Kiet Van Nguyen, and Ngan Luu-Thuy Nguyen Emotion recognition for vietnamese social media text CoRR, abs/1911.09339, 2019 [9] Huyen TM Nguyen, Hung V Nguyen, Quyen T Ngo, Luong X Vu, Vu Mai Tran, Bach X Ngo, and Cuong A Le Vlsp shared task: sentiment analysis Journal of Computer Science and Cybernetics, 34(4):295–310, 2018 [10] Dang Van Thin, Ngan Luu-Thuy Nguyen, Tri Minh Truong, Lac Si Le, and Duy Tin Vo Two new large corpora for vietnamese aspect-based sentiment analysis at sentence level ACM Trans Asian Low-Resour Lang Inf Process., 20(4), may 2021 [11] Thin-Dang Van, Kiet-Van Nguyen, and Ngan Luu-Thuy Nguyen Nlp@ uit at vlsp 2018: A supervised method for aspect based sentiment analysis In Proceedings of the Fifth International workshop on Vietnamese Language and Speech Processing (VLSP), 2018 [12] Pham Quang Nhat Minh Tuan Anh Nguyen Using multilayer perceptron for aspectbased sentiment analysis at vlsp-2018 sa task In Proceedings of the Fifth International workshop on Vietnamese Language and Speech Processing (VLSP 2018), 2018 [13] Thin Dang, Vu Duc, Kiet Nguyen, and Ngan Nguyen Deep learning for aspect detection on vietnamese reviews pages 104–109, 11 2018 [14] Thin Dang, Duc-Vu Nguyen, Kiet Nguyen, Ngan Nguyen, and Tu-Anh Hoang Multitask Learning for Aspect and Polarity Recognition on Vietnamese Datasets, pages 169–180 07 2020 [15] Ning Liu, Bo Shen, Zhenjiang Zhang, Zhiyuan Zhang, and Kun Mi Attention-based sentiment reasoner for aspect-based sentiment analysis Human-centric Computing and Information Sciences, 9, 12 2019 [16] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin Attention is all you need, 2017 [17] Wilson L Taylor “cloze procedure”: A new tool for measuring readability Journalism & Mass Communication Quarterly, 30:415 – 433, 1953 [18] Dat Quoc Nguyen and Anh Tuan Nguyen PhoBERT: Pre-trained language models for Vietnamese In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1037–1042, Online, November 2020 Association for Computational Linguistics Graduate thesis Page 70/78 Ho Chi Minh city National University - University of Technology Faculty of Computer Science and Engineering [19] Thanh Vu, Dat Quoc Nguyen, Dai Quoc Nguyen, Mark Dras, and Mark Johnson Vncorenlp: A vietnamese natural language processing toolkit Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, 2018 [20] Rico Sennrich, Barry Haddow, and Alexandra Birch Neural machine translation of rare words with subword units 08 2015 [21] Dang Van Thin, Lac Si Le, Vu Xuan Hoang, and Ngan Luu-Thuy Nguyen Investigating monolingual and multilingual bertmodels for vietnamese aspect category detection, 2021 [22] Natesh Reddy, Pranaydeep Singh, and Muktabh Mayank Srivastava Does BERT understand sentiment? leveraging comparisons between contextual and non-contextual embeddings to improve aspect-based sentiment models CoRR, abs/2011.11673, 2020 [23] Ganesh Jawahar, Benoˆıt Sagot, and Djamé Seddah What does BERT learn about the structure of language? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3651–3657, Florence, Italy, July 2019 [24] Akbar Karimi, Leonardo Rossi, and Andrea Prati Improving bert performance for aspect-based sentiment analysis arXiv preprint arXiv:2010.11731, 2020 [25] Oanh Thi Tran and Viet The Bui A bert-based hierarchical model for vietnamese aspect based sentiment analysis In 2020 12th International Conference on Knowledge and Systems Engineering (KSE), pages 269–274, 2020 [26] Chi Sun, Luyao Huang, and Xipeng Qiu Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence CoRR, abs/1903.09588, 2019 [27] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov Roberta: A robustly optimized bert pretraining approach, 2019 Graduate thesis Page 71/78 Appendix A Our Research Paper HSUM-HC: Integrating Bert-based hidden aggregation to hierarchical classifier for Vietnamese aspect-based sentiment analysis Tri Cong-Toan Tran Thien Phu Nguyen Ho Chi Minh City University of Technology Vietnam National University Ho Chi Minh City Ho Chi Minh, Viet Nam tri.tran.1713657@hcmut.edu.vn Ho Chi Minh City University of Technology Vietnam National University Ho Chi Minh City Ho Chi Minh, Viet Nam thien.nguyen.phu@hcmut.edu.vn Thanh-Van Le* Ho Chi Minh City University of Technology Vietnam National University Ho Chi Minh City Ho Chi Minh, Viet Nam ltvan@hcmut.edu.vn * Corresponding Author Abstract—Aspect-Based Sentiment Analysis (ABSA), which aims to identify sentiment polarity towards specific aspects in customers’ comments or reviews, has been an attractive topic of research in social listening In this paper, we construct a specialized model utilizing PhoBert’s top-level hidden layers integrated into a hierarchical classifier, taking advantage of these components to propose an effective classification method for ABSA task We evaluated our model’s performance on two public datasets in Vietnamese and the results show that our implementation outperforms previous models on both datasets Index Terms—aspect based sentiment analysis, PhoBert, BERT, hidden layer aggregation, hierarchical classifier, Vietnamese corpus I INTRODUCTION The fast growth of e-commerce, particularly the B2C (business-to-customer) model, has resulted in a rise in online purchasing habits It makes day-to-day transactions extremely simple for the general public, and it ultimately becomes one of the most popular sorts of purchases, especially during a global pandemic like COVID-19 Due to the sheer development of social media platforms, customers are encouraged to provide reviews and comments expressing their positive or negative sentiments about the products or services that they experienced Analyzing a huge amount of data for mining public opinion is a time-consuming and labor-intensive operation As a result, building an automatic sentiment analysis system can help consumers exploit quality judgments of others about interest products Moreover, this system will support businesses to better manage their reputation, understand the business requirements well adapted to the customer’s needs and avoid marketing disasters For this reason, sentiment analysis has become one of the most attractive study fields in machine learning among academic and business researchers in recent years There have been previous interesting researches of sentiment analysis for Vietnamese text using VLSP 2016 datasets1 However, in modern days, sentiment analysis does not provide enough information since it assumes the entire review only has one topic and one sentiment, but a product can have both its pros and cons in many aspects The challenge of Aspectbased sentiment analysis (ABSA) is not only detecting aspects in a review but also the sentiment attached to that aspect A review can be represented by dozens or hundreds of words about multiple aspects with different sentiments to each, and determining which sentiment words go with which aspect can be very difficult With ABSA, reviews about a product can now be analyzed in detail, showing the reviewer’s opinion on each aspect of that product The main problem of ABSA is as follows: Given a customer review about a domain (e.g hotel or restaurant), the goal is to identify sets of (Aspect, Polarity) that fit the opinion mentioned in the review Each aspect is a set of an entity and an attribute, and polarity consists of negative, neutral, and positive sentiment For each domain, all possible combinations of entities and attributes are predefined The ABSA task will be divided into two phases: (i) identify pairs of entities and attribute, (ii) analyze the sentiment polarity to the corresponding aspect (entity#attribute) identified in the previous phase For example, a review “Nơi có quang cảnh tuyệt đẹp, đồ ăn ngon phục vụ tệ” (This place has an amazing view, the food is great too but the service is bad) will output (Entity#Attribute: Polarity) as follows: (Ho1 https://vlsp.org.vn/vlsp2016/eval/sa tel#Design&Features: Positive), (Food&Drinks#Quality: Positive), (Service#General: Negative) In this paper, we propose a method using multiple Bert’s top-level hidden layers for classification combined with an intuitive hierarchical classifier for the ABSA task Our results demonstrate that a large model with many hidden layers contains useful information which can be used to get better results We achieved the highest score when applying our method to two Vietnamese ABSA datasets such as VLSP2 and UIT ABSA [1] dataset II RELATED WORK In recent years, Sentiment Analysis has taken off and is strongly developed by advanced researches for social listening Many corpora and tasks have been developed, such as SemEval 2015 (Task 12) [2] and 2016 (Task 5) [3] for various languages, including English, Chinese, etc The first public Vietnamese benchmark datasets were released by the VLSP (Vietnamese Language and Speech Processing) community in 2018 The organizer built two benchmark document-level corpora with 4,751 and 5,600 reviews for the restaurant and hotel domain, respectively Several interesting methods have been proposed to handle these tasks The earliest works are heavily based on feature engineering (Wagner et al [4]; Kiritchenko et al [5]), which made use of the combination of n-grams and sentiment lexicon features to solve various ABSA tasks in SemEval Task 2014 Nguyen and Shirai [6]; Wang et al [7]; Tang et al [8] were able to achieve higher accuracy by improving on Neural network with hierarchical structure by integrating the dependency relations and phrases [6], an Attention module [7], or with target-dependent mechanism [8] Ma et al [9] incorporated useful commonsense knowledge into a deep neural network to further enhance the model Recently, the pre-trained language model over a large text corpus such as ELMo (Peters et al [10]), OpenAI GPT (Radford et al [11]), and especially BERT (Devlin et al [12]) have shown their effectiveness to alleviate the effort of feature engineering Chi Sun et al [13] proposed four methods of converting the ABSA task, such as question answering (QA) and natural language inference (NLI), into a sentence pair classification task by constructing auxiliary sentences and fine-tuned a BERT model to solve the task The sentence pair is created by concatenating the original sentence with an auxiliary sentence generated by several methods from the target-aspect pair Karami et al [14] proposed two modules called Parallel Aggregation and Hierarchical Aggregation utilizing the hidden layers of the BERT language model to produce deeper semantic representations of input sequences They perform predictions on each one of the selected modules and compute the loss These losses are then aggregated to produce the final loss of the model They used Conditional Random Fields (CRFs) for the sequence labeling task which yielded better results In addition, their experiments also show that training BERT with a large number of epochs does not cause the model to overfit For the low-resource language such as Vietnamese, there has been little study of aspect-based sentiment analysis over the year, but still steady progress Oanh et al [15] proposed a BERT-based Hierarchical model which integrated the context information of the entity layer into the prediction of the aspect layer, optimizing the global loss functions to capture the entire information from all layers Their model consists of two main components The Bert component encodes the context information of the review into a representation vector The representation vector will be used as input to the hierarchical model to generate multiple outputs (entity; aspect; polarity) corresponding to each layer Thin et al [16] performed an investigation on the performance of various monolingual pretrained language models compared with multilingual models on the Vietnamese aspect category detection problem This research showed the effectiveness of PhoBert compared to several models, including the XLM-R [17], mBERT model [12] and another version of BERT model for Vietnamese languages III PROPOSED MODEL In this section we will introduce HSUM-HC, our ABSA approach inheriting the benefits of PhoBert with hidden layer aggregation and hierarchical classifiers for Vietnamese text (Fig 1) By deeply analyzing the characteristic of each model, we believe this combination can give us a model that is well suited for ABSA task PhoBert is a monolingual pretrained model specifically made for the Vietnamese language Input sequences will be tokenized and fed into the PhoBert Model, then we take top n hidden layers as the meaningful context input for the next step of the hierarchical aggregation layer Then the output of the latter layer will be input into a hierarchical classifier for predicting the set of aspects and sentiment polarity 1) Bert Model: There have been many multilingual pretrained Bert models that support Vietnamese, but as pointed out by [18], these models have two main problems: little pretraining data and not recognizing compound words PhoBert is made to address these problems, it is also the first monolingual Bert model pre-trained for Vietnamese PhoBert’s pre-training approach is based on RoBerta [19], which aims to improve Bert’s pre-training procedures for better performance The pretraining was done with 20GB of monolingual text (Vietnamese Wikipedia and Vietnamese news corpus3 ) and employs the use of a segmenter, VNCoreNLP4 to tokenize compound words (e.g khách_sạn, thức_uống) PhoBert has been used as a pre-trained model in our research because we aim to process Vietnamese text for ABSA tasks For fine-tuning, we follow the steps taken when pre-training the model, using VNCoreNLP for word segmentation, and PhoBert’s tokenizer to split sequences into tokens and map tokens into their index, https://github.com/binhvq/news-corpus https://vlsp.org.vn/vlsp2018/eval/sa https://github.com/vncorenlp/VnCoreNLP experimented with the model architecture in their paper and saw that we could improve the result by around 3% by using PhoBert as a pre-trained model and VNCoreNLP for word segmentation Secondly, in their implementation, only the last hidden layer was used to make the prediction, this means the top layer is considered most important and all the information in previous hidden layers is not utilized [20] showed that all hidden layers of BERT can contain information, higherlevel layers have valuable semantic information Thus, we can enhance the Bert-based model by using these layers For that reason, we implemented the hierarchical hidden level aggregation architecture by [14], which adds a BERT layer on top of the hidden layers The output is then aggregated with the previous hidden layer and then goes through the hierarchical classifier and the total loss is the sum of every classifier’s losses The Binary Cross-Entropy loss function for each layer Li of the classifier is calculated as follows: C yc · log(σ(ˆ y )) + (1 − yc ) · log(1 − σ(yˆc )) Li = (1) c=1 With C being the number of classes for that layer The loss for each classifier is the sum of three predictions layers’ losses calculated above classif ier_loss = L1 + L2 + L3 (2) The total loss is the sum of all classifier’s losses, with H being the number of classifiers Figure 1: Our HSUM-HC model for the ABSA task H total_loss = classif ier_lossh (3) h=1 adding the [CLS] token at the start and [SEP] token at the end of each sequence This tokenizer will also give us the attention masks and pad sequences to ensure equal length Then the list of tokens and attention masks will be input into the Bert model 2) Hidden layer aggregation with hierarchical classifiers: A Bert-based model with a hierarchical classifier was by created Oanh et al [15] to deal with ABSA, its architecture is based on how a human would annotate manually the same task It does classification in three layers: Entity, Aspect, and Sentiment The process is to label first the entity (e.g Hotel, Room, ) then identify the entity’s attribute (e.g Design, Comfort, ) to form an aspect, and lastly analyze the sentiment for that aspect in the review Every layer contributes its output as context to the next layer With this architecture, we can solve ABSA with an end-to-end model, without the need for multiple classifiers In the original Bert with hierarchical classifier implementation from Oanh et al [15], we observe that some improvements can be made to achieve better performance for this task Firstly, they used a multilingual Bert model and did further training for Vietnamese, this creates a pre-trained model accustomed to Vietnamese, but it is still not specialized since, without the use of a segmenter, compound words are not handled properly, we With this implementation, we obtain an enhanced model with the goal of achieving the best performance possible for the aspect-based sentiment analysis task: A monolingual pre-trained model for Vietnamese, a mechanism to exploit this pre-trained model to its full potential, and a hierarchical classifier Our promising results will be presented in detail in the experiment section IV EXPERIMENTS A Datasets We experimented our model’s performance with the VLSP 2018 ABSA dataset, which was the first public Vietnamese dataset for ABSA task This dataset was collected from user comments on Agoda5 and consists of document-level reviews The length of each review varies by quite a large number, some are short sentences but some reviews can contain hundreds of words, with the longest containing around 1000 words We also evaluated our model on the UIT ABSA Datasets, which is sentence-level reviews containing relatively short sentences, which only have 1.65 per review on average The data was collected on mytour6 In the formulation of both https://www.agoda.com/vi-vn https://mytour.vn datasets, multiple annotators were employed and raw data were manually annotated with strict guidelines The datasets deal with the hotel and restaurant domains being divided into training, development, and testing sets with similar label ratios There are 34 aspects for the hotel domain and each review can have a various amount of aspects Details about the dataset can be seen in Table I and Table II From the standard deviation for each dataset, it is apparent that the aspect distribution is very uneven, with the most frequent aspect appearing around 2000 times, and the rarest aspect only appearing or times D Experimental Results and Discussion We compared our model’s performance with previous work done on the same dataset For the UIT ABSA Dataset, all results besides ours are from the baseline results in [1], these results will be taken from the Multi-task approach (except for SVM) 1) Experimental Results: Results can be seen in Table III and IV for two datasets Overall, we find that our implementation outperforms previous methods in the same task For the VLSP 2018 dataset, our model achieved an F1 score of 85.20% for Phase A and 80.08% for Phase B, which is a significant improvement from previous Deep Learning models Table I: Dataset Details for VLSP 2018 ABSA Notably, compared to [15], our model performs considerably better when applying a hierarchical classifier with a languageType #Reviews #Aspect Avg Aspect σ Avg Length train 3000 13948 4.64 439.21 47 specific pre-trained model and hidden layers, improving 3.14% dev 2000 7111 3.55 252.34 23 in Phase A and 5.39% in Phase B F1 score of our model is test 600 2584 4.31 84.55 30 6.04% higher than one of [16] which used PhoBert-base with a linear layer for aspect detection For the UIT ABSA Dataset, our model got 80.78% and 75.25% in Phase A and Phase B, Table II: Dataset Details for UIT ABSA respectively Our model also improves at least 1.68% in Phase Type #Reviews #Aspect Avg Aspect σ Avg Length A and 1.56% in Phase B compared to baseline models in [1] train 7180 11812 1.65 469.00 18.25 dev 795 1318 1.66 52.18 18.54 It’s also proven that using the top layers for hidden layer test 2030 3283 1.62 130.26 18.27 aggregation gives us a better performance compared to only 4, this is because we are using a large model with more hidden layers, which means more layers can contain useful semantic B Evaluation Metrics information To evaluate the performance of ABSA models, we use From the results of UIT ABSA sentence-level dataset, we the micro-average method The evaluation will be done in can see that our implementation can have lower precision but two phases, Phase A will evaluate the model’s capabilities in much higher recall than previous models, which leads to a detecting aspects of a review, Phase B will evaluate the aspect, higher F1 score than Deep Learning models, meaning it overall polarity pair detection The Precision, Recall, and F1 Scores outperforms these models This is even more apparent in the are evaluated with the following formulas: document-level dataset, which has longer reviews requiring the model to capture long-range dependencies, each review ci ∈C T Pci also has a higher amount of aspects on average Therefore this P recision = ci ∈C T Pci + F Pci task can be considered more challenging than sentence-level However, for document-level, our model scores significantly ci ∈C T Pci higher than it did on sentence-level This means that our Recall = ci ∈C T Pci + F Nci model, instead of being challenged by long sequences and forgetting information, actually can learn the extra information P re ∗ Rec F1 = ∗ in these sequences and make use of them to achieve a better P re + Rec result We see that our model shows its true potential when C Experimental Setup put through a more demanding task with more information to As mentioned above, we use VNCoreNLP’s segmenter to learn Overall the results show that our implementation is effective segment each review before using PhoBert Then we use PhoBert’s tokenizer to get token ids, attention masks and in dealing with ABSA, and all three components PhoBert, then perform padding We use Hugginface’s AdamW opti- HSUM, and hierarchical classifier are essential for improving mizer7 together with the constant scheduler8 for warmup, the the model’s performance 2) Loss and performance curve: In our experiments, we base learning rate we choose is 2e-5 and 5e-6 for document trained our model with a high amount of epochs and relatively and sentence level datasets, respectively We set the warmup little data Our training loss curve can be seen in Fig 2, from ratio to 0.25 and batch size to 10, then we train each model a first glance, it is obvious that our model started to overfit for 100 epochs The BERT model we use is PhoBert-large very early and the validation loss kept increasing However, with 25 Transformers blocks and a hidden layer size of 1024 we observe that it is not the case Even though validation loss We test the performance of two settings: layers aggregation was increasing, performance still slowly increases as can be (HSUM-HC_4) and layers aggregation (HSUM-HC_8) seen in Fig 3, this case was also observed by [14] and [21], https://huggingface.co/transformers/main_classes/optimizer_schedules.html indicating that the model still learns but does it at a slow Table III: Results on the test set of VLSP 2018 Dataset, Hotel domain Models Linear SVM Multilayer Perceptron CNN BiLSTM + CNN PhoBert-based viBERT Our method HSUM-HC_8 HSUM-HC_4 Phase A (Aspect Detection) Precision Recall F1 83 58 68 85 42 56 82.35 59.75 69.25 84.03 72.52 77.85 81.49 76.96 79.16 83.93 80.26 82.06 86.79 85.59 83.66 83.39 85.20 84.67 Phase B(Aspect Polarity Detection) Precision Recall F1 71 49 58 80 39 53 76.53 66.04 70.90 80.04 70.01 74.69 84,52 83.50 76.08 74.65 80.08 78.83 Table IV: Results on the test set of UIT ABSA Dataset, Hotel domain Models Multiple SVM CNN LSTM + Attention BiLSTM + Attention CNN-LSTM CNN-LSTM + Attention BiLSTM-CNN PhoBert-base Our method HSUM-HC_8 HSUM-HC_4 Phase A (Aspect Detection) Precision Recall F1 76.68 74.70 75.68 78.61 74.35 76.42 83.47 69.07 75.59 82.02 72.08 76.73 10.74 42.35 17.14 76.92 70.76 73.71 77.11 78.22 77.66 83.46 75.18 79.10 (a) VLSP 2018 Loss 80.26 79.75 81.31 80.96 80.78 80.34 Phase B(Aspect Polarity Precision Recall 69.06 67.28 71.48 67.61 76.22 63.07 74.68 65.63 07.72 30.43 69.02 63.50 70.23 71.23 77.75 70.03 76.87 76.89 73.71 72.97 Detection) F1 68.16 69.49 69.03 69.86 12.32 66.14 70.72 73.69 75.25 74.88 (b) UIT ABSA Loss Figure 2: The loss curves on the validation and test sets for VLSP 2018 (left) and UIT ABSA dataset (right) (a) VLSP 2018 F1 Score (b) UIT ABSA F1 Score Figure 3: The F1 curves on the validation and test sets for VLSP 2018 (left) and UIT ABSA dataset (right) and steady pace At one point performance plateaus and the learning process stop It can be explained that BERT was pretrained on an enormous amount of data and therefore will not easily overfit E Conclusion We implemented an effective method that utilizes hidden layers of Bert with a hierarchical classifier to deal with the Vietnamese ABSA task We experimented on two datasets on different review levels and significantly outperforms previous methods, achieving state-of-the-art results for both datasets We find that since Bert-large has 25 hidden layers, using layers for aggregation gives better performance compared to the original layers usage For future work, we plan to apply our model to different domains and languages and test it with online customer reviews to see its potential applications ACKNOWLEDGMENT We would like to thank VLSP 2018 organizers and the UIT NLP Group for providing us with the ABSA datasets REFERENCES [1] D Van Thin, N L.-T Nguyen, T M Truong, L S Le, and D T Vo, “Two new large corpora for vietnamese aspect-based sentiment analysis at sentence level,” ACM Trans Asian Low-Resour Lang Inf Process., vol 20, no 4, May 2021 [Online] Available: https://doi.org/10.1145/3446678 [2] B Phạm and S McLeod, “Consonants, vowels and tones across vietnamese dialects,” International Journal of Speech-Language Pathology, vol 18, no 2, pp 122–134, 2016, pMID: 27172848 [Online] Available: https://doi.org/10.3109/17549507.2015.1101162 [3] M Pontiki, D Galanis, H Papageorgiou, I Androutsopoulos, S Manandhar, M AL-Smadi, M Al-Ayyoub, Y Zhao, B Qin, O De Clercq, V Hoste, M Apidianaki, X Tannier, N Loukachevitch, E Kotelnikov, N Bel, S M Jiménez-Zafra, and G Eryi˘git, “SemEval-2016 task 5: Aspect based sentiment analysis,” in Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016) San Diego, California: Association for Computational Linguistics, Jun 2016, pp 19–30 [4] J Wagner, P Arora, S Cortes, U Barman, D Bogdanova, J Foster, and L Tounsi, “DCU: Aspect-based polarity classification for SemEval task 4,” in Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) Dublin, Ireland: Association for Computational Linguistics, Aug 2014, pp 223–229 [5] S Kiritchenko, X Zhu, C Cherry, and S Mohammad, “NRC-Canada2014: Detecting aspects and sentiment in customer reviews,” in Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) Dublin, Ireland: Association for Computational Linguistics, Aug 2014, pp 437–442 [6] T H Nguyen and K Shirai, “PhraseRNN: Phrase recursive neural network for aspect-based sentiment analysis,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing Lisbon, Portugal: Association for Computational Linguistics, Sep 2015, pp 2509–2514 [7] Y Wang, M Huang, X Zhu, and L Zhao, “Attention-based LSTM for aspect-level sentiment classification,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing Austin, Texas: Association for Computational Linguistics, Nov 2016, pp 606–615 [8] D Tang, B Qin, X Feng, and T Liu, “Effective lstms for targetdependent sentiment classification,” 2016 [9] Y Ma, H Peng, and E Cambria, “Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive lstm,” Proceedings of the AAAI Conference on Artificial Intelligence, vol 32, no 1, Apr 2018 [Online] Available: https://ojs.aaai.org/index.php/AAAI/article/view/12048 [10] M E Peters, M Neumann, M Iyyer, M Gardner, C Clark, K Lee, and L Zettlemoyer, “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume (Long Papers) New Orleans, Louisiana: Association for Computational Linguistics, Jun 2018, pp 2227–2237 [11] A Radford and K Narasimhan, “Improving language understanding by generative pre-training,” 2018 [12] J Devlin, M.-W Chang, K Lee, and K Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2019 [13] C Sun, L Huang, and X Qiu, “Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume (Long and Short Papers) Minneapolis, Minnesota: Association for Computational Linguistics, Jun 2019, pp 380–385 [14] A Karimi, L Rossi, and A Prati, “Improving bert performance for aspect-based sentiment analysis,” arXiv preprint arXiv:2010.11731, 2020 [15] O T Tran and V T Bui, “A bert-based hierarchical model for vietnamese aspect based sentiment analysis,” in 2020 12th International Conference on Knowledge and Systems Engineering (KSE), Can Tho, Viet Nam, 2020, pp 269–274 [16] D V Thin, L S Le, V X Hoang, and N L.-T Nguyen, “Investigating monolingual and multilingual bertmodels for vietnamese aspect category detection,” 2021 [17] A Conneau, K Khandelwal, N Goyal, V Chaudhary, G Wenzek, F Guzmán, E Grave, M Ott, L Zettlemoyer, and V Stoyanov, “Unsupervised cross-lingual representation learning at scale,” 2020 [18] D Q Nguyen and A T Nguyen, “Phobert: Pre-trained language models for vietnamese,” CoRR, vol abs/2003.00744, 2020 [Online] Available: https://arxiv.org/abs/2003.00744 [19] Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, L Zettlemoyer, and V Stoyanov, “Roberta: A robustly optimized BERT pretraining approach,” CoRR, vol abs/1907.11692, 2019 [Online] Available: http://arxiv.org/abs/1907.11692 [20] G Jawahar, B Sagot, and D Seddah, “What does BERT learn about the structure of language?” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, Jul 2019, pp 3651–3657 [21] X Li, L Bing, W Zhang, and W Lam, “Exploiting BERT for end-to-end aspect-based sentiment analysis,” in Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), Hong Kong, China, Nov 2019, pp 34–41 ... ROOMS#COMFORT • Step 3: Identifying the sentiment for the aspect Looking at the review, we see that the reviewer said that their room is very comfortable, so the sentiment for ROOMS#COMFORT is... Integrating Bert -based hidden aggregation to hierarchical classifier for Vietnamese aspect- based sentiment analysis? ?? chấp thuận trình bày hội nghị IEEE, 2021 8th NAFOSTED Conference on Information and... Acknowledgements It gives us great pleasure and satisfaction in presenting our thesis on ? ?Aspect based sentiment analysis for text documents? ?? This would be our last project as bachelor students in university,

Ngày đăng: 02/06/2022, 20:17

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN