(LUẬN văn THẠC sĩ) a feature based opinion mining model on product reviews in vietnamese

53 1 0
(LUẬN văn THẠC sĩ) a feature based opinion mining model on product reviews in vietnamese

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY VU TIEN THANH A FEATURE-BASED OPINION MINING MODEL ON PRODUCT REVIEWS IN VIETNAMESE MASTER THESIS OF INFORMATION TECHNOLOGY Hanoi – 2012 TIEU LUAN MOI download : skknchat@gmail.com VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY VU TIEN THANH A FEATURE-BASED OPINION MINING MODEL ON PRODUCT REVIEWS IN VIETNAMESE Major : Computer Science Code : 60 48 01 MASTER THESIS OF INFORMATION TECHNOLOGY Supervisor: Assoc.Prof Ha QuangThuy Hanoi – 2012 TIEU LUAN MOI download : skknchat@gmail.com ORIGINALITY STATEMENT ‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at University of Engineering and Technology (UET/Coltech) or any other educational institution, except where due acknowledgement is made in the thesis Any contribution made to the research by others, with whom I have worked at UET/Coltech or elsewhere, is explicitly acknowledged in the thesis I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.’ Hanoi, November 25th , 2012 Signed i TIEU LUAN MOI download : skknchat@gmail.com ii TIEU LUAN MOI download : skknchat@gmail.com iii ABSTRACT Feature-based opinion mining and summarizing (FOMS) of reviews is a very interesting and attracting issue in the opinion mining field With the development of e-commerce in Vietnam, there are more and more commercial sites and technical forums where people can review or express their opinions on the products which they have used As a result, the number of reviews has been increasing rapidly to hundreds or even thousands for a hot-product in recent years Not only is it difficult for the customer to read in order to make a decision whether to buy product but hard for the producer to handle customer opinions to improve their products as well In this thesis, we describe a Feature-based opinion mining and summarizing model on Vietnamese product reviews Our model performs four following steps:(1)Preprocessing the input customer reviews by standardizing reviews, segmenting Token, and POS tagging(2) extracting explicit product features and opinion-words by using Vietnamese syntax rules, identifying implicit product features by using relationships with opinion words,and automatically grouping synonym product features by combining HAC clustering method and semi-supervised SVM-kNN classification method; (3) identifying opinion sentences in each review and deciding whether each opinion sentence is positive, negative or neutral by using a VietSentiWordNet extended from an initial SentiWordNet 3.0; (4) summarizing the results which is different from the traditional text summarization because we only focus on product-features on which the customers reviewed and whether opinions are positive, negative or neutral Experimental results on Vietnamese reviews of mobile phone product domain demonstrate the effectiveness of the model Publications: Huyen-Trang Pham, Tien-Thanh Vu, Mai-Vu Tran and Quang-Thuy Ha A Solution for Grouping Vietnamese Synonym Feature Words in Product Reviews In Proceedings of the 6th international conference on Asia-Pacific Services Computing (APSCC 2011) Quang-Thuy Ha, Tien-Thanh Vu, Huyen-Trang Pham and Cong-To Luu An Upgrading Featurebased Opinion Mining Model on Vietnamese Product Reviews In Proceedings of the 7th international conference on Active media technology (AMT 2011), pp 173-185 Tien-Thanh Vu, Huyen-Trang Pham, Cong-To Luu and Quang-Thuy Ha A Feature-Based Opinion Mining Model on Product Reviews in Vietnamese In Semantic Methods for Knowledge Management and Communication (SCI 381), pp 23-33 TIEU LUAN MOI download : skknchat@gmail.com ACKNOWLEDGEMENTS First and foremost, I would like to express my deepest gratitude to my supervisor, Assoc.Prof Ha Quang Thuy, for his patient guidance and continuous support throughout the years He always appears when I need help, and responds to queries so helpfully and promptly I would like to give my honest appreciation to my colleagues at the Knowledge and Technology laboratory for their great support I also would like to thank my friend, Nguyen Quoc Dat, for his kindly help I sincerely acknowledge the Vietnam National University, Hanoi, NAFOSTED Vietnam and especially, QG.10.38 and KC.01.TN04/11-15 projects for supporting finance to my master study Finally, this thesis would not have been possible without the support and love of my parents and my wife Thank you! iv TIEU LUAN MOI download : skknchat@gmail.com To my family ♥ v TIEU LUAN MOI download : skknchat@gmail.com Table of Contents Introduction Literature review 2.1 Opinion Mining 2.1.1 The demand of opinion mining 2.1.2 The basic concepts in the opinion mining field 2.1.3 Opinion mining problems 2.2 Feature-based Opinion Mining 2.2.1 Problem Definition 2.2.2 Features Extraction 2.2.3 Opinion Orientation Identification 2.2.4 Feature-based Opinion Mining System on Vietnamese Product Reviews Our Feature-based Opinion Mining Model 3.1 Introduction 3.2 Phase 1: Pre-processing 3.2.1 Data Standardizing 3.2.2 Token Segmenting and POS Tagging 3.3 Phase 2: Product Features and Opinion Words 3.3.1 Explicit Product Features Extraction 3.3.2 Opinion word Extraction 3.3.3 Implicit Features identification 3.3.4 Grouping Synonym Features 3.3.5 Frequent Features Identification 3.4 Phase 3: Determining the opinion orientation 3.5 Phase 4: Summarization Extraction 4 10 10 11 12 14 15 15 16 16 17 18 18 21 22 23 24 26 28 vi TIEU LUAN MOI download : skknchat@gmail.com TABLE OF CONTENTS vii Evaluation 4.1 Environment and Experimental Data 4.1.1 Environment 4.1.2 Experimental Data 4.2 Product Features Extraction Evaluation 4.3 Opinion Words Extraction Evaluation 4.4 The Whole System Evaluation 29 29 29 29 30 31 32 Conclusion 36 TIEU LUAN MOI download : skknchat@gmail.com List of Figures 1.1 An example summarization of Samsung Galaxy Tab 2.1 2.2 2.3 2.4 OM documents on Google Scholars (In title) OM documents on Google Scholars (In anywhere) The tree of Nokia N72 object A customer review 3.1 Model for Feature-based Opinion Mining and Summarizing in Vietnamese Product Reviews 16 A summarization output 28 3.2 4.1 4.2 4.3 4.4 4.5 (Precision values (%))A comparison between our method in (Vu et al., 2011) and in this thesis (Recall values (%))A comparison between our method in (Vu et al., 2011) and in this thesis (F1 values (%))A comparison between our method in (Vu et al., 2011) and in this thesis A summarization of Nokia C5-03 A summarization of LG Wink Touch T300 33 34 34 35 35 viii TIEU LUAN MOI download : skknchat@gmail.com 3.4 Phase 3: Determining the opinion orientation 27 0.4 and their synonym synsets After that, T rp0 is added all of synsets being antonym with the synset in T rn0 , and vice versa Then T ro0 is the set of remaining synsets in the initial VietSentiWordNet And then a VietnameseAV dictionary is constructed by adding all of adjectives and modal verbs from a general Vietnamese dictionary – Secondly, two SVM classifiers are trained by using boosting method based on two training set which are T rp0 ∪ T ro0 (for positive classifier) and T rp0 ∪T ro0 (for negative classifier) Then all of the adjectives in the VietnameseAV dictionary are labelled by using these two classifiers After the normalization all of opinion words, the extending VietSentiWordNet has 9333 synsets and 9533 words Denoting ts as the opinion weight of the feature in a customer review, tsi is the weight of the ith opinion words on the feature in the review (denoted by wordi ); wi is opinion weight of wordi got from VietSentiWordnet dictionary by getting the subtraction of positive and negative score of wordi After that, ts is determined as: ts = m tsi where m be the number of opinion words of the feature in the review In cases of having negative word such as “không” not , the value of tsi is reversed (it means that tsi = −1 × tsi ) In other cases, tsi equals to wi if there is no gradable word such as: rấtvery , and ti is determined as h × wi if there is a gradable word with weight of h • In the second step, opinion orientation for the feature is classified into one of three classes: positive/negative or neutral based on the weight of ts – if +0.2 < ts so the opinion is positive – if −0.2 ≤ ts ≤ +0.2 so the opinion is neutral – if ts < −0.2 the opinion is negative For example, given a customer review “Con có đầy đủ tính Nó dễ dùng” “This mobile has full of functions It is also quite easy to use” After phase 2, feature word “tính năng” f unction , opinion words “đầy đủ” f ull and “dễ” easy have been extracted from the review Then feature word “tính năng” f unction has been grouping into feature “ứng dụng” application After phase 3, the opinion weights of “ứng dụng” application feature of the customer review is determined by TIEU LUAN MOI download : skknchat@gmail.com 3.5 Phase 4: Summarization 28 following: Opinion weights of “đầy đủ” f ull and “dễ” easy are 0.625 and 0.625 respectively Therefore, the opinion weight of the customer review on “ứng dụng” application feature is 1.25 being the sum of 0.625 and 0.625 The weight is greater than the threshold value of 0.2 then the opinion orientation of the customer on “ứng dụng” applications is positive 3.5 Phase 4: Summarization The summarization is determined by enumerating on all of customer opinion orientation on all of product features And the result is showed in table diagram like figure 3.2 Figure 3.2: A summarization output TIEU LUAN MOI download : skknchat@gmail.com Chapter Evaluation Based on the proposed model in chapter 3, this thesis implements experiments on building Vietnamese FOMS system on “mobile phone” product reviews In this chapter, we describe our results in evaluating via three experiments which are: product features extraction, opinion words extraction and the whole system evaluations After the two experiments, we implement summarization task and show the summarizing result in column charts 4.1 4.1.1 Environment and Experimental Data Environment • Chip: Intel(R) Core I5(R) @ 2.53GHz • Ram: 3.00 GB • OS: Microsoft Windows • Programming Tool: Java Eclipse SDK 4.1.2 Experimental Data We crawl 743 customer reviews on ten popular “mobile phone” products from website http://www.thegioididong.com Table 4.1 shows the number of crawled and standardized reviews for each product 29 TIEU LUAN MOI download : skknchat@gmail.com 4.2 Product Features Extraction Evaluation 30 Table 4.1: Total of crawled reviews Product names Number of comments LG GS290 Cookie Fresh 77 LG Optimums One P500 45 LG Wink Touch T300 102 Nokia c5-03 102 Nokia e63 61 Nokia E72 68 Nokia N8 88 Nokia X2-01 79 Samsung galaxy tab 42 Samsung star s5233w 79 4.2 Product Features Extraction Evaluation Table 4.2: Results of frequent product features extraction (MF: Number of manual product feature; SF: Number of product features found by the system; CSF: Number of correct product features found by the system) Product names MF LG GS290 Cookie Fresh 18 LG Optimums One P500 17 LG Wink Touch T300 11 Nokia c5-03 22 Nokia e63 23 Nokia E72 26 Nokia N8 22 Nokia X2-01 15 Samsung star s5233w 15 Samsung galaxy tab 15 Average SF/CSF 19/18 18/16 11/11 23/20 23/21 28/23 24/21 19/14 20/14 16/14 Precision(%) 94.74 88.89 100 86.96 91.30 82.14 87.50 73.68 85.00 87.50 87.06 Recall(%) 100 94.12 100 90.91 91.30 88.46 95.45 93.33 93.33 93.33 93.58 F1 (%) 97.37 91.50 100 88.93 91.30 85.30 91.48 83.51 90.42 88.92 90.32 Subsequently, we evaluate the achievement result on feature extracting phase using Vietnamese syntax rules Table 4.2 illustrates the effectiveness of the feature extraction For each product, we read all of those reviews and list all product features from them Then we enumerate corrected features returned by the system The precision, recall and F1 are illustrated in columns 4, and respectively in table 4.2 It can be seen that results of frequent features extraction step are good with average of F1 values above 90% Furthermore, to illustrate the effectiveness of our TIEU LUAN MOI download : skknchat@gmail.com 4.3 Opinion Words Extraction Evaluation 31 feature extraction step, we compare the features generated using base method in (Hu and Liu, 2004) in which we adopt for Vietnamese reviews Discussion: The F1 score of baseline method is just under 67%, the average recall score is 60.68%, and the precision score is 70.13% which are significantly lower than those of our innovation We find that there are three major reasons leading to its poor results: Firstly, Vietnamese syntax rules have many differences in comparison with syntax rules of English For example, in Vietnamese, nouns come before adjectives Whereas, English is opposite Secondly, in baseline, the authors not process grouping synonym features case, so the result is not really high Finally, the authors not process implicit features case, lead to recall in baseline is quite low Comparing the average result in table 4.2, we can clearly see that the proposed method is much more effective for our task than (Hu and Liu, 2004) 4.3 Opinion Words Extraction Evaluation After product features extraction step, we evaluate the achievement result on opinion words extracting phase using Vietnamese syntax rules Table 4.3 illustrates the Table 4.3: Results of opinion words extraction (MO: Number of manual opinion words; SO: Number of opinion words found by the system; CSO: Number of correct opinion words found by the system) Product names MO LG GS290 Cookie Fresh 117 LG Optimums One P500 83 LG Wink Touch T300 119 Nokia c5-03 125 Nokia e63 148 Nokia E72 148 Nokia N8 124 Nokia X2-01 120 Samsung star s5233w 157 Samsung galaxy tab 60 Average SO/CSO 114/94 62/60 102/94 102/97 134/120 154/110 128/97 128/102 162/131 61/54 Precision(%) 82.47 96.77 92.16 95.10 89.55 71.43 75.78 79.69 80.86 88.52 85.23 Recall(%) 80.34 72.29 78.99 77.60 81.08 74.32 78.23 85.00 83.44 90.00 80.13 F1 (%) 81.39 82.76 85.07 85.46 85.11 72.85 76.98 82.26 82.13 89.26 82.33 effectiveness of opinion words extraction step For each product, we read all of those reviews and list all opinion words from them Then we enumerate corrected opinion words returned by the system The precision, recall and F1 are illustrated in column TIEU LUAN MOI download : skknchat@gmail.com 4.4 The Whole System Evaluation 32 4, and respectively in table 4.3 It can be seen that results of opinion words extraction step are good with average of F1 value above 82% 4.4 The Whole System Evaluation For each feature extracted from the previous experiment, firstly, the system extracts opinion words from reviews mentioning to this feature in 743 crawled reviews Secondly, the system calculate opinion weight of the opinion words Thirdly, orientation of opinions are identified as positive, negative or neutral by using threshold of 0.2 (as mention in chapter 3) Finally, we obtain positive, negative and neutral comments for all features of each product and then we evaluate performance of the whole system by precision, recall and F1 measures for each product According to the table 4.4, the precision and recall of our system are quite satisfactory with both precision and recall values approximate 69% Discussion: To prove the effectiveness of our approach, we also implement a same system by using method presented by (Vu et al., 2011) The comparative results showed in figures 4.1, 4.2, 4.3 prove that our method achieve better result than (Vu et al., 2011) in all three measures being Precision, Recall, and F1 (69.51% vs 64.44% in average of Precision, 69.27% vs 61.09% in average of Recall values, and 69.21% vs 62.76% in average of F1 ) Our method in this thesis is better than that in (Vu et al., 2011) because VietSentiWordnet dictionary with 9333 synsets in this thesis is much richer than that with 977 synsets in (Vu et al., 2011) In summarization task, with given extracted product features from the first evaluation Firstly, we group the product features by method presented in 3.3.4 to get groups/categories such as “Giá” price (giá cảprice , chi phíexpense , số tiềnamountof money , etc), “Tính năng” f unction (đàiradio , từ điểndictionary , trò chơigame , etc), etc Secondly, the system generates a column chart summarizing the extracted information Figures 4.4 and 4.5 show summarizations of the reviews of customers on each features of two products which are LG Wink Touch T300 and Nokia C5-03 TIEU LUAN MOI download : skknchat@gmail.com 4.4 The Whole System Evaluation 33 Table 4.4: Precision, Recall and F1 of Feature-based Opinion Mining Model on Vietnamese mobile phones Reviews Product names Precision(%) LG GS290 Cookie Fresh 77.12 LG Optimums One P500 67.19 LG Wink Touch T300 70.59 Nokia c5-03 65 Nokia e63 71.01 Nokia E72 70.25 Nokia N8 71.32 Nokia X2-01 68.18 Samsung star s5233w 64.18 Samsung galaxy tab 70.30 Average 69.51 Recall(%) 77.78 55.81 62.07 57 66.22 75 78.23 75.00 71.67 73.89 69.27 F1 (%) 77.45 60.97 66.06 60.74 68.53 72.55 74.62 71.43 67.72 72.05 69.21 Figure 4.1: (Precision values (%))A comparison between our method in (Vu et al., 2011) and in this thesis TIEU LUAN MOI download : skknchat@gmail.com 4.4 The Whole System Evaluation 34 Figure 4.2: (Recall values (%))A comparison between our method in (Vu et al., 2011) and in this thesis Figure 4.3: (F1 values (%))A comparison between our method in (Vu et al., 2011) and in this thesis TIEU LUAN MOI download : skknchat@gmail.com 4.4 The Whole System Evaluation 35 Figure 4.4: A summarization of Nokia C5-03 Figure 4.5: A summarization of LG Wink Touch T300 TIEU LUAN MOI download : skknchat@gmail.com Chapter Conclusion In this thesis, we presented, in chapter 3, an approach to build an opinion mining system of customer reviews according to product features based on Vietnamese syntax rules and VietSentiWordNet dictionary in four phases: (1)Pre-processing; (2)Extracting explicit/implicit product features and opinion-words,and grouping synonym product features; (3)Identifying orientation of opinion; and (4)Summarizing the results With three main contributions as following: • Firstly, in the phase 1, we built a Vietnamese accented system combined Ngram statistic model and Hidden Markov model(HMM) for the purpose of converting a sentence without accents into a Vietnamese accented sentence • Secondly, in the phase 2, we constructed a mapping dictionary to identify implicit features by mapping those ones to corresponding opinion words; and we proposed a method of using SVM-kNN semi-supervised learning along with HAC clustering method generating training set for SVM-kNN to group synonym features; after that, co-reference was resolved by using some Vietnamese rules • Finally, in the phase 3, we extended the initial VietSentiWordnet dictionary (a Vietnamese sentiment resource) only having 977 sentiment synsets and 1179 sentiment words to a new VietSentiWordnet having 9333 synsets and 9533 words Our proposed model handled the limitations of the current FOMS systems not resolved yet 36 TIEU LUAN MOI download : skknchat@gmail.com 37 In chapter 4, we also applied the opinion mining model to implement FOMS system on “mobile phone” reviews in Vietnamese and achieved good results with F1 measures above 90% on product features extraction evaluation, 82% on opinion words extraction evaluation, and 69% on the whole system evaluation, that results confirms the correctness of our approach In the future, we will improve the model to automatically identify implicit features (not using the mapping dictionary) and resolve problem of mining comparative opinions For example, “Pin nokia C5-03 tốt pin LG Wink Touch T300”_(Battery of Nokia C5-03 is better than that of LG Wink Touch T300), the recent system had a failed opinion orientation which both C5-03 and LG Wink Touch T300 are positive because of positive opinion word “tốt” good belonging to both products After that, we will implement the Vietnamese opinion mining system in other domain such as: computer, household item, jewelry, etc TIEU LUAN MOI download : skknchat@gmail.com Bibliography Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining In Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, and Daniel Tapias, editors, Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), Valletta, Malta, may 2010 European Language Resources Association (ELRA) ISBN 2-9517408-6-7 Giuseppe Carenini, Raymond T Ng, and Ed Zwart Extracting knowledge from evaluative text In K-CAP, pages 11–18, 2005 Amitava Das and Sivaji Bandyopadhyay Sentiwordnet for indian languages In Proceedings of The 8th Workshop on Asian Language Resources, pages 56—-63, 2010 Andrea Esuli Automatic generation of lexical resources for opinion mining: models, algorithms and applications SIGIR Forum, 42:105–106, November 2008 ISSN 0163-5840 Andrea Esuli and Fabrizio Sebastiani Sentiwordnet: A publicly available lexical resource for opinion mining In In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC’06), pages 417–422, 2006 Quang-Thuy Ha, Tien-Thanh Vu, Huyen-Trang Pham, and Cong-To Luu An upgrading featurebased opinion mining model on vietnamese product reviews In Proceedings of the 7th international conference on Active media technology, AMT’11, pages 173–185, Berlin, Heidelberg, 2011 Springer-Verlag ISBN 978-3-642-23619-8 Vasileios Hatzivassiloglou and Kathleen R McKeown Predicting the semantic orientation of adjectives In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics, EACL ’97, pages 174–181, Stroudsburg, PA, USA, 1997 Association for Computational Linguistics Chih-Wei Hsu and Chih-Jen Lin A comparison of methods for multiclass support vector machines IEEE Transactions on Neural Networks, 13(2):415–425, 2002 URL http://ieeexplore.ieee org/xpls/abs_all.jsp?arnumber=991427&isnumber=21380 38 TIEU LUAN MOI download : skknchat@gmail.com Bibliography 39 Minqing Hu and Bing Liu Mining and summarizing customer reviews In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’04, pages 168–177, New York, NY, USA, 2004 ACM ISBN 1-58113-888-1 Pham Huyen-Trang, Vu Tien-Thanh, Tran Mai-Vu, and Ha Quang-Thuy A solution for grouping vietnamese synonym feature words in product reviews In Proceedings of the APSCC 2011 conference, inpress, Korea, 2011 Binh Thanh Kieu and Son Bao Pham Sentiment analysis for vietnamese In Proceedings of the 2010 Second International Conference on Knowledge and Systems Engineering, KSE ’10, pages 152–157, Washington, DC, USA, 2010 IEEE Computer Society ISBN 978-0-7695-4213-3 Soo-Min Kim and Eduard Hovy Automatic identification of pro and reasons in online reviews In Proceedings of the COLING/ACL on Main conference poster sessions, COLING-ACL ’06, pages 483–490, Stroudsburg, PA, USA, 2006 Association for Computational Linguistics Kunlun Li, Xuerong Luo, and Ming Jin Semi-supervised learning for svm-knn Journal of Computers, 5(5):671–679, 2010 Bing Liu Sentiment analysis and subjectivity In Nitin Indurkhya and Fred J Damerau, editors, Handbook of Natural Language Processing, Second Edition CRC Press, Taylor and Francis Group, Boca Raton, FL, 2010 ISBN 978-1420085921 Bruno Ohana Opinion mining with the SentWordNet lexical resource PhD thesis, 2009 Bo Pang and Lillian Lee Opinion mining and sentiment analysis Found Trends Inf Retr., 2: 1–135, January 2008 ISSN 1554-0669 doi: 10.1561/1500000011 URL http://dl.acm.org/ citation.cfm?id=1454711.1454712 Dang Duc Pham, Giang Binh Tran, and Son Bao Pham A hybrid approach to vietnamese word segmentation using part of speech tags Knowledge and Systems Engineering, International Conference on, 0:154–161, 2009 Ana-Maria Popescu and Oren Etzioni Extracting product features and opinions from reviews In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05, pages 339–346, Stroudsburg, PA, USA, 2005 Association for Computational Linguistics Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen Expanding domain sentiment lexicon through double propagation In Proceedings of the 21st international jont conference on Artifical intelligence, IJCAI’09, pages 1199–1204, San Francisco, CA, USA, 2009 Morgan Kaufmann Publishers Inc Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen Opinion word expansion and target extraction through double propagation Comput Linguist., 37:9–27, 2011 ISSN 0891-2017 TIEU LUAN MOI download : skknchat@gmail.com Bibliography 40 Christopher Scaffidi, Kevin Bierhoff, Eric Chang, Mikhael Felker, Herman Ng, and Chun Jin Red opal: product-feature scoring from reviews In Proceedings of the 8th ACM conference on Electronic commerce, EC ’07, pages 182–191, New York, NY, USA, 2007 ACM ISBN 978-159593-653-0 Veselin Stoyanov and Claire Cardie Topic identification for fine-grained opinion analysis In Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1, COLING ’08, pages 817–824, Stroudsburg, PA, USA, 2008 Association for Computational Linguistics ISBN 978-1-905593-44-6 Mike Thelwall Myspace comments Online Information Review, 33(1):58–76, 2009 Peter D Turney Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews Computational Linguistics, pages(July):8, 2002 URL http://cogprints org/2321/ Peter D Turney and Michael L Littman Measuring praise and criticism: Inference of semantic orientation from association ACM Trans Inf Syst., 21:315–346, October 2003 ISSN 1046-8188 Tien-Thanh Vu, Huyen-Trang Pham, Cong-To Luu, and Quang-Thuy Ha A feature-based opinion mining model on product reviews in vietnamese In Radoslaw Katarzyniak, Tzu-Fu Chiu, Chao-Fu Hong, and Ngoc Nguyen, editors, Semantic Methods for Knowledge Management and Communication, volume 381 of Studies in Computational Intelligence, pages 23–33 Springer Berlin Heidelberg, 2011 ISBN 978-3-642-23417-0 Zhongwu Zhai, Bing Liu, Hua Xu, and Peifa Jia Grouping product features using semi-supervised learning with soft-constraints In Proceedings of the 23rd International Conference on Computational Linguistics, COLING ’10, pages 1272–1280, Stroudsburg, PA, USA, 2010 Association for Computational Linguistics Zhongwu Zhai, Bing Liu, Hua Xu, and Peifa Jia Clustering product features for opinion mining In WSDM’11, pages 347–354, 2011a Zhongwu Zhai, Bing Liu, Hua Xu, and Peifa Jia Constrained lda for grouping product features in opinion mining In Joshua Huang, Longbing Cao, and Jaideep Srivastava, editors, Advances in Knowledge Discovery and Data Mining, volume 6634 of Lecture Notes in Computer Science, pages 448–459 Springer Berlin / Heidelberg, 2011b ISBN 978-3-642-20840-9 Hao Zhang, Alexander C Berg, Michael Maire, and Jitendra Malik Svm-knn: Discriminative nearest neighbor classification for visual category recognition In CVPR (2), pages 2126–2136, 2006 Lei Zhang, Bing Liu, Suk Hwan Lim, and Eamonn O’Brien-Strain Extracting and ranking product features in opinion documents In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING ’10, pages 1462–1470, Stroudsburg, PA, USA, 2010 Association for Computational Linguistics TIEU LUAN MOI download : skknchat@gmail.com Bibliography 41 Copyright c 2012 by Vu Tien Thanh Printed and bound by Vu Tien Thanh TIEU LUAN MOI download : skknchat@gmail.com ... Opinion Mining Feature-based Opinion Mining Feature-based Opinion Mining and Summarizing Natural Language Processing Pointwise Mutual Information Support Vector Machine Hierarchical Agglomerative... there are more and more users’ opinions on all aspects of life, social appearing on the internet, creating a rich data resource for opinion mining and summarization This brings not only advantages... LUAN MOI download : skknchat@gmail.com 2.1 Opinion Mining • Opinion passage on a feature An opinion passage on a feature f of an object O evaluated in d is a group of consecutive sentences in

Ngày đăng: 27/06/2022, 09:12

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan