(LUẬN văn THẠC sĩ) predicting the popularity of social curation,dự đoán nội dung mạng xã hội nổi bật

Predicting the Popularity of Social Curation Kieu Thanh Binh Faculty of Information Technology University of Engineering and Technology Vietnam National University, Hanoi Supervised by Assoc Prof Pham Bao Son A thesis submitted in fulfillment of the requirements for the degree of Master of Science in Computer Science December 2015 TIEU LUAN MOI download : skknchat@gmail.com TIEU LUAN MOI download : skknchat@gmail.com ORIGINALITY STATEMENT ‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at University of Engineering and Technology (UET/Coltech) or any other educational institution, except where due acknowledgement is made in the thesis Any contribution made to the research by others, with whom I have worked at UET/Coltech or elsewhere, is explicitly acknowledged in the thesis I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.’ Hanoi, December 30th , 2015 Signed i TIEU LUAN MOI download : skknchat@gmail.com ABSTRACT The amount and variety of social media content such as status, images, movies, and music are increasing rapidly Accordingly, the social curation service is emerging as a new way to connect, select, and organize information on a massive scale One noticeable feature of social curation services is that they are loosely supervised: the content that users create in the service is manually collected, selected, and maintained A large proportion of these contents are arbitrarily created by inexperienced users In this thesis, we look into social curation, particularly, the Storify website This is the most popular social curation for creating stories included in various domains such as Twitter, Flicker, and YouTube We implemented a machine learning method with feature extraction to filter these contents and to predict the popularity of social curation data Publication: Binh Thanh Kieu, Son Bao Pham and Ryutaro Ichise Predicting the Popularity of Social Curation In Proceedings of the 6th International Conference on Knowledge and Systems Engineering, pp.413-424, Springer (KSE 2014) ii TIEU LUAN MOI download : skknchat@gmail.com ACKNOWLEDGEMENTS First and foremost, I would like to express my deepest gratitude to my supervisor, Assoc Prof Pham Bao Son, for his patient guidance and continuous support throughout the years He always appears when I need help, and responds to queries so helpfully and promptly I would like to specially thank Prof Ryutaro Ichise and his colleagues for their help through my time at Ichise Laboratory, NII I sincerely acknowledge the Vietnam National University, Hanoi, Toshiba Foundation Scholarship, and especially Assoc Prof Pham Bao Son for supporting finance to my master study Finally, this thesis would not have been possible without the support and love of my mother and my father Thank you! iii TIEU LUAN MOI download : skknchat@gmail.com Table of Contents Introduction 1.1 Social Curation 1.2 Prediction the poplularity 1.3 Thesis Organisation Literature review 2.1 Social Curation 2.1.1 Definition 2.1.2 Social Curation Service 2.2 Storify 2.3 Related Work Predicting the Popularity of Social Curation 3.1 Problem Formulation 3.1.1 Regression 3.1.2 Classification 3.2 Feature Extraction 3.2.1 Curator features 3.2.2 Curation features 3.2.3 Text features 3.2.4 Regression and classification model Experimental Results 4.1 The Experimental Dataset 4.2 Results 4.2.1 Regression 4.2.2 Classification 4.2.3 T-test Evaluation 1 2 3 10 13 13 13 13 14 14 15 16 17 19 19 19 19 19 21 iv TIEU LUAN MOI download : skknchat@gmail.com TABLE OF CONTENTS Conclusion v 22 TIEU LUAN MOI download : skknchat@gmail.com List of Figures 2.1 2.2 2.3 Content Creators Content Curators Example of a Storify list vi TIEU LUAN MOI download : skknchat@gmail.com List of Tables 2.1 2.2 2.3 Statistics of curated domains Element types Storify action statistics 10 4.1 4.2 4.3 Mean Square Errors (MSE) of view count regression by SVR 20 Prediction accuracy 20 Accuracy of 10 tests 21 vii TIEU LUAN MOI download : skknchat@gmail.com List of Abbreviations BoW ML NLP SNS SVM Bag-of-Words Machine Learning Natural Language Processing Social Networking Service Support Vector Machine viii TIEU LUAN MOI download : skknchat@gmail.com 12 Chapter Literature review developed a coarse multi-class classifier-based approach to determine whether given Twitter hashtags are retweeted x ≤ (0; 100; 10000; ∞) times (Hong et al., 2011) Similarly, Lakkaraju and Ajmera used support vector machines (SVMs) to predict whether a given content falls into a group that attracts x ≤ (10%; 25%; 50%; 75%; 100%) of the attention in a system (Lakkaraju and Ajmera, 2011), while Jamali and Rangwala predicted the popularity of content by using an entropy measure (Jamali and Rangwala, 2009) Finally, Szabo and Huberman presented a linear regression model based on the number of views (Szabo and Huberman, 2010); this method was applied to build predictive popularity by applying regression to different feature spaces (Bandari et al., 2012) (Hogg and Lerman, 2012) (Lerman and Hogg, 2010) (Tsagkias et al., 2010) In this work, the popularity of Social Curation is shown by the number of views that the content will receive in the near future We propose three groups for categorizing the popularity level of Social Curation We build a predictor based on a machine learning method, SVM, with feature selection to classify into these groups TIEU LUAN MOI download : skknchat@gmail.com Chapter Predicting the Popularity of Social Curation 3.1 3.1.1 Problem Formulation Regression Formally, we predict view count yi of content i from information of the content xi This is a typical regression problem: i.e we try to minimize the error between the predicted view count yˆi and the true view count yi by modifying an unknown parameter w that governs the regression function yˆi = f (xi; w) Given content and social curation lists, we extract several features xi and predicted a view count for each content Social curation lists contain many kinds of information that are useful for predicting view counts 3.1.2 Classification Similar to normal content, the popularity of social curation is defined by the number of users’ view We predict how much view which stories will receive in the near future However, it is difficult to predict exact amount of attention and people are almost interested in the popularity of content; thus, instead of predicting exactly the number, we cast the task as a multi-class classification problem that predicts the popularity that a curation list will receive after three months based on the number of views Although our system cannot predict exactly the number of attention, but this 13 TIEU LUAN MOI download : skknchat@gmail.com 14 Chapter Predicting the Popularity of Social Curation system partly helps users to be able to identify popular contents and not popular contents We divide the number of views into three different classes: class – not popular, with the number of views less than 10, class – less popular, with the number of views between 10 and 1000, class – very popular, with the number of views more than 1000 We used an SVM to classify these classes LibSVM (Chang and Lin, 2011) with a radial basic function (RBF) kernel and default parameters, and the feature selection tool (wei Chen, 2005) were used to optimize the result We extracted three types of features, namely curation features, curator features and text features Curator features are features of users who collect and organize elements from some domains and create curation lists Curation features are features related to the content of the curation lists Text features cover all content text of curation lists 3.2 Feature Extraction Social curation lists contain many kinds of information that are useful for classifying For example, if the curation list includes many Twitter contents, the view count of the contents is expected to increase; or, if elements match the context of the curation list, the content will attract much more attention In this study, as the social curation list included a large number of Twitter messages, we used applicable features for predicting the number of retweets and microblogging popularity We divided the features into the two distinct sets mentioned above: curator features (which are related to the author of the story), curation features (which encompass various statistics of the content in the story) and text features 3.2.1 Curator features The following are the five curator features: (i) The number of users who follow the curator of the content TIEU LUAN MOI download : skknchat@gmail.com 3.2 Feature Extraction 15 (ii) The number of users who the curator of the content follows (iii) The number of stories written by the curator (iv) The user’s language (English or not) (v) When the curator of the content started using Storify These features were selected from the content creator features proposed by Ishiguro et al (Ishiguro et al., 2012) We implemented these features as our baseline system The number of followers and friends has been consistently shown to be a good indicator of retweetability, whereas the number of stories has not been found to have a significant impact (Suh et al., 2010) Our prior analysis also showed that stories written in English are more likely to be viewed, so we used a binary feature indicating if the user’s language is English The date when a curator started using Storify shows their experience Normally, longtime users have more experience producing more popular curator stories than new users We are not aware of any prior work that analyzes the effect of language or date on content popularity 3.2.2 Curation features The following are the seven curation features: (i) The number of hashtags (ii) The number of versions (iii) The number of embeds (iv) The story’s language (English or not) (v) The number of popular tweet elements/total elements (the number of retweets greater than 100) (vi) The number of popular image and video elements/total elements (the number of image views and video views greater than 1000) (vii) The total number of elements TIEU LUAN MOI download : skknchat@gmail.com 16 Chapter Predicting the Popularity of Social Curation As a large proportion of elements in the curation list is from the Twitter domain, hashtags therefore play an important feature for predicting the popularity One paper showed that hashtags, URLs and mentions have a high correlation with predicting popular Twitter messages (Suh et al., 2010) Although the Storify API provides hashtags, URLs and mentions of each story, URLs and mentions have an insignificant impact on the result The version feature shows that users who modified their story can improve the story’s quality and get more attention The embed feature shows that more sharing is more popular The English language is the most well-known language in the world, so stories written in English are read by more people than those in any other language Although the feature is quite similar to the feature of the language of the curator, not all curators use their main language to write stories According to our experiments, tests using this feature achieved higher results Finally, the higher proportion of Twitter elements and media elements also increase the accuracy Moreover, using many elements in a story draws more attention than stories with fewer elements 3.2.3 Text features Text messages can directly represent the intentions, opinions, or emotions of content creators and curators Thus, carefully designed text features would be useful in predicting responses to content Our assumption is that if the topics or contexts of the list and the comments attached to content match well, then the content will attract much attention and gain view counts Our text features are computed as follows First, we extract three parts of texts that are found in Storify curation lists, and compute a Bag-of-Words (BoW) histogram from each part The first part is the title and description of the curation list This part is edited directly by curators; so we expect this part concisely describes the entire context of the list The second part is all texts of the curated contents in the list: tweets, facebook, youtube The BoW histogram of this part is a direct summarization of the curation list The third part is all comments for curation lists The histogram of this part encompasses the responses of SNS users with regard to the content From these BoW histograms, we compute three cosine distances based on our assumption (i) Distances between the first and the second BoWs TIEU LUAN MOI download : skknchat@gmail.com 3.2 Feature Extraction 17 (ii) Distances between the first and the third BoWs (iii) Distances between the second and the third BoWs Feature i) computes text context similarities between the title, description and the text in the list In other words this feature is a measure of the similarity between curators intention and the actual context of the list Feature ii) computes text context similarities between the title, description and the responses to the focused content Namely this feature is a measure of the similarity between curators intention and observed responses to the content in SNSs Feature iii) computes text context similarities between the tweets in the list and the responses to the content In other words, this feature is a measure of the similarity between the actual context of the list and observed responses to the content Also, we compute binarized versions of three BoWs We binaraize the BoW histograms by thresholding in order to absorb the difference in lengths and numbers of tweets among the curation lists We compute three cosine distances for these binarized BoWs in the same manner Thus, we finally obtain six text features 3.2.4 Regression and classification model To the best of our knowledge, no prior work analyzed the effect of these features on content popularity Therefore, the features we proposed are based on the experiments and feature selection tool to acquire the highest result The feature selection tool, combined with libSVM, uses the F-score for selecting features (wei Chen, 2005) The F-score is a simple technique that measures the discrimination of two sets of real numbers The larger the F-score is, the more likely this feature is more discriminative Therefore, this score is used as a feature selection criterion Moreover, libSVM also provides a feature scaling function in order to absorb the scale differences among feature values, then we re-scaled them between [0,1] Finally, these above features had the highest result for predicting the popularity of Storify data Support Vector Regression (SVR) is known for its powerful regression performances, and is used as one of the standard regression models We employ SVR as the regression function and also use SVM for classification model As the kernel function, we choose the standard RBF kernel We experimentally optimized the soft TIEU LUAN MOI download : skknchat@gmail.com 18 Chapter Predicting the Popularity of Social Curation margin parameter and the kernel parameter Other parameters were set to default values TIEU LUAN MOI download : skknchat@gmail.com Chapter Experimental Results 4.1 The Experimental Dataset We used Storify’s streaming API to collect a random sample of public stories created from March 1, 2013 to March 31, 2013 with 34,810 curation lists We suppose that these stories have the same published time We crawled them in June 2013 so we predict how much attention of these contents in the three months later Finally, we divided this dataset into 10 groups and ran 10 cross validations 4.2 4.2.1 Results Regression The distribution of view counts is skewed: the minimum view count is 0, while the maximum count is 1,389,705 Therefore, we used the logarithm of view counts in the experiment This yielded the average and the variance of the log view counts of 4.4589 and 3.1035, respectively The result is shown in Table 4.1 4.2.2 Classification The different popular levels are displayed followed by the three classes, as mentioned in Section 4.1 Statistically, nearly half of the stories are class 1, nearly 20% are class 2, and the remaining are class The prediction accuracy for the two types of features are shown in Table 4.2 The result of curation features (7 features) is the worst at 75.08%, curator features (5 features) at 80.02%, and the best result is 19 TIEU LUAN MOI download : skknchat@gmail.com 20 Chapter Experimental Results Table 4.1: Mean Square Errors (MSE) of view count regression by SVR Type of feature No of features MSE (10-fold) Curation features 1.5470 Text features 1.8542 Curator features 1.2642 Curation + Text 13 1.4785 Curation + Curator 12 1.3774 Curation + Text + Curator 18 1.4234 Table 4.2: Prediction accuracy Type of feature No of features Classification (10-fold) Curation features 75.08% Text features 70.68% Curator features 80.20% Curation + Text 13 74.20% Curation + Curator 12 82.62% Curation + Text + Curator 18 76.42% combined features (combined between curator and curation features for a total of 12 features) at 82.62% Therefore, both types of features are necessary for prediction with high accuracy Table 4.3 shows more detailed results for the 10 tests The curator features (as baseline features) and combined features (curation and curator) show different results Most tests using combined features are more accurate than tests using only curator features except for test We analyzed test and realized that the percentage of class is approximately 40%, which is double the normal percentage It is shown that combined features cannot perform well for class In addition, most tests using combined features attain roughly 83% accuracy except some tests such as tests 6, (lower accuracy barely over 70%) and tests 8, 10 (high accuracy over 90%) Although the distribution ratio of classes in these tests is quite different from the others, the difference is irregular and not significant This is an open problem in our research; finding the answer for this question would improve the result TIEU LUAN MOI download : skknchat@gmail.com 4.2 Results Test 10 4.2.3 21 Table 4.3: Accuracy of 10 tests Curator features Curation + Curator features 83.61% 87.42% 82.26% 85.58% 79.88% 83.91% 78.55% 80.38% 82.23% 85.49% 73.56% 71.19% 76.31% 78.92% 87.60% 89.83% 68.44% 70.38% 88.78% 93.22% T-test Evaluation The (student’s) t-test is a statistical examination of two population means In simple terms, the t-test assesses whether the means of two groups are statistically different from each other It is commonly used when the variances of two normal distributions are unknown and when an experiment uses a small sample size In our case, we used the t-test to evaluate two group results of the above 10 tests (small sample size) The decision rule is a 95% confidence interval of the difference from −3.8929 to −1.1271 and we calculate our value t = −4.1059 Therefore, we conclude that this difference is considered to be very significant It indicates that our proposal to use both features is effective to predict the popularity of social curation data TIEU LUAN MOI download : skknchat@gmail.com Chapter Conclusion In this paper, we presented a method to predict the popularity of social curation content as the first step for mining social curation A key insight is that a curation list, which is unique compared to other social data, is the manual collection, selection, and maintenance by curators We used a machine learning approach and selected key features Analyzing the features, we found that social features (curator features) perform very well, but the system can be improved by combining the content features (curation features) A comparison by the t-test showed the significance However, the paper investigated only a specific curation dataset for a specific task We are aware that there are many open problems We have to investigate social features in a larger dataset or other domains In addition, analyzing and explaining the effect of features for predicting the popularity of social curation could improve the result Finally, our research is the first task for mining social curation data Based on this research, we could consider future tasks such as an automatic system or a recommendation system for curating social data 22 TIEU LUAN MOI download : skknchat@gmail.com Bibliography Mohamed Ahmed, Stella Spagna, Felipe Huici, and Saverio Niccolini A peek into the future: Predicting the evolution of popularity in user generated content In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, WSDM ’13, pages 607–616, New York, NY, USA, 2013 ACM ISBN 978-1-4503-1869-3 doi: 10.1145/2433396.2433473 URL http://doi.acm.org/10.1145/2433396.2433473 Roja Bandari, Sitaram Asur, and Bernardo A Huberman The pulse of news in social media: Forecasting popularity CoRR, abs/1202.0332, 2012 URL http://arxiv.org/abs/1202.0332 David M Blei, Andrew Y Ng, and Michael I Jordan Latent dirichlet allocation J Mach Learn Res., 3:993–1022, March 2003 ISSN 1532-4435 URL http://dl.acm.org/citation.cfm?id= 944919.944937 Meeyoung Cha, Alan Mislove, and Krishna P Gummadi A measurement-driven analysis of information propagation in the flickr social network In Proceedings of the 18th International Conference on World Wide Web, WWW ’09, pages 721–730, New York, NY, USA, 2009 ACM ISBN 978-1-60558-487-4 doi: 10.1145/1526709.1526806 URL http://doi.acm.org/10.1145/ 1526709.1526806 Chih-Chung Chang and Chih-Jen Lin Libsvm: A library for support vector machines ACM Trans Intell Syst Technol., 2(3):27:1–27:27, May 2011 ISSN 2157-6904 doi: 10.1145/1961189 1961199 URL http://doi.acm.org/10.1145/1961189.1961199 Kevin Duh, Tsutomu Hirao, Akisato Kimura, Katsuhiko Ishiguro, Tomoharu Iwata, and ChingMan Au Yeung Creating stories: Social curation of twitter messages, 2012 URL https: //www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4578 Keylly Fincham Review: Storify (2011) Journal of Media Literacy Education, Volume issue 1, 2011 URL http://digitalcommons.uri.edu/jmle/vol3/iss1/15/ Derek Greene, Gavin Sheridan, Barry Smyth, and Pádraig Cunningham Aggregating content and network information to curate twitter user lists In Proceedings of the 4th ACM RecSys Workshop on Recommender Systems and the Social Web, RSWeb ’12, pages 29–36, New York, NY, USA, 2012 ACM ISBN 978-1-4503-1638-5 doi: 10.1145/2365934.2365941 URL http: //doi.acm.org/10.1145/2365934.2365941 23 TIEU LUAN MOI download : skknchat@gmail.com 24 Bibliography Catherine Hall and Michael Zarro Social curation on the website pinterest.com Proceedings of the American Society for Information Science and Technology, 49(1):1–9, 2012 ISSN 1550-8390 doi: 10.1002/meet.14504901189 URL http://dx.doi.org/10.1002/meet.14504901189 Tad Hogg and Kristina Lerman Social dynamics of digg CoRR, abs/1202.0031, 2012 URL http://arxiv.org/abs/1202.0031 Liangjie Hong, Ovidiu Dan, and Brian D Davison Predicting popular messages in twitter In Proceedings of the 20th International Conference Companion on World Wide Web, WWW ’11, pages 57–58, New York, NY, USA, 2011 ACM ISBN 978-1-4503-0637-9 doi: 10.1145/1963192 1963222 URL http://doi.acm.org/10.1145/1963192.1963222 Amanda Lenhart; Deborah Fallows; John Horrigan Content creation online February 2004 Yuheng Hu, Ajita John, Dorée Duncan Seligmann, and Fei Wang What were the tweets about? topical associations between public events and twitter feeds In Proceedings of the Sixth International Conference on Weblogs and Social Media, Dublin, Ireland, June 4-7, 2012, 2012 URL http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/view/4658 Katsuhiko Ishiguro, Akisato Kimura, and Koh Takeuchi Towards automatic image understanding and mining via social curation In Proceedings of the 2012 IEEE 12th International Conference on Data Mining, ICDM ’12, pages 906–911, Washington, DC, USA, 2012 IEEE Computer Society ISBN 978-0-7695-4905-7 doi: 10.1109/ICDM.2012.37 URL http://dx.doi.org/10 1109/ICDM.2012.37 Salman Jamali and Huzefa Rangwala Digging digg: Comment mining, popularity prediction, and social network analysis In Proceedings of the 2009 International Conference on Web Information Systems and Mining, WISM ’09, pages 32–38, Washington, DC, USA, 2009 IEEE Computer Society ISBN 978-0-7695-3817-4 doi: 10.1109/WISM.2009.15 URL http://dx.doi.org/10 1109/WISM.2009.15 Su-Do Kim, Sung-Hwan Kim, and Hwan-Gue Cho Predicting the virtual temperature of webblog articles as a measurement tool for online popularity In Proceedings of the 2011 IEEE 11th International Conference on Computer and Information Technology, CIT ’11, pages 449– 454, Washington, DC, USA, 2011 IEEE Computer Society ISBN 978-0-7695-4388-8 doi: 10.1109/CIT.2011.104 URL http://dx.doi.org/10.1109/CIT.2011.104 Juhi Kulshrestha, Farshad Kooti, Ashkan Nikravesh, and Krishna Gummadi Geographic dissection of the twitter network, 2012 URL http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/ paper/view/4685 Himabindu Lakkaraju and Jitendra Ajmera Attention prediction on social media brand pages In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM ’11, pages 2157–2160, New York, NY, USA, 2011 ACM ISBN 978-1-4503-0717-8 doi: 10.1145/2063576.2063915 URL http://doi.acm.org/10.1145/2063576.2063915 TIEU LUAN MOI download : skknchat@gmail.com Bibliography 25 Jong Gun Lee, Sue Moon, and Kave Salamatian An approach to model and predict the popularity of online contents with explanatory factors In Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01, WI-IAT ’10, pages 623–630, Washington, DC, USA, 2010 IEEE Computer Society ISBN 978-0-76954191-4 doi: 10.1109/WI-IAT.2010.209 URL http://dx.doi.org/10.1109/WI-IAT.2010.209 Jong Gun Lee, Sue Moon, and Kavé Salamatian Modeling and predicting the popularity of online contents with cox proportional hazard regression model Neurocomput., 76(1):134–145, January 2012 ISSN 0925-2312 doi: 10.1016/j.neucom.2011.04.040 URL http://dx.doi.org/10.1016/ j.neucom.2011.04.040 Kristina Lerman and Tad Hogg Using a model of social dynamics to predict popularity of news In Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pages 621– 630, New York, NY, USA, 2010 ACM ISBN 978-1-60558-799-8 doi: 10.1145/1772690.1772754 URL http://doi.acm.org/10.1145/1772690.1772754 Yelena Mejova and Padmini Srinivasan Crossing media streams with sentiment: Domain adaptation in blogs, reviews and twitter, 2012 URL https://www.aaai.org/ocs/index.php/ICWSM/ ICWSM12/paper/view/4580 Bongwon Suh, Lichan Hong, Peter Pirolli, and Ed H Chi Want to be retweeted? scale analytics on factors impacting retweet in twitter network In Proceedings of the IEEE Second International Conference on Social Computing, SOCIALCOM ’10, pages 184, Washington, DC, USA, 2010 IEEE Computer Society ISBN 978-0-7695-4211-9 10.1109/SocialCom.2010.33 URL http://dx.doi.org/10.1109/SocialCom.2010.33 large 2010 177– doi: Gabor Szabo and Bernardo A Huberman Predicting the popularity of online content Commun ACM, 53(8):80–88, August 2010 ISSN 0001-0782 doi: 10.1145/1787234.1787254 URL http: //doi.acm.org/10.1145/1787234.1787254 Manos Tsagkias, Wouter Weerkamp, and Maarten de Rijke News comments: Exploring, modeling, and online prediction In Proceedings of the 32Nd European Conference on Advances in Information Retrieval, ECIR’2010, pages 191–203, Berlin, Heidelberg, 2010 Springer-Verlag ISBN 3-642-12274-4, 978-3-642-12274-3 URL http://doi.acm.org/10.1007/978-3-642-12275-0_ 19 Roelof van Zwol, Adam Rae, and Lluis Garcia Pueyo Prediction of favourite photos using social, visual, and textual signals In Proceedings of the International Conference on Multimedia, MM ’10, pages 1015–1018, New York, NY, USA, 2010 ACM ISBN 978-1-60558-933-6 URL http: //doi.acm.org/10.1145/1873951.1874138 Yi wei Chen Combining svms with various feature selection strategies In Taiwan University Springer-Verlag, 2005 TIEU LUAN MOI download : skknchat@gmail.com 26 Bibliography Copyright c 2015 by Binh Thanh Kieu Printed and bound by Binh Thanh Kieu TIEU LUAN MOI download : skknchat@gmail.com ... work, the popularity of Social Curation is shown by the number of views that the content will receive in the near future We propose three groups for categorizing the popularity level of Social. .. (ii) The number of users who the curator of the content follows (iii) The number of stories written by the curator (iv) The user’s language (English or not) (v) When the curator of the content started... Chapter Predicting the Popularity of Social Curation As a large proportion of elements in the curation list is from the Twitter domain, hashtags therefore play an important feature for predicting the

Tiêu đề	Predicting the Popularity of Social Curation
Tác giả	Kieu Thanh Binh
Người hướng dẫn	Assoc. Prof. Pham Bao Son
Trường học	University of Engineering and Technology
Chuyên ngành	Computer Science
Thể loại	thesis
Năm xuất bản	2015
Thành phố	Hanoi

Định dạng
Số trang	36
Dung lượng	1,05 MB