Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 66 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
66
Dung lượng
1,85 MB
Nội dung
VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY Dat Mai-Cong THEROLEOFSOCIALTIESINSOCIALRECOMMENDATIONSYSTEMS Major: Computer Science HA NOI - 2015 VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY Dat Mai-Cong THEROLEOFSOCIALTIESINSOCIALRECOMMENDATIONSYSTEMS Major: Computer Science Supervisor: Assoc Prof Dr Thuy Ha-Quang Co-Supervisor: MSc Le Luong-Thai HA NOI - 2015 AUTHORSHIP “I hereby declare that the work contained in this thesis is of my own and has not been previously submitted for a degree or diploma at this or any other higher education institution To the best of my knowledge and belief, the thesis contains no materials previously published or written by another person except where due reference or acknowledgement is made.” Signature:……………………………………………… i SUPERVISOR’S APPROVAL “I hereby approve that the thesis in its current form is ready for committee examination as a requirement for the Bachelor of Computer Science degree at the University of Engineering and Technology.” Signature:……………………………………………… ii ACKNOWLEDGEMENT First of all, I would like to express my sincere thanks to my advisors Assoc Prof Dr Thuy Ha-Quang for his support and guidance throughout this thesis work I am grateful to MSc Le Luong-Thai to her support and reviews for this thesis I would like to give a big thank to brother and sister in Knowledge and Technology Laboratory (KT-lab) who have supported me to complete this research I would also like to give gratitude to University of Engineering and Technology that are provided the environment and conditions for my learning I am greatly indebted to my family for their encouragements, unconditional support and patience Because time is limited and the condition of this thesis is inevitable shortcomings, I look forward to the comments ofthe teacher and the concern you have with this issue iii ABSTRACT SocialRecommendationSystems have received increasing attention of scientists in recent years Many researches are published in this field such as Jiliang Tang et al (2013) [1], Jiliang Tang, Jie Tang, HuanLiu (2014) [2] The increasing grown ofsocial network also brings many opportunities to improve RecommendationSystems [3] [4] Social theories, models for SocialRecommendationSystems are developed to explain and prove the positive effect ofsocial relation to quality ofSocialRecommendationSystems [4] In which, Social tie strength is also used to improve quality ofRecommendationSystems This thesis focuses on exploiting the effect ofSocial Tie to the performance ofRecommendationSystems based on some researches in [3] [5] [6] Based on these researches, the thesis has proposed a model for mining thesocial tie strength to enhance quality ofRecommendationSystemsin two dimensions of tie strength: Appearances together in photos, Number of friends in common Simultaneously, the thesis also implements this model as experiment and collects data by using a survey of rating for 99 movies to 80 Facebook users Experimental results show that the exploitation of tie strength was initially effective in improving thesocialrecommendation Keywords: SocialRecommendation Systems, Recommendation Systems, Social Ties, Tie Strength, Collaborative filtering, Social Theory, Social media iv TÓM TẮT Trong năm gần đây, hệ tư vấn xã hội ngày nhận quan tâm từ nhà khoa học, có nhiều nghiên cứu hệ tư vấn xã hội công bố nghiên cứu Jiliang Tang cộng (2013) [1], Jiliang Tang Jie Tang, HuanLiu (2014) [2] Sự phát triển mạng xã hội mang lại nhiều hội cho việc cải thiện chất lượng hệ tư vấn [3] [4] Các lý thuyết xã hội số mô hình tư vấn phát triển để giải thích chứng minh cho vai trò qua hệ xã hội hệ tư vấn [4] Trong đó, độ mạnh liên kết người dùng mạng xã hội sử dụng để tang chất lượng tư vấn Khóa luận tập trung vào việc khai thác độ mạnh liên kết người dùng mạng xã hội dựa nghiên cứu [3] [5] [6] Dựa sở nghiên cứu đó, khóa luận đề nghị mô hình khai thác liên kết xã hội để tăng cường tư xã hội dựa độ mạnh liên kết tính theo hai tham số “số bạn chung”, “số ảnh chung” Khóa luận xây dựng, cài đặt mô hình thu thập liệu dựa khảo sát đánh giá 99 phim 80 người dùng mạng xã hội Facebook Kết thực nghiệm cho thấy việc khai thác độ mạnh liên kết có tác dụng bước đầu việc cải thiện chất lượng tư vấn Từ khóa: SocialRecommendation Systems, Recommendation Systems, Social Ties, Tie Strength, Collaborative filtering, Social Theory, Social media v TABLE OF CONTENTS AUTHORSHIP i SUPERVISOR’S APPROVAL ii ACKNOWLEDGEMENT iii ABSTRACT iv TÓM TẮT v TABLE OF CONTENTS vi List of Figures ix List of Tables x ABBREVATIONS xi INTRODUCTION 1.1 Motivation 1.1.1 Social Network with Tie Strength 1.2 Contributions and thesis overview LITERATURE REVIEW 2.1 Traditional RecommendationSystems 2.1.1 Content-based filtering approach 2.1.2 Collaborative filtering approach 2.1.2.1 Memory based approach 2.1.2.2 Model based approach 17 2.1.3 Hybrid RecommendationSystems 17 2.1.4 Evaluation RecommendationSystems 18 2.1.5 Some problem inRecommendationSystems 19 2.1.5.1 Cold-start problem 19 2.1.5.2 Data sparsity problem 20 vi 2.1.5.3 Attacks problem 20 2.1.5.4 Privacy concerns 20 2.1.5.5 Explanation problem 20 2.2 SocialRecommendation 21 2.2.1 Social media and Social theories 21 2.2.1.1 Social media 21 2.2.1.2 Social Theories 21 2.2.2 SocialRecommendation 27 2.2.2.1 Special feature ofSocialRecommendation 27 2.2.2.2 SocialRecommendationsystems 29 2.3 Social Tie Theories 31 2.3.1 Introduction 31 2.3.2 Social Tie Strength 32 2.4 Summary 34 THE METHOD 35 3.1 TheroleofSocial Tie Strength 35 3.2 A model to indicate the effect ofSocial Tie strength to RecommendationSystems 36 3.2.1 General Idea 36 3.2.2 A model to indicate the effect of Tie strength to RecommendationSystems 37 3.2.2.1 Data preprocessing 39 3.2.2.2 Collaborative filtering systems 40 3.2.2.3 Collaborative filtering combine Tie strength 40 3.2.2.4 Evaluation 41 3.2.3 Summary 42 EXPERIMENTS AND DISCUSSIONS 43 4.1 Overview 43 4.2 Tools in use 44 4.3 Data 45 vii 4.4 Result and Discussion 47 CONCLUSIONS 49 5.1 Conclusions 49 5.2 Future Works 49 REFERENCES 51 viii 3.2.2.1 Data preprocessing Purpose: Raw data preprocessing Input: Survey data Output: Ratings data (users ID, items ID, ratings matrix) and Social data (Number of mutual friends between users and their source, number of photos in common) Method: From input is survey data, two tables are obtained, the first table is rating of users to movies, called rating table, the second table is the three source choosing, called source table For this phrase, firstly, first table is analyzed into list of users, list of items, and rating matrix as rating data For second table about three source, An URL request of Facebook is used to obtain Mutual Friend and Photos in common of users and their source as Social data Steps: Analysis rating table into rating data by hand and saves to file Analysis source table: a Read current user ID and source ID from source table b Request URL: https://facebook.com/ + “currentUserID” + “?and=” + “sourceID” to browser c Read number of mutual friends and number of photos in common of current user and his sources from browser d Saving these data to file, this is social data 39 3.2.2.2 Collaborative filtering systems Purpose: Implement Collaborative filtering based on user-based approach with Pearson Correlation Input: Rating data (users ID, items ID, ratings matrix) Output: Predictions matrix Method: The second phrase, user-based collaborative filtering systems with Pearson correlation is in use to implement CF system User-based CF system and Pearson correlation are illustrated in chapter Steps: Read all user ID from rating data file into array Loop each user ID in array a Calculate the Pearson correlation of current user to all user remaining by equation 2.4 b Find three maximum value of Pearson correlation to obtain neighbor set c Calculate the prediction using equation 2.2 Save all prediction value into a Predictions matrix 3.2.2.3 Collaborative filtering combine Tie strength Purpose: Implement Collaborative filtering using Tie Strength as weight instead of Pearson Correlation Input: Rating data (users ID, items ID, ratings matrix), Social data (Number of mutual friends between users and their source, number of photos in common) 40 Output: Predictions matrix Method: For this phrase, tie strength is applied to collaborative filtering The equation that represent by Ofer Arazy et al in [3] is in use to generate prediction as equation 3.1: 𝑝𝑢,𝑖 = 𝑟̅𝑢 + ∑𝑢′ ∈𝑁 𝑠(𝑢,𝑢′)(𝑟𝑢′ ,𝑖 − 𝑟̅𝑢′ ) ∑𝑢′ ∈𝑁|𝑠(𝑢,𝑢′)| (3.1) Where: 𝑠(𝑢, 𝑢′) is tie strength between two users 𝑢 , 𝑢′ Steps: Read all user ID from rating data Loop for each user ID a Read three source of this user ID from social data b Calculate the prediction by using equation 3.1 Save all prediction into Prediction matrix 3.2.2.4 Evaluation Purpose: Comparison the prediction results from algorithms in phrase and phrase Input: Predictions matrix (from phrase and 3) Output: MAE value (using to evaluate) Method: Using MAE measurement to calculate the MAE value Steps: 41 Read two Prediction matrix from phrase and Calculate MAE value by using equation 2.13 3.2.3 Summary For this chapter, the effect of Tie Strength to RecommendationSystems is presented and a model to evaluate the effect of Tie strength to RecommendationSystems is introduced For next chapter, the thesis will represent the result when implement from chapter 42 Chapter EXPERIMENTS AND DISCUSSIONS 4.1 Overview For this experiment, the method of Arazy O et al inthe article [6] is used In this article, Arazy O et al implement algorithms ofRecommendation Systems: uses traditional CF and CF combined with social relation, socialties strength After implementation, they make a comparison to see the effect of CF when combining to social relations In thesis, this method is used but parameters are changed for suitable with thesis In detail, social tie strength is used in two dimension: Appearances together in photo (or Number of photos in common), Number of friends in commons (or Number mutual friends) to combine with traditional RecommendationSystemsThe aim ofthe experiment is competition two algorithms: traditional collaborative filtering and collaborative filtering combined with mutual friends and photos in common to highlight the positive effect ofSocial Tie Strength to RecommendationSystemsThe model inthe chapter is implemented for the experiment As mentioned in chapter 3, six modules are constructed: com.data: uses to rating data process as: file process, format input… com.TSprocess: uses to process data for socialties data as: file process, format input… 43 com.similarity: to calculate similarity of users com.prediction: uses to compute the prediction, it include algorithm in phrase and com.evaluation: implements the MAE measurement com.math: implements some basic math function as average calculation … 4.2 Tools in use The Table 4.1 and Table 4.2 show the configuration of hardware and list of software in use: Configuration of hardware: Hardware component Processor RAM Operating System Hard Disk Drive Information Intel(R) core(TM) i3-2350M, CPU 2.30GHz 4GB Windows 64bit 500GB Table 4.1: Systems configuration information List of software: Index Software Author Eclipse IDE for Java Developers ,Version: Luna Release (4.4.0) Commons-math library version 3.5 release Microsoft Excel 2013 Source https://www.eclipse.org Open source software Microsoft http://commons.apache.org/proper /commons-math https://store.office.com Table 4.2: List of tools in use 44 4.3 Data In order to study tie strength, a survey of 80 users insocial networks Facebook is made, that users as known as candidates These candidates have rated for 99 movies from 2005 to 2014 in 5-scale (very bad, bad, normal, good, very good) Since, a list of users (candidates), a list of items (movies), rating data (candidates to movies) are collected Each candidate is request to choose three people (trusted sources) inthe candidates list that he or she wants to receive advice on choosing moves Then, candidates are asking about Appearances together in photo, Number of friend in commons with these sources After that we check it by using the URL request from Facebook: https://www.facebook.com/ + “user ID” + “?and=” + “source ID” 80 candidates almost us friend in Facebook, so, it can believe that the rating value is trusted The component of candidate are various as Table 4.3: University friends Number Percent(%) 43 53.75 High School friends 22 27.5 Family 3.75 Unknown 10 Others friends Total 80 100 Table 4.3: The component of candidates For data collection, data is completed from 19/4/2015 to 1/5/2015 Because of difficulties in collecting thesocialties data, there are also 80 users that complete the survey, but, in some research about socialties as [5] [24] [6], these author also get data from surveys, and the number of participants is not much For example, in [5], Oliver Oechslein et al used 193 participants, in [24], Eric Gilbert and Karrie Karahalios used 35 participants, in [6], Ofer Arazy et al used 99 participants Figure 4.1, Figure 4.2 and Figure 4.3 bellows illustrate data that are proceeded: 45 Figure 4.1: Example about items list Figure 4.2: Example about users list 46 Figure 4.3: Example about the rating matrix collected from survey 4.4 Result and Discussion Inthe experiment, data are divided and compared by using 10-fold method, each fold are generated at random In which, 80% data is used for training and 20% data for test The Table 4.4 below shows the MAE measurement for each fold, notate that low MAE value is better than high MAE value CF Fold Fold Fold Fold Fold Fold Fold Fold Fold Fold 10 1.4612 1.4448 1.4346 1.4407 1.3807 1.4151 1.3912 1.3565 1.4117 1.4490 1.4178 1.392 1.3719 1.3119 1.3164 1.3474 1.2837 1.3392 1.3629 1.4185 1.3949 1.3735 1.314 1.3182 1.3495 1.2862 1.3418 1.3646 CF + Mutual 1.4109 Friend CF + Photo in 1.4119 common Table 4.4: The MAE value of CF method and CF + tie strength method 47 1.5 1.45 1.4 1.35 1.3 1.25 1.2 1.15 Fold Fold Fold Fold Fold Fold Fold Fold Fold Fold 10Average CF CF + Mutual Friend CF + Photo in common Figure 4.4: MAE value over 10 fold in graph From the results ofthe experiment, in ten folds, it is clearly that the results are positive In all fold, the method CF + Mutual friend always gives best results, CF + Photo in common give results that approximate CF + Mutual Friend Both methods are better results than traditional CF method To have a clearly view, considering Figure 4.4, in fold 1, fold and fold 3, MAE values of CF + Mutual friend and CF + photo in common are approximate and smaller than CF method, but distances of CF line to two others are not large In seven remained folds, these distances become larger Noticeably, in fold 6, the MAE of CF method (1.4151) is clearly larger than two method remaining (1.3182 and 1.3164) MAE value of fold is the best The MAE values of fold 1, fold 2, and fold 10 are quite high And it can be seen that the MAE of CF + Mutual friends is the smallest in ten folds, which means, mutual friend factor is slightly better than photo in common factor for RecommendationSystemsin this data 48 Chapter CONCLUSIONS 5.1 Conclusions In conclusion, this thesis shows the influence ofSocialTiesinRecommendationSystems To this task, firstly, traditional RecommendationSystems and SocialRecommendation and these algorithms (chapter 2) were introduced Next, the effect ofsocial media to users inSocialRecommendation through social theories was investigated Secondly, the dimension ofSocialTies (chapter 2) and how they can affect to RecommendationSystems (chapter 3) were studied Finally, Collaborative filtering algorithm with Pearson correlation and Collaborative filtering combined with SocialTies were implemented to compare the result, and the results are positive to show that the effect ofSocialTies to RecommendationSystems (chapter and chapter 4) In order to complete implementation, a survey to collect data (has socialties strength factor) from Facebook users was completed This work is taken from us much effort However, it also weak point ofthe thesis because ofthe limitation of on the number of users take part inthe survey 5.2 Future Works Inthe future, firstly, other dimensions ofSocialTies to RecommendationSystems will be expanding in research such as: 49 Duration variable : Days since from first communication Intimacy variable: Days since from last communication, Inbox intimacy words … Emotional Support Variables: Wall & inbox positive emotion words Predictive Intensity Variables: Wall words exchanged, Inbox messages exchanged … Secondly, more data will be collecting in order to make data more objective than inthe thesis 50 REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Tang, Jiliang and Hu, Xia and Liu, Huan, "Social recommendation: a review," Social Network Analysis and Mining, vol 3, pp 1113-1133, 2013 Tang, Jiliang and Tang, Jie and Liu, Huan, "Recommendation insocial media: recent advances and new frontiers," in Proceedings ofthe 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2014, pp 1977-1977 Arazy, Ofer and Kumar, Nanda and Shapira, Bracha, "Improving Social Recommender Systems," IT professional, vol 11, no 4, pp 38-44, 2009 Tang, Jiliang and Chang, Yi and Liu, Huan, "Mining social media with social theories: A survey," ACM SIGKDD Explorations Newsletter, vol 15, pp 2029, 2014 Oechslein, Oliver and Hess, Thomas, "The Value of a Recommendation: TheRoleofSocialTiesinSocial Recommender Systems," in System Sciences (HICSS), 2014 47th Hawaii International Conference on, IEEE, 2014, pp 1864-1873 Arazy, Ofer and Elsane, I and Shapira, Bracha and Kumar, Nanda, "Social relationships in recommender systems," in Proc ofthe 17th Workshop on Information Technologies & Systems, 2007 Granovetter, Mark S, "The strength of weak ties," American journal of sociology, pp 1360-1380, 1973 Koroleva, Ksenia and timac, Vid, "Tie Strength vs Network Overlap: Why Information from Lovers is More Valuable than from Close Friends on Social Network Sites?," Proceedings ofthe 33rd International Conference on Information Systems (ICIS), pp 1-17, 2012 Li, Seth and Karahanna, Elena, "Peer-based recommendations in online B2C e-commerce: comparing collaborative personalization and social networkbased personalization," in System Science (HICSS), 2012 45th Hawaii International Conference on, IEEE, 2012, pp 733-742 Konstan, Joseph A and Riedl, John, "Recommender systems: from algorithms to user experience," User Modeling and User-Adapted Interaction, vol 22, pp 101-123, 2012 Ricci, Francesco and Rokach, Lior and Shapira, Bracha, "Introduction to recommender systems handbook," in Recommender systems handbook, Springer, 2011, pp 1-35 Ekstrand, Michael D and Riedl, John T and Konstan, Joseph A, "Collaborative filtering recommender systems," Foundations and Trends in HumanComputer Interaction, vol 4, pp 81-173, 2011 Pazzani, Michael J and Billsus, Daniel, "Content-based recommendation systems," inThe adaptive web, Springer, 2007, pp 325-341 51 [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] Zafarani, Reza and Abbasi, Mohammad Ali and Liu, Huan, Social media mining: an introduction, Cambridge University Press, 2014 Ma, Hao and King, Irwin and Lyu, Michael R, "Effective missing data prediction for collaborative filtering," in Proceedings ofthe 30th annual international ACM SIGIR conference on Research and development in information retrieval, ACM, 2007, pp 39-46 Linden, Greg and Smith, Brent and York, Jeremy, "Amazon com recommendations: Item-to-item collaborative filtering," Internet Computing, IEEE, vol 7, pp 76-80, 2003 Resnick, Paul and Iacovou, Neophytos and Suchak, Mitesh and Bergstrom, Peter and Riedl, John, "GroupLens: an open architecture for collaborative filtering of netnews," in Proceedings ofthe 1994 ACM conference on Computer supported cooperative work, ACM, 1994, pp 175-186 Sarwar, Badrul and Karypis, George and Konstan, Joseph and Riedl, John, "Item-based collaborative filtering recommendation algorithms," in Proceedings ofthe 10th international conference on World Wide Web, ACM, 2001, pp 285-295 Bell, Robert M and Koren, Yehuda, "Scalable collaborative filtering with jointly derived neighborhood interpolation weights," in Data Mining, 2007 ICDM 2007 Seventh IEEE International Conference on, IEEE, 2007, pp 4352 Gong, SongJie and Ye, HongWu and Tan, HengSong, "Combining memorybased and model-based collaborative filtering in recommender system," in Circuits, Communications and Systems, 2009 PACCS'09 Pacific-Asia Conference on, IEEE, 2009, pp 690-693 Burke, Robin, "Hybrid recommender systems: Survey and experiments," User modeling and user-adapted interaction, vol 12, pp 331-370, 2002 Lakshmi, Soanpet Sree and Lakshmi, T Adi, "Recommendation Systems: Issues and challenges," (IJCSIT) International Journal of Computer Science and Information Technologies, vol 5, no 4, pp 5771-5772, 2014 Guo, Guibing, "Resolving data sparsity and cold start in recommender systems," in User Modeling, Adaptation, and Personalization, Springer, 2012, pp 361-364 Gilbert, Eric and Karahalios, Karrie, "Predicting tie strength with social media," in Proceedings ofthe SIGCHI Conference on Human Factors in Computing Systems, 2009, pp 211 220 Leskovec, Jure and Huttenlocher, Daniel and Kleinberg, Jon, "Signed networks insocial media," in Proceedings ofthe SIGCHI Conference on Human Factors in Computing Systems, ACM, 2010, pp 1361-1370 Khanafiah, Deni and Situngkir, Hokky, "Social balance theory: revisiting Heider’s balance theory for many agents," 2004 52 [27] [28] [29] [30] [31] [32] [33] [34] [35] Kautz, Henry and Selman, Bart and Shah, Mehul, "Referral Web: combining social networks and collaborative filtering," Communications ofthe ACM, vol 40, pp 63-65, 1997 Victor, Patricia and Cornelis, Chris and De Cock, Martine and Teredesai, Ankur, "A Comparative Analysis of Trust-Enhanced Recommenders for Controversial Items," in ICWSM, 2009 Victor, Patricia and De Cock, Martine and Cornelis, Chris, "Trust and recommendations," in Recommender systems handbook, Springer, 2011, pp 645-675 Jamali, Mohsen and Ester, Martin, "Trustwalker: a random walk model for combining trust-based and item-based recommendation," in Proceedings ofthe 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2009, pp 397-406 Golbeck, Jennifer, Generating predictive movie recommendations from trust insocial networks, Springer, 2006 Massa, Paolo and Avesani, Paolo, "Trust-aware recommender systems," in Proceedings ofthe 2007 ACM conference on Recommender systems, ACM, 2007, pp 17-24 Marsden, Peter V and Campbell, Karen E, "Measuring tie strength," Social forces, vol 63, no 2, pp 482-501, 1984 Steffes, Erin M and Burgee, Lawrence E, "Social ties and online word of mouth," Internet research, vol 19, pp 42-59, 2009 Easley, David and Kleinberg, Jon, Networks, crowds, and markets: Reasoning about a highly connected world, Cambridge University Press, 2010 53 ... solving these tasks Using Recommendation Systems means that use the wisdom of the crown [3], to support making a choice process Recommendation Systems are used in many online systems and they... the effect of Social Tie to the performance of Recommendation Systems based on some researches in [3] [5] [6] Based on these researches, the thesis has proposed a model for mining the social tie... effect of social relation to quality of Social Recommendation Systems [4] In which, Social tie strength is also used to improve quality of Recommendation Systems This thesis focuses on exploiting the