Research on node ranking in peer-to-peer networks

59 387 0
Research on node ranking in peer-to-peer networks

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Tài liệu tham khảo chuyên ngành viễn thông Research on node ranking in peer-to-peer networks

ĐẠI HỌC QUỐC GIA HÀ NỘI TRƯỜNG ĐẠI HỌC CÔNG NGHỆ Hoàng Cường Research on node ranking in peer-to-peer networks KHỐ LUẬN TỐT NGHIỆP ĐẠI HỌC HỆ CHÍNH QUY Ngành: Công nghệ thông tin HÀ NỘI - 2010 Research on Node Ranking – Peer to Peer … Hoàng Cường Lời cảm ơn Lời em xin bày tỏ lòng biết ơn sâu sắc tới TS Nguyễn Hoài Sơn, thầy hướng dẫn nguồn cảm hứng cho trình nghiên cứu em Em xin bày tỏ lịng biết ơn tới thầy, giáo Khoa Công nghệ thông tin - Trường Đại học Công nghệ - ĐHQGHN Các thầy cô dạy bảo, dẫn chúng em tạo điều kiện tốt cho chúng em học tập suốt trình học đại học đặc biệt thời gian làm khoá luận tốt nghiệp Hà Nội, ngày 22 tháng năm 2010 Hoàng Cường i Research on Node Ranking – Peer to Peer … Hoàng Cường ABSTRACT This paper defines and describes a fully distributed NODE ranking algorithm for “peer to peer” systems The research puts forward new approach for ranking nodes over peer to peer Synthesizing foundation and promoting new method which is feasible for peer to peer networks Integration of this algorithm into P2P keyword search can produce dramatic benefit both in terms of effectiveness for users and decrease in network traffic The incremental search algorithm provided approximately a ten-fold reduction in network traffic for two-word and three-word queries ii Research on Node Ranking – Peer to Peer … Hoàng Cường Chapter Table of Contents Abstract Error! Bookmark not defined List images List tables Chapter 1: Peer to Peer and Ranking Problem 1.1 Peer to Peer 1.1.1 Peer to Peer overview 1.1.2 Architecture of Peer to Peer Systems Error! Bookmark not defined.7 1.1.3 Distributed hash tables 1.2 Ranking in Peer to Peer networks 1.2.1 Introduction Error! Bookmark not defined 1.2.2 Ranking Roles Error! Bookmark not defined 1.2.3 Research’s important objects Error! Bookmark not defined Chapter 2: Ranking on DHT Peer to Peer Networks 11 2.1 Chord Protocol 11 2.2 Pagerank 12 2.2.1 Description 12 2.2.2 Algorithms 13 2.3 Distributed Computing 17 2.2.1 Introduction 17 2.2.2 Algorithms Error! Bookmark not defined 2.4 if-idf 18 Chapter 3: Building a new algorithm for ranking in chord networksError! Bookmark not define 3.1 Targets and Missions of Research Error! Bookmark not defined 3.2 Idea Error! Bookmark not defined iii Research on Node Ranking – Peer to Peer … Hoàng Cường 3.2.1 Major problems to exploit Error! Bookmark not defined 3.2.2 Ranking Idea Error! Bookmark not defined Chapter 4: Ranking on Details Error! Bookmark not defined 4.1 Ranking algorithm Error! Bookmark not defined 4.2 Ranking’s features Error! Bookmark not defined Chapter 5: Evaluation 50 Chapter 6: Related Work 52 Chapter 7: Contributions and future work 53 References 54 iv Research on Node Ranking – Peer to Peer … Hoàng Cường List Images Image 1.1.1 Peer to Peer means connected together.Error! Bookmark not defined Image 1.1.3 Distributed hash tables example Error! Bookmark not defined Image 1.2.1 System must to have the ranking engine to find the one.Error! Bookmark not defined Image 2.1 A 16-node Chord network example Error! Bookmark not defined Image 2.2.2: How Pagerank works Error! Bookmark not defined Image 2.3: Distributed Nodes Graph example Error! Bookmark not defined Image 3.2.1: Google almost is not exact Error! Bookmark not defined Image 3.2.2: Intersect Idea Error! Bookmark not defined Image 3.4: Factor Percent Error! Bookmark not defined Image 4.1: Bandwidth is the key of ranking trusted Error! Bookmark not defined Image 4.1.2: Example of sub-graph semantic rank Error! Bookmark not defined Fig 4: A global graph of both local nodes and external nodes Error! Bookmark not defined Fig 5: An external local graph without a strategy Error! Bookmark not defined Fig 6: An external local graph Error! Bookmark not defined Image 4.2: Eigenvalue Error! Bookmark not defined Image 4.2.3: Random walk Error! Bookmark not defined Image 4.2.4: (n+1) graph nodes Error! Bookmark not defined Image 4.2.5: Graph example - nodes Error! Bookmark not defined Image 4.2.6: Multiplication result example Error! Bookmark not defined Image 4.2.7: Multiplication result example – at iterators Error! Bookmark not defined v Research on Node Ranking – Peer to Peer … Hoàng Cường vi Research on Node Ranking – Peer to Peer … Hoàng Cường List tables Table 3.2.1: The Pagerank converge and HITS converge .Error! Bookmark not defined Table 3.2.2: The Pagerank converge increasing to fastError! Bookmark not defined Table 3.2.3: Pagerank convergence are not steady when Epsilon small Error! Bookmark not defined Table 3.2.4: HITS convergence ( take lots time than Pagerank) .Error! Bookmark not defined Table 5.1: the number of iterators which converges……………………….Error! Bookmark not defined Research on Node Ranking – Peer to Peer … Hoàng Cường Chapter 1: Peer to Peer and Ranking Problem 1.1 Peer to Peer A peer-to-peer, commonly abbreviated to P2P, is any distributed network architecture conceive in associate that make a portion of their resources (such as processing power, disk storage or network bandwidth) directly available to other network partners, without the need for central coordination instances (such as servers or stable hosts) Peers are both suppliers and consumers of resources, in contrast to the traditional client–server model where only servers put out, and clients snack Peer-to-peer was popularized by file sharing systems like Napster File sharing is the practice of distributing or providing access to digitally stored information, such as computer programs, multi-media (audio, video), documents, or electronic books It may be implemented through a variety of storage, transmission, and distribution models and common methods of file sharing incorporate manual sharing using removable media, centralized computer file server installations on computer networks, World Wide Web-based hyperlinked documents, and the use of distributed peer-to-peer networking 1.1.1 Peer to Peer overview In its simplest form, a peer-to-peer (P2P) network is created when two or more PCs are connected and share resources without going through a separate server A P2P network can be an ad hoc connection—a couple of computers connected via a Universal Serial Bus to transfer files A P2P network also can be a permanent infrastructure that links a half-dozen computers in a small office over copper wires Or a P2P network can be a network on a much grander scale in which special protocols and applications set up direct relationships among users over the Internet The initial use of P2P networks in business followed the deployment in the early 1980s of free-standing PCs In contrast to the mini-mainframes of the day (e.g Fuitsu/ICL, IBM AS/400, IBM Mainframe, Unisys, … ), which used by over 16,000 Research on Node Ranking – Peer to Peer … Q1 is an N x ( n + 1) matrix: Assume N = 6, n = 4; In matrix Q2 Matrix 43 Hoàng Cường Research on Node Ranking – Peer to Peer … Tương tự: Giả sử node outside node node N = 6, n = 4, node outside R[5] = 0.206 R[6] = 0.2862 Sum = 0.206 + 0.2862 = 0.4922 Matrix E E=( , = (0.418529053, 0.581470947) A_deal_matrix = Q2AQ1 = 44 Hoàng Cường Research on Node Ranking – Peer to Peer … = [ [0 [ 1/3 [ [ 0.5 1/3 0 0.5 0 0 0 ] 0 ] 1/3 ] ] 0.79073547 0.20926453] Xét lại công thức : Với R[5] = 0.206 R[6] = 0.2862 A[5][5] = 45 Hoàng Cường Hoàng Cường Research on Node Ranking – Peer to Peer … A[5][6] =1/2 A[6][5] = A[6][6] = A[5][4] = 1/2 A[6][4] = Sum = 0.4922 = [ [0 [ 1/3 [ [ 0=0+0 0.5 1/3 0=0+0 0.5 0 0=0+0 0 0 0.79073547 = 0+0 =0+0 1/3 = 1/3+0 = 1/2 + 1/2 0.20926453 ] ] ] ] ] 0.79073547 = (0.206*0.5+0.2826)/0.4922 (đúng) 0.20926453 = (0.206*0 + 0.206*1/2 + 0.2826*0+ 0.2826*0)/0.4922 ( đúng) Và ma trận A 46 Research on Node Ranking – Peer to Peer … Hoàng Cường Image 4.2.6: Multiplication result example The idea of multiplying the values of entries in A with the two matrices Q1 and Q2, where Q1 derived from the ranking vector for outside data, is key to the approach of A_deal_matrix It has the effect of distributing the possibility continuance from the outside nodes, in a manner that is proportional to the importance of each of the outside data in the original PageRank vector Recall that the personalization vector in the original PageRank is defined as a uniform vector Instead, for Rank_local_idea we define the personalization vector Pideal according to the number of outside data and total number of data in the graph More specifically, the i-th entry of Pideal, Pideal[i] can be expressed as follows: Summary, according to the equalization: 47 Research on Node Ranking – Peer to Peer … Hoàng Cường Convergence of Rank_local_idea Let Rank_ideal be the final ranking vector of Rank_local_idea, where the first n elements are scores for local data and the (n+1)-th element is the score for the outside node graph EG We show that the scores of first n elements are identical to the true PageRank scores Theorem 1: In Rank_ideal, scores for the first n data converge to the true PageRank scores The score for the (n + 1) th element, graph EG, converges to the sum of true PageRank scores for all outside data Proof: Let R be the true PageRank vector such that R is the converged stationary distribution for A Let R’ = n + entries We also know that R = be a vector with It is obvious that R’[i] = R[i] for first n elements and We will show that R’ is the Rank_local_idea vector Next consider a left multiply with We know that obtain the following: to Since A_deal_matrix is stochastic and Markov Chain defined by Rank_local_idea is irreducible and aperiodic, there is a unique stationary distribution for A_deal_matrix Therefore, R0 = Rank_ideal The Rank_local_idea algorithm addresses several applications One is where some sub-graph of the Web graph has been updated A second case is when the personalized authority transfer is limited to the sub-graph In these cases, the knowledge of PageRank scores can be potentially relied on to estimate new ranking scores 48 Research on Node Ranking – Peer to Peer … Hoàng Cường THE PAGERANK_LOCAL ALGORITHM Unlike the previous scenario where PageRank values for outside data are known, we now consider scenarios where the PageRank scores are not known a priori To cover this state, our framework has an approximate solution Pagerank_local The key difference is that for Pagerank_local, the algorithm is not able to differentiate the (previously weighted) contribution of authority from each individual outside page (since these PageRank scores are unknown) Instead, Pagerank_local will consider the authority continuance from outside data assuming they are equally important We analyze the L1 distance between Rank_local_idea scores and Pagerank_local scores of the sub-graph and reveal that it is within a constant factor of L1 distance between the true PageRank scores and uniform scores of the outside data We will show through experiments that Pagerank_local is a good approximation A The Pagerank_local algorithm The Pagerank_local vector Rapprox is defined as follows Pagerank_local adopts the same personalization vector as Rank_local_idea It however, defines its own transition matrix Matrix_A_approx B Matrix_A_approx definition Matrix_A_approx is an (n + 1) x (n + 1) matrix It is defined as follows: For example( the above graph) The matrix A 49 Research on Node Ranking – Peer to Peer … Hoàng Cường New matrix Approx: [ 0.5 0.5 0 ] [0 0 0 ] = [1/3 1/3 0 1/3 ] [0 0 0.5 ] [0 0 0.75 1/4] Calculating local-pagerank Choose alpha = 0.85 ( according to Pagerank ) At iterator 0: Rapprox = [1/6, 1/6, 1/6, 1/6, 1/3] Pideal = [1/6, 1/6, 1/6, 1/6, 1/3] At iterator 1: Rapprox = 0.85* *R approx + 0.25* [0.25, 0.25, 0.25, 0.25, 0.5] … Program results after 10 iterators: 50 Research on Node Ranking – Peer to Peer … Hoàng Cường Image 4.2.7 : Multiplication result example at iterators We Can be clearly seen: Order ( node 1-> 4) And order Are the same Matrix_A_approx is different from A_deal_matrix in the last row, since Rank_local_idea does not utilize knowledge about PageRank scores of outside data in the first n rows For the first n entries in the last row, the value symbolizes the (average) possibility continuance accumulated from (N - n) outside data to each local page The last entry in this n-th row of the matrix is the (average) possibility continuance from outside data to other outside data Similar to A_deal_matrix = Q1AQ2, Matrix_A_approx can be formally defined as Matrix_A_approx = Q’ 1AQ2, where the vector E is replaced by a vector Eapprox in Q’ 51 Research on Node Ranking – Peer to Peer … Hoàng Cường In approx, the values at the last row are as follows: 1) For the first n values, (1

Ngày đăng: 20/11/2012, 11:36

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan