1. Trang chủ
  2. » Công Nghệ Thông Tin

03 link analysis pagerank

54 0 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu ¡ Today we will talk about how does the Web graph look like: § 1) We will take a real system: the Web § 2) We will represent it as a directed graph § 3) We will use the language of graph theory v § Strongly Connected Components § 4) We will design a computational experiment: Out(v) § Find In- and Out-components of a given node v § 5) We will learn something about the structure of the Web: BOWTIE! 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu Q: What does the Web “look like” at a global level? ¡ Web as a graph: § Nodes = web pages § Edges = hyperlinks § Side issue: What is a node? § Dynamic pages created on the fly § “dark matter” – inaccessible database generated pages 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu I teach a class on Networks CS224W: Classes are in the Huang building Computer Science Department at Stanford Stanford University 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu I teach a class on Networks CS224W: Classes are in the Huang building Computer Science Department at Stanford Stanford University ¡ ¡ 10/2/18 In early days of the Web links were navigational Today many links are transactional (used not to navigate from page to page, but to post, comment, like, buy, …) Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu Citations 10/2/18 References in an Encyclopedia Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu ¡ ¡ How is the Web linked? What is the “map” of the Web? Web as a directed graph [Broder et al 2000]: § Given node v, what can v reach? § What other nodes can reach v? E B F A D C G In(v) = {w | w can reach v} Out(v) = {w | v can reach w} 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu For example: In(A) = {A,B,C,E,G} Out(A)={A,B,C,D,F} ¡ Two types of directed graphs: § Strongly connected: § Any node can reach any node via a directed path E B A D C In(A)=Out(A)={A,B,C,D,E} § Directed Acyclic Graph (DAG): E § Has no cycles: if u can reach v, then v cannot reach u A D ¡ 10/2/18 B C Any directed graph (the Web) can be expressed in terms of these two types! § Is the Web a big strongly connected graph or a DAG? Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 10 ¡ Input: Graph ! and parameter " § Directed graph ! with spider traps and dead ends § Parameter # ¡ Output: PageRank vector $ § Set: %& § do: § ∀/: ' (4) $′2 (4) $′2 = ) , * ,=1 (49:) $7 = ∑7→2 " ;7 = < if in-deg of is § Now re-insert the leaked PageRank: 4 ∀2: $2 = $= + § 4=4+: (B) 10/2/18 § while ∑& %& :?@ A (B?)) − %& (B) where: F = ∑& %′& >E Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 43 Node size proportional to the PageRank score 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 44 … § Q: What is most related conference to ICDM? … Given: Conferences-to-authors graph ¡ Goal: Proximity on graphs ¡ IJCAI Philip S Yu KDD ICDM Ning Zhong SDM R Ramakrishnan AAAI M Jordan … NIPS … Conference 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu Author 46 10 12 11 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 47 Goal: Evaluate pages not just by popularity but by how close they are to the topic ¡ Teleporting can go to: ¡ § Any page with equal probability § PageRank (we used this so far) § A topic-specific set of “relevant” pages § Topic-specific (personalized) PageRank (S teleport set) !’#$ = & !#$ + () − &)/|.| if # ∈ = & !#$ otherwise § A single page/node (|S| = 1), § Random Walk with Restarts 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 48 § Ranks nodes by “importance” ¡ Personalized PageRank: § Ranks proximity of nodes to the teleport set ! ¡ Proximity on graphs: 10/2/18 Philip S Yu KDD ICDM Ning Zhong SDM R Ramakrishnan AAAI M Jordan NIPS … § Teleport back to the starting node: S = { single node } IJCAI … § Q: What is most related conference to ICDM? § Random Walks with Restarts … Graphs and web search: … ¡ Conference Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu Author 49 Node 0.13 0.04 0.10 0.13 10 0.03 12 0.02 0.08 11 0.04 0.13 0.05 0.05 Node Node Node Node Node Node Node Node Node Node 10 Node 11 Node 12 S={4} Notice: Nearby nodes have higher scores (are more red) 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 0.13 0.10 0.13 / 0.13 0.05 0.05 0.08 0.04 0.03 0.04 0.02 Ranking vector 50 PKDD SDM PAKDD 0.008 0.009 KDD 0.011 CIKM 0.007 0.005 ICDM 0.004 0.004 ICML 0.005 0.005 ICDE 0.004 ECML SIGMOD DMKD 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 51 Q: Which conferences are closest to KDD & ICDM? K I Graph of CS conferences 10/2/18 A: Personalized PageRank with teleport set S={KDD, ICDM} Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 52 ¡ The problem of measuring “similarity” of objects (nodes in a graph) arises in many applications ¡ The approach should be applicable in any domain with object-to-object relationships ¡ SimRank intuition: Two objects are similar if they are related to similar objects 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 54 We model objects and relationships as a directed graph ! = ($, &) ¡ For a node v in a graph, we denote by (()) and *()) the set of in-neighbors and out-neighbors ¡ Basic SimRank Equation ¡ § If + = , then -(+, ,) is defined to be Otherwise, | I ( a )|| I ( b )| C s ( a, b) = s ( I i (a ), I j (b)) å å | I (a ) || I (b) | i =1 j =1 where C is a constant between and § Set -(+, ,) = when ( + = ∅ or ( , = ∅ 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 55 ¡ Computing SimRank Naive Method § !"($, &) is initialized with the lower bound on the (($, &) R0 (a, b) = { a ¹ b) (if a = b ) (if § To compute )*+, (-, ) from )* ⋅,⋅ : | I ( a )|| I ( b )| C Rk +1 (a, b) = Rk ( I i (a ), I j (b)) å å | I (a ) || I (b) | i =1 j =1 Rk +1 (a, b) = 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu when a¹b a=b 56 10/2/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 57

Ngày đăng: 26/07/2023, 19:35

Xem thêm:

w