1. Trang chủ
  2. » Công Nghệ Thông Tin

Girvan newman algorithm

77 1 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

1 How to compute betweenness? How to select the number of clusters? J Leskovec, A Rajaraman, J Ullman: Mining of Massive Datasets, http://www.mmds.org  Want to compute betweenness of paths starting at node �  Breath first search starting from �: J Leskovec, A Rajaraman, J Ullman: Mining of Massive Datasets, http://www.mmds.org  Count the number of shortest paths from � to all other nodes of the network: J Leskovec, A Rajaraman, J Ullman: Mining of Massive Datasets, http://www.mmds.org  Compute betweenness by working up the tree: If there are multiple paths count them fractionally The algorithm: •Add edge flows: node flow = 1+∑child edges split the flow up based on the parent value • Repeat the BFS procedure for each starting node � 1+1 paths to H Split evenly 1+0.5 paths to J Split 1:2 path to K Split evenly J Leskovec, A Rajaraman, J Ullman: Mining of Massive Datasets, http://www.mmds.org  Compute betweenness by working up the tree: If there are multiple paths count them fractionally The algorithm: •Add edge flows: node flow = 1+∑child edges split the flow up based on the parent value • Repeat the BFS procedure for each starting node � 1+1 paths to H Split evenly 1+0.5 paths to J Split 1:2 path to K Split evenly J Leskovec, A Rajaraman, J Ullman: Mining of Massive Datasets, http://www.mmds.org Graph Partitioning • Methods to break a network into sets of connected components called regions • Many general approaches – Divisive methods: Repeatedly identify and remove edges connecting densely connected regions – Agglomerative methods: Repeatedly identify and merge nodes that likely belong in the same region [Girvan-Newman ‘02]  Divisive hierarchical clustering based on the notion of edge betweenness: Number of shortest paths passing through the edge  Girvan-Newman Algorithm: § Undirected unweighted networks § Repeat until no edges are left: § Calculate betweenness of edges § Remove edges with highest betweenness § Connected components are communities § Gives a hierarchical decomposition of the network J Leskovec, A Rajaraman, J Ullman: Mining of Massive Datasets, http://www.mmds.org Girvan-Newman Algorithm • Divisive method Proposed by Girvan and Newman in 2002 • Uses edge betweenness to identify edges to remove • Edge betweenness: Total amount of “flow” an edge carries between all pairs of nodes where a single unit of flow between two nodes divides itself evenly among all shortest paths between the nodes (1/k units flow along each of k shortest paths) Girvan-Newman Algorithm Calculate betweenness of all edges Remove the edge(s) with highest betweenness Repeat steps and until graph is partitioned into as many regions as desired Girvan-Newman Algorithm B keeps & passes 1 A B C D 1 F ½ I H ½ J ½ ½ K 1 G E C keeps & passes 1 A 2 B C D 1 F ½ I H ½ J ½ ½ K 1 G E D keeps & passes along A B C D 1 F ½ I H ½ J ½ ½ K 1 G E A B C D 1 F 2 ½ I H J ½ ½ K 1 ½ E G E keeps & passes along No flow yet… A B C D 1 F 2 ½ I H ½ J ½ ½ K 1 G E Computing Edge Betweenness Efficiently For each node N in the graph Repeat for B, C, etc Perform breadth-first search of graph starting at node N Determine the number of shortest paths from N to every other node Based on these numbers, determine the amount of flow from N to all other nodes that use each edge Divide sum of flow of all edges by Since sum includes flow from A  B and B  A, etc  Example b d a g c e Compute #geodesics from every node to g f Breadth-first search – means for doing manythings  Example b d d=0 w=1 a c g e f Breadth-first search – means for doing many things Example b d=1 w=1 d d=0 w=1 a c e d=1 w=1 g f Breadth-first search – means for doing many things  Example d=2 w=2 b d=1 w=1 d d=0 w=1 a c d=2 w=2 e d=1 w=1 f g d=2 w=2 Breadth-first search – means for doing many things  Example d=2 w=2 d=3 w=4 b d=1 w=1 d d=0 w=1 a c d=2 w=2 e d=1 w=1 f g Have all info we need for edge betweenness now d=2 w=2 Breadth-first search – means for doing many things  Example d=2 w=2 d=3 2/4 w=4 b d=1 w=1 d 1/2 a 2/4 d=0 w=1 c d=2 w=2 e d=1 w=1 1/2 f g Note: a and f are like leaves: no geodesic to g from other nodes passes through them d=2 w=2 Breadth-first search – means for doing many things  An Example d=2 w=2 ½(1+2/4) d ) /4 +2 (1 ½ d=3 2/4 w=4 b a 2/4 4) / + (1 ½ d=1 w=1 c ½(1+2/4) d=2 w=2 e d=0 w=1 1/2 d=1 w=1 1/2 f g Note: a and f are like leaves: no geodesic to g from other nodes passes through them d=2 w=2 Breadth-first search – means for doing many things  Example 1/1[ 1+½(1+2/4)+1/2(1+2/4)+1/2] d=2 w=2 ½(1+2/4) d ) /4 +2 (1 ½ d=3 2/4 w=4 b a 2/4 4) / + (1 ½ d=1 w=1 c ½(1+2/4) d=2 w=2 e d=0 w=1 1/2 d=1 w=1 1/2 f g Note: a and f are like leaves: no geodesic to g from other nodes passes through them d=2 w=2 Breadth-first search – means for doing many things Edge betweenness for all edges can be computed in time �(��) (�=#edges, �=#nodes) [Newman 2001] – details soon  Recalculation makes algorithm �(�2�), so not feasible for large networks 

Ngày đăng: 26/07/2023, 19:42