05 motifs and graphlets

41 0 0
05 motifs and graphlets

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu Network Metrics ¡ ¡ ¡ Many metrics at the node level: Network Metrics There are many available metrics at the node level: § E.g., node degree, PageRank score, node clustering – E.g degree, betweenness, closeness There are many available metrics at the node level: Many – metrics at the whole-network level: – E.g degree, betweenness, closeness There are also many metrics at the global level: – § E.g., diameter, clustering, size of giant component – E.g average distance, density, clustering coefficient There are also many metrics at the global level: – E.g What about in-between? average something distance, density, clustering coefficient What about something inbetween? § A What mesoscale characterization about something inbetween?of networks > 10/9/18 ?? ? Macroscopic: Mesoscopic Whole network Jure Leskovec, Stanford CS224W: Analysis of Networks > Microscopic: Pedro Ribeiro Single node Pedro Ribeiro Building of Blocks of Networks Building Blocks Networks ¡ Subnetworks, or subgraphs, are the Subnetworks, or subgraphs, arebuilding the building Subnetworks, or subgraphs, blocks of networks are the building blocks of networks: blocks of networks They have the power to characterize and discriminate networks ¡ 10/9/18 Theyhave havethe the power power to They tocharacterize characterizeand and discriminate networks discriminate networks Jure Leskovec, Stanford CS224W: Analysis of Networks Pedro Ribeiro Pedro Ribeiro Subgraph decomposition of an electronic circuit 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks Oxford Protein Informatics Group Example Application Consider directed Let’s considerall all possible possible (non-isomorphoic) subgraphs of size directed subgraphs of size 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks Pedro Ribeiro ¡ For each subgraph: § Imagine you have a metric capable of classifying the subgraph “significance” [more on that later] § Negative values indicate under-representation § Positive values indicate over-representation ¡ We create a network significance profile: § A feature vector with values for all subgraph types ¡ Next: Compare profiles of different networks: § § § § § 10/9/18 Regulatory network (gene regulation) Neuronal network (synaptic connections) World Wide Web (hyperlinks between pages) Social network (friendships) Language networks (word adjacency) Jure Leskovec, Stanford CS224W: Analysis of Networks Example Application Gene regulation networks Network significance profile Neurons Web and social Language networks Different networks have similar fingerprints! Image: (Milo et al., 2004) Different networks have similar significance profiles 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks Pedro Ribeiro Milo et al., Science 2004 Example Application Correlation Network significance profile similarity Clustering of networks based on their significance profiles Correlation in significance profile of the English and French language networks Closely related networks have more similar significance profiles 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks Different networks have similar fingerprints! Milo et al., Science 2004 Image: (Milo et al., 2004) Example Application – Science Subgraph types (corresponding to the X-axis of the plot) Network significance profile Co-Authorship Network in different scientific areas 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks Choobdar et al., ASONAM 2012 Image: (Choobdar et al, 2002) Pedro Ribeiro Graphlet Degree Vector ¡ An automorphism orbit takes into account the symmetries of a subgraph An automorphism “orbit” takes into account the ¡ Graphlet Degree Vector (GDV): a vector with symmetries of the graph the frequency of the node in each orbit position The graphlet degree vector is a feature vector with ¡ Example: Graphlet vector of node v the frequency of the node in eachdegree orbit position For a node ! of graph ", the automorphism orbit of ! is #$% ! = {( ∈ * " ; ( = , ! for some , Aut(")} The Aut denotes an automorphism group of ", i.e., an isomorphism from " to itself 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks Pedro Ribeiro 28 Graphlet degree vector counts #(graphlets) that a node touches at a particular orbit ¡ Considering graphlets on to nodes we get: ¡ § Vector of 73 coordinates is a signature of a node that describes the topology of node's neighborhood § Captures its interconnectivities out to a distance of hops ¡ Graphlet degree vector provides a measure of a node’s local network topology: § Comparing vectors of two nodes provides a highly constraining measure of local topological similarity between them 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 29 ergm.graphlets: A Package for ERG Modeling Based on Graphlet Statistics F D A A B Orbit B E E C C Orbit 27 Orbit 19 F D A B E C Orbit 15 F D A B E C F D Orbit 35 14 15 16 18 19 20 26 27 28 34 35 36 72 GDV(A) 0 0 0 0 0 0 Graphlet Degree Vector (GDV) of node A: Figure 2: An illustration of the graphlet signature of node A The number of graphlets that touch node A at orbit i is the ith element of graphlet degree vector of a node The figure highlights the graphlets that touch node A at orbits 15, 19, 27, and 35 from left to right, respectively (Kuchaiev et al 2010) § !-th element of GDV(A): #(graphlets) that touch A at orbit ! § only Highlighted arepatterns graphlets touch node A at orbits terms, as induced subgraph are takenthat into account by the ergm.graphlets terms ergm.graphlets alsoand provides extended of subgraph 15, package 19, 27, 35anfrom leftlistto right pattern based terms that covers any observable subgraph pattern of size 2, 3, 4, and 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks Yaveroglu et al., Journal of Statistical Software 2015 30 In the remainder of this article, we proceed as follows First, we provide a brief summary ¡ Finding size-k motifs/graphlets requires solving two challenges: § 1) Enumerating all size-k connected subgraphs § 2) Counting #(occurrences of each subgraph type) ¡ Just knowing if a certain subgraph exists in a graph is a hard computational problem! § Subgraph isomorphism is NP-complete ¡ Computation time grows exponentially as the size of the motif/graphlet increases § Feasible motif size is usually small (3 to 8) 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 31 ¡ Network-centric approaches: § Step 1) Enumerate all k-connected sets of nodes § Step 2) Count subgraphs of each type via graph isomorphisms test ¡ Algorithms: § Exact subgraph enumeration (ESU) [Wernicke 2006] § Kavosh [Kashani et al 2009] § Subgraph sampling [Kashtan et al 2004] ¡ 10/9/18 Today: ESU algorithm Jure Leskovec, Stanford CS224W: Analysis of Networks 32 ¡ Two sets: § !"#$%&'() : currently constructed subgraph (motif) § !*+,*-"./- : set of candidate nodes to extend the motif ¡ Idea: Starting with a node 0, add those nodes to !*+,*-"./- set that have two properties: § Their node_id must be larger than that of § They may only be neighbored to the newly added node but not to a node already in !"#$%&'() ¡ ESU is implemented as a recursive function: § The running of this function can be displayed as a tree-like structure of depth 2, called the ESU-Tree 10/9/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 33 350 IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIO subrouti a node w1 # w2 in this se lemma s for provi Lemma Fig Pseudocode for the algorithm ESU which allonsize-k Wernicke, enumerates IEEE/ACM Transactions Computational Biology and Bioinformatics 2006 given graph G of(The 10/9/18subgraphs in a Jure Leskovec, Stanford CS224W: Analysis Networks definition of the exclusive 34 neighborhood Nexcl ðv; V Þ is given in Section 2.1.) 012345674829:56;, @ : 0A741B829:56;

Ngày đăng: 26/07/2023, 19:35

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan