09 node2vec (graph representation learning)

CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu ? ? ? ? ? Machine Learning Node classification 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu ? ? ? 10/23/18 x Machine Learning Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu ¡ (Supervised) Machine Learning Lifecycle requires feature engineering every single time! Raw Data Structured Data Automatically Feature Engineering learn the features 10/23/18 Learning Algorithm Model Downstream task Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu Goal: Efficient task-independent feature learning for machine learning in networks! vec node u !: # → ℝ& ℝ& Feature representation, embedding 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu What is network embedding? ¡ Task: map in a network • WeWe map eacheach nodenode in a network into a into low- a low-dimensional space dimensional space § Distributed representation forfornodes – Distributed representation nodes § Similarity of embedding between nodes indicates – Similarity between nodes indicate the link theirstrength network similarity – Encode network information andgenerate generatenode node § Encode network information and representation representation 17 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu Example 2D embedding of nodes of the Zachary’s Karate Karate Club network: • Zachary’s Network: ¡ Image from: Perozzi et al DeepWalk: Online Learning of Social Representations KDD 2014 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu ¡ Modern deep learning toolbox is designed for simple sequences or grids § CNNs for fixed-size images/grids… § RNNs or word2vec for text/sequences… 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu ¡ But networks are far more complex! § Complex topographical structure (i.e., no spatial locality like grids) § No fixed node ordering or reference point (i.e., the isomorphism problem) § Often dynamic and have multimodal features 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 10 ¡ Different kinds of biased random walks: § Based on node attributes (Dong et al., 2017) § Based on a learned weights (Abu-El-Haija et al., 2017) ¡ Alternative optimization schemes: § Directly optimize based on 1-hop and 2-hop random walk probabilities (as in LINE from Tang et al 2015) ¡ Network preprocessing techniques: § Run random walks on modified versions of the original network (e.g., Ribeiro et al 2017’s struct2vec, Chen et al 2016’s HARP) 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 48 ¡ How to use embeddings !" of nodes: § Clustering/community detection: Cluster points #$ § Node classification: Predict label %(#$ ) of node ( based on #$ § Link prediction: Predict edge ((, *) based on %(#$ , #+ ) § Where we can: concatenate, avg, product, or take a difference between the embeddings: § Concatenate: %(#$ , #+ )= ,([#$ , #+ ]) § Hadamard: %(#$ , #+ )= ,(#$ ∗ #+ ) (per coordinate product) § Sum/Avg: %(#$ , #+ )= ,(#$ + #+ ) § Distance: %(#$ , #+ )= ,(||#$ − #+ ||3 ) 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 49 Basic idea: Embed nodes so that distances in embedding space reflect node similarities in the original network ¡ Different notions of node similarity: ¡ § Adjacency-based (i.e., similar if connected) § Multi-hop similarity definitions § Random walk approaches (covered today) 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 50 ¡ ¡ So what method should I use ? No one method wins in all cases… § E.g., node2vec performs better on node classification while multi-hop methods performs better on link prediction (Goyal and Ferrara, 2017 survey) Random walk approaches are generally more efficient ¡ In general: Must choose def’n of node similarity that matches your application! ¡ 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 51 ¡ Goal: Want to embed an entire graph ! "# ¡ Tasks: § Classifying toxic vs non-toxic molecules § Identifying anomalous graphs 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 53 Simple idea: ¡ Run a standard graph embedding technique on the (sub)graph ! ¡ Then just sum (or average) the node embeddings in the (sub)graph ! "# = % "& &∈# ¡ Used by Duvenaud et al., 2016 to classify molecules based on their graph structure Representation Learning on Networks, snap.stanford.edu/proj/embeddings-www, WWW 2018 54 ¡ Idea: Introduce a “virtual node” to represent the (sub)graph and run a standard graph embedding technique ¡ Proposed by Li et al., 2016 as a general technique for subgraph embedding Representation Learning on Networks, snap.stanford.edu/proj/embeddings-www, WWW 2018 55 States in anonymous walk correspond to the index of the first time we visited the node in a random walk Anonymous Walk Embeddings, ICML 2018 https://arxiv.org/pdf/1805.11921.pdf 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 56 Number of anonymous walks grows exponentially: § There are anon walks !" of length 3: !# =111, !$ =112, !% = 121, !& = 122, !' = 123 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 57 Enumerate all possible anonymous walks !" of # steps and record their counts ¡ Represent the graph as a probability distribution over these walks ¡ ¡ For example: § Set # = § Then we can represent the graph as a 5-dim vector § Since there are anonymous walks !" of length 3: 111, 112, 121, 122, 123 § &' [)] = probability of anonymous walk !" in + 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 58 ¡ Complete counting of all anonymous walks in a large graph may be infeasible ¡ Sampling approach to approximating the true distribution: Generate independently a set of ! random walks and calculate its corresponding empirical distribution of anonymous walks ¡ How many random walks ! we need? § We want the distribution to have error of more than " with prob less than #: 10/23/18 where: $… number of anon walks of length % Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu For example: There are $ = 877 anonymous walks of length % = If we set " = 0.1 and # = 0.01 then we need to generate !=122500 random walks 59 Learn embedding !" of every anonymous walk #" ¡ The embedding of a graph $ is then sum/avg/concatenation of walk embeddings z& How to embed walks? ¡ Idea: Embed walks s.t next walk can be predicted § Set z& s.t we maximize * ' ()* ()+, , … , ()* = 0(2) § Where ()* is a 4-th random walk starting at node 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 60 ¡ Run ! different random walks from " each of length #: $% & = ()* , (,* … (.* § Let /0 be its anonymous version of walk (0 ¡ ¡ Learn to predict walks that co-occur in 1-size window Estimate embedding 20 of anonymous walk /0 of (0 : max log ?((9 |(9B; , … , (9B) ) 9:; where: Δ… context window size EFG(H IJ ) L EFG(H(IL )) § ? (9 (9B;, … , (9B) = ∑M § N (9 = O + Q ⋅ ) ; ∑ ; 0:) § where O ∈ ℝ, Q ∈ ℝU , 20 is the embedding of the anonymized version of walk (0 Anonymous Walk Embeddings, ICML 2018 https://arxiv.org/pdf/1805.11921.pdf 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 61 We discussed ideas to graph embeddings ¡ Approach 1: Embed nodes and sum/avg them ¡ Approach 2: Create super-node that spans the (sub) graph and then embed that node ¡ Approach 3: Anonymous Walk Embeddings § Idea 1: Represent the graph via the distribution over all the anonymous walks § Idea 2: Sample the walks to approximate the distribution § Idea 3: Embed anonymous walks 10/23/18 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 62

Định dạng
Số trang	60
Dung lượng	28,54 MB