snap.stanford.edu

Download Report

Transcript snap.stanford.edu

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

http://cs224w.stanford.edu

4/26/2020

Better and better clusters Clusters get worse and worse Best cluster has ~100 nodes

k, (cluster size) Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

2

Denser and denser network core 4/26/2020 Small good communities

Nested core-periphery

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

3

 

Intuition: Self-similarity

Object is similar to a part of itself: the whole has the same shape as one or more of the parts Mimic recursive graph/community growth Initial graph Recursive expansion 

Kronecker Product

self-similar matrices is a way of generating 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

4

[PKDD ‘05] Intermediate stage (3x3) (9x9) Initiator graph 4/26/2020 After the growth phase Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

5

Kronecker product

given by of matrices A and B is

N x M K x L N*K x M*L

 Define a Kronecker product of two graphs as a Kronecker product of their

adjacency matrices

4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

6

[PKDD ‘05] 

Kronecker graph:

sequence of graphs by iterating the a growing

Kronecker product:

Note:

One can easily use multiple initiator matrices (

K 1 ’

,

K 1 ’’

,

K 1 ’’’

) (even of different sizes) 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

K 1

7

[PKDD ‘05]   For

K k K 1

(k th 

N 1 k

on nodes

N 1

nodes and

E 1

Kronecker power of edges

K 1

) has: 

E 1 k

edges We get

densification power-law:

 𝑬 𝒕 ∝ 𝑵 𝒕 𝒂 , What is a?

 𝐥𝐨𝐠 𝑬 𝒕 𝒂 = 𝐥𝐨𝐠 𝑵 𝒕 = 𝐥𝐨𝐠 𝑬 𝟏 𝐥𝐨𝐠(𝑵 𝟏 ) 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

K 1

8

[PKDD ‘05] 

Continuing multypling with

K 1

obtain

K 4

and so on … we

4/26/2020

K 1

3 x 3 9 x 9

K 4

adjacency matrix Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

9

[PKDD ‘05] 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

10

[PKDD ’05] 

Kronecker graphs have many properties found in real networks:

Properties of static networks

 Power-Law like Degree Distribution  Power-Law eigenvalue and eigenvector distribution  Small Diameter 

Properties of dynamic networks

 Densification Power Law  Shrinking/Stabilizing Diameter 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

11

[PKDD ’05]  Observation: Edges in Kronecker graphs: where

X

are appropriate nodes in G and H 

Why?

 An entry in matrix G  H is a multiplication of entries in G and H.

4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

12

[PKDD ’05]    Theorem: diameter

d

Constant diameter

: If

G

,

H

then

G

H

has diameter

d

have

What is distance between nodes u, v in

G

H

?

t

Let u=[a,b], v=[a’,b’] (using notation from last slide) hen edge (u,v) in

G

H

iif (a,a’)G and (b,b’)

H

 So, path a to a’ in G is less d steps: a

1 ,a 2 ,a 3 ,…,a d

   And path b to b’ in H is less d steps: b

1 ,b 2 ,b 3 ,…,b d

Then: edge ([a

1 ,b 1 ], [a 2 ,b 2

]) is in

G

H

So it takes steps to get from u to v in

G

H

Consequence:

 If

K 1

has diameter

d

then graph

K k

also has diameter

d

4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

13

[PKDD ’05]    Create

N 1

N 1

Compute the

k th

For each entry edge (

u,v

)

probability matrix

in

K k p

Kronecker power

Θ k uv

of

Θ k

include an with probability

p

Θ 1

uv

0.5 0.2

0.1 0.3

Θ 1

Probability of edge

p ij

Kronecker multiplication 0.25

0.10

0.10

0.04

0.05

0.05

0.15

0.02

0.02

0.15

0.06

0.06

0.01

0.03

0.03

Θ 2 = Θ 1

Θ 1

0.09

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Instance matrix

K 2

Flip biased coins 14 4/26/2020

What is known about Stochastic Kronecker?

Undirected

Kronecker graph model with: 

Connected

, if: 

b+c > 1

 1  

Connected component of size Θ(n)

, if: 

(a+b)(b+c) > 1

Constant diameter

, if: 

b+c > 1

Not searchable

by a decentralized algorithm a b a>b>c b c 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

15

Show that naïve coin flipping is too slow. Have a picture of the RMAT-like generation.

Double edges will be a problem.

4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

16

  Skip method of moments!

Have a nice slide with pictures motivaing what we are trying to do.

Given a real network

G

Want to estimate the initiator matrix: Method of moments

[Gleich&Owen ‘09]  Compare counts of and solve the system of equations   1  a b b d For every of the 4 subgraphs, we get an equation:  2 E[# ] = (a+2b+c) k - (a+c) k  2 E[# ] = …  … where k = log 2 (N)  Now solve the system of equations by trying all possible values (a,b,c) 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

17

 estimation. Just explain the formulation and how to compute the likelihood Method 2:

Maximum Likelihood Estimation

arg max 1 P( | Kronecker  1 )   Naïve estimation takes

O(N!N

2 )

: 

N!

for different node labelings: 

N 2

for traversing graph adjacency matrix Do gradient descent to find good  1  1  a b c d  We will get this down do O(E)!

4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

18

KronFit: Maximum likelihood estimation

  Given real graph

G

Find Kronecker initiator graph

Θ

which arg max 

P

(

G

|  ) We then need to (efficiently) calculate

P

(

G

|  ) 

And maximize over

Θ

(e.g., using gradient descent) 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

19

the likelihood – edge probability and the outcome  Given a graph

G Θ

and Kronecker matrix we calculate probability that

Θ

generated

G P(G|Θ)

0.25 0.10 0.10 0.04

1 0 1 0.5 0.2

0.05 0.15 0.02 0.06

0 1 0 0.05 0.02 0.15 0.06

1 0 1 0.1 0.3

1 1 1 0.01 0.03 0.03 0.09

Θ Θ k G P(G|Θ ) P

(

G

|  )  (

u

, 

v

) 

G

k

[

u

,

v

] (

u

,

v

 ) 

G

( 1  

k

[

u

,

v

]) 1 1 1 1 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

20 G

[ICML ‘07] 0.5

0.1

1

Θ

0.2

0.3

G’

3

σ

0.25

0.05

0.05

0.01

1 0 0.10

Θ k

0.10

0.15

0.02

0.03

0 1 1 1 0.02

0.15

0.03

0 1 0.04

0.06

0.06

0.09

   Nodes are unlabeled Graphs G’ and G” should have the same probability

P(G’|Θ) = P(G”|Θ)

One needs to consider all

P

(

G

| node correspondences

σ

 )   

P

(

G

|  ,  )

P

(  ) 2 1 1 1 1 4 0 0 1 1 1 2

G”

4 3 1 0 1 0 1 0 1 0 1 1 1 1   All correspondences are a priori equally likely There are

O(N!)

1 1 1 1

P(G’|Θ) = P(G”|Θ)

4/26/2020 correspondences Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

21

[ICML ‘07]  Assume that we solved the node correspondence problem 

P

( Calculating

G

|  )  (

u

 ,

v

) 

G

k

[

u

,

v

] (

u

 ,

v

) 

G

( 1   Takes

O(N 2 )

time 

k

[

u

,

v

]) 4/26/2020 0.25

0.05

0.05

0.01

0.10

0.10

0.15

0.02

0.02

0.15

0.03

Θ k

0.03

0.04

0.06

0.06

0.09

σ

1 0 1 0 0 1 0 0 1 0 1 1

G

1 1 1 1

P(G|Θ, σ )

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

22

Node correspondence:

 Permutation σ defines the mapping  2 1 Randomly search over σ 3 4 Swap node 2 4 labels 1 and 4 to find good mappings 3 1 1 2 3 4 1 0 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 2 3 4 1 1 1 1 1 1 1 1 1 0 0 1 0 0 1 1  4/26/2020

Calculating the likelihood

P(G|Θ,σ)

 Calculate likelihood of

empty graph

(G with 0 edges)  Correct it for edges that we observe in the graph Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

23

 Experimental setup  Given real graph

G

   Gradient descent from random initial point Obtain estimated parameters

Θ

Generate synthetic graph

K

using

Θ

 Compare properties of graphs

G

and

K

Note:

 We do not fit the graph properties themselves  We fit the likelihood and then compare the properties   a b c d 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

30

[ICML ‘07] 

Real and Kronecker are very close:

 1  0.99 0.54

0.49 0.13

4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

32

explansion for the little matrix.

What do estimated parameters tell us about the network structure?

b edges   a b c d a edges d edges c edges 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

33

[JMLR ‘10] 

What do estimated parameters tell us about the network structure?

  0.9 0.5

0.5 0.1

0.5 edges

Core

0.9 edges

Periphery

0.1 edges 0.5 edges 4/26/2020

Nested Core-periphery

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

34

Small and large networks are very different:

[JMLR ‘10] 4/26/2020  = 0.99 0.17

0.17 0.82

 = 0.99 0.54

0.49 0.13

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

35

Skip

Large scale network structure:

 Large networks are different from small  networks and manifolds

Nested

Core

periphery

 Recursive onion-like structure of the network where each layer decomposes into a core and periphery 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

36

Skip   Remember the SKG theorems:  Connected , if

b+c>1

:  0.55+0.15 > 1.

No!

  0.99 0.55

0.55 0.15

 Giant component , if

(a+b)∙(b+c)>1

:  (0.99+0.55)∙(0.55+0.15) > 1.

Yes!

Real graphs are in the in the parameter region analogous to the giant component of an

extremely sparse

G np

4/26/2020

G np 1/n

real-networks

log(n)/n

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

37

  Each node has a set of

categorical attributes

 Example:  Gender: Male, Female  Home country: US, Canada, Russia, etc.

How do node attributes influence link formation? 𝑣 ’s gender 𝒖 𝒗 𝒖 𝒗

FEMALE FEMALE MALE 0.3

0.6

u is friends with v

MALE 0.6

0.2

Link probability Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

4/26/2020 39

  Let the values of the 𝒊

-th attribute

for node 𝑢 and 𝑣 be 𝒂 𝒊 𝒖 and 𝒂 𝒊 (𝒗)  𝑎 𝑖 𝑢 and 𝑎 𝑖 (𝑣) can take values {0, ⋯ , 𝑑 𝑖 − 1} Question: How can we capture the influence of the attributes on link formation?

Attribute matrix

𝚯 𝑎 𝑖 𝑣 = 0 𝑎 𝑖 𝑣 = 1 𝑎 𝑖 𝑢 = 0 𝚯[𝟎, 𝟎] 𝚯[𝟎, 𝟏] 𝑷 𝒖, 𝒗 = 𝚯[𝒂 𝒊 𝒖 , 𝒂 𝒊 (𝒗)] 𝑎 𝑖 𝑢 = 1 𝚯[𝟏, 𝟎] 𝚯[𝟏, 𝟏] 4/26/2020 Each entry of the attribute matrix captures the

probability of a link

between two nodes associated with the attributes of them Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

40

Flexibility in the network structure:

Homophily

: love of the

same

 e.g., political parties, hobbies

0.9 0.1

0.1

0.8

Heterophily

: love of the

opposite

 e.g., genders

0.2

0.9

0.9 0.1

Core-periphery

: love of the

core

 e.g. extrovert personalities

0.9 0.5

0.5

0.2

4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

41

How do we combine the effects of multiple attributes?

Multiply the probabilities

from all attributes 𝒂 𝒖 = [ 𝒂 𝒗 = [ 𝚯 𝐢 = 𝜶 𝟏 𝜷 𝟏 𝜷 𝟏 𝜸 𝟏 𝑷 𝒖, 𝒗 = 4/26/2020 𝟎 𝟎 𝜶 𝟏 × 𝟎 𝟏 𝜶 𝟐 𝜷 𝟐 𝜷 𝟐 𝜸 𝟐 𝜷 𝟐 × 𝟏 𝟏 𝜶 𝟑 𝜷 𝟑 𝜷 𝟑 𝜸 𝟑 𝜸 𝟑 × 𝟎 𝟎 𝜶 𝟒 𝜷 𝟒 𝜷 𝟒 𝜸 𝟒 𝜶 𝟒 ] ] Link probability Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Node attributes

+

Attribute matrices 42

Multiplicative Attribute Graph

𝑴(𝒏, 𝒍, 𝒂, 𝜣) :  A network contains 𝒏

nodes

 Each node has 𝒍

categorical attributes

   𝑎 𝑖 (𝑢) represents the 𝒊

-th attribute of node

𝒖 Each attribute 𝑎 𝑖 (∙) is linked to a 𝒅 𝒊

link-affinity matrix

𝜣 𝒊 × 𝒅 𝒊

attribute

Edge probability between nodes 𝑢 and 𝑣 𝒍 𝑷(𝒖, 𝒗) = 𝚯 𝒊 [𝒂 𝒊 𝒊=𝟏 𝒖 , 𝒂 𝒊 𝒗 ] 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

43

statement more precise – explain it better.

  Initiator matrix K

1

acts like an

Probability of a link

affinity matrix

between nodes

u, v

:

P

(

u

,

v

) 

i k

  1

K

1 (

A u

(

i

),

A v

(

i

))

K

1  0 1

a c b d v 2 v 3

= (0,1) = (1,0) 0 1

P(v 2 ,v 3 ) = b∙c

4/26/2020 = Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

44

    Each node in a Kronecker graph has a node id (e.g. 0, ⋯ , 2 𝑙 − 1 ) A binary representation of node id is its attribute vector in a MAG model Then, the (stochastic) adjacency matrices of two models are equivalent

Example:

𝑣 0 𝑣 1 𝑣 2 𝑣 3 𝑎 𝑲 𝑏 𝑣 0 𝑣 1 𝑎 𝒂 𝑐 𝑏 𝑑 𝑎 𝑐 𝒃 𝑏 𝑑 𝑐 𝑑 𝑣 2 𝑣 3 𝑎 𝒄 𝑐 𝑏 𝑑 𝑎 𝒅 𝑐 𝑏 𝑑 𝑎(𝑣 1 ) = [0 1] 𝑎(𝑣 2 ) = [1 0] 𝑃 𝑣 1 , 𝑣 2 = 𝑏 ∙ 𝑐 4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

45

2 ingredients of Kronecker model:

 (1) Each of

2 k

nodes has a unique binary vector of length

k

 Node id expressed binary is the vector  (2) The initiator matrix

K

Question:

 What if ingredient (1) is dropped?

 i.e., do we need high variability of feature vectors?

4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

46

Adjacency matrices:

4/26/2020 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

47