Transcript here

Influence propagation in
large graphs - theorems and
algorithms
B. Aditya Prakash
http://www.cs.cmu.edu/~badityap
Christos Faloutsos
http://www.cs.cmu.edu/~christos
Carnegie Mellon University
AUTH’12
Networks are everywhere!
Facebook Network [2010]
Gene Regulatory Network
[Decourty 2008]
Human Disease Network
[Barabasi 2007]
The Internet [2005]
AUTH '12
Prakash and Faloutsos 2012
2
Dynamical Processes over networks
are also everywhere!
AUTH '12
Prakash and Faloutsos 2012
3
Why do we care?
• Information Diffusion
• Viral Marketing
• Epidemiology and Public Health
• Cyber Security
• Human mobility
• Games and Virtual Worlds
• Ecology
• Social Collaboration
........
AUTH '12
Prakash and Faloutsos 2012
4
Why do we care? (1:
Epidemiology)
• Dynamical Processes over networks
[AJPH 2007]
Diseases over contact networks
AUTH '12
Prakash and Faloutsos 2012
CDC data: Visualization of
the first 35 tuberculosis
(TB) patients and their
1039 contacts
5
Why do we care? (1:
Epidemiology)
• Dynamical Processes over networks
• Each circle is a hospital
• ~3000 hospitals
• More than 30,000 patients
transferred
[US-MEDICARE
NETWORK 2005]
AUTH '12
Problem: Given k units of
disinfectant, whom to immunize?
Prakash and Faloutsos 2012
6
Why do we care? (1:
Epidemiology)
~6x
fewer!
CURRENT PRACTICE
[US-MEDICARE
NETWORK 2005]
OUR METHOD
AUTH '12
Prakash and Faloutsos 2012
7
Hospital-acquired
inf. took
99K+ lives, cost $5B+ (all per year)
Why do we care? (2: Online
Diffusion)
> 800m users, ~$1B
revenue [WSJ 2010]
~100m active users
> 50m users
AUTH '12
Prakash and Faloutsos 2012
8
Why do we care? (2: Online
Diffusion)
• Dynamical Processes over networks
Buy Versace™!
Followers
Celebrity
Social Media Marketing
AUTH '12
Prakash and Faloutsos 2012
9
High Impact – Multiple Settings
epidemic out-breaks
Q. How to squash rumors faster?
products/viruses
Q. How do opinions spread?
transmit s/w patches
Q. How to market better?
AUTH '12
Prakash and Faloutsos 2012
10
Research Theme
ANALYSIS
Understanding
POLICY/
ACTION
DATA
Large real-world
networks & processes
AUTH '12
Managing
Prakash and Faloutsos 2012
11
In this talk
Given propagation models:
Q1: Will an epidemic
happen?
ANALYSIS
Understanding
AUTH '12
Prakash and Faloutsos 2012
12
In this talk
Q2: How to immunize and
control out-breaks better?
POLICY/
ACTION
Managing
AUTH '12
Prakash and Faloutsos 2012
13
Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
AUTH '12
Prakash and Faloutsos 2012
14
A fundamental question
Strong
Virus
Epidemic?
AUTH '12
Prakash and Faloutsos 2012
15
example (static graph)
Weak Virus
Epidemic?
AUTH '12
Prakash and Faloutsos 2012
16
Problem Statement
# Infected
above (epidemic)
below (extinction)
Separate the
regimes?
time
Find, a condition under which
– virus will die out exponentially quickly
– regardless of initial infection condition
AUTH '12
Prakash and Faloutsos 2012
17
Threshold (static version)
Problem Statement
• Given:
–Graph G, and
–Virus specs (attack prob. etc.)
• Find:
–A condition for virus extinction/invasion
AUTH '12
Prakash and Faloutsos 2012
18
Threshold: Why important?
•
•
•
•
Accelerating simulations
Forecasting (‘What-if’ scenarios)
Design of contagion and/or topology
A great handle to manipulate the spreading
– Immunization
– Maximize collaboration
…..
AUTH '12
Prakash and Faloutsos 2012
19
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUTH '12
Prakash and Faloutsos 2012
20
“SIR” model: life immunity
(mumps)
• Each node in the graph is in one of three states
– Susceptible (i.e. healthy)
– Infected
– Removed (i.e. can’t get infected again)
Prob. δ
t=1
AUTH '12
t=2
Prakash and Faloutsos 2012
t=3
21
Terminology: continued
• Other virus propagation models (“VPM”)
– SIS : susceptible-infected-susceptible, flu-like
– SIRS : temporary immunity, like pertussis
– SEIR : mumps-like, with virus incubation
(E = Exposed)
….………….
• Underlying contact-network – ‘who-can-infectwhom’
AUTH '12
Prakash and Faloutsos 2012
22
Related Work















R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press, 1991.
A. Barrat, M. Barthélemy, and A. Vespignani. Dynamical Processes on Complex Networks.
Cambridge University Press, 2010.
F. M. Bass. A new product growth for model consumer durables. Management Science,
15(5):215–227, 1969.
D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in real
networks. ACM TISSEC, 10(4), 2008.
D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly
Connected World. Cambridge University Press, 2010.
A. Ganesh, L. Massoulie, and D. Towsley. The effect of network topology in spread of
epidemics. IEEE INFOCOM, 2005.
Y. Hayashi, M. Minoura, and J. Matsukubo. Recoverable prevalence in growing scale-free
networks and the effective immunization. arXiv:cond-at/0305549 v2, Aug. 6 2003.
H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000.
H. W. Hethcote and J. A. Yorke. Gonorrhea transmission dynamics and control. Springer
Lecture Notes in Biomathematics, 46, 1984.
J. O. Kephart and S. R. White. Directed-graph epidemiological models of computer viruses.
IEEE Computer Society Symposium on Research in Security and Privacy, 1991.
J. O. Kephart and S. R. White. Measuring and modeling computer virus prevalence. IEEE
Computer Society Symposium on Research in Security and Privacy, 1993.
R. Pastor-Santorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical
Review Letters 86, 14, 2001.
………
………
………
AUTH '12
All are about either:
• Structured
topologies (cliques,
block-diagonals,
hierarchies, random)
• Specific virus
propagation models
• Static graphs
Prakash and Faloutsos 2012
23
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUTH '12
Prakash and Faloutsos 2012
24
How should the answer look
like?
• Answer should depend on:
– Graph
– Virus Propagation Model (VPM)
• But how??
– Graph – average degree? max. degree? diameter?
– VPM – which parameters?
– How to combine – linear? quadratic? exponential?


2
2
(
d

d
)
/
d
?
d

diameter
? avg
…..
avg
m
ax
avg
AUTH '12
Prakash and Faloutsos 2012
25
Static Graphs: Our Main Result
• Informally,
For,
 any arbitrary topology (adjacency
matrix A)
 any virus propagation model (VPM) in
standard literature
•
the
epidemic threshold depends only
1.on the λ, first eigenvalue of A, and
2.some constant CVPM
, determined by the
virus propagation model
w/ Deepay
Chakrabarti
λ
CVPM
No
epidemic if
λ * CVPM< 1
In Prakash+ ICDM 2011 (Selected among best papers).
AUTH '12
Prakash and Faloutsos 2012
26
Our thresholds for some models
• s = effective strength
• s < 1 : below threshold
Models
SIS, SIR, SIRS, SEIR
SIV, SEIV
Effective Strength
(s)
s=λ.
 
 
 
s=λ.
 



      
SI
I
V
V
(H.I.V.) s
1
2
1
2
AUTH '12
 1v2  2 
= λ .  v   v  
1 
 2
Prakash and Faloutsos 2012
Threshold (tipping
point)
s=1
27
Our result: Intuition for λ
“Official” definition:
• Let A be the adjacency
matrix. Then λ is the root
with the largest magnitude of
the characteristic polynomial
of A [det(A – xI)].
“Un-official” Intuition 
• λ ~ # paths in the graph
k
A
≈
u
k
.u
• Doesn’t give much intuition!
k(i, j) = # of paths i  j
A
of length k
AUTH '12
Prakash and Faloutsos 2012
28
Largest Eigenvalue (λ)
better connectivity
λ≈2
λ≈2
N = 1000
AUTH '12
higher λ
λ= N
λ = N-1
λ= 31.67
λ= 999
Prakash and Faloutsos 2012
N nodes
29
Footprint
Fraction of Infections
Examples: Simulations – SIR
(mumps)
Effective Strength
Time ticks
(a) Infection profile
AUTH '12
(b) “Take-off” plot
PORTLAND graph: synthetic population,
31 million links, 6 million nodes
Prakash and Faloutsos 2012
30
Footprint
Fraction of Infections
Examples: Simulations – SIRS
(pertusis)
Time ticks
Effective Strength
(a) Infection profile
AUTH '12
(b) “Take-off” plot
PORTLAND graph: synthetic population,
31 million links, 6 million nodes
Prakash and Faloutsos 2012
31
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUTH '12
Prakash and Faloutsos 2012
32
See paper for
full proof
General VPM
structure
Model-based
λ * CVPM< 1
Topology and
stability
AUTH '12
Prakash and Faloutsos 2012
Graph-based
33
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUTH '12
Prakash and Faloutsos 2012
34
Dynamic Graphs: Epidemic?
Alternating behaviors
DAY
(e.g., work)
adjacency
matrix
8
8
AUTH '12
Prakash and Faloutsos 2012
35
Dynamic Graphs: Epidemic?
Alternating behaviors
NIGHT
(e.g., home)
adjacency
matrix
8
8
AUTH '12
Prakash and Faloutsos 2012
36
Model Description
Healthy
• SIS model
N2
Prob. β
– recovery rate δ
– infection rate β
N1
X
Prob. δ
Infected
N3
• Set of T arbitrary graphs
day
N
night
N
AUTH '12
N
, weekend…..
N
Prakash and Faloutsos 2012
37
Our result: Dynamic Graphs
Threshold
• Informally, NO epidemic if
eig (S) =
Single number!
Largest eigenvalue of
The system matrix S
In Prakash+, ECML-PKDD 2010
AUTH '12
Prakash and Faloutsos 2012
<1
S =
38
Infection-profile
log(fraction infected)
MIT Reality
Mining
Synthetic
ABOVE
ABOVE
AT
AT
BELOW
BELOW
Time
AUTH '12
Prakash and Faloutsos 2012
39
Footprint (#
infected @
“steady state”)
“Take-off” plots
Synthetic
MIT Reality
EPIDEMIC
Our
threshold
NO EPIDEMIC
Our
threshold
EPIDEMIC
NO EPIDEMIC
(log scale)
AUTH '12
Prakash and Faloutsos 2012
40
Outline
• Motivation
• Epidemics: what happens? (Theory)
– Background
– Result (Static Graphs)
– Proof Ideas (Static Graphs)
– Bonus 1: Dynamic Graphs
– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
AUTH '12
Prakash and Faloutsos 2012
41
Competing Contagions
iPhone v Android
Blu-ray v HD-DVD
Biological common flu/avian flu, pneumococcal inf etc
AUTH '12
Prakash and Faloutsos 2012
42
A simple model
• Modified flu-like
• Mutual Immunity (“pick one of the two”)
• Susceptible-Infected1-Infected2-Susceptible
Virus 2
Virus 1
AUTH '12
Prakash and Faloutsos 2012
43
Question: What happens in the
end?
Number of
Infections
green: virus 1
red: virus 2
Footprint @ Steady State
Footprint @ Steady State
ASSUME:
Virus 1 is stronger than Virus 2
AUTH '12
Prakash and Faloutsos 2012
= ?
44
Question: What happens in the
end? Footprint @ Steady State
Number of
Infections
green: virus 1
red: virus 2
Footprint @ Steady State
??
Strength
Strength
=
2
Strength
Strength
ASSUME:
Virus 1 is stronger than Virus 2
AUTH '12
Prakash and Faloutsos 2012
45
Answer: Winner-Takes-All
Number of
Infections
green: virus 1
red: virus 2
ASSUME:
Virus 1 is stronger than Virus 2
AUTH '12
Prakash and Faloutsos 2012
46
Our Result: Winner-Takes-All
Given our model, and any graph, the
weaker virus always dies-out completely
1. The stronger survives only if it is above threshold
2. Virus 1 is stronger than Virus 2, if:
strength(Virus 1) > strength(Virus 2)
3. Strength(Virus) = λ β / δ  same as before!
In Prakash+ WWW 2012
AUTH '12
Prakash and Faloutsos 2012
47
Real Examples
[Google Search Trends data]
Reddit v Digg
AUTH '12
Blu-Ray v HD-DVD
Prakash and Faloutsos 2012
48
Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
AUTH '12
Prakash and Faloutsos 2012
49
Full Static Immunization
Given: a graph A, virus prop. model and budget k;
Find: k ‘best’ nodes for immunization (removal).
?
?
k=2
?
?
AUTH '12
Prakash and Faloutsos 2012
50
Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
– Full Immunization (Static Graphs)
– Fractional Immunization
AUTH '12
Prakash and Faloutsos 2012
51
Challenges
• Given a graph A, budget k,
Q1 (Metric) How to measure the ‘shieldvalue’ for a set of nodes (S)?
Q2 (Algorithm) How to find a set of k nodes
with highest ‘shield-value’?
AUTH '12
Prakash and Faloutsos 2012
52
Proposed vulnerability measure
λ
λ is the epidemic threshold
“Safe”
“Vulnerable”
“Deadly”
Increasing λ
Increasing vulnerability
AUTH '12
Prakash and Faloutsos 2012
53
A1: “Eigen-Drop”: an ideal shield
value
Eigen-Drop(S)
Δ λ = λ - λs
9
9
11
10
Δ
9
10
1
1
4
4
8
8
2
2
5
5
6
Original Graph
AUTH '12
7
3
7
3
Prakash and Faloutsos 2012
6
Without {2, 6}
54
(Q2) - Direct Algorithm too
expensive!
• Immunize k nodes which maximize Δ λ
S = argmax Δ λ
• Combinatorial!
• Complexity:
– Example:
• 1,000 nodes, with 10,000 edges
• It takes 0.01 seconds to compute λ
• It takes 2,615 years to find 5-best nodes!
AUTH '12
Prakash and Faloutsos 2012
55
A2: Our Solution
• Part 1: Shield Value
– Carefully approximate Eigen-drop (Δ λ)
– Matrix perturbation theory
• Part 2: Algorithm
– Greedily pick best node at each step
– Near-optimal due to submodularity
• NetShield (linear complexity)
– O(nk2+m) n = # nodes; m = # edges
In Tong, Prakash+ ICDM 2010
AUTH '12
Prakash and Faloutsos 2012
56
Experiment: Immunization
quality
Log(fraction of
infected
nodes)
PageRank
Betweeness (shortest path)
Degree
Lower
is
better
AUTH '12
Acquaintance
Eigs (=HITS)
NetShield
Time
Prakash and Faloutsos 2012
57
Outline
• Motivation
• Epidemics: what happens? (Theory)
• Action: Who to immunize? (Algorithms)
– Full Immunization (Static Graphs)
– Fractional Immunization
AUTH '12
Prakash and Faloutsos 2012
58
Fractional Immunization of Networks
B. Aditya Prakash, Lada Adamic, Theodore
Iwashyna (M.D.), Hanghang Tong, Christos
Faloutsos
Under review
AUTH '12
Prakash and Faloutsos 2012
59
Fractional Asymmetric
Immunization
Drug-resistant Bacteria
(like XDR-TB)
Another
Hospital
Hospital
AUTH '12
Prakash and Faloutsos 2012
60
Fractional Asymmetric
Immunization
Drug-resistant Bacteria
(like XDR-TB)
Another
Hospital
Hospital
AUTH '12
Prakash and Faloutsos 2012
61
Fractional Asymmetric
Immunization
Problem: Given k units of disinfectant,
how to distribute them to maximize
hospitals saved?
Another
Hospital
Hospital
AUTH '12
Prakash and Faloutsos 2012
62
Our Algorithm “SMARTALLOC”
~6x
fewer!
[US-MEDICARE NETWORK 2005]
• Each circle is a hospital, ~3000 hospitals
• More than 30,000 patients transferred
CURRENT PRACTICE
AUTH '12
Prakash and Faloutsos 2012
SMART-ALLOC
63
Wall-Clock
Time
Running Time
> 1 week
≈
> 30,000x
speed-up!
Lower
is
better
AUTH '12
14 secs
Simulations
Prakash and Faloutsos 2012
SMART-ALLOC
64
Lower
is
better
Experiments
PENN-NETWORK
SECOND-LIFE
~5 x
AUTH '12
K = 200
Prakash and Faloutsos 2012
~2.5 x
K = 2000
65
Acknowledgements
Funding
AUTH '12
Prakash and Faloutsos 2012
66
References
1.
2.
3.
4.
5.
6.
7.
Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks (B. Aditya
Prakash, Deepayan Chakrabarti, Michalis Faloutsos, Nicholas Valler, Christos Faloutsos) In IEEE ICDM 2011, Vancouver (Invited to KAIS Journal Best Papers of ICDM.)
Virus Propagation on Time-Varying Networks: Theory and Immunization Algorithms (B.
Aditya Prakash, Hanghang Tong, Nicholas Valler, Michalis Faloutsos and Christos
Faloutsos) – In ECML-PKDD 2010, Barcelona, Spain
Epidemic Spreading on Mobile Ad Hoc Networks: Determining the Tipping Point
(Nicholas Valler, B. Aditya Prakash, Hanghang Tong, Michalis Faloutsos and Christos
Faloutsos) – In IEEE NETWORKING 2011, Valencia, Spain
Winner-takes-all: Competing Viruses or Ideas on fair-play networks (B. Aditya Prakash,
Alex Beutel, Roni Rosenfeld, Christos Faloutsos) – In WWW 2012, Lyon
On the Vulnerability of Large Graphs (Hanghang Tong, B. Aditya Prakash, Tina EliassiRad and Christos Faloutsos) – In IEEE ICDM 2010, Sydney, Australia
Fractional Immunization of Networks (B. Aditya Prakash, Lada Adamic, Theodore
Iwashyna, Hanghang Tong, Christos Faloutsos) - Under Submission
Rise and Fall Patterns of Information Diffusion: Model and Implications (Yasuko
Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, Christos Faloutsos) - Under
Submission
http://www.cs.cmu.edu/~badityap/
AUTH '12
Prakash and Faloutsos 2012
67
Propagation on Large Networks
B. Aditya Prakash
Christos Faloutsos
Analysis
AUTH '12
Policy/Action
Prakash and Faloutsos 2012
Data
68