Felix Correlation Method for Topology Discovery

Download Report

Transcript Felix Correlation Method for Topology Discovery

Felix Project
Inferential Topology Discovery:
From Delay Data to Network Graph
Mark W. Garrett
14 February 2001
J. Baron, D. Shallcross
Darpa ITO
Intrusion Detection Program
An SAIC Company
C. Huitema, J. DesMarais,
B. Siegell, P. Seymour,
F. Chung
The Felix Project
Goals
 Evaluate network status independently from
the usual network management protocols
and data.
– E.g., no use of routing protocols, ping,
traceroute, ICMP, SNMP, etc
 Measure network by sending sparse probe packets among a set
of monitors. Collect delay and loss data.
 From these data discover the network topology and evaluate the
performance of all links in the network.
 Small new field of research developing called “Inferential
Topology Discovery” (Kurose, Towsley, Paxson, McCanne,
Caceras, Duffield, et al.)
 This talk presents a particular method based on modeling
correlation across the observations.
MWG Felix Project Jan01 2
Network Monitoring
Felix Data Analysis Approach
common component matrix
measurement system
A
raw data
Internet
898896670
898896693
898896707
898896718
898896762
898896907
898896923
898897096
898897099
898897101
898897265
898897280
898897285
898897333
898897351
898897355
898897458
898897511
898897631
898897782
898897897
898897925
898897926
898897938
898898220
E
C
D
145718
159087
184151
173311
195353
243507
252194
315751
321974
326261
376966
371363
371371
401269
385009
389369
428081
470461
472162
558276
608592
605581
616708
614421
693504
F A : Fri Jun 26 17:31:10 1998 Fri Jun 26 17:31:10 1998 1 0 0 0 0
D E : Fri Jun 26 17:31:33 1998 Fri Jun 26 17:31:33 1998 22 0 0 0 0
C D : Fri Jun 26 17:31:47 1998 Fri Jun 26 17:31:47 1998 6 0 0 0 0
B F : Fri Jun 26 17:31:58 1998 Fri Jun 26 17:31:58 1998 6 0 0 0 0
D E : Fri Jun 26 17:32:42 1998 Fri Jun 26 17:32:42 1998 22 0 0 0 0
F A : Fri Jun 26 17:35:07 1998 Fri Jun 26 17:35:07 1998 1 0 0 0 0
A C : Fri Jun 26 17:35:23 1998 Fri Jun 26 17:35:23 1998 8 0 0 0 0
D C : Fri Jun 26 17:38:16 1998 Fri Jun 26 17:38:16 1998 9 0 0 0 0
E B : Fri Jun 26 17:38:19 1998 Fri Jun 26 17:38:19 1998 2 0 0 0 0
F C : Fri Jun 26 17:38:21 1998 Fri Jun 26 17:38:21 1998 3 0 0 0 0
E F : Fri Jun 26 17:41:05 1998 Fri Jun 26 17:41:05 1998 7 0 0 0 0
B C : Fri Jun 26 17:41:20 1998 Fri Jun 26 17:41:20 1998 6 0 0 0 0
B F : Fri Jun 26 17:41:25 1998 Fri Jun 26 17:41:25 1998 6 0 0 0 0
C E : Fri Jun 26 17:42:13 1998 Fri Jun 26 17:42:13 1998 14 0 0 0 0
A F : Fri Jun 26 17:42:31 1998 Fri Jun 26 17:42:31 1998 8 0 0 0 0
D B : Fri Jun 26 17:42:35 1998 Fri Jun 26 17:42:35 1998 5 0 0 0 0
C B : Fri Jun 26 17:44:18 1998 Fri Jun 26 17:44:18 1998 9 0 0 0 0
B D : Fri Jun 26 17:45:11 1998 Fri Jun 26 17:45:11 1998 2 0 0 0 0
E B : Fri Jun 26 17:47:11 1998 Fri Jun 26 17:47:11 1998 0 0 0 0 0
D F : Fri Jun 26 17:49:42 1998 Fri Jun 26 17:49:42 1998 9 0 0 0 0
C D : Fri Jun 26 17:51:37 1998 Fri Jun 26 17:51:37 1998 4 0 0 0 0
A F : Fri Jun 26 17:52:05 1998 Fri Jun 26 17:52:05 1998 8 0 0 0 0
E F : Fri Jun 26 17:52:06 1998 Fri Jun 26 17:52:06 1998 3 0 0 0 0
C B : Fri Jun 26 17:52:18 1998 Fri Jun 26 17:52:18 1998 13 0 0 0 0
C D : Fri Jun 26 17:57:00 1998 Fri Jun 26 17:57:00 1998 5 0 0 0 0
Network element
and link performance
network graph
network map
A
A
e3
e1
C
e4
e2
(backbone
site)
e5
F
e6
D
e9
e8
e7
F
e1
B
(NAP)
E
•
•
•
•
•
Delay
Loss
Load
Throughput
Pr cong
B
AC
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
AD
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
AE
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
AF
1
1
1
1
1
0
1
1
1
1
1
1
0
1
1
BC
1
1
0
0
0
1
1
1
1
1
1
1
0
0
0
BD
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
BE
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
BF
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
CD
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
CE
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
CF
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
DE
0
0
1
1
0
0
1
1
0
1
1
0
1
1
1
DF
0
0
1
1
1
0
1
1
1
1
1
1
1
1
1
EF
0
0
1
1
1
0
1
1
1
1
1
1
1
1
1
Identify links
AB
AC
AD
AE
AF
BC
BD
BE
BF
CD
CE
CF
DE
DF
EF
e1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
e2
1
1
0
0
0
0
1
1
1
1
1
1
0
0
0
e3
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
e4
0
1
0
0
0
1
0
0
0
1
1
1
0
0
0
e5
0
0
1
1
1
0
1
1
1
1
1
1
0
0
0
e6
0
0
0
0
1
0
0
0
1
0
0
1
0
1
1
e7
0
0
1
1
0
0
1
1
0
1
1
0
0
1
1
e8
0
0
0
1
0
0
0
1
0
0
1
0
1
0
1
e9
0
0
1
0
0
0
1
0
0
1
0
0
1
1
0
Create graph
graph specification
(nodes and links)
D
E
e6
e2
AB
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
path component matrix
Performance
Assessment
Simulator
(NAP)
Topology
Discovery
AB
AC
AD
AE
AF
BC
BD
BE
BF
CD
CE
CF
DE
DF
EF
e5
e3
e4
e7
e8
C
Add geographic
information
e9
E
Graph
Rendering
F
A
C
B
D
MWG Felix Project Jan01 3
intermediate results
B
F
Network Discovery
Terminology for Network Topology and Monitoring
M2
M3
M1
M4
M5
M14
M6
M15
M7
M13
M8
Monitor
M9
M12
(Interior) Node
Cloud
Path
M11
M10
– For m monitors, there are np = m(m-1) paths
– The number of links is between m (star) and m2 (full mesh)
– Links are unidirectional
– … So a line in the graph usually represents two links
MWG Felix Project Jan01 4
Network Discovery
Reduced Graph Concept
M2
M3
M1
M4
“Series
Equivalent
Edges”
M5
Links Not
Traversed by
Monitor Packets
– Define Reduced Graph as the sub-graph within the network that
is discoverable.
– Excludes links not traversed by monitor packets
– Combines equivalent edges, i.e. edges traversed by exactly the
same set of paths.
– Non-series equivalent edges can occur when reducing a real
graph, but they are very rare.
MWG Felix Project Jan01 5
Network Discovery
Example of Complete Network and Reduced Graph
3150 nodes
100 monitors
WAN-MAN-LAN design
187 nodes
698 (unidirectional) links
Reduced graph tends to
include more of backbone
and less of edges
MWG Felix Project Jan01 6
Network Discovery
Reduced Graph – Non-series Equivalent Edges
“Non-Series
Equivalent
Edges”
A
B
– Here is an (artificially) symmetrical graph with equivalent edges.
– We have seen non-series equivalent edges only once in reducing
randomly generated graphs (out of 100+ examples)
MWG Felix Project Jan01 7
Network Discovery
Reduced Graph Related to Paths
– Reduced graph determined by n = 2… monitors is a successive
approximation to the network.
MWG Felix Project Jan01 8
Network Discovery
Reduced Graph Related to Paths
– Reduced graph determined by n = 2, 3… monitors is a
successive approximation to the network.
MWG Felix Project Jan01 9
Network Discovery
Reduced Graph Related to Paths
– Reduced graph determined by n = 2… 4… monitors is a
successive approximation to the network.
MWG Felix Project Jan01 10
Network Discovery
Reduced Graph Related to Paths
– Reduced graph determined by n = 2… 5… monitors is a
successive approximation to the network.
MWG Felix Project Jan01 11
Network Discovery
Reduced Graph Related to Paths
– Reduced graph determined by n = 2… 6… monitors is a
successive approximation to the network.
MWG Felix Project Jan01 12
Network Discovery
Reduced Graph Related to Paths
– Reduced graph determined by n = 2… 7… monitors is a
successive approximation to the network.
MWG Felix Project Jan01 13
Network Discovery
Reduced Graph Related to Paths
– Reduced graph determined by n = 2… 8… monitors is a
successive approximation to the network.
MWG Felix Project Jan01 14
Network Discovery
Reduced Graph Related to Paths
– Reduced graph determined by n = 2… 9… monitors is a
successive approximation to the network.
MWG Felix Project Jan01 15
Network Discovery
Reduced Graph Related to Paths
– Reduced graph determined by n = 2… 10… monitors is a
successive approximation to the network. Etc…
MWG Felix Project Jan01 16
A Relationship Between Observable Path Metric,
Topology and Link Performance
 The delay along a path = sum of delays for each link
DP = X  dL
– X identifies topology (in terms of links on paths), and is
always rank deficient.
– To illustrate, consider adding a constant delay to each link
into a particular node, and subtracting from outgoing links.
+c
-c
+c
-c
 A variation on this general relationship can be formulated with
each performance metric: packet loss, link load, throughput,
congestion probability.
MWG Felix Project Jan01 17
Felix Data Measurements
Routing Changes Apparent in Data
Data courtesy of Advanced Network Solutions
MWG Felix Project Jan01 18
Felix Data Measurements
Routing Changes Apparent in Data
Data courtesy of Advanced Network Solutions
MWG Felix Project Jan01 19
Felix Data Measurements
Routing Changes Apparent in Data
Data courtesy of Advanced Network Solutions
MWG Felix Project Jan01 20
Felix Topology Discovery
Correlation Method: Concept
MWG Felix Project Jan01 21
Felix Correlation Method
Identifying Links By Correlation of Paths
Group 1
Group 3 Group 4
Group 1
1
Path A
0
Group 5
1
Path B
0
Group 2
Path C
1
0
Path D
1
0
MWG Felix Project Jan01 22
Felix Correlation Method
Abstracting Congestion Event Sequence From Data
 Open problem: how exactly to get from a delay measurement on
a real network to a series of thresholded congestion “events”.
 Several approaches:
– Average delay in a fixed-length sliding window
– Cross-correlation function (pair-wise between paths, but promising…)
– Congestion decision can be complex combination of delay and loss in
window – probably most robust method, but needs some empirical
experience to create useful methodology.
 We assume a solution and solve the next part…
MWG Felix Project Jan01 23
Felix Correlation Method
Network Model Assumptions
 Node processing delay is negligible, so paths sharing nodes
(but not links) do not show correlation. Queueing delay is
associated with the link.
 Network links congest independently.
 Congestion is modeled as
fixed-length discrete-time events
 Congestion rate is fixed for each
link, but can vary over a range for
the set of links in the network.
 Routes are stable
 Monitor packets are exchanged
frequently enough that congestion
events will be recorded consistently
across all paths crossing a given link.
– Note, this does not require every event to be noticed, and real
congestion events do occur over a wide range of time scales.
MWG Felix Project Jan01 24
Felix Correlation Method
Observations and Triggers
 An Observation is a measurement of congestion (however
defined) on a path between two monitors.
 A Trigger is a hypothetical cause of congestion, such as a link, or
a group of links, in the network.
 Method of solution:
Based on joint observations across all paths, define a
model that discriminates statistically between the true
triggers, that represent links in the network, and the
apparent (or false) triggers that are due to combinations
of true links congesting simultaneously. Then reduce
the triggers down to single links.
MWG Felix Project Jan01 25
Felix Correlation Method
Observations and Triggers
Illustration of observations, triggers, paths and links:
Observation “a” = path M1M3,
Observation “b” = path M2M4
Trigger a = all links on path a
Trigger ab = links in common
between paths a and b
M1
M2
M3
M4
M5
Definitions and Notation:
 An observation event occurs at time t, when a set of paths are
congested and not congested as specified.
 For example, O
(t )
abc dgk
is the observation that paths a, b, d, k are congested and paths c,
g are not congested at time t. Paths not included in the subscript
are “don’t care” for this observation variable.
MWG Felix Project Jan01 26
Felix Correlation Method
Observations and Triggers
 A trigger event occurs at time t, when at least one link congested
that is a member (or not a member) of a set of paths as specified.
(t )
 For example, Tv (t )  T
abc dgk
is the event that some link congests that is shared by paths a, b,
d, k, and is not on path c, or path g.
 We refer to paths in the specification as “included” or “excluded”
 If all paths are included or excluded, the trigger is “fully specified”
 Observation and Trigger Probabilities follow these examples:
 Pr[pathsa, b, d, k are congested
abc dgk
and pathsc, g are not congested].
PO
MWG Felix Project Jan01 27
Felix Correlation Method
Relationship Between Observations and Triggers
 Now we can related the observation and trigger probabilities in
several interesting ways. E.g., [Ratnasamy & McCanne]
P o  P t  (1  P t ) P t P t
ab
ab
ab
a b ab
P o  (1  P t ) P t (1  P t )
ab
ab
ab
ab
P o  (1  P t ) P t (1  P t )
ab
ab
ab
ab
 This set says, considering only two paths, if we see congestion
on both paths, then it is caused either by a link the two paths
share in common, or one link on each of the paths (not in
common) are congesting together.
 Similarly, if we see congestion on only one path, it must be due to
a link that is on that path, and not on the other.
 Note, this forces us to explicitly write the combinations of triggers
that can cause an observation (not very scaleable).
MWG Felix Project Jan01 28
Felix Correlation Method
Relationship Between Observations and Triggers
 Another interesting and useful relationship is this:
P o  (1  P t )(1  P t )(1  P t )
ab
ab
ab
ab
P o  (1  P t )(1  P t )
a
ab
ab
P o  (1  P t )(1  P t )
b
ab
ab
 This one says that we observe no congestion on a set of paths
only when none of the triggers that are on those paths are active.
 We say a path (in the trigger specification) contradicts the
observation when a path turned off in the observation is included
in the trigger. (It is easy to write down these combinations.)
 Inclusion of observations with multiple paths makes this model
more powerful than an earlier method (DP = X  dL) that relied on
a rank-deficient matrix.
MWG Felix Project Jan01 29
Felix Correlation Method
Organization of Triggers
 Tree contains all potential triggers, i.e., all possible combinations
of paths that can specify a link or group of links.
 Triggers on a level partition the set of (potential) links in the graph
 The tree grows exponentially as we add paths, but the number of
true triggers is bounded by the number of links in the network.

t
P
Pat
a
Pt
ab
Pt
ab
Pt
ab
t
P
ab
t
t
t Pt Pt Pt Pt Pt
P
P
Pabc
abc ab c ab c a bc a bc a b c a b c
MWG Felix Project Jan01 30
Felix Correlation Method
Some More Useful Stuff From the Model…
Po  Pt
a
a
Po  Po  Po  Po  1
ab
ab
ab
ab
(1  P t )  (1  P t )(1  P t
ab

Po
a b ... p
abc
n

all v with n paths
abc
(1  P t )
v
)
n
 Observation of congestion on a path means some link on that
path is congesting (single-path observation and trigger).
 Something must be happening, so the sum over all possible
observations with n paths specified equals unity.
 Child triggers are related to their parent.
 No congestion observed anywhere means all triggers are quiet.
(The product of all inverse triggers on any level is constant.)
MWG Felix Project Jan01 31
Felix Correlation Method
Solving for Trigger Probabilities – 3 Path Example
 Observation of no congestion on 3,2,1 paths implies no activity
on any trigger that includes one of the named paths
 Triangular form: each equation produces one Pvt
t
Paob c  (1  Pabc
)(1  Pabt c )(1  Patb c )(1  Patbc )(1  Patb c )(1  Patbc )(1  Patb c ) (1)
t
Paob  (1  Pabc
)(1  Pabt c )(1  Patb c )(1  Patbc )(1  Patb c )(1  Patbc ) (2)
t
Paoc  (1  Pabc
)(1  Pabt c )(1  Patb c )(1  Patbc )(1  Patb c )(1  Patb c ) (3)
t
Pboc  (1  Pabc
)(1  Pabt c )(1  Patb c )(1  Patbc )(1  Patbc )(1  Patb c ) (4)
t
t
t
t
Pao  (1  Pabc
)(1  Pab
c )(1  Pab c )(1  Pab c )
(5)
t
t
t
t
Pbo  (1  Pabc
)(1  Pab
c )(1  Pa bc )(1  Pa bc )
(6)
t
Pco  (1  Pabc
)(1  Patb c )(1  Patbc )(1  Patb c )
(7 )
MWG Felix Project Jan01 32
Felix Correlation Method
Generalization of Solution to Any Number of Paths
 Count various things:
– n = number of paths in the triggers = level in tree diagram
– k = number of paths in the observation (varying from n down to 1)
– j = number of paths excluded in the triggers (varying from 0 to n-1)
class 1
triggers
class 2
triggers
0≤j<k
j=0
j=k
j=1
class 3
triggers
k < j ≤ n-1
j = n-1
t
t
t
t
Paob c ...  (1  Pabcde
)(1  Patbcde )(1  Pabcd
)

(
1

P
)

(
1

P
)
e
a b c de
ab cde
n paths in trig
k paths in obs
Master equation has k = n
 Divide “Master” equation by each “Specific”
equation to find one trigger probability
MWG Felix Project Jan01 33
Felix Correlation Method
Generalization of Solution to Any Number of Paths
n
n
 For n paths there are 2 -1 equations and 2 -1 triggers.
 The “Master” equation has all possible triggers, i.e., any active
trigger contradicts the observation of no congestion anywhere.
 For class 1 triggers (0 ≤ j < k):
– The j paths excluded in the trigger cannot cover all k paths in the
observation, so at least one path is included in the trigger that
contradicts the observation.
– All triggers then occur in both the master and specific equations, and
cancel out in the division.
 For class 2 triggers (j = k):
– The j paths excluded in the trigger can cover the k paths in the
observation, but there is only one combination. Call this the target
trigger. All other triggers contradict the observation and cancel out.
– There is one equation in which each such target trigger survives the
division.
MWG Felix Project Jan01 34
Felix Correlation Method
Generalization of Solution to Any Number of Paths
 For class 3 triggers (k < j ≤ n-1):
– There are n  k such triggers.
n j
– No class 3 triggers exist in the first two stages
(k = n, and k = n–1)
– All class 3 triggers are computed at previous stages, when they
appear as class 2 triggers.
– For example, consider the case k = 8 < j = 9. In the previous stage
when we had k = 9, the class 2 triggers with j = 9 were solved.
 Each “Quotient” equation is left with one unknown trigger
MWG Felix Project Jan01 35
Felix Correlation Method
Generalization of Solution to Any Number of Paths
 General form of solution, for trigger probabilities with paths
excluded (first case), and with no paths excluded (second case):
O
O
P
/
P
E
PEt I  1  N
 (1  Pw )
w
O
P
N
PIt  1 
 (1  Pu )
u
Where:
 E is the set of excluded paths in the trigger
 I is the set of included paths in the trigger
 N is the set of all paths
 w is the set of class-3 trigger probabilities in the master equation, but not
in the specific equation
 u is the set of all trigger probabilities with at least one path excluded.
MWG Felix Project Jan01 36
Felix Correlation Method
Pruning Tree Reduces Computational Complexity
 Returning to the tree of trigger probabilities…
 For triggers that specify actual links in the network, the trigger
probability is the (aggregate) congestion rate on that set of links.
 False triggers (for which no link exists) are approximately zero
 (True) triggers on the last level identify single links and their
associated paths (reduced graph).
 Therefore, a trigger prob. of zero can be pruned out along with all
of its descendents.
 Number of triggers to compute is bounded by (paths • links).
t
P
Pat
a
Pt
ab
Pt
ab
Pt
ab
Pt
Let’s see some
results…
ab
t
t
t Pt Pt Pt Pt Pt
P
P
Pabc
abc ab c ab c a bc a bc a b c a b c
MWG Felix Project Jan01 37
Felix Correlation Method
Results
18 monitors
23 nodes
95 (unidirectional) links
MWG Felix Project Jan01 38
Felix Correlation Method
Results
19 monitors
27 nodes
114 (unidirectional) links
MWG Felix Project Jan01 39
Felix Correlation Method
Results
20 monitors
29 nodes
121 (unidirectional) links
MWG Felix Project Jan01 40
Felix Correlation Method
Results
50 monitors
61 nodes
269 (unidirectional) links
• Run with link congestion rate
of 1% (best efficiency)
• Approx 12 hours to compute
MWG Felix Project Jan01 41
Felix Correlation Method
Algorithm Complexity
 Complexity of correlation algorithm is more than (paths • links)
because the computation of triggers increases with number of
paths…
 …but it is polynomial: O(LPN + L2P) for L links, P paths, N
simulated time intervals.
 However, the overall run-time is apparently exponential,
because it takes more data to discriminate the true and false
triggers as the network gets larger.
MWG Felix Project Jan01 42
Felix Correlation Method
Algorithm Complexity
 Running time of simulation and correlation code as function of
network size (number of links)
– Exponential increase if quality of result held constant.
– Link Congestion Rate = 10% (constant).
5
10
simulation ( = 10)
correlation ( = 5)
correlation ( = 10)
4
10
3
10
2
10
1
10
0
10
20
30
40
50
60
70
80
90
100
MWG Felix Project Jan01 43
Felix Correlation Method
Results With Variable Link Congestion
 Constant link congestion rate is artificial constraint
 Algorithm works well with links congesting in a range,
e.g., tried 1% – 5%, 1% – 10%, 1% – 15%, etc.
 Effect is to spread the distribution of true trigger probabilities
– Longer convergence time
 Probably all of the simplifying assumptions in the model can be
relaxed at the cost of increased convergence time.
 Correlation algorithm ran fastest with 1% link congestion
– Probably an artifact of implementation…
MWG Felix Project Jan01 44
Felix Correlation Method
Statistical Discrimination Problem
 Nice scaling property of the algorithm depends on being able to
discriminate true from false triggers.
 False triggers are approximately zero, but at edge of solvable
parameter space, both populations are more noisy
– Too little data (from simulation or measurement)
– Too much variability in link loss rates
– Too much dependence between link congestions, etc, etc
 Need to set threshold, group triggers and evaluate “goodness”
of resulting topology.

1
false
triggers
true
triggers
0
false
triggers


 
2
true
triggers
0
MWG Felix Project Jan01 45
Felix Project
General Discussion
 We can make use of multicast idea (MINC project) to reduce load
on network: each source multicasts packets to all receivers.
– This will improve coincidence of measurements in time across all
paths.
MWG Felix Project Jan01 46
Felix Topology / Performance Inference
Applicability
 Does not replace “traditional” autodiscovery methods (SNMP)
 May augment autodiscovery in difficult environment:
– Military network under physical attack
– Military or commercial network under cyber-attack
– Network with buggy software (e.g. routing implementation)
– Multiple protocol layers, not all included in autodiscovery
– Protocols too old or new for the autodiscovery technology
 Good for observing networks not under your control
– Commercial context: ISP tries to locate fault between networks
– Military context: Map out foreign network
 Future networks will probably be more chaotic
– Track changing topology & performance with minimal extra load
MWG Felix Project Jan01 47
Felix Project
Further Work
 Augment algorithms to work in more fully realistic environment:
– Non-discrete time: congestion events with “ragged edges”
– Less stable routing (this is hard)
– Dependence in link congestion – cross traffic routed through net
– More volatile delay and loss patterns (most significant issue)
– Wider range of congestion rates; more erratic time dependence
 Variation with delay metric (instead of probability of congestion)
is possible.
– Result would be bounds on mean, variance, (higher moments) of
delay distribution on each link.
– Procedure is analogous (but not identical) to present algorithm.
 Progressive version of algorithm to update existing topology
estimate based on continuous data.
 More experience with real data
MWG Felix Project Jan01 48
Felix Correlation Method
Summary: Three Stages in Topology Discovery
Packet Delay
& Loss Data
Event Abstraction
Algorithm
Network
Graph
Event
Time
Series
Graph
Construction
Algorithm
(“Matroid” Alg)
Event Correlation
Algorithm
Path-Link
Matrix
 Reduced graph concept: limitation of observability
 Decomposition of topology/performance inference into
separable problems
– Allows optimization and variation of algorithms at each stage
 Correlation Method:
– Uses entire time series of data for each path.
– Takes advantage of joint statistics across all paths
MWG Felix Project Jan01 49
Felix Project
Extra Slides
MWG Felix Project Jan01 50
Topology Discovery and Performance Assessment:
6 Methods
 “Matrix” method
– Evaluates “goodness” of topology, solves for link delay or loss
 Tree-growing method
– Composes topology as a tree, solves for link delays, goodness of fit.
 Spike-tail method
– Uses delay distributions to solve for link loads given topology.
 Correlation method
– Uses time-dependent delay data to find common path components.
 Matroid method
– Graph theoretic method - complements correlation method by solving
from path-component list to topology
 Distance-Realization method
– Graph theoretic method - finds topologies rooted at each monitor and
merges for complete system topology
MWG Felix Project Jan01 51
Time Series Example A G
MWG Felix Project Jan01 52
Time Series Example G A
MWG Felix Project Jan01 53
Heavy-tailed Distribution of Packet Delay
MWG Felix Project Jan01 54
Clock Drift Correction
 Algorithm
– Compute lower envelope of time series in both directions.
– Shift lower envelopes so centered around zero.
– Compute “average” of envelopes (one flipped).
– Add/subtract average from original time series data.
MWG Felix Project Jan01 55
Clock Drift Problem in One-way Delay
Measurements
MWG Felix Project Jan01 56
Time series data - adjusted delay from buzzard to
brooklyn
MWG Felix Project Jan01 57
Time series data - adjusted delay from brooklyn to
buzzard
MWG Felix Project Jan01 58
Felix Matroid Method
Summary
 Partial solution - goes with Correlation Method
 Input here is unordered “path-component” list
 3 stages with increasing level of assumptions
 Clouds: Incomplete solution is still useful when uncertainty is
geographically localized. Internet graphs usually have no clouds.
 Split nodes in solution - we can surely fix this problem.
 Monitor placement changes discovered graph -- also changes
discoverable “reduced” graph
 Two examples - used GeorgiaTech code to generate realisticlooking Internet topologies
A
F
B
E
C
D
MWG Felix Project Jan01 59
Felix Matroid Method
Example of Reconstructed Network Graph
3150 nodes
WAN-MAN-LAN design
MWG Felix Project Jan01 60
Felix Matroid Method
Example of Reconstructed Network Graph
100 monitors
187 nodes
698 (unidirectional) links
MWG Felix Project Jan01 61
Felix Matroid Method
Example of Reconstructed Network Graph
74 split nodes
2 clouds with 3 links each
MWG Felix Project Jan01 62