Optimal estimation of audience size - Sophia
Download
Report
Transcript Optimal estimation of audience size - Sophia
Parameter Estimation and
Performance Analysis
of Several Network Applications
Sara Alouf
Ph.D. defense - November 8, 2002
Advisor: Philippe Nain
Thesis topics
Adaptive unicast applications
Background: network does not offer guarantee
Objective: estimate network internal state
Large audience multicast applications
Background: need for membership estimates
Objective: efficiently track membership
Mobile code applications
Background: existence of several mechanisms for
objects communication
Objective: determine fastest among two of them
Thesis topics
Adaptive unicast applications
Background: network does not offer guarantee
Objective: estimate network internal state
Challenges:
efficient congestion control, good QoS
Two distinct approaches:
adding intelligence to network
adding intelligence to applications
acquire some knowledge on network
change application policy accordingly
Adaptive unicast applications
Poisson probes
Application
data packets
Sink
K
Methodology:
source probes network
having feedback from destination, source measures
some performance metrics (e.g. loss probability, endto-end delay, conditional loss probability, etc.)
given model for connection, metrics are expressed in
terms of network internal state
given performance metrics, source infers network
internal state
Adaptive unicast applications
Main contributions:
Detailed analysis of the M+M/M/1/K queue
(expressions for 5 metrics of interest, including
loss-related conditional probabilities)
New analysis of the M+M/D/1/K queue (explicit
information on stationary distribution;
expressions for 3 metrics of interest)
Identification of “best” way of inferring network
internal characteristics:
use loss rate and network response time
given by M+M/M/1/K queue model
Thesis topics
Adaptive unicast applications
Background: network does not offer guarantee
Objective: estimate network internal state
Large audience multicast applications
Background: need for membership estimates
Objective: efficiently track membership
Mobile code applications
Background: existence of several mechanisms for
objects communication
Objective: determine fastest among two of them
Large audience multicast applications
Motivation - Objective
Kalman filter
Wiener filter
Least square estimation
Extension
Large audience multicast applications
Motivation - Objective
Kalman filter
Wiener filter
Least square estimation
Extension
Motivation
Interesting multicast applications (distance
learning, video-conferences, events, radios,
televisions (?), live sports(?), etc.)
Membership is required for:
feedback suppression (RTP, SRM)
tuning amount of FEC packets for reliability
pricing
stopping transmission when no more receivers
and especially for radios and future TVs, to:
adapt transmission content, advertise, ...
Previous work
#ACKs
needed
Bolot, Turletti &
Wakeman
Nonnenmacher &
Biersack
Friedman &
Towsley
Liu &
Nonnenmacher
at least
one
at least
one
at least
one
at least
one
Previous
estimate
Bias
Feedback
implosion
no
possible
no if
N 216
yes
yes
no
no
no
no
no
possible
possible
Need for unbiased estimator that efficiently uses
previous estimates
Methodology
Source:
Receivers:
each S seconds, send ACK to source with prob. p
Source:
periodically requests from receivers to send ACK
with probability p every S seconds
stores Yn number of ACKs received at time nS
Objective: use noisy observation Yn to
estimate membership Nn = N(nS)
Naive estimation
Yn
ˆ
Nn =
p
Drawbacks:
very noisy (s.l.l.n. lim N Y/N = p)
no profit from correlation (no use of previous
estimate)
Naive estimation : p = 0.01
Naive estimation : p = 0.50
EWMA estimation
Yn
ˆ
ˆ
N n,a = a N n 1,a 1 a
p
0 a 1
Advantages:
use of previous estimate
no a priori information needed
Drawbacks:
what value for a ?
estimator does not depend on ACK interval S
EWMA estimation
Objective
Use optimal filtering techniques to find
estimator
Notation
Ti join time of participant i
Ti+Di leave time of participant i
N(t) number of participants at time t
N t = 1Ti t Ti Di
i 1
Occupation process in the G/G/ queue
… not much is known about it …
Large audience multicast applications
Motivation - Objective
Kalman filter
Wiener filter
Least square estimation
Extension
M/M/ model - heavy traffic case
Assumptions:
Poisson arrival process, intensity T
exponential on-times, parameter
Occupation process in the M/M/ queue
T
average membership: T =
NT t T
Define normalized membership ZT t =
T
if T , ZT(t) Ornstein-Ühlenbeck process
t t u
t
X t = X 0e 2 e
dBu
0
{B(t), t 0} standard Brownian motion
Optimal estimation - Kalman filter
Ornstein-Ühlenbeck process in discrete time
X n 1S = e
S
X nS 2
n1S
nS
e n1S u dBu
n1 = n wn
with
n = X nS , = e
and wn = 2
n1S
nS
S
e n1S u dBu
wn are white noise with variance Q = (12)
Optimal estimation - Kalman filter
Number of ACKs at step n: Yn
Define normalized measurement
Yn pT
Mn =
, n = 0, 1,
ZT(nS)
T
VT(n)
Yn pNT nS
NT nS T
=p
T
T
Weak limit T :
mn = p n vn
vn are white noise with variance R = p(1p)
Optimal estimation - Kalman filter
Stationary version
Optimal filter minimal mean-square error
System dynamics
n+1 = n wn
Measurement
mn = pn vn
wn and vn white noise
variances Q and R
Error variance
P = ([ 2 P + Q]1 + p2 / R)1
Filter gain
K = Pp/R
State estimator
ˆn = ˆn1 K (mn p[ˆn1 ])
prediction
actualization
Optimal estimation - Kalman filter
ˆn estimator of n
Define Nˆ n estimator of Nn
Nˆ n = ˆn T T
Finally
EWMA estimator
Yn
ˆ
ˆ
N n,a = a N n 1,a 1 a
p
Nˆ n = 1 Kp Nˆ n1 KYn T 1 1 Kp
Yn amount of ACKs at nth observatio n step
T
and assumed known
To summarize
System state
Measurement
Continuous Discrete
time
time
NT(t)
Nn = NT(nS)
ZT(t)
Zn = ZT(nS)
X(t)
n = X(nS)
n+1 = n + wn
Yn
Estimation
Nˆ n = 1 Kp Nˆ n1 KYn
T 1 1 Kp
Mn = p Zn + VT(n) ˆn = ˆn1 K (M n p[ˆn1 ])
Kalman filter
mn = p n + vn
ˆn = ˆn1 K (mn p[ˆn1 ])
Simulations
Objective: validate model
Assumptions made in theory
Poisson arrivals
Exponential on-times
Heavy-traffic regime
Simulations:
2 regimes investigated: light load/heavy-load
2 distributions: Exponential/Pareto
8 different scenarios simulated
Validation with real traces
Objective: further validate model
Robustness to “real” distributions?
Independence-related assumptions are
violated
Distribution of traces investigated
Best fit for inter- Best fit for onarrivals sequence times sequence
Short audio Weibull
Long audio Lognormal
Weibull
Lognormal
Membership in real traces vs. time
Objective
Find optimal estimator under more general
assumptions
Large audience multicast applications
Motivation - Objective
Kalman filter
Wiener filter
Least square estimation
Extension
M/G/ model
Assumptions:
Poisson arrival process, intensity
on-times have common probability distribution
D denotes a generic random variable
Occupation process in the M/G/ queue
Characteristics of N(t) in steady-state:
Poisson random variable, Mean = Variance = = E[D]
Autocorrelation function
CovN t , N t h = PD u du
h
Notation: Cov N k = CovN nS , N n k S
Optimal estimation - Wiener filter
Noisy observation Yn
yn
Wiener filter
Ho(z)
Optimal linear filter minimal mean-square error
Optimal estimation - Wiener filter
Introduce:
power spectrum of yn n , S y z = k = Cov y k z k
z - transform of Covy k , Sy z = k = Covy k z k
We have:
Covy k = p Cov k
Cov y k = p 2 Cov k 1k =0 p1 p
Canonical factorizat ion, S y z = G z G z 1
Compute
S
z
y
H z = 1
G z
H o z =
H z
Gz
Application to M/M/ model
When D ~ Expμ
Cov ν k = , =exp S
B
We find H o z =
1 Az
1 1 2 p 1 1 1 2 p
where A =
2 1 p
1 1 1 1 2 p
B=
2 p1 p
B
Transfer function H o z =
1 Az
Impulse response ˆ = Aˆ By
k
1
2
2
2
2
2
2
2
2
2
1
n
n 1
n
Application to M/M/ model
Centered processes : ˆ = Aˆ By
n
n 1
n
Non-centered processes:
Nˆ n = ANˆ n1 BYn 1 A pB
Mean square error
min
2
= E N n Nˆ n
Bp
= 1
1A
Kalman filter vs. Wiener filter
Estimators are the same!
But
Kalman filter M/M/ queue, heavy traffic
Wiener filter M/M/ queue
we relaxed one assumption
Large audience multicast applications
Motivation - Objective
Kalman filter
Wiener filter
Least square estimation
Extension
Optimal first-order linear filter
Find A0, 1 and B such that
ˆn = Aˆn 1 Byn
[
]
mean - square error = E n ˆn minimized
k
ˆ
Steady- state n = B A yn k
2
k =0
Minimize
2
pB
= 2 pBg A 2 2 pg A 1 2 p
1 A
where
g z = z k Cov k
k =0
Optimal first-order linear filter
A =0
System to solve
=0
B
Solution is unique
D ~ Exp same solution as Wiener filter
D ~ HyperexponentialL,i , pi ,i =1L
Numerical solving
Validation with real traces
Distribution of inter-arrivals and on-times
Best fit for inter- Best fit for onarrivals sequence times sequence
video1
video2
video3
video4
Lognormal
Lognormal
Weibull
Weibull
Weibull
Weibull
Lognormal
Weibull
Mean & Variance of the error N n Nˆ n
video1 Nˆ nE
Mean Variance
min, min
0.1121 12.6641 13.9424
0.0469 12.8508 12.1198 1.1504
0.0062
0.4947 1.4068
video2
Nˆ nH 2
Nˆ E
video3
Nˆ nH 2
Nˆ E
n
0.0188
0.0373
0.7851
0.2065
0.3955 3.5570
0.7370
video4
Nˆ nH 2
Nˆ E
n
0.0194
0.0523
0.2291
0.9105
0.2084 3.5365
1.5656
Nˆ nH 2
0.0651
1.4231
0.6755 2.3177
n
empirical
theoretical
And the winner is …
Estimator
E
ˆ
Nn !
Advantages:
optimal for M/M/ queue
efficient over real traces
only two parameters required
Drawbacks:
a priori knowledge needed
Large audience multicast applications
Motivation - Objective
Kalman filter
Wiener filter
Least square estimation
Extension
Extension
How to estimate and ?
estimate and (recall = )
Receivers :
on arrival, send "hello" message with prob. q
Source :
records time of receipt t m of mth hello message
m
ˆ
estimates as =
(MLE)
q tm
estimates as ˆ = E Nˆ n (initially ˆ = E [Yn p ])
[ ]
Large audience multicast applications
Main contributions:
Proposition of several unbiased estimators that
efficiently track membership
Validation through simulated and real traces
Identification of “best” estimator among those
proposed
Proposition of estimators for a priori parameters
Thesis topics
Adaptive unicast applications
Background: network does not offer guarantee
Objective: estimate network internal state
Large audience multicast applications
Background: need for membership estimates
Objective: efficiently track membership
Mobile code applications
Background: existence of several mechanisms for
objects communication
Objective: determine fastest among two of them
Mobile code applications
Code mobility paradigm
Forwarders mechanism
Centralized mechanism
Simulations & experiments
Contributions
Code mobility paradigm
Definition:
components of application might change host
(migrate) during execution
Utility:
load balancing
data mining (data available on different hosts)
e-commerce (find the cheapest airline fare)
Issue:
ensure communications with mobile objects
Code mobility paradigm
Two widely used solutions:
distributed approach (use forwarders)
centralized approach (use server)
Objective: identify best approach in terms
of response time
Forwarders mechanism: description
S : Source
O : mobile Object
F : Forwarder
reference
Host A
S
O
Host B
Host C
Host D
Forwarders mechanism: description
S : Source
O : mobile Object
F : Forwarder
reference
Host A
S
Message
F
Host B
Forwarding
Migrating
O
F
Host C
Forwarding
Migrating
O
Host D
Forwarders mechanism: description
S : Source
O : mobile Object
F : Forwarder
reference
Host A
S
Update
F
F
O
Host B
Host C
Host D
Forwarders mechanism: description
S : Source
O : mobile Object
F : Forwarder
reference
Host A
S
F
F
O
Host B
Host C
Host D
Subsequent messages use new reference
Centralized mechanism: description
Host A
Server
S : Source
O : mobile Object
reference
S
O
Host B
Host C
Host D
Centralized mechanism: description
Host A
Server
S : Source
O : mobile Object
reference
S
Update
Migrating
Host B
O
Host C
Host D
Centralized mechanism: description
Host A
Server
S : Source
O : mobile Object
reference
S
Fail
Update
Message
Migrating
Host B
Host C
O
Host D
Centralized mechanism: description
Host A
S
Query
Server
location
S : Source
O : mobile Object
reference
Reply
Message
! Object may
have moved in
the meantime
O
Host B
Host C
Host D
Centralized mechanism: the server
may need to send Reply after processing
request from Source
O
S
S
send Reply
S
O
Mobile code applications
Forwarders mechanism:
infinite state-space Markov chain
expression for expected response time TF
expression for expected number of forwarders
Centralized mechanism:
finite state-space Markov chain
expression for expected response time TS
Models validated through simulations and
experiments (LAN & MAN)
Forwarder LAN (100 Mb/s)
Mean response time (ms) vs. communication rate
250
200
migration rate
150
= 10
100
=5
50
0
=1
1
2
3
4
5
Experiments
6
7
8
9
10
Model
11
Server LAN (100Mb/s)
Mean response time (ms) vs. communication rate
120
100
80
= 10
60
=5
40
=1
20
0
1
2
3
4
5
6
Experiments
7
8
9
10
Model
11
Forwarder MAN (7Mb/s)
Mean response time (ms) vs. communication rate
3000
2500
= 10
2000
1500
1000
=5
=1
500
0
1
2
3
4
5
6
Experiments
7
8
9
Model
10
11
Server MAN (7Mb/s)
Mean response time (ms) vs. communication rate
3000
2500
= 10
2000
1500
=5
1000
500
0
=1
1
2
3
4
5
6
Experiments
7
8
9
Model
10
11
Overall performance is fair
models can safely be
used for performance evaluation
Mobile code applications
Main contributions:
Proposition of Markovian models for two
communication mechanisms
Validation through simulations and experiments
(LAN & MAN)
Theoretical comparison:
prediction of fastest mechanism in general
Conclusion
General methodology
Propose mathematical models for system at hand
Derive metrics of interest or estimators under
models assumptions
Validate models via simulations and/or
experiments
Simple tools applicable over wide range of
applications
Conclusion
Optimal filtering techniques
estimation of RTT in TCP protocol
estimation of average queue size in RED routers
…
Performance analysis tools
very useful in design of mobile code applications
(high cost of implementation)
protocol evaluation
…
Thank you!