Optimal estimation of audience size - Sophia

Download Report

Transcript Optimal estimation of audience size - Sophia

Parameter Estimation and
Performance Analysis
of Several Network Applications
Sara Alouf
Ph.D. defense - November 8, 2002
Advisor: Philippe Nain
Thesis topics
Adaptive unicast applications
Background: network does not offer guarantee
 Objective: estimate network internal state

Large audience multicast applications
Background: need for membership estimates
 Objective: efficiently track membership

Mobile code applications
Background: existence of several mechanisms for
objects communication
 Objective: determine fastest among two of them

Thesis topics
Adaptive unicast applications
Background: network does not offer guarantee
 Objective: estimate network internal state

Challenges:

efficient congestion control, good QoS
Two distinct approaches:
adding intelligence to network
 adding intelligence to applications
 acquire some knowledge on network
 change application policy accordingly

Adaptive unicast applications
Poisson probes 
Application
data packets



Sink
K
Methodology:
 source probes network
 having feedback from destination, source measures
some performance metrics (e.g. loss probability, endto-end delay, conditional loss probability, etc.)
 given model for connection, metrics are expressed in
terms of network internal state
 given performance metrics, source infers network
internal state
Adaptive unicast applications
Main contributions:



Detailed analysis of the M+M/M/1/K queue
(expressions for 5 metrics of interest, including
loss-related conditional probabilities)
New analysis of the M+M/D/1/K queue (explicit
information on stationary distribution;
expressions for 3 metrics of interest)
Identification of “best” way of inferring network
internal characteristics:
use loss rate and network response time
given by M+M/M/1/K queue model

Thesis topics
Adaptive unicast applications
Background: network does not offer guarantee
 Objective: estimate network internal state

Large audience multicast applications
Background: need for membership estimates
 Objective: efficiently track membership

Mobile code applications
Background: existence of several mechanisms for
objects communication
 Objective: determine fastest among two of them

Large audience multicast applications

Motivation - Objective

Kalman filter

Wiener filter

Least square estimation

Extension
Large audience multicast applications

Motivation - Objective

Kalman filter

Wiener filter

Least square estimation

Extension
Motivation


Interesting multicast applications (distance
learning, video-conferences, events, radios,
televisions (?), live sports(?), etc.)
Membership is required for:
feedback suppression (RTP, SRM)
 tuning amount of FEC packets for reliability
 pricing
 stopping transmission when no more receivers

and especially for radios and future TVs, to:

adapt transmission content, advertise, ...
Previous work
#ACKs
needed
Bolot, Turletti &
Wakeman
Nonnenmacher &
Biersack
Friedman &
Towsley
Liu &
Nonnenmacher
at least
one
at least
one
at least
one
at least
one
Previous
estimate
Bias
Feedback
implosion
no
possible
no if
N  216
yes
yes
no
no
no
no
no
possible
possible
 Need for unbiased estimator that efficiently uses
previous estimates
Methodology

Source:


Receivers:


each S seconds, send ACK to source with prob. p
Source:


periodically requests from receivers to send ACK
with probability p every S seconds
stores Yn number of ACKs received at time nS
Objective: use noisy observation Yn to
estimate membership Nn = N(nS)
Naive estimation
Yn
ˆ
Nn =
p
Drawbacks:
very noisy (s.l.l.n. lim N   Y/N = p)
 no profit from correlation (no use of previous
estimate)

Naive estimation : p = 0.01
Naive estimation : p = 0.50
EWMA estimation
Yn
ˆ
ˆ
N n,a = a N n 1,a  1  a 
p
0 a 1
Advantages:
use of previous estimate
 no a priori information needed

Drawbacks:
what value for a ?
 estimator does not depend on ACK interval S

EWMA estimation
Objective
Use optimal filtering techniques to find
estimator
Notation
Ti join time of participant i
 Ti+Di leave time of participant i
 N(t) number of participants at time t

N t  =  1Ti  t  Ti  Di 
i 1
Occupation process in the G/G/ queue
 … not much is known about it …

Large audience multicast applications

Motivation - Objective

Kalman filter

Wiener filter

Least square estimation

Extension
M/M/ model - heavy traffic case

Assumptions:


Poisson arrival process, intensity T
exponential on-times, parameter 
 Occupation process in the M/M/ queue
T
average membership: T =


NT t   T
Define normalized membership ZT t  =
T
if T  , ZT(t)  Ornstein-Ühlenbeck process
t   t u 
 t
X t  = X 0e  2 e
dBu 
0
{B(t), t  0} standard Brownian motion
Optimal estimation - Kalman filter

Ornstein-Ühlenbeck process in discrete time
X n  1S  = e
 S
X nS   2
n1S
nS
e  n1S u dBu 
 n1 =   n  wn
with
 n = X nS ,  = e
and wn = 2
n1S
nS
 S
e  n1S u dBu 
wn are white noise with variance Q = (12)
Optimal estimation - Kalman filter

Number of ACKs at step n: Yn

Define normalized measurement
Yn  pT
Mn =
, n = 0, 1, 
ZT(nS)
T
VT(n)
Yn  pNT nS 
NT nS   T
=p

T
T

Weak limit T  :
mn = p n  vn
vn are white noise with variance R = p(1p)
Optimal estimation - Kalman filter

Stationary version

Optimal filter  minimal mean-square error
System dynamics
n+1 =  n  wn
Measurement
mn = pn  vn
wn and vn white noise
variances Q and R
Error variance
P = ([ 2 P + Q]1 + p2 / R)1
Filter gain
K = Pp/R
State estimator
ˆn = ˆn1  K (mn  p[ˆn1 ])
prediction
actualization
Optimal estimation - Kalman filter
 ˆn estimator of n
 Define Nˆ n estimator of Nn
Nˆ n = ˆn T  T
Finally
EWMA estimator
Yn
ˆ
ˆ
N n,a = a N n 1,a  1  a 
p
Nˆ n =  1 Kp Nˆ n1  KYn  T 1 1 Kp
Yn amount of ACKs at nth observatio n step
 T
and  assumed known
To summarize
System state
Measurement
Continuous Discrete
time
time
NT(t)
Nn = NT(nS)
ZT(t)
Zn = ZT(nS)
X(t)
n = X(nS)
n+1 = n + wn
Yn
Estimation
Nˆ n =  1 Kp Nˆ n1  KYn
 T 1 1 Kp 
Mn = p Zn + VT(n) ˆn = ˆn1  K (M n  p[ˆn1 ])
Kalman filter
mn = p n + vn
ˆn = ˆn1  K (mn  p[ˆn1 ])
Simulations

Objective: validate model

Assumptions made in theory
Poisson arrivals
 Exponential on-times
 Heavy-traffic regime


Simulations:
2 regimes investigated: light load/heavy-load
 2 distributions: Exponential/Pareto

 8 different scenarios simulated
Validation with real traces
Objective: further validate model
 Robustness to “real” distributions?
 Independence-related assumptions are
violated

Distribution of traces investigated
Best fit for inter- Best fit for onarrivals sequence times sequence
Short audio Weibull
Long audio Lognormal
Weibull
Lognormal
Membership in real traces vs. time
Objective
Find optimal estimator under more general
assumptions
Large audience multicast applications

Motivation - Objective

Kalman filter

Wiener filter

Least square estimation

Extension
M/G/ model

Assumptions:
 Poisson arrival process, intensity 
 on-times have common probability distribution
D denotes a generic random variable
 Occupation process in the M/G/ queue

Characteristics of N(t) in steady-state:
 Poisson random variable, Mean = Variance =  =  E[D]
 Autocorrelation function

CovN t , N t  h  =   PD  u du
h

Notation: Cov N k  = CovN nS , N n  k S 
Optimal estimation - Wiener filter

Noisy observation Yn
yn
Wiener filter
Ho(z)
Optimal linear filter  minimal mean-square error
Optimal estimation - Wiener filter
Introduce:
power spectrum of yn n , S y z  = k =  Cov y k z  k

z - transform of Covy k  , Sy z  = k =  Covy k z  k

We have:
Covy k  = p Cov k 
Cov y k  = p 2 Cov k   1k =0 p1 p 
Canonical factorizat ion, S y  z  = G z G z  1 
Compute



S
z
 y



H  z  =    1  
 G z  
 
  


H o  z  =

H z 
Gz 
Application to M/M/ model
When D ~ Expμ 
Cov ν k =  ,  =exp S 
B


We find H o  z  =
1 Az
1 1 2 p  1   1   1  2 p  
where A =
2 1 p 
 1     1   1   1  2 p  
B=
2 p1  p 
B


Transfer function H o  z  =
1 Az
Impulse response ˆ = Aˆ  By
k
1
2
2
2
2
2
2
2
2
2
1
n
n 1
n
Application to M/M/ model
Centered processes : ˆ = Aˆ  By
n
n 1
n
Non-centered processes:
 Nˆ n = ANˆ n1  BYn   1 A pB
Mean square error

min
2


= E  N n  Nˆ n  
 



 Bp 
=  1

 1A 
Kalman filter vs. Wiener filter
Estimators are the same!
But
Kalman filter  M/M/ queue, heavy traffic
Wiener filter  M/M/ queue

we relaxed one assumption
Large audience multicast applications

Motivation - Objective

Kalman filter

Wiener filter

Least square estimation

Extension
Optimal first-order linear filter
 Find A0, 1 and B such that
 ˆn = Aˆn  1  Byn
[
]
 mean - square error  = E  n ˆn  minimized
k
ˆ
 Steady- state  n = B A yn  k

2
k =0
 Minimize
2
pB

 =   2 pBg  A 2 2 pg  A  1 2 p 
 1 A 
where
g z = z k Cov k 

k =0
Optimal first-order linear filter
 
 A =0
 System to solve  
 =0
 B
 Solution is unique
 D ~ Exp   same solution as Wiener filter
 D ~ HyperexponentialL,i , pi ,i =1L
 Numerical solving
Validation with real traces
Distribution of inter-arrivals and on-times
Best fit for inter- Best fit for onarrivals sequence times sequence
video1
video2
video3
video4
Lognormal
Lognormal
Weibull
Weibull
Weibull
Weibull
Lognormal
Weibull
Mean & Variance of the error N n  Nˆ n
video1 Nˆ nE
Mean Variance
min, min
0.1121 12.6641 13.9424
0.0469 12.8508 12.1198 1.1504
0.0062
0.4947 1.4068
video2
Nˆ nH 2
Nˆ E
video3
Nˆ nH 2
Nˆ E
n
0.0188
0.0373
0.7851
0.2065
0.3955 3.5570
0.7370
video4
Nˆ nH 2
Nˆ E
n
0.0194
0.0523
0.2291
0.9105
0.2084 3.5365
1.5656
Nˆ nH 2
0.0651
1.4231
0.6755 2.3177
n
empirical
theoretical
And the winner is …
Estimator
E
ˆ
Nn !
Advantages:
optimal for M/M/ queue
 efficient over real traces
 only two parameters required

Drawbacks:

a priori knowledge needed
Large audience multicast applications

Motivation - Objective

Kalman filter

Wiener filter

Least square estimation

Extension
Extension
 How to estimate  and  ?
 estimate  and  (recall  =   )
 Receivers :
 on arrival, send "hello" message with prob. q
 Source :
 records time of receipt t m of mth hello message
m
ˆ
 estimates  as  =
(MLE)
q tm
 estimates  as ˆ = E Nˆ n (initially ˆ = E [Yn p ])
[ ]
Large audience multicast applications
Main contributions:




Proposition of several unbiased estimators that
efficiently track membership
Validation through simulated and real traces
Identification of “best” estimator among those
proposed
Proposition of estimators for a priori parameters
Thesis topics
Adaptive unicast applications
Background: network does not offer guarantee
 Objective: estimate network internal state

Large audience multicast applications
Background: need for membership estimates
 Objective: efficiently track membership

Mobile code applications
Background: existence of several mechanisms for
objects communication
 Objective: determine fastest among two of them

Mobile code applications

Code mobility paradigm

Forwarders mechanism

Centralized mechanism

Simulations & experiments

Contributions
Code mobility paradigm


Definition:
components of application might change host
(migrate) during execution
Utility:
load balancing
 data mining (data available on different hosts)
 e-commerce (find the cheapest airline fare)


Issue:
ensure communications with mobile objects
Code mobility paradigm

Two widely used solutions:
distributed approach (use forwarders)
 centralized approach (use server)


Objective: identify best approach in terms
of response time
Forwarders mechanism: description
S : Source
O : mobile Object
F : Forwarder
reference
Host A
S
O
Host B
Host C
Host D
Forwarders mechanism: description
S : Source
O : mobile Object
F : Forwarder
reference
Host A
S
Message
F
Host B
Forwarding
Migrating
O
F
Host C
Forwarding
Migrating
O
Host D
Forwarders mechanism: description
S : Source
O : mobile Object
F : Forwarder
reference
Host A
S
Update
F
F
O
Host B
Host C
Host D
Forwarders mechanism: description
S : Source
O : mobile Object
F : Forwarder
reference
Host A
S
F
F
O
Host B
Host C
Host D
Subsequent messages use new reference
Centralized mechanism: description
Host A
Server
S : Source
O : mobile Object
reference
S
O
Host B
Host C
Host D
Centralized mechanism: description
Host A
Server
S : Source
O : mobile Object
reference
S
Update
Migrating
Host B
O
Host C
Host D
Centralized mechanism: description
Host A
Server
S : Source
O : mobile Object
reference
S
Fail
Update
Message
Migrating
Host B
Host C
O
Host D
Centralized mechanism: description
Host A
S
Query
Server
location
S : Source
O : mobile Object
reference
Reply
Message
! Object may
have moved in
the meantime
O
Host B
Host C
Host D
Centralized mechanism: the server

may need to send Reply after processing
request from Source
O
S

S

send Reply
S
O
Mobile code applications

Forwarders mechanism:
 infinite state-space Markov chain
 expression for expected response time TF
 expression for expected number of forwarders

Centralized mechanism:
 finite state-space Markov chain
 expression for expected response time TS

Models validated through simulations and
experiments (LAN & MAN)
Forwarder LAN (100 Mb/s)
Mean response time (ms) vs. communication rate
250
200
 migration rate
150
 = 10
100
=5
50
0
=1
1
2
3
4
5
Experiments
6
7
8
9
10
Model
11
Server LAN (100Mb/s)
Mean response time (ms) vs. communication rate
120
100
80
 = 10
60
=5
40
=1
20
0
1
2
3
4
5
6
Experiments
7
8
9
10
Model
11
Forwarder MAN (7Mb/s)
Mean response time (ms) vs. communication rate
3000
2500
 = 10
2000
1500
1000
=5
=1
500
0
1
2
3
4
5
6
Experiments
7
8
9
Model
10
11
Server MAN (7Mb/s)
Mean response time (ms) vs. communication rate
3000
2500
 = 10
2000
1500
=5
1000
500
0
=1
1
2
3
4
5
6
Experiments
7
8
9
Model
10
11
Overall performance is fair

models can safely be
used for performance evaluation
Mobile code applications
Main contributions:



Proposition of Markovian models for two
communication mechanisms
Validation through simulations and experiments
(LAN & MAN)
Theoretical comparison:
 prediction of fastest mechanism in general
Conclusion

General methodology
Propose mathematical models for system at hand
 Derive metrics of interest or estimators under
models assumptions
 Validate models via simulations and/or
experiments


Simple tools applicable over wide range of
applications
Conclusion
Optimal filtering techniques



estimation of RTT in TCP protocol
estimation of average queue size in RED routers
…
Performance analysis tools
very useful in design of mobile code applications
(high cost of implementation)
 protocol evaluation
 …

Thank you!