Transcript Document

COST-ACTION: TMA meeting @ Samos’08
Accurate & scalable models
for wireless traffic workload
Maria Papadopouli
Assistant Professor
Department of Computer Science, University of Crete &
Institute of Computer Science, Foundation for Research & Technology-Hellas (FORTH)
Joint research with: F. Hernandez-Campos, M. Karaliopoulos, H. Shen, E. Raftopoulos
1IBM
Faculty Award, EU Marie Curie IRG, GSRT “Cooperation with non-EU countries” grants
Wireless landscape
• Growing demand for wireless access
• Mechanisms for better than best-effort service provision
• Performance analysis of these mechanisms
 Typically using simplistic traffic models
• Empirically-based measurements impel modeling efforts to
produce more realistic models
 Enable more meaningful performance analysis studies
Wireless infrastructure
Internet
disconnection
Router
Wired Network
Switch
AP3
Wireless
Network
User A
AP 1
AP 2
roaming
User B
roaming
Associations
1
Flows
2
Packets
3
0
Dimensions in modeling wireless access
• Intended user demand
• User mobility patterns
– Arrival at APs
– Roaming across APs
– Duration the user is connected to an infrastructure
• Link conditions
• Network topology
Internet
disconnection
Router
Wired Network
Switch
Wireless
Network
User A
AP 1
AP3
AP 2
Events
User B
Session
1
2
3
0
Flow
Arrivals
t1
t2
t3
t4
t5
t6
t7
time
Our parameters and models
Parameter
Model
Association, session duration
BiPareto
Session arrival
Time-varying
Poisson
AP of first association/session
Lognormal
Flow interarrival/session
Lognormal
Flow number/session
BiPareto
Flow size
BiPareto
Client roaming between APs
Probability Density Function
Related
Papers
EW' 06
N: # of sessions between t1 and t2
WICON '06
LANMAN'05
WICON '06
Same as above
WICON '06
WICON '06
Same as above
Markov-chain
WICON '06
INFOCOM'04
Wireless infrastructure & acquisition
•
•
•
•
•
26,000 students, 3,000 faculty, 9,000 staff in over 729-acre campus
488 APs (April 2005), 741 APs (April 2006)
SNMP data collected every 5 minutes
Several months of SNMP & SYSLOG data from all APs
Packet-header traces:
– Two weeks (in April 2005 & April 2006)
– Captured on the link between UNC & rest of Internet via a highprecision monitoring card
Modeling process
1.
2.
3.
•
•
•
4.
5.
Selection of models (e.g., various distributions) determines
spatial &
Fitting parameters using empirical traces
temporal scale
Evaluation and comparison of models
Visual inspection
e.g., CCDFs & QQ plots models vs. empirical data
Statistical-based criteria
e.g., QQ/simulation envelops, statistical tests
Systems-based criteria
Validation of models
Generalization of models
Modeling in various spatio-temporal scales
Scales
Objective
Hourly period @ AP
Network-wide
Sufficient spatial detail
Scalable Amenable to analysis






Tradeoff with respect to accuracy, scalability, reusability & tractablity
Synthetic trace generation
Simulation/Emulation testbed
• TCP flows
• UDP
• Wired
clients: senders
• Wireless clients: receivers
Simulation & emulation testbeds
Internet
Router
Wired
Network
Switch
Wireless
Network
User A
AP 1
AP3
AP 2
User D
User B
Scenario of
wireless access
Various
traffic
conditions
User C
Scenario:
User A generates a flow of size X @ T1
User B generates a flow of size Y @ T2
▪
▪
Main results
 Accurate and scalable models of wireless demand

Same distributions/models persist:
 over two different periods (2005 and 2006)
 over two different campus-wide infrastructures
 over heavy & normal traffic conditions @ AP
 using statistical- & systems-based metrics
 Empirical traces used as “ground truth” for the comparison with
synthetics traces based on various models
Main results (con’t)
Accuracy:
• our models perform very close to the empirical traces
• popular models deviate substantially from the empirical traces
Scalability:
• same distributions at various spatial & temporal scales
• group of APs per bldg addresses scalability-accuracy tradeoffs
Application mix of AP traffic
• mostly web: very accurate models
• both web & p2p : models are ok
• mostly p2p: larger deviations from empirical data
In progress …
• Improve modeling of non-web traffic
• Client profiling
• Impact of underlying network conditions on application and
usage patterns
• Evaluate the performance of AP or channel selection, load
balancing & admission control protocols under real-life traffic
conditions
– Mesh testbed
– Heterogeneous wireless networks
UNC/FORTH web archive
 Online repository of models, tools, and traces
– Packet header, SNMP, SYSLOG, synthetic traces, …
http://netserver.ics.forth.gr/datatraces/
 Free login/ password to access it
 Simulation & emulation testbeds that replay synthetic traces
for various traffic conditions
Mobile Computing Group @ University of Crete/FORTH
http://www.ics.forth.gr/mobile/

[email protected]
Hourly aggregate throughput
FLOWSIZE—FLOWARRIVAL
EMPIRICAL
BIPARETO-LOGNORMAL-AP
BIPARETO-LOGNORMAL
Impact of flow size
Fixed flow sizes & empirical flow arrivals
(aggregate traffic as in EMPIRICAL)
Pareto flow sizes, empirical flow arrivals
Scalability vs. Accuracy: flow interarrivals
EMPIRICAL
BDLG(DAY)
BDLGTYPE(DAY)
NETWORK(TRACE)
Scalability vs Accuracy:
Number of flow arrivals in an hour
BDLGTYPE(TRACE)
BDLG(DAY)
EMPIRICAL
NETWORK(TRACE)
Per-flow throughput
FLOWSIZE—FLOWARRIVAL
BIPARETO-LOGNORMAL
EMPIRICAL
BIPARETO-LOGNORMAL-AP
Pareto flow sizes
Fixed flow sizes &
empirical flow arrivals
Pareto flow sizes &
uniform flow arrivals in tracing period
due to large % of
small size flows
Histogram of flow sizes
Aggregate hourly downloaded traffic
UDP traffic scenario
 Wireless hotspot AP
 Wireless clients downloading
 Wired traffic transmit at 25Kbps
 Total aggregate traffic sent in CBR and in empirical is the same
Empirical: 1.4 Kbps
Bipareto-Lognormal-AP: 2.4 Kbps
Bipareto-Lognormal: 2.6 Kbps
Large differences in the distributions
Impact of
application mix on per-flow throughput
TCP-based scenario
AP with 85% web traffic
AP with 80% p2p traffic
AP with 50% web & 40% p2p traffic
Goodput
Per-flow delay
Jitter per flow
Impact of application mix of AP traffic
50% web & 40% p2p
80% p2p
85% web
Session-level flow related variation
In-session flow interarrival can be modeled with same distribution for all
building types but with different parameters
Mean in-session flow interarrival f
Session-level flow size variation
Mean flow size f (bytes)
Flow size vs. flow-interarrival on hourly throughput
TCP scenario
avg flow size fixed
original flow interarrival
Flow size - Flow interarrival
Flow interarrivals has slightly higher impact
empirical
avg flow interarrivals fixed
original flow size
Flow size vs. flow-interarrival on per-flow throughput
Flow size - Flow interarrival
avg flow size fixed
original flow interarrivals
original flow size
avg flow interarrivals fixed
Flow size has higher impact
original trace
Per flow statistics for hours that have produced
the same aggregate download traffic
Our models persist for traffic generated
during busy periods
Empirical trace: one hour of a hotspot AP
with heavy workload conditions
Number of flows per session
Simplicity at the cost of
higher loss of information
Number of Flows Per Session