Transcript Document
COST-ACTION: TMA meeting @ Samos’08 Accurate & scalable models for wireless traffic workload Maria Papadopouli Assistant Professor Department of Computer Science, University of Crete & Institute of Computer Science, Foundation for Research & Technology-Hellas (FORTH) Joint research with: F. Hernandez-Campos, M. Karaliopoulos, H. Shen, E. Raftopoulos 1IBM Faculty Award, EU Marie Curie IRG, GSRT “Cooperation with non-EU countries” grants Wireless landscape • Growing demand for wireless access • Mechanisms for better than best-effort service provision • Performance analysis of these mechanisms Typically using simplistic traffic models • Empirically-based measurements impel modeling efforts to produce more realistic models Enable more meaningful performance analysis studies Wireless infrastructure Internet disconnection Router Wired Network Switch AP3 Wireless Network User A AP 1 AP 2 roaming User B roaming Associations 1 Flows 2 Packets 3 0 Dimensions in modeling wireless access • Intended user demand • User mobility patterns – Arrival at APs – Roaming across APs – Duration the user is connected to an infrastructure • Link conditions • Network topology Internet disconnection Router Wired Network Switch Wireless Network User A AP 1 AP3 AP 2 Events User B Session 1 2 3 0 Flow Arrivals t1 t2 t3 t4 t5 t6 t7 time Our parameters and models Parameter Model Association, session duration BiPareto Session arrival Time-varying Poisson AP of first association/session Lognormal Flow interarrival/session Lognormal Flow number/session BiPareto Flow size BiPareto Client roaming between APs Probability Density Function Related Papers EW' 06 N: # of sessions between t1 and t2 WICON '06 LANMAN'05 WICON '06 Same as above WICON '06 WICON '06 Same as above Markov-chain WICON '06 INFOCOM'04 Wireless infrastructure & acquisition • • • • • 26,000 students, 3,000 faculty, 9,000 staff in over 729-acre campus 488 APs (April 2005), 741 APs (April 2006) SNMP data collected every 5 minutes Several months of SNMP & SYSLOG data from all APs Packet-header traces: – Two weeks (in April 2005 & April 2006) – Captured on the link between UNC & rest of Internet via a highprecision monitoring card Modeling process 1. 2. 3. • • • 4. 5. Selection of models (e.g., various distributions) determines spatial & Fitting parameters using empirical traces temporal scale Evaluation and comparison of models Visual inspection e.g., CCDFs & QQ plots models vs. empirical data Statistical-based criteria e.g., QQ/simulation envelops, statistical tests Systems-based criteria Validation of models Generalization of models Modeling in various spatio-temporal scales Scales Objective Hourly period @ AP Network-wide Sufficient spatial detail Scalable Amenable to analysis Tradeoff with respect to accuracy, scalability, reusability & tractablity Synthetic trace generation Simulation/Emulation testbed • TCP flows • UDP • Wired clients: senders • Wireless clients: receivers Simulation & emulation testbeds Internet Router Wired Network Switch Wireless Network User A AP 1 AP3 AP 2 User D User B Scenario of wireless access Various traffic conditions User C Scenario: User A generates a flow of size X @ T1 User B generates a flow of size Y @ T2 ▪ ▪ Main results Accurate and scalable models of wireless demand Same distributions/models persist: over two different periods (2005 and 2006) over two different campus-wide infrastructures over heavy & normal traffic conditions @ AP using statistical- & systems-based metrics Empirical traces used as “ground truth” for the comparison with synthetics traces based on various models Main results (con’t) Accuracy: • our models perform very close to the empirical traces • popular models deviate substantially from the empirical traces Scalability: • same distributions at various spatial & temporal scales • group of APs per bldg addresses scalability-accuracy tradeoffs Application mix of AP traffic • mostly web: very accurate models • both web & p2p : models are ok • mostly p2p: larger deviations from empirical data In progress … • Improve modeling of non-web traffic • Client profiling • Impact of underlying network conditions on application and usage patterns • Evaluate the performance of AP or channel selection, load balancing & admission control protocols under real-life traffic conditions – Mesh testbed – Heterogeneous wireless networks UNC/FORTH web archive Online repository of models, tools, and traces – Packet header, SNMP, SYSLOG, synthetic traces, … http://netserver.ics.forth.gr/datatraces/ Free login/ password to access it Simulation & emulation testbeds that replay synthetic traces for various traffic conditions Mobile Computing Group @ University of Crete/FORTH http://www.ics.forth.gr/mobile/ [email protected] Hourly aggregate throughput FLOWSIZE—FLOWARRIVAL EMPIRICAL BIPARETO-LOGNORMAL-AP BIPARETO-LOGNORMAL Impact of flow size Fixed flow sizes & empirical flow arrivals (aggregate traffic as in EMPIRICAL) Pareto flow sizes, empirical flow arrivals Scalability vs. Accuracy: flow interarrivals EMPIRICAL BDLG(DAY) BDLGTYPE(DAY) NETWORK(TRACE) Scalability vs Accuracy: Number of flow arrivals in an hour BDLGTYPE(TRACE) BDLG(DAY) EMPIRICAL NETWORK(TRACE) Per-flow throughput FLOWSIZE—FLOWARRIVAL BIPARETO-LOGNORMAL EMPIRICAL BIPARETO-LOGNORMAL-AP Pareto flow sizes Fixed flow sizes & empirical flow arrivals Pareto flow sizes & uniform flow arrivals in tracing period due to large % of small size flows Histogram of flow sizes Aggregate hourly downloaded traffic UDP traffic scenario Wireless hotspot AP Wireless clients downloading Wired traffic transmit at 25Kbps Total aggregate traffic sent in CBR and in empirical is the same Empirical: 1.4 Kbps Bipareto-Lognormal-AP: 2.4 Kbps Bipareto-Lognormal: 2.6 Kbps Large differences in the distributions Impact of application mix on per-flow throughput TCP-based scenario AP with 85% web traffic AP with 80% p2p traffic AP with 50% web & 40% p2p traffic Goodput Per-flow delay Jitter per flow Impact of application mix of AP traffic 50% web & 40% p2p 80% p2p 85% web Session-level flow related variation In-session flow interarrival can be modeled with same distribution for all building types but with different parameters Mean in-session flow interarrival f Session-level flow size variation Mean flow size f (bytes) Flow size vs. flow-interarrival on hourly throughput TCP scenario avg flow size fixed original flow interarrival Flow size - Flow interarrival Flow interarrivals has slightly higher impact empirical avg flow interarrivals fixed original flow size Flow size vs. flow-interarrival on per-flow throughput Flow size - Flow interarrival avg flow size fixed original flow interarrivals original flow size avg flow interarrivals fixed Flow size has higher impact original trace Per flow statistics for hours that have produced the same aggregate download traffic Our models persist for traffic generated during busy periods Empirical trace: one hour of a hotspot AP with heavy workload conditions Number of flows per session Simplicity at the cost of higher loss of information Number of Flows Per Session