PingER: Methodology, Uses & Results Les Cottrell SLAC, Warren Matthews GATech Extending the Reach of Advanced Networking: Special International Workshop Arlington, VA., April 22,

Download Report

Transcript PingER: Methodology, Uses & Results Les Cottrell SLAC, Warren Matthews GATech Extending the Reach of Advanced Networking: Special International Workshop Arlington, VA., April 22,

PingER: Methodology, Uses & Results

Les Cottrell SLAC, Warren Matthews GATech

Extending the Reach of Advanced Networking: Special International Workshop

Arlington, VA., April 22, 2004 www.slac.stanford.edu/grp/scs/net/talk03/i2-method-apr04.ppt

Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also supported by IUPAP 1

Outline

• What is PingER • World Internet performance trends • Regions and Digital Divide • Examples of use • Challenges • Summary of Uses 2

Methodology

• Use ubiquitous ping • Each 30 minutes from monitoring site to target : – 1 ping to prime caches – by default send11x100Byte pkts followed by 10x1000Byte pkts • Low network impact + no software to install / configure / maintain at remote sites + no passwords / accounts needed = good for developing sites / regions • Record loss & RTT, (+ reorders, duplicates) • Derive throughput, jitter, unreachability … 3

Architecture

SLAC Archive Cache Monitoring Monitoring Remote Remote WWW Reports & Data Archive FNAL HTTP Ping ~35 Monitoring Monitoring 1 monitor host remote host pair Remote ~550 Remote • Hierarchical vs. full mesh 4

Regions Monitored

Monitoring sites in ~ 35 countries • Recent added NIIT PK as monitoring site • White = no host monitored in country • Colors indicate regions • Also have affinity groups (VOs), e.g. AMPATH, Silk Road, CMS, XIWT and can select multiple groups 5

World Trends

• Increase in sites with Good (<1%) loss • 25% increase in sites monitored – Big focus on Africa 4=>19 countries – Silk Road

Loss quality ratings seen from SLAC

300 ICTP Ping blocking 250 WSIS 200 150

60%

100

50%

50 0 Dreadful >12% V. poor >=5% & <12% Poor >=2.5% & < 5% Acceptable >=1% & < 2.5% Good <1% 6

Trends

S.E. Europe, Russia:

catching up

Latin Am., Mid East, China:

keeping up

India, Africa:

falling behind

Derived

throughput~MSS/(RTT*sqrt(loss))

Silk Road NaukaNet/ Gloriad AMPath 7

Current State – Aug ‘03

thruput ~ MSS / (RTT * sqrt(loss))

• Within region performance better – E.g. Ca|EDU|GOV-NA, Hu-SE Eu, Eu-Eu, Jp-E Asia, Au-Au, Ru Ru|Baltics • Africa, Caucasus, Central & S. Asia all bad Bad < 200kbits/s < DSL Poor > 200, < 500kbits/s Good > 1000kbits/s

Examples of Use

• Need for constant upgrades • Upgrades • Filtering • Pakistan 9

Usage Examples

Identify need to upgrade and effects • BW increase by factor 300 • Multiple sites track • Xmas & summer holiday • Selecting ISPs for DSL/Cable services for home users – Monitor accessibility of routers etc. from site – Long term and changes • Trouble shooting – Identifying problem reported is probably network related – Identify when it started and if still happening or fixed – Look for patterns: • Step functions • Periodic behavior, e.g. due to congestion • Multiple sites with simultaneous problems, e.g. common problem link/router … – Provide quantitative information to ISPs 10

Russia Examples

• Russian losses improved by factor 5 in last 2 years, due to multiple upgrades • E.g. Upgrade to KEK-BINP link from 128kbps to 512kbps, May ’02: improved from few % loss to ~0.1% loss 11

Usage Examples

Upgrades & ping filtering 25 20 15 10 5 0 50 45 40 35 30 12 Packet Loss between DESY and FNAL in February and March 2000.

10

Median Packet Loss Seen From nbi.dk

Ten-155 became operational on December 11.

To North America Smurf Filters installed on NORDUnet’s US connection.

To Western Europe 8 6 2 4

DFN closes Perryman POP and looses direct peering with ESnet Peering re-established via Dante at 60 Hudson

0 1 3 5 7 9 11 February 13 15 17 19 1 21 23 25 27

Day of the Month

3 5 7 9 11 March 13 15 17 19 Peering problems, took long time identify/fix 12

Pakistan Example

• Big performance differences to sites, depend on ISP (at least 3 ISPs seen for Pakistan A&R sites) • To NIIT (Rawalpindi): – Get about 300Kbps, possibly 380Kbps at best – Verified bottleneck appeared to be in Pakistan – There is often congestion (packet loss & extended RTTs) during busy periods each weekday – Video will probably be sensitive to packet loss, so it may depend on the time of day – H.323 (typically needs 384Kbps + 64Kbps), would appear to be marginal at best at any time.

– Requested upgrade to 1Mbps, and verified got it (Feb ’04) • No peering Pakistan between NIIT and NSC 13

Challenges 1 of 2

• Ping blocking – Complete block easy to ID, then contact site to try and by pass, can be frustrating for 3 rd world – Partial blocks trickier, compare with synack • Effort: – Negligible for remote hosts – Monitoring host: < 1 day to install and configure, occasional updates to remote host tables and problem response – Archive host: 20% FTE, code stable, could do with upgrade, contact monitoring sites whose data is inaccessible – Analysis: your decision, usually for long term details download & use Excel – Trouble-shooting: • usually re-active, user reports, then look at PingER data • Working on automating alerts, data is available for download 14

Challenges 2 of 2

• Funding – DoE development/research funding ended 2003 – Looking for alternate funding sources • Sustain, maintain & extend databases & measurements to more countries • Get measurements FROM & within developing regions • New analyses, preparing & presenting reports • Making contacts, coordinating efforts 15

Uses

• Near real time results: – Trouble shooting, detect problems see when they occur • Long term trends: – Set expectations, planning, – Give sites/regions better idea of how good/bad things are – Input to policy and funding agencies, assist in deciding where help is needed and how to provide • Measure before

&

after upgrades – Is it working right, did we get our money’s worth 16

More Information

• PingER: – www-iepm.slac.stanford.edu/pinger/ • MonaLisa – monalisa.cacr.caltech.edu/ • GGF/NMWG – www-didc.lbl.gov/NMWG/ • ICFA/SCIC Network Monitoring report, Jan03 – www.slac.stanford.edu/xorg/icfa/icfa-net-paper-dec02 • Monitoring the Digital Divide, CHEP03 paper – arxiv.org/ftp/physics/papers/0305/0305016.pdf

• Human Development Index – www.undp.org/hdr2003/pdf/hdr03_backmatter_2.pdf

• Network Readiness Index – www.weforum.org/site/homepublic.nsf/Content/Initiatives+subhome 17