Network Performance Measurements

Download Report

Transcript Network Performance Measurements

Monitoring Internet connectivity of Research and Educational Institutions

Les Cottrell – SLAC/Stanford University

Prepared for the workshop on “Developing Country Access to On-line Scientific Publishing", 4-5 October 2002, Trieste, Italy http://www.slac.stanford.edu/grp/scs/net/talk/ictp-02.html

Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also supported by IUPAP 1

• • • •

Outline

Measurement project initially created for HEP to measure performance for various collaborations, extended to other physics & science collaborations

Value for planning, trouble shooting, setting expectations, comparisons, setting SLAs etc.

Methodology Results

– –

Round trip times Loss Summary

2

Measurement Architecture

• Uses existing ubiquitous Internet

ping

infrastructure, no tools to install • Hierarchical vs. full mesh, each monitoring site chooses remote sites • Lightweight – – low network impact (100bits/s/path) – no special machines – trivial to add monitored sites • Runs continuously since 1995 WWW HTTP Ping SLAC Reports & Data Archive Archive HEPNRC Cache Monitoring Monitoring Remote Remote Monitoring Monitoring 1 monitor host remote host pair Remote Remote 3

PingER Measurement Methodology

• Measurement host admin choose remote hosts of interest – sends 21 pings each 30 mins to each chosen remote host – Records RTT, loss, jitter, unreachable, out of order … – Records data in local cache • Archive host gathers data from measurements hosts regularly (at least daily) – Archives, analyzes and generates reports from data – Make reports and data publicly available via the web • Requirements: – Remote host: need a host accessible to pings, and a contact in case host does not respond (almost no effort) – Monitoring host: a low end host to make measurements, file space for cache, admin to install toolkit, choose remote hosts, build configuration file, respond to archivers in case unable to get data & keep it running (<<10% FTE) – Archive site: probably about 20% of an FTE 4

PingER deployment

• Measurements from – 34 monitors in 14 – Over 72 countries – Mainly A&R sites countries – Over 600 remote hosts – Over 3300 monitor-remote site pairs – Measurements go back to Jan-95

Monitoring Sites

– Reports on RTT, loss, reachability, jitter, reorders, duplicates … • Countries monitored – Contain 78% of world population – 99% of online users of Internet

Remote Sites

5

User interface 1/2

Choose: metric, monitoring site(s), remote sites(s), time granularity Shows colored values by time, allows downloading of results for further analysis 6

User interface 2/2: PingER Group History Table

7

PingWorld

Java applet at http://jas.freehep.org/demos/PingWorld/ 8

Performance Results Examples of the type of information that the monitoring can provide

9

History - Round Trip Time (RTT)

• Improving by 10-20% year • More direct paths • Replacing satellites with land lines – Satellite >~550ms • Faster lines & network equipment • Lower limit speed of light in fiber

Speed of light in fiber Typical lower limit today ~ distance/(0.3 * (0.6 * c))

10

• Note large

RTT to world from US

number of satellite links (> 600ms dark red) • Note reduction by Aug 2002

Jan 2000 Aug 2002

11

Impact of loss on applications

• Email – fairly insensitive to quality, may be delayed but keeps retrying for days and eventually gets through • Web – usually has human but expectations are low, performance often more limited by server, human present so can retry • Bulk file transfer – unattended, if > 10-12% loss connections can time out • Interactive telnet, voice – very time & loss sensitive – E.g. telnet/ssh loss of > 3% severely impacts typing ability, interactive voice sensitive at lower losses 12

• Loss more critical than RTT • Losses cause timeouts of typically seconds

History - Loss

• 40-50% improve/yr • Best networks below 0.1% • Russia, SE Europe, China several years behind 13

History – Loss Quality

• Fewer sites have v. poor to dreadful performance • More have good performance (< 1%) 14

Loss to world from US

Using year 2000, fraction of world’s population/country from www.nua.ie/surveys/how_many_online/ 15

• • •

Losses: World by region, Jan ‘02

<1%=good, < 2.5%=acceptable , < 5%=poor , > 5%=bad Russia, S Monitored Region \ Monitor BR America Country bad COM Canada Balkans, US C America M East, Australasia Africa, E Asia Europe S Asia, Caucasus NET FSU-

Balkans Mid East

poor

Africa

(1)

0.2

1.8

0.4

1.2

0.4

1.7

CA (2)

1.6

2.6

3.5

5.6

6.2

4.5

4.6

5.8

DK (1) DE (1) HU (1) IT (3)

0.3

0.3 0.5 9.0 0.3

0.2 0.3 8.0 0.1

1.0 1.1 9.0 0.9

0.3 0.5 5.4 0.4

1.0 1.3 8.0 1.6

0.5 9.8 0.5

1.4 3.0 8.5 2.8

1.5 12.0 1.2

JP (2) RU (2)

1.4 21.7

1.4 13.8

2.0

5.2

1.3 15.5

3.6 21.9

1.6 11.2

3.2 11.8

4.2 11.9

CH (1) UK (3)

0.7 0.7

0.3 1.3

0.8

1.5 1.4

1.1 1.0

0.7 0.8

4.3 1.2

2.0 2.5

2.0 1.9

US (16) Avg

0.3 0.2

0.5 3.5

0.9 2.7

0.9 0.9

1.8 1.3

1.5 2.6

1.0 2.9

0.9 4.3

2.0 4.0

3.8 3.8

2.1 4.2

2.5 4.8

Baltics S Asia

1.6

5.3

7.3

0.8 2.3 7.7 2.2

0.1 3.1 9.2 3.0

3.5 10.8

3.9 17.9

4.8 2.1

1.5 3.1

3.9 4.3

3.0 4.9

Caucasus S America Russia

Avg

24.1 11.3

7.5

6.9

0.6 0.9 6.7 12.9

7.7 23.0

35.9 24.1 22.2 13.4 23.8 21.7 13.6

2.8 2.4 9.8 3.7

0.7

3.9 13.8

9.3 1.1

3.1 3.2

3.2 3.2

6.6 9.5

8.7 24.1 12.7 18.3

2.8 4.4

Pairs

64 144 54 67 70 203 190 114 209 192 1990

Region COM Canada US C America Australasia E Asia Europe NET FSU-

Balkans Mid East Africa Baltics S Asia Caucasus S America Russia

Avg Pairs A v NA + ( WEU

0.27

0.74

0.88 2149 0.89

1.30

1.61

1.38

2.00

2.09

3.83

2.70

2.72

3.12

3.12

3.22

6.30

17.57

3.16

23 126 19 18 215 852 85 48 109 57 45 67 97 19 203 91 16

History - Throughput quality improvements

TCP BW

from US

< MSS/(RTT*sqrt(loss)) (1)

80% annual improvement ~ factor 10/4yr ~

Factor 100 improvement in 8 years

(1) Macroscopic Behavior of the TCP Congestion Avoidance Algorithm

, Matthis, Semke, Mahdavi, Ott, Computer Communication Review 27(3), July 1997 17

Detailed example of improvements

Increase of bandwidth by factor of 460 in 6 years, more than kept pace - factor of 50 times improvement in loss Note valleys when students on vacation 18

Summary - results

• Internet A&R connectivity performance

is improving

– RTT 10-20%/yr, loss 50%/yr, throughput 80%/yr – Reduced use of satellites, mainly use for new hard to get to areas (e.g. S. Russian Republics) • China, S.E. Europe, Russia rate of change keeps up but several years behind • India, S. America performance is where N. America & W. Europe were 4 – 5 years ago • Africa limited continuous results (UCT & Wits. no longer respond): Uganda losses in last 2 years reduced from10% to 3%, RTT fairly constant at 800ms.

• Improvements need constant investments to understand & improve 19

Summary - PingER

• Lightweight (100bps/host pair, 21 pings/30mins per pair) • Very useful for inter-regional and poor links • Easy to deploy (uses ubiquitous Internet

ping

infrastructure), however pings can be blocked • Easy to deploy for monitoring of sites in developing countries – Remote sites ~ no effort (provide contact & host) – Monitoring site small effort:1 day to download software set up & configure, (shared host) choose remote hosts to monitor, make data available for upload, check working, ongoing respond to emails.

– SLAC would be willing to assist – Data public so anyone can do analysis/presentation of data – Provide me (business card or email [email protected]

) with contact and name of host to be monitored 20

Help

• Looking for better hosts to monitor & contacts in: – Albania, Armenia, Austria, Azerbaijan – Macedonia*, Turkey*, Yugoslavia – Columbia*, Venezuela*, Cuba, Mexico* – Pakistan* – Africa (apart from Egypt, Uganda & South Africa, n.b. according to http://www3.sn.apc.org/africa/afrmain.htm

all 54 countries in Africa now have Internet access in capitals) – Note there are a few countries (about 5% of the world’s countries) that do not have full Internet connections and pay dearly by the byte.

• A couple of years ago these included: Afghanistan, Western Sahara, Christmas Island, S. Georgia, Marshall Islands, Myanmar, Montserrat, N. Korea, Pitcairn, St Vincente & Grenadines 21

More Information

• IEPM/PingER home site: – www-iepm.slac.stanford.edu/ • What we do, coverage, how to download (free) software, requirements for hosts, results, data download etc.

• Java demonstration from

your

computer

– http://jas.freehep.org/demos/PingWorld/ 22