Transcript cntop 1529
Explaining Power Laws by Trade-Offs Alex Fabrikant, Elias Koutsoupias, Milena Mihail, Christos Papadimitriou Powerlaws in the Internet • [Faloutsos3 1999]: the degrees of the Internet topology are power law distributed • Both autonomous systems graph and router graph • Hop distances: ditto • Eigenvalues: ditto (!??!) • Model? DIMACS, Feb 13, 2002 2 The world according to Zipf • • • • • Power laws, Zipf’s law, heavy tails,… “the signature of human activity” i-th largest is ~ i-a (cities, words: a = 1) Equivalently: prob[X greater than x] ~ x -b (compare with law of large numbers) DIMACS, Feb 13, 2002 3 Models predicting power laws • • • • • • Size-independent growth (“the rich get richer”) Preferential attachment Brownian motion in log Exponential arrival + exponential growth Copying (web graph) Carlson and Doyle 1999: Highly optimized tolerance (HOT) DIMACS, Feb 13, 2002 4 Our model: minj < i [ dij + hopj] DIMACS, Feb 13, 2002 5 hopj • Average hop distance from other nodes • Maximum hop distance from other nodes • Distance from center (first node) NB: Resulting graph is a tree DIMACS, Feb 13, 2002 6 Theorem: • if < const, then graph is a star degree = n -1 • if > n, then there is exponential concentration of degrees prob(degree > x) < exp(-ax) • otherwise, if const < < n, heavy tail prob(degree > x) > x -a DIMACS, Feb 13, 2002 7 Also: why are files on the Internet power-law distributed? • Suppose each data item i has “popularity” ai • Partition data items in files to minimize total cost • Cost of each file: total popularity · size + overhead C • Notice trade-off! • From [CD99] DIMACS, Feb 13, 2002 8 Files (continued) • Suppose further that popularities of items are iid from distribution f • Result: File sizes are power law distributed “for any reasonable” distribution f (exponential, Gaussian, uniform, power law, etc.) • ([CD99] observe it for a few distributions) DIMACS, Feb 13, 2002 9 Heuristically optimized tradeoffs • Power law distributions seem to also come from tradeoffs between objectives (a signature of human activity?) • Generalizes [CD99] (the other objective need not be reliability) • cf [Mandelbrot 1954]: Power Laws in language are due to a tradeoff between information and communication costs DIMACS, Feb 13, 2002 10 PS: Eigenvalues of the Internet may be a corollary of the degrees phenomenon: Theorem: If a graph has largest degrees d1, d2,…, dk and o(dk ) more edges, then with high probability its largest eigenvalues are within (1+ o(1)) of d1, d2,…, dk (NB: The eigenvalue exponent observed in Faloutsos3 is about ½ of the degree exponent!) DIMACS, Feb 13, 2002 11