Rules of Thumb in Data Engineering Jim Gray UC Santa Cruz 7 May 2002 Gray@Microsoft.com, http://research.Microsoft.com/~Gray/Talks/

Rules of Thumb in Data Engineering Jim Gray UC Santa Cruz 7 May 2002 [email protected], http://research.Microsoft.com/~Gray/Talks/

Transcript Rules of Thumb in Data Engineering Jim Gray UC Santa Cruz 7 May 2002 [email protected], http://research.Microsoft.com/~Gray/Talks/

Rules of Thumb in Data
Engineering
Jim Gray
UC Santa Cruz
7 May 2002
[email protected],
http://research.Microsoft.com/~Gray/Talks/
1
Outline
Moore’s Law and consequences
Storage rules of thumb
Balanced systems rules revisited
Networking rules of thumb
Caching rules of thumb
2
Meta-Message:
Technology Ratios Matter
Price and Performance change.
If everything changes in the same way,
then nothing really changes.
If some things get much cheaper/faster than others,
then that is real change.
Some things are not changing much:



Cost of people
Speed of light
…
And some things are changing a LOT
3
Trends: Moore’s Law
Performance/Price doubles every 18 months
100x per decade
Progress in next 18 months
= ALL previous progress


New storage = sum of all old storage (ever)
New processing = sum of all old processing.
E. coli double ever 20 minutes!
15 years ago
4
Trends:
ops/s/$ Had Three Growth Phases
1890-1945
Mechanical
Relay
7-year doubling
1945-1985
Tube, transistor,..
2.3 year doubling
1985-2000
Microprocessor
1.0 year doubling
1.E+09
ops per second/$
doubles every
1.0 years
1.E+06
1.E+03
1.E+00
1.E-03
doubles every
7.5 years
doubles every
2.3 years
1.E-06
1880
1900
1920
1940
1960
1980
2000
5
So: a problem
Suppose you have a ten-year compute job on the
world’s fastest supercomputer.
What should you do.
? Commit 250M$ now?
? Program for 9 years
Software speedup:
26 = 64x
Moore’s law speedup:
26 = 64x
so 4,000x speedup:
spend 1M$ (not 250M$ on hardware)
runs in 2 weeks, not 10 years.
Homework problem:
What is the optimum strategy?
6
Disk TB Shipped per Year
Storage capacity
beating Moore’s law
1E+7
1E+6
1E+5
1 k$/TB today (raw disk)
100$/TB by end of 2007
1998 Disk Trend (Jim Porter)
http://www.disktrend.com/pdf/portrpkg.pdf.
ExaByte
disk TB
growth:
112%/y
Moore's Law:
58.7%/y
1E+4
1E+3
1988
1991
1994
1997
Moores law
58.70% /year
Revenue
7.47%
TB growth
112.30% since 1993
Price decline 50.70% since 1993
2000
7
Trends: Magnetic Storage Densities
Amazing progress
Ratios have changed
Improvements:
Capacity
60%/y
Bandwidth 40%/y
Access time 16%/y
Magnetic Disk Parameters vs
1000000
Time
100000
10000
1000
100
tpi
kbpi
MBps
Gbpsi
10
1
0.1
0.01
year 84
88
92
96
00
04
8
Trends: Density Limits
Bit Density
The end is near!
Products:23 Gbpsi
Lab:
50 Gbpsi
“limit”: 60 Gbpsi
But
limit keeps rising
& there are
alternatives
b/µm2 Gb/in2
3,000 2,000
Density vs Time
b/µm2 & Gb/in2
?: NEMS,
Florescent?
Holographic
, DNA?
1,000 600
300 200
SuperParmagnetic Limit
100
60
30
20
10
6
3
2
Wavelength Limit
DVD
ODD
CD
1
0.6
Figure adapted from Franco Vitaliano,
“The NEW new media: the growing attraction
1990 1992 1994 1996 1998 2000 2002 2004 2006 2008
of nonmagnetic storage”,
9
Data Storage, Feb 2000, pp 21-32, www.datastorage.com
Trends: promises
NEMS (Nano Electro Mechanical Systems)
(http://www.nanochip.com/)
also Cornell, IBM, CMU,…
• 250 Gbpsi by using
tunneling electronic microscope
• Disk replacement
• Capacity:
•
•
•
•
180 GB now,
1.4 TB in 2 years
Transfer rate: 100 MB/sec R&W
Latency: 0.5msec
Power: 23W active, .05W Standby
10k$/TB now, 2k$/TB in 2004
10
Consequence of Moore’s law:
Need an address bit every 18 months.
Moore’s law gives you 2x more in 18 months.
RAM


Today we have 10 MB to 100 GB machines
(24-36 bits of addressing) then
In 9 years we will need 6 more bits:
30-42 bit addressing (4TB ram).
Disks


Today we have 10 GB to 100 TB file systems/DBs
(33-47 bit file addresses)
In 9 years, we will need 6 more bits
11
40-53 bit file addresses (100 PB files)
Architecture could change this
1-level store:



System 48, AS400 has 1-level store.
Never re-uses an address.
Needs 96-bit addressing today.
NUMAs and Clusters


Willing to buy a 100 M$ computer?
Then add 6 more address bits.
Only 1-level store pushes us beyond 64-bits
Still, these are “logical” addresses,
64-bit physical will last many years
12
Trends: Gilder’s Law:
3x bandwidth/year for 25 more years
Today:



40 Gbps per channel (λ)
12 channels per fiber (wdm): 500 Gbps
32 fibers/bundle = 16 Tbps/bundle
In lab 3 Tbps/fiber (400 x WDM)
In theory 25 Tbps per fiber
1 Tbps = USA 1996 WAN bisection bandwidth
Aggregate bandwidth doubles every 8 months!
1 fiber = 25 Tbps
13
Outline
Moore’s Law and consequences
Storage rules of thumb
Balanced systems rules revisited
Networking rules of thumb
Caching rules of thumb
14
How much storage do we need?
Yotta
Soon everything can be
Everything
recorded and indexed
!
Recorded
Most bytes will never be
All Books
seen by humans.
MultiMedia
Data summarization, trend
detection anomaly
All LoC books
(words)
detection are key
technologies
See Mike Lesk:
How much information is there:
http://www.lesk.com/mlesk/ksg97/ksg.html
See Lyman & Varian:
24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli
Exa
Peta
Tera
.Movi
e
A Photo
How much information
http://www.sims.berkeley.edu/research/projects/how-much-info/
Zetta
A Book
Giga
Mega
15
Kilo
Storage Latency:
How Far Away is the Data?
10 9
Andromeda
Tape /Optical
Robot
10 6 Disk
100
10
2
1
Memory
On Board Cache
On Chip Cache
Registers
2,000 Years
Pluto
Springfield
2 Years
1.5 hr
This Campus
10 min
This Room
My Head
1 min
16
Storage Hierarchy :
Speed & Capacity vs Cost Tradeoffs
1012
Disc
Secondary
109
Main
106
Price vs Speed
Cache
102
Offline Main
Tape
100
Secondary
Online
Online
Tape
Disc Tape 10-2
Offline
Nearline
Tape
Tape
10-4
Cache
103
10-6
10-9 10-6 10-3 10 0 10 3
Access Time (seconds)
10-9 10-6 10-3 10 0 10 3
Access Time (seconds)
17
$/MB
Typical System (bytes)
1015
Size vs Speed
Nearline
Tape
Disks: Today
Disk is 18GB to 180 GB
10-50 MBps
5k-15k rpm (6ms-2ms rotational latency)
12ms-7ms seek
1K$/IDE-TB, 6k$/SCSI-TB
For shared disks most time spent
waiting in queue for access to
arm/controller
Wait
Transfer
Transfer
Rotate
Rotate
Seek
Seek
18
The Street Price of a Raw disk TB
about 1K$/TB
12/1/1999
k$/TB
9/1/2000raw
k$/TB
9/1/2001
40
35
40
IDE
30
SCSI
$
35
25
SCSI
30
IDE
20
25
10.0
15
$
$
$
1000
900 Price vs disk capacity
1000800
900700Price vs disk capacity
y = 17.9x
800600
IDE
SCSI
700500
SCSI
IDE
600400
1400
500300 Price vs disk capacity
1200200
y = 13x
400
y
=
7.2x
y = 6.7x
100
14001000
300
0
200
800
GB
0
40
60
1200 100
SCSI 20
y = 3.8x
600
IDE
y = 6x
9.0
20
10
0
400 0
800 200
SCSI
IDE
20
40
60
Raw Disk unit Size GBy
80
= 2.0x
9.0
8.0
4.0
0
3.007.0
2.06.0
1.05.0
0.0
4.0
0
3.0
2.0
1.0
0.0
0
55.0
raw
k$/TB
$
0
600
0
400
50
100
150
Raw Disk unit Size GB
200
y=x
200
0
0
50
100
150
Raw Disk unit Size GB
200
6
GB
30
rawSCSI
IDE
k$/TB
10
20
20
40
40
Disk unit size GB
50
60
60
80
SCSI
$
1000
$
$
4/1/2002
Price vs disk capacity
8.0
15
5
10.0
7.0
10
06.0
50
0
IDE
100
150
Disk unit size GB
50
100
Disk unit size GB
200
15019
200
Standard Storage Metrics
Capacity:



RAM:
Disk:
Tape:
MB and $/MB: today at 512MB and 200$/GB
GB and $/GB: today at 80GB and
7k$/TB
TB and $/TB: today at
40GB and
7k$/TB
(nearline)
Access time (latency)



RAM:
Disk:
Tape:
1…100 ns
5…15 ms
30 second pick, 30 second position
Transfer rate



RAM:
Disk:
Tape:
1-10 GB/s
10-50 MB/s - - -Arrays can go to 10GB/s
5-15 MB/s - - - Arrays can go to 1GB/s
20
New Storage Metrics:
Kaps, Maps, SCAN
Kaps: How many kilobyte objects served per second


The file server, transaction processing metric
This is the OLD metric.
Maps: How many megabyte objects served per sec

The Multi-Media metric
SCAN: How long to scan all the data

the data mining and utility metric
And

Kaps/$, Maps/$, TBscan/$
21
For
the
Record
(good 2002 devices packaged in system
http://www.tpc.org/results/individual_results/Compaq/compaq.5500.99050701.es.pdf)
Unit capacity (GB)
Unit price $
$/GB
Latency (s)
Bandwidth (MBps)
Kaps
Maps
Scan time (s/TB)
$/Kaps
$/Maps
$/TBscan
DRAM
1
100
500
1.E-7
1000
9.E+5
1.E+3
1,000
1.E-12
1.E-9
$1.06
DISK
80
500
6
5.E-3
40
199
33.33
2,000
3.E-8
2.E-7
$0.13
TAPE robot
80 X 100
20000
3.5
30
6
3.E-2
3.E-2
333,333
6.E-3
6.E-3
$881
22
Tape slice is 8Tb with 1 DLT reader at 6MBps per 100 tapes.
For the Record
(good 2002 devices packaged in system
http://www.tpc.org/results/individual_results/Compaq/compaq.5500.99050701.es.pdf
)
1.E+6
DRAM
1.E+4
DISK
1.E+2
TAPE
1.E-8
$/
TB
sc
an
$/
M
ap
s
ap
s
$/
K
(s
/T
B)
ti m
e
1.E-6
M
ap
s
1.E-4
Sc
an
1.E-2
Ka
ps
1.E+0
1.E-10
1.E-12
Tape is 1Tb with 4 DLT readers at 5MBps each.
23
Disk Changes
Disks got cheaper: 20k$ -> 200$


$/Kaps etc improved 100x (Moore’s law!) (or even 500x)
One-time event (went from mainframe prices to PC prices)
Disks got cooler (50x in decade)


1990: 1 Kaps per
20 MB
2002: 1 Kaps per 1,000 MB
Disk scans take longer (10x per decade)


1990 disk ~ 1GB and 50Kaps and 5 minute scan
2002 disk ~160GB and 160Kaps and 1 hour scan
So.. Backup/restore takes a long time (too long)
24
Storage Ratios Changed
10x better access time
10x more bandwidth
100x more capacity
Data 25x cooler (1Kaps/20MB vs 1Kaps/GB)
4,000x lower media price
20x to 100x lower disk price
Scan takes 10x longer (3 min vs 1hr)
1
1980
1990
Year



1970-1990
1990-1995
1995-1997
today ~
1$/GB disk
200$/GB ram
0.1
2000
100:1
10:1
50:1
200:1
Storage Price vs Time
Megabytes per kilo-dollar
100
10,000.
1,000.
MB/k$
Accesses per Second
1.
Capacity (GB)
seeks per second
bandwidth: MB/s
10.
10

Disk accesses/second
vs Time
Disk Performance vs Time
100
RAM/disk media price ratio
changed
10
100.
10.
1.
1
1980
1990
Year
2000
0.1
1980
1990
Year
25
2000
More Kaps and Kaps/$ but….
1970
1980
1990
1000
100
10
2000
100 GB
30 MB/s
26
Kaps/disk
Kaps/$
Disk accesses got much less expensive
Better disks
Kaps over time
Cheaper disks!
1.E+6
Kaps/$
But: disk arms
1.E+5
1.E+4
are expensive
the scarce resource 1.E+3
1.E+2
1 hour Scan
Kaps
1.E+1
vs 5 minutes in 1990 1.E+0
Data on Disk
Can Move to RAM in 10 years
Storage Price vs Time
Megabytes per kilo-dollar
10,000.
100:1
MB/k$
1,000.
100.
10.
10 years1.
0.1
1980
1990
Year
2000
27
The “Absurd” 10x (=4 year) Disk
2.5 hr scan time
(poor sequential access)
1 aps / 5 GB
(VERY cold data)
It’s a tape!
100 MB/s
200 Kaps
1 TB
28
Disk vs Tape
Disk

160 GB
40 MBps
4 ms seek time
2 ms rotate latency
1$/GB for drive
1$/GB for ctlrs/cabinet
60 TB/rack

1 hour scan





Tape

80 GB
10 MBps
10 sec pick time
30-120 second seek time
2$/GB for media
5$/GB for drive+library
20 TB/rack

1 week scan





Guestimates
Cern: 200 TB
3480 tapes
2 col = 50GB
Rack = 1 TB
= 8 drives
The price advantage of tape is gone, and
the performance advantage of disk is growing
At 10K$/TB, disk is competitive with nearline tape.
29
Caveat:
Tape vendors may innovate
Sony DTF-2 is
100 GB,
24 MBps
30 second pick time
So, 2x better
Prices not clear
http://bpgprod.sel.sony.com/DTF/seismic/dtf2.html
30
It’s Hard to Archive a Petabyte
It takes a LONG time to restore it.
At 1GBps it takes 12 days!
Store it in two (or more) places online
A geo-plex
(on disk?).
Scrub it continuously (look for errors)
On failure,


use other copy until failure repaired,
refresh lost copy from safe copy.
Can organize the two copies differently
(e.g.: one by time, one by space)
31
Auto Manage Storage
1980 rule of thumb:

A DataAdmin per 10GB, SysAdmin per mips
2002 rule of thumb


A DataAdmin per 5TB
SysAdmin per 100 clones (varies with app).
Problem:

5TB is >5k$ today, 500$ in a few years.

Admin cost >> storage cost !!!!
Challenge:

Automate ALL storage admin tasks
32
How to cool disk data:
Cache data in main memory

See 5 minute rule later in presentation
Fewer-larger transfers

Larger pages (512-> 8KB -> 256KB)
Sequential rather than random access


Random 8KB IO is 1.5 MBps
Sequential IO is 30 MBps (20:1 ratio is growing)
Raid1 (mirroring) rather than Raid5 (parity).
33
Stripes, Mirrors, Parity (RAID 0,1, 5)
RAID 0: Stripes

bandwidth
RAID 1: Mirrors, Shadows,…


Fault tolerance
Reads faster, writes 2x slower
RAID 5: Parity



Fault tolerance
Reads faster
Writes 4x or 6x slower.
0,3,6,..
1,4,7,..
0,1,2,..
2,5,8,..
0,1,2,..
0,2,P2,.. 1,P1,4,.. P0,3,5,..
34
RAID 10 (strips of mirrors) Wins
“wastes space, saves arms”
RAID 5 (6 disks 1 vol):
Performance



675 reads/sec
210 writes/sec
Write
RAID1 (6 disks, 3 pairs)
Performance



750 reads/sec
300 writes/sec
Write
 4 logical IO,
 2 logical IO
 2 seek + 1.7 rotate
 2 seek 0.7 rotate
SAVES SPACE
Performance
degrades on failure
SAVES ARMS
Performance
improves on failure
35
Shows Best Page Index Page Size ~16KB
Index Page Utility vs Page Size
and Disk Performance
Index Page Utility vs Page Size
and Index Elemet Size
1.00
0.90
0.90
0.80
0.80
Utility
16 byte entries
32 byte
0.70
10 MB/s
0.70
5 MB/s
0.60
0.60
64 byte
0.50
0.40
Utility
1.00
128 byte
2
4
8
16
0.40
32
3 MB/s
0.50
2
4
8
16
32
64
128
128
40 MB/s 0.65 0.74 0.83 0.91 0.97 0.99 0.94
16 B
0.64 0.72 0.78 0.82 0.79 0.69 0.54
10 MB/s 0.64 0.72 0.78 0.82 0.79 0.69 0.54
32 B
0.54 0.62 0.69 0.73 0.71 0.63 0.50
5 MB/s
0.62 0.69 0.73 0.71 0.63 0.50 0.34
64 B
0.44 0.53 0.60 0.64 0.64 0.57 0.45
3 MB/s
0.51 0.56 0.58 0.54 0.46 0.34 0.22
128 B 0.34 0.43 0.51 0.56 0.56 0.51 0.41
1 MB/s
0.40 0.44 0.44 0.41 0.33 0.24 0.16
Page Size (KB)
64
1MB/s
Page Size (KB)
36
Summarizing storage rules of thumb (1)
Moore’s law: 4x every 3 years
100x more per decade
Implies 2 bit of addressing every 3 years.
Storage capacities increase 100x/decade
Storage costs drop 100x per decade
Storage throughput increases 10x/decade
Data cools 10x/decade
Disk page sizes increase 5x per decade.
37
Summarizing storage rules of thumb (2)
RAM:Disk and Disk:Tape cost ratios are
100:1
and
1:1
So, in 10 years, disk data can move to RAM
since prices decline 100x per decade.
A person can administer a million dollars of
disk storage: that is 1TB - 100TB today
Disks are replacing tapes as backup devices.
You can’t backup/restore a Petabyte quickly
so geoplex it.
Mirroring rather than Parity to save disk arms
38
Outline
Moore’s Law and consequences
Storage rules of thumb
Balanced systems rules revisited
Networking rules of thumb
Caching rules of thumb
39
Standard Architecture (today)
System Bus
PCI Bus 1
PCI Bus 2
40
Amdahl’s Balance Laws
parallelism law: If a computation has a serial
part S and a parallel component P,
then the maximum speedup is (S+P)/S.
balanced system law: A system needs
a bit of IO per second per instruction per second:
about 8 MIPS per MBps.
memory law: =1:
the MB/MIPS ratio (called alpha ()),
in a balanced system is 1.
IO law:
Programs do one IO per 50,000 instructions.
41
Amdahl’s Laws Valid 35 Years Later?
Parallelism law is algebra: so SURE!
Balanced system laws?
Look at tpc results (tpcC, tpcH) at http://www.tpc.org/
Some imagination needed:

What’s an instruction (CPI varies from 1-3)?
 RISC, CISC, VLIW, … clocks per instruction,…

What’s an I/O?
42
TPC systems
Normalize for CPI (clocks per instruction)


TPC-C has about 7 ins/byte of IO
TPC-H has 3 ins/byte of IO
TPC-H needs ½ as many disks, sequential vs random
Both use 9GB 10 krpm disks (need arms, not bytes)
KB IO/s
MHz/
Disk Disks MB/s
CPI mips
/
/
s
/ cpu / cpu
cpu
IO disk
Amdahl
1
1
1
6
TPC-C=
random
550
2.1
262
8
100
397
50
40
TPC-H=
sequential
550
1.2
458
64
100
176
22
141
Ins/
IO
Byte
43
8
7
3
TPC systems: What’s alpha
(=MB/MIPS)
?
Hard to say:



Intel 32 bit addressing (= 4GB limit). Known CPI.
IBM, HP, Sun have 64 GB limit.
Unknown CPI.
Look at both, guess CPI for IBM, HP, Sun
Alpha is between 1 and 6
Mips
Memory
Alpha
Amdahl
1
1
tpcC Intel
8x262 = 2Gips
4GB
tpcH Intel
8x458 = 4Gips
4GB
tpcC IBM
24 cpus ?= 12 Gips
64GB
tpcH HP
32 cpus ?= 16 Gips
32 GB
1
2
1
6
244
Instructions per IO?
We know 8 mips per MBps of IO
So, 8KB page is 64 K instructions
And 64KB page is 512 K instructions.
But, sequential has fewer instructions/byte.
(3 vs 7 in tpcH vs tpcC).
So, 64KB page is 200 K instructions.
45
Amdahl’s Balance Laws Revised
Laws right, just need “interpretation”
Balanced System Law:
A system needs 8 MIPS/MBpsIO,
(imagination?)
but instruction rate must be measured on the workload.
 Sequential workloads have low CPI (clocks per instruction),
 random workloads tend to have higher CPI.
Alpha (the MB/MIPS ratio) is rising from 1 to 6.
This trend will likely continue.
One Random IO per 50k instructions.
Sequential IOs are larger
One sequential IO per 200k instructions
46
PAP vs RAP (a y2k perspective)
Peak Advertised Performance vs
Real Application Performance
Application
Data
File System
CPU
System Bus
2000 x4 Mips = 8 Bips
1600 MBps
500 MBps
PCI
1-6 cpi = 500..2,000 mips
System Bus
133 MBps
90 MBps
SCSI
PCI Bus 1
PCI Bus 2
160 MBps
90 MBps
Disks
66 MBps
40 MBps
47
Outline
Moore’s Law and consequences
Storage rules of thumb
Balanced systems rules revisited
Networking rules of thumb
Caching rules of thumb
48
Standard IO (Infiniband™) next Year?
Probably
Replace PCI with something better
will still need a mezzanine bus standard
Multiple serial links directly from processor
Fast (10 GBps/link) for a few meters
System Area Networks (SANS) ubiquitous
(VIA morphs to Infiniband?)
49
Ubiquitous 10 GBps SANs
in 5 years
1Gbps Ethernet are reality now.

Also FiberChannel ,MyriNet, GigaNet,
ServerNet,, ATM,…
10 Gbps x4 WDM deployed now

(OC192)
1 GBps
3 Tbps WDM working in lab
In 5 years, expect 10x,
wow!!
120 MBps
(1Gbps)
80 MBps
40 MBps
5 MBps
20 MBps
50
Networking
WANS are getting faster than LANS
G8 = OC192 = 9Gbps is “standard”
Link bandwidth improves 4x per 3 years
Speed of light (60 ms round trip in US)
Software stacks
have always been the problem.
Time = SenderCPU + ReceiverCPU + bytes/bandwidth
This has been the problem for
small (10KB or less) messages
51
The Promise of SAN/VIA:10x in 2 years
http://www.ViArch.org/
Yesterday:



10 MBps (100 Mbps Ethernet)
250
~20 MBps tcp/ip saturates
200
2 cpus
round-trip latency ~250 µs 150
Now

Wires are 10x faster
Myrinet, Gbps Ethernet, ServerNet,…

Fast user-level
communication
 tcp/ip ~ 100 MBps 10% cpu
Time µs to
Send 1KB
Transmit
receivercpu
sender cpu
100
50
0
100Mbps
Gbps
SAN
 round-trip latency is 15 us
1.6 Gbps demoed on a WAN
52
The Network Revolution
LPC
0
2048
VIA-copy
VIA-direct
Microseconds
1600
1200
800
400
0
4096
6144
8192
Data size (bytes)
VIA-direct
VIA-copy
TCP
100
Bandwidth (MBps)
Networking folks are
finally streamlining LAN
case (SAN).
Offloading protocol to NIC
½ power point is 8KB
Min round trip latency is
~50 µs.
3k ins + .1 ins/byte
TCP
80
60
40
20
0
0
16384
32768
49152
65536
Data size (bytes)
•High-Performance Distributed Objects over a System Area Network
Li, L. ; Forin, A. ; Hunt, G. ; Wang, Y. , MSR-TR-98-68
53
How much does wire-time cost?
$/Mbyte?
Gbps Ethernet
100 Mbps Ethernet
OC12 (650 Mbps)
DSL
POTs
Wireless:
Cost
Time
.2µ$
.3µ$
.003$
.0006$
.002$
.80$
10 ms
100 ms
20 ms
25 sec
200 sec
500 sec
Seat cost
$/3y
GBpsE
2000
100MbpsE
700
OC12
12960000
OC3
3132000
T1
28800
DSL
2300
POTS
1180
Wireless ?
Bandwidt
h
B/s
$/MB
Time
1.00E+08
2.E-07
0.010
1.00E+07
7.E-07
0.100
5.00E+07
3.E-03
0.020
3.00E+06
1.E-02
0.333
1.00E+05
3.E-03
10.000
4.00E+04
6.E-04
25.000
5.00E+03
2.E-03 200.000
2.00E+03
8.E-01 500.000
seconds in 3 years
94608000
54
Data delivery costs 1$/GB today
Rent for “big” customers:
300$/megabit per second
per month
Improved 3x in last 6 years (!).
That translates to
1$/GB at each end.
Overhead (routers, people,..) makes it
6$/GB at each end.
You can mail a 160 GB disk for 20$.



That’s 16x cheaper
If overnight it’s 4 MBps.
7 disks ~ 30 MBps (1/4 Gbps)
TeraScale SneakerNet
7x160
GB
~ 1 TB
55
Outline
Moore’s Law and consequences
Storage rules of thumb
Balanced systems rules revisited
Networking rules of thumb
Caching rules of thumb
56
The Five Minute Rule
Trade DRAM for Disk Accesses
Cost of an access (Drive_Cost / Access_per_second)
Cost of a DRAM page ( $/MB/ pages_per_MB)
Break even has two terms:
Technology term and an Economic term
PagesPerMBofDRAM
PricePerDiskDrive
BreakEvenReferenceInterval 

AccessPerSecondPerDisk PricePerMBofDRAM
Grew page size to compensate for changing ratios.
Now at 5 minutes for random, 10 seconds sequential
57
The 5 Minute Rule Derived
$
T =TimeBetweenReferences to Page
Breakeven:
RAM_$_Per_MB =
PagesPerMB
T=
_____DiskPrice
.
T x AccessesPerSecond
DiskPrice
x PagesPerMB
.
RAM_$_Per_MB x AccessPerSecond
58
Plugging in the Numbers
BreakEvenReferenceInterval 
PagesPerMBofDRAM
PricePerDiskDrive

AccessPerSecondPerDisk PricePerMBofDRAM
PPM/aps
Random
128/120
Sequential
1/30 ~
disk$/Ram$
Break Even
~1 1000/3 ~300
5 minutes
.03
~ 300
10seconds
Trend is longer times because
disk$ not changing much,
RAM$ declining 100x/decade
5 Minutes & 10 second rule
59
The 10 Instruction Rule
Spend 10 instructions /second to save 1 byte
Cost of instruction:
I =ProcessorCost/MIPS*LifeTime
Cost of byte:
B = RAM_$_Per_B/LifeTime
Breakeven:
NxI = B
N = B/I = (RAM_$_B X MIPS)/ ProcessorCost
~ (3E-6x5E8)/500 = 3 ins/B for Intel
~ (3E-6x3E8)/10 = 10 ins/B for ARM
60
Trading Storage for Computation
You can spend 10 bytes of RAM to save 1
instruction/second.
Rent for Disk: 1$/GB (forever)
Processor costs 10$ to 1,000$/mips
10$ - 1,000$ for 100 Tera Ops.
So 1$/TeraOp (or a penny per TeraOp)
1 GB ~ 1 Top
1 MB ~ 1 Gop
1 KB ~ 1 Mop
Save a 1KB object on disk
if it costs more than 10 ms to compute.
61
When to Cache Web Pages.
Caching
Caching
Caching
Caching


saves user time
saves wire time
costs storage
only works sometimes:
New pages are a miss
Stale pages are a miss
62
Web Page Caching Saves People Time
Assume people cost 20$/hour (or .2 $/hr ???)
Assume 20% hit in browser, 40% in proxy
Assume 3 second server time
Caching saves people time
28$/year to 150$/year of people time
or .28 cents to 1.5$/year.
connection
cache
R_remote
seconds
LAN
LAN
Modem
Modem
Mobile
Mobile
proxy
browser
proxy
browser
proxy
browser
3
3
5
5
13
13
R_local
seconds
H
hit rate
0.3
0.1
2
0.1
10
0.1
0.4
0.2
0.4
0.2
0.4
0.2
People
S avings
¢/page
0.6
0.3
0.7
0.5
0.7
63
1.4
Web Page Caching Saves Resources
Wire cost is penny (wireless) to 100µ$ LAN
Storage is 8 µ$/mo
Breakeven: wire cost = storage rent
18 months to 300 years
Add people cost: breakeven >15 years.
“cheap people” (.2$/hr)  >3 years.
Time =  A/B
A
B
C
$/10 KB
$/10 KB
download
storage/mo
cache
Internet/LAN
network
1.E-04
8.E-06
storage
time
18 months
$
0.02
Modem
Wireless
2.E-04
1.E-02
8.E-06
2.E-04
36 months
300 years
0.03
0.07
Break-even People Cost
Time =
(A+ C )/B
of download Break Even
15 years
21 years
>99964years
Caching
Disk caching


5 minute rule for random IO
10 second rule for sequential IO
Web page caching:

If page will be re-referenced in
18 months: with free users
15 years: with valuable users
then cache the page in the client/proxy.
Challenge:
guessing which pages will be re-referenced
detecting stale pages (page velocity)
65
Meta-Message:
Technology Ratios Matter
Price and Performance change.
If everything changes in the same way,
then nothing really changes.
If some things get much cheaper/faster than others,
then that is real change.
Some things are not changing much:



Cost of people
Speed of light
…
And some things are changing a LOT
66
Outline
Moore’s Law and consequences
Storage rules of thumb
Balanced systems rules revisited
Networking rules of thumb
Caching rules of thumb
67

Rules of Thumb in Data Engineering Jim Gray UC Santa Cruz 7 May 2002 [email protected], http://research.Microsoft.com/~Gray/Talks/

Transcript Rules of Thumb in Data Engineering Jim Gray UC Santa Cruz 7 May 2002 [email protected], http://research.Microsoft.com/~Gray/Talks/

Directory