CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 30, 2000 Prof.

Download Report

Transcript CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 30, 2000 Prof.

CS252
Graduate Computer Architecture
Lecture 1
Review of Technology Trends and
Cost/Performance
August 30, 2000
Prof. John Kubiatowicz
8/30/00
CS252/Kubiatowicz
Lec 1.1
Original
Big Fishes Eating Little Fishes
8/30/00
CS252/Kubiatowicz
Lec 1.2
1988 Computer Food Chain
Mainframe
Supercomputer
Minisupercomputer
Work- PC
Ministation
computer
Massively Parallel
Processors
8/30/00
CS252/Kubiatowicz
Lec 1.3
Massively Parallel Processors
Minisupercomputer
Minicomputer
1998 Computer Food Chain
Mainframe
Server
Supercomputer
8/30/00
Work- PC
station
Now who is eating whom?
CS252/Kubiatowicz
Lec 1.4
Why Such Change in 10 years?
• Performance
– Technology Advances
» CMOS VLSI dominates older technologies (TTL, ECL) in
cost AND performance
– Computer architecture advances improves low-end
» RISC, superscalar, RAID, …
• Price: Lower costs due to …
– Simpler development
» CMOS VLSI: smaller systems, fewer components
– Higher volumes
» CMOS VLSI : same dev. cost 10,000 vs. 10,000,000
units
– Lower margins by class of computer, due to fewer services
• Function
– Rise of networking/local interconnection technology
8/30/00
CS252/Kubiatowicz
Lec 1.5
Technology Trends:
Microprocessor Capacity
100000000
“Graduation Window”
Alpha 21264: 15 million
Pentium Pro: 5.5 million
PowerPC 620: 6.9 million
Alpha 21164: 9.3 million
Sparc Ultra: 5.2 million
10000000
Moore’s Law
Pent ium
i80486
Transistors
1000000
i80386
i80286
100000
CMOS improvements:
• Die size: 2X every 3 yrs
• Line width: halve / 7 yrs
i8086
10000
i8080
i4004
1000
1970
1975
1980
1985
1990
1995
2000
Year
8/30/00
CS252/Kubiatowicz
Lec 1.6
Memory Capacity
(Single Chip DRAM)
size
1000000000
100000000
Bits
10000000
1000000
100000
10000
1000
1970
1975
1980
1985
1990
1995
year
1980
1983
1986
1989
1992
1996
2000
2000
size(Mb) cyc time
0.0625 250 ns
0.25
220 ns
1
190 ns
4
165 ns
16
145 ns
64
120 ns
256
100 ns
Year
8/30/00
CS252/Kubiatowicz
Lec 1.7
Technology Trends
(Summary)
8/30/00
Capacity
Speed (latency)
Logic
2x in 3 years
2x in 3 years
DRAM
4x in 3 years
2x in 10 years
Disk
4x in 3 years
2x in 10 years
CS252/Kubiatowicz
Lec 1.8
Processor Performance
Trends
1000
Supercomputers
100
Mainframes
10
Minicomputers
Microprocessors
1
0.1
1965
1970
1975
1980
1985
1990
1995
2000
Year
8/30/00
CS252/Kubiatowicz
Lec 1.9
400
200
0
8/30/00
600
800
1.54X/yr
1200
DEC Alpha 21164/600
DEC Alpha 5/500
DEC Alpha 5/300
DEC Alpha 4/266
IBM POWER 100
DEC AXP/500
HP 9000/750
IBM RS/6000
1000
MIPS M/120
MIPS M/2000
Sun-4/260
Processor Performance
(1.35X before, 1.55X now)
87 88 89 90 91 92 93 94 95 96 97
CS252/Kubiatowicz
Lec 1.10
Performance Trends
(Summary)
• Workstation performance (measured in Spec
Marks) improves roughly 50% per year
(2X every 18 months)
• Improvement in cost performance estimated
at 70% per year
8/30/00
CS252/Kubiatowicz
Lec 1.11
Computer Architecture Is …
the attributes of a [computing] system as
seen by the programmer, i.e., the
conceptual structure and functional behavior,
as distinct from the organization of the data
flows and controls the logic design, and the
physical implementation.
Amdahl, Blaaw, and Brooks,
1964
SOFTWARE
8/30/00
CS252/Kubiatowicz
Lec 1.12
Computer Architecture’s
Changing Definition
• 1950s to 1960s: Computer Architecture Course:
Computer Arithmetic
• 1970s to mid 1980s: Computer Architecture
Course: Instruction Set Design, especially ISA
appropriate for compilers
• 1990s: Computer Architecture Course:
Design of CPU, memory system, I/O system,
Multiprocessors, Networks
• 2010s: Computer Architecture Course: Self
adapting systems? Self organizing structures?
DNA Systems/Quantum Computing?
8/30/00
CS252/Kubiatowicz
Lec 1.13
Instruction Set Architecture
(ISA)
software
instruction set
hardware
8/30/00
CS252/Kubiatowicz
Lec 1.14
Evolution of Instruction Sets
Single Accumulator (EDSAC 1950)
Accumulator + Index Registers
(Manchester Mark I, IBM 700 series 1953)
Separation of Programming Model
from Implementation
High-level Language Based
(B5000 1963)
Concept of a Family
(IBM 360 1964)
General Purpose Register Machines
Complex Instruction Sets
(Vax, Intel 432 1977-80)
Load/Store Architecture
(CDC 6600, Cray 1 1963-76)
RISC
(Mips,Sparc,HP-PA,IBM RS6000, . . .1987
8/30/00
CS252/Kubiatowicz
Lec 1.15
Interface Design
A good interface:
• Lasts through many implementations (portability,
compatability)
• Is used in many differeny ways (generality)
• Provides convenient functionality to higher levels
• Permits an efficient implementation at lower levels
use
use
use
8/30/00
Interface
imp 1
time
imp 2
imp 3
CS252/Kubiatowicz
Lec 1.16
Virtualization:
One of the lessons of RISC
• Integrated Systems Approach
– What really matters is the functioning of the complete system,
I.e. hardware, runtime system, compiler, and operating system
– In networking, this is called the “End to End argument”
– Programmers care about high-level languages, debuggers, sourcelevel object-oriented programming
• Computer architecture is not just about transistors,
individual instructions, or particular implementations
• Original RISC projects replaced complex
instructions with a compiler + simple instructions
• Logical Extension => Genetically adaptive runtime
systems enhanced by dynamic compilation running on
reconfigurable hardware? Perhaps.
8/30/00
CS252/Kubiatowicz
Lec 1.17
Computer Architecture Topics
Input/Output and Storage
Disks, WORM, Tape
VLSI
Coherence,
Bandwidth,
Latency
L2 Cache
L1 Cache
Instruction Set Architecture
Addressing,
Protection,
Exception Handling
Pipelining, Hazard Resolution,
Superscalar, Reordering,
Prediction, Speculation,
Vector, Dynamic Compilation
8/30/00
Network
Communication
Other Processors
Emerging Technologies
Interleaving
Bus protocols
DRAM
Memory
Hierarchy
RAID
Pipelining and Instruction
Level Parallelism
CS252/Kubiatowicz
Lec 1.18
Computer Architecture Topics
P
M
P
S
M
° ° °
P
M
P
M
Interconnection Network
Processor-Memory-Switch
Multiprocessors
Networks and Interconnections
8/30/00
Shared Memory,
Message Passing,
Data Parallelism
Network Interfaces
Topologies,
Routing,
Bandwidth,
Latency,
Reliability
CS252/Kubiatowicz
Lec 1.19
CS 252 Course Focus
Understanding the design techniques, machine
structures, technology factors, evaluation
methods that will determine the form of
computers in 21st Century
Technology
Applications
Programming
Languages
Computer Architecture:
• Instruction Set Design
• Organization
• Hardware/Software Boundary
Operating
Systems
8/30/00
Parallelism
Measurement &
Evaluation
Interface Design
(ISA)
Compilers
History
CS252/Kubiatowicz
Lec 1.20
Topic Coverage
Textbook: Hennessy and Patterson, Computer
Architecture: A Quantitative Approach, 2nd Ed., 1996.
Research Papers -- Handed out in class
• 1.5 weeks Review: Fundamentals of Computer Architecture (Ch. 1),
Instruction Set Architecture (Ch. 2), Pipelining (Ch. 3)
• 2.5 weeks: Pipelining, Interrupts, and Instructional Level
Parallelism (Ch. 4), Vector Processors (Appendix B).
• 1.5 weeks: Dynamic Compilation. Data Speculation (papers).
Complexity, design via genetic algorithms
• 1 week:
Memory Hierarchy (Chapter 5)
• 1.5 weeks: Fault Tolerance, Input/Output and Storage (Ch. 6)
• 1.5 weeks: Networks and Interconnection Technology (Ch. 7)
• 1.5 weeks: Multiprocessors (Ch. 8 + Research papers + Culler
book draft Chapter 1)
• 1 week:
Quantum Computing, DNA Computing
8/30/00
CS252/Kubiatowicz
Lec 1.21
CS252: Staff
Instructor:Prof John D. Kubiatowicz
Office: 673 Soda Hall, 643-6817 kubitron@cs
Office Hours: Thursday 1:30 - 3:00 or by appt.
(Contact Michael Granger, 642-4334, granger@cs,
676 Soda)
T. A:
Mark Whitney
Office: 464 Soda Hall, whitney@cs
TA Office Hours: Tuesday/Wednesday 11:00-12:00
Class:
Wed, Fri, 1:00 - 2:30pm
310 Soda Hall
Text:
Computer Architecture: A Quantitative Approach,
Second Edition (1996) (4th printing)
Web page: http://www.cs/~kubitron/courses/cs252-F00/
Lectures available online <11:30AM day of lecture
Newsgroup: ucb.class.cs252
Email:
8/30/00
[email protected]
CS252/Kubiatowicz
Lec 1.22
Lecture style
•
•
•
•
•
•
•
1-Minute Review
20-Minute Lecture/Discussion
5- Minute Administrative Matters
25-Minute Lecture/Discussion
5-Minute Break (water, stretch)
25-Minute Lecture/Discussion
Instructor will come to class early & stay after to
answer questions
Attention
20 min.
8/30/00
Break “In Conclusion, ...”
Time
CS252/Kubiatowicz
Lec 1.23
Grading
• 20% Homeworks (work in pairs)
• 35% Examinations (2 Midterms)
• 35% Research Project (work in pairs)
– Transition from undergrad to grad student
– Berkeley wants you to succeed, but you need to show
initiative
– pick topic
– meet 3 times with faculty/TA to see progress
– give oral presentation
– give poster session
– written report like conference paper
– 3 weeks work full time for 2 people
– Opportunity to do “research in the small” to help make
transition from good student to research colleague
• 10% Class Participation
8/30/00
CS252/Kubiatowicz
Lec 1.24
Quizes
• Reduce the pressure of taking quizes
– Only 2 Graded Quizes:
Tentative: Wed Oct 18th and Wed. Dec 6th
– Our goal: test knowledge vs. speed writing
– 3 hrs to take 1.5-hr test (5:30-8:30 PM, TBA location)
– Both mid-term quizes can bring summary sheet
» Transfer ideas from book to paper
– Last chance Q&A: during class time day of exam
• Students/Staff meet over free pizza/drinks at La Vals:
Wed Oct. 18th (8:30 PM) and Wed Dec 6th (8:30 PM)
8/30/00
CS252/Kubiatowicz
Lec 1.25
Research Paper Reading
• As graduate students, you are now researchers.
• Most information of importance to you will be in
research papers.
• Ability to rapidly scan and understand research
papers is key to your success.
• So: you will read lots of papers in this course!
– Quick 1 paragraph summaries will be due in class
– Important supplement to book.
– Will discuss papers in class
• Papers will be scanned and on web page.
8/30/00
CS252/Kubiatowicz
Lec 1.26
More Course Info
• Everything is on the course Web page:
www.cs.berkeley.edu/~kubitron/courses/cs252-F00
• Notes:
– Not sure what the state of textbooks at Student Center.
– The course Web page includes a pointer to last term’s 152 home
page. The “handout” page includes pointers to old 152 quizes.
• Schedule:
–
–
–
–
–
–
–
2 Graded Quizes: Wed Oct 18th and Wed Dec 6th
Veteran’s Day: Friday Nov 10
Thanksgiving Vacation: Thur Nov 23 - Sun Nov 26
Oral Presentations: Tue/Wed Dec 12/13
252 Last lecture: Fri Dec 8
252 Poster Session: ???
Project Papers/URLs due: Fri Dec 15th
• Project Suggestions: TBA
8/30/00
CS252/Kubiatowicz
Lec 1.27
Related Courses
CS 152
Strong
Prerequisite
How to build it
Implementation details
Basic knowledge of the
organization of a computer
is assumed!
CS 252
Why, Analysis,
Evaluation
CS 258
Parallel Architectures,
Languages, Systems
CS 250
Integrated Circuit Technology
from a computer-organization viewpoint
8/30/00
CS252/Kubiatowicz
Lec 1.28
Coping with CS 252
• Too many students with too varied background?
– Next Wednesday - Prequisite exam
• Limiting Number of Students
–
–
–
–
First priority is CS/ EECS grad students taking prelims
Second priority is N-th year CS/ EECS grad students (breadth)
Third priority is College of Engineering grad students
Fourth priority is CS/EECS undergraduate seniors
(Note: 1 graduate course unit = 2 undergraduate course units)
– All other categories
• If not this semester, 252 is offered regularly
– Should be offered next term as well.
8/30/00
CS252/Kubiatowicz
Lec 1.29
Coping with CS 252
• Students with too varied background?
– In past, CS grad students took written prelim exams on
undergraduate material in hardware, software, and theory
– 1st 5 weeks reviewed background, helped 252, 262, 270
– Prelims were dropped => some unprepared for CS 252?
• In class exam on Wednesday September 2nd
– Doesn’t affect grade, only admission into class
– 2 grades: Admitted or audit/take CS 152 1st
– Improve your experience if recapture common background
• Review: Chapters 1-3, CS 152 home page,
maybe “Computer Organization and Design
(COD)2/e”
–
–
–
–
8/30/00
Chapters 1 to 8 of COD if never took prerequisite
If took a class, be sure COD Chapters 2, 6, 7 are familiar
Copies in Bechtel Library on 2-hour reserve
Last year’s exam on previous-year’s web site
(~kubitron/courses/cs252-F99)
CS252/Kubiatowicz
Lec 1.30
Computer Engineering
Methodology
Technology
Trends
8/30/00
CS252/Kubiatowicz
Lec 1.31
Computer Engineering
Methodology
Evaluate Existing
Systems for
Bottlenecks
Technology
Trends
8/30/00
Benchmarks
CS252/Kubiatowicz
Lec 1.32
Computer Engineering
Methodology
Evaluate Existing
Systems for
Bottlenecks
Technology
Trends
Benchmarks
Simulate New
Designs and
Organizations
Workloads
8/30/00
CS252/Kubiatowicz
Lec 1.33
Computer Engineering
Methodology
Implementation
Complexity
Evaluate Existing
Systems for
Bottlenecks
Technology
Trends
Implement Next
Generation System
Benchmarks
Simulate New
Designs and
Organizations
Workloads
8/30/00
CS252/Kubiatowicz
Lec 1.34
Measurement and Evaluation
De s ign
Architecture is an iterative process:
• Searching the space of possible designs
• At all levels of computer systems
Analys is
Creativity
Cost /
Performance
Analysis
Good Ideas
8/30/00
Bad Ideas
Mediocre Ideas
CS252/Kubiatowicz
Lec 1.35
Measurement Tools
• Benchmarks, Traces, Mixes
• Hardware: Cost, delay, area, power
estimation
• Simulation (many levels)
– ISA, RT, Gate, Circuit
• Queuing Theory
• Rules of Thumb
• Fundamental “Laws”/Principles
8/30/00
CS252/Kubiatowicz
Lec 1.36
The Bottom Line:
Performance (and Cost)
Plane
DC to
Paris
Speed
Passengers
Throughput
(pmph)
Boeing 747
6.5 hours
610 mph
470
286,700
BAD/Sud
Concodre
3 hours
1350 mph
132
178,200
• Time to run the task (ExTime)
– Execution time, response time, latency
• Tasks per day, hour, week, sec, ns …
(Performance)
– Throughput, bandwidth
8/30/00
CS252/Kubiatowicz
Lec 1.37
The Bottom Line:
Performance (and Cost)
"X is n times faster than Y" means
ExTime(Y)
--------ExTime(X)
=
Performance(X)
--------------Performance(Y)
• Speed of Concorde vs. Boeing 747
• Throughput of Boeing 747 vs. Concorde
8/30/00
CS252/Kubiatowicz
Lec 1.38
Amdahl's Law
Speedup due to enhancement E:
ExTime w/o E
Performance w/ E
Speedup(E) = ------------ExTime w/ E
=
------------------Performance w/o E
Suppose that enhancement E accelerates a fraction
F of the task by a factor S, and the remainder of
the task is unaffected
8/30/00
CS252/Kubiatowicz
Lec 1.39
Amdahl’s Law

Fractionenhanced 
ExTimenew  ExTimeold  1  Fractionenhanced  

Speedup

enhanced 
Speedupoverall 
ExTimeold

ExTimenew
1
1  Fractionenhanced  
Fractionenhanced
Speedupenhanced
Best you could ever hope to do:
Speedupmaximum
8/30/00
1

1 - Fractionenhanced 
CS252/Kubiatowicz
Lec 1.40
Amdahl’s Law
• Floating point instructions improved to run 2X;
but only 10% of actual instructions are FP
ExTimenew =
Speedupoverall =
8/30/00
CS252/Kubiatowicz
Lec 1.41
Amdahl’s Law
• Floating point instructions improved to run 2X;
but only 10% of actual instructions are FP
ExTimenew = ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold
Speedupoverall =
8/30/00
1
0.95
=
1.053
CS252/Kubiatowicz
Lec 1.42
Metrics of Performance
Application
Programming
Language
Answers per month
Operations per second
Compiler
(millions) of Instructions per second: MIPS
ISA
(millions) of (FP) operations per second:
MFLOP/s
Datapath
Megabytes per second
Control
Function Units
Cycles per second (clock rate)
Transistors Wires Pins
8/30/00
CS252/Kubiatowicz
Lec 1.43
Aspects of CPU Performance
CPU time
= Seconds
= Instructions x
Program
CPI
Program
Compiler
X
(X)
Inst. Set.
X
X
Technology
x Seconds
Instruction
Inst Count
X
Organization
8/30/00
Program
Cycles
X
Cycle
Clock Rate
X
X
CS252/Kubiatowicz
Lec 1.44
Cycles Per Instruction
(Throughput)
“Average Cycles per Instruction”
CPI = (CPU Time * Clock Rate) / Instruction Count
= Cycles / Instruction Count
n
CPU time  Cycle Time   CPI j  I j
j 1
“Instruction Frequency”
n
CPI   CPI j  Fj
j 1
where Fj 
Ij
Instruction Count
Invest Resources where time is Spent!
8/30/00
CS252/Kubiatowicz
Lec 1.45
Example: Calculating CPI
Base Machine
Op
ALU
Load
Store
Branch
(Reg /
Freq
50%
20%
10%
20%
Typical Mix
8/30/00
Reg)
Cycles
1
2
2
2
CPI(i)
.5
.4
.2
.4
1.5
(% Time)
(33%)
(27%)
(13%)
(27%)
CS252/Kubiatowicz
Lec 1.46
SPEC: System Performance
Evaluation Cooperative
• First Round 1989
– 10 programs yielding a single number (“SPECmarks”)
• Second Round 1992
– SPECInt92 (6 integer programs) and SPECfp92 (14 floating
point programs)
» Compiler Flags unlimited. March 93 of DEC 4000 Model
610:
spice: unix.c:/def=(sysv,has_bcopy,”bcopy(a,b,c)=
memcpy(b,a,c)”
wave5: /ali=(all,dcom=nat)/ag=a/ur=4/ur=200
nasa7: /norecu/ag=a/ur=4/ur2=200/lc=blas
• Third Round 1995
8/30/00
– new set of programs: SPECint95 (8 integer programs) and
SPECfp95 (10 floating point)
– “benchmarks useful for 3 years”
– Single flag setting for all programs: SPECint_base95,
CS252/Kubiatowicz
Lec 1.47
SPECfp_base95
How to Summarize Performance
• Arithmetic mean (weighted arithmetic mean)
tracks execution time:
(Ti)/n or (Wi*Ti)
• Harmonic mean (weighted harmonic mean) of
rates (e.g., MFLOPS) tracks execution time:
n/(1/Ri) or n/(Wi/Ri)
• Normalized execution time is handy for scaling
performance (e.g., X times faster than
SPARCstation 10)
• But do not take the arithmetic mean of
normalized execution time, use the geometric
mean:
(  Tj / Nj )1/n
8/30/00
CS252/Kubiatowicz
Lec 1.48
SPEC First Round
• One program: 99% of time in single line of code
• New front-end compiler could improve
dramatically
800
700
SPEC Perf
600
500
400
300
200
100
tomcatv
fpppp
matrix300
eqntott
li
nasa7
doduc
spice
epresso
gcc
0
Benchmark
8/30/00
CS252/Kubiatowicz
Lec 1.49
Impact of Means on
SPECmark89 for IBM 550
Ratio to VAX:
Program
gcc
espresso
spice
doduc
nasa7
li
eqntott
matrix300
fpppp
tomcatv
Mean
8/30/00
Time:
Weighted Time:
Before After Before After
Before After
30
29
49
51
8.91
9.22
35
34
65
67
7.64
7.86
47
47
510 510
5.69
5.69
46
49
41
38
5.81
5.45
78 144
258 140
3.43
1.86
34
34
183 183
7.86
7.86
40
40
28
28
6.68
6.68
78 730
58
6
3.43
0.37
90
87
34
35
2.97
3.07
33 138
20
19
2.01
1.94
54
72
124 108
54.42 49.99
Geometric
Arithmetic
Weighted Arith.
Ratio 1.33
Ratio 1.16
Ratio
1.09
CS252/Kubiatowicz
Lec 1.50
Performance Evaluation
• “For better or worse, benchmarks shape a field”
• Good products created when have:
– Good benchmarks
– Good ways to summarize performance
• Given sales is a function in part of performance
relative to competition, investment in improving
product as reported by performance summary
• If benchmarks/summary inadequate, then choose
between improving product for real programs vs.
improving product to get more sales;
Sales almost always wins!
• Execution time is the measure of computer
performance!
8/30/00
CS252/Kubiatowicz
Lec 1.51
Integrated Circuits Costs
IC cost 
Die cost 
Die cost  Testing cost  Packaging cost
Final test yield
Wafer cost
Dies per Wafer  Die yield
 (Wafer_diam/2)2
  Wafer_diam
Dies per wafer 

 Test_Die
Die_Area
2  Die_Area




 Defect_Density  Die_area  
Die Yield  Wafer_yield  1  
 


 



Die Cost goes roughly with die area4
8/30/00
CS252/Kubiatowicz
Lec 1.52
Real World Examples
Chip
Metal Line Wafer Defect Area Dies/ Yield Die Cost
layers width cost
/cm2 mm2 wafer
386DX
2 0.90 $900
1.0
43 360 71%
$4
486DX2
3 0.80 $1200
1.0
81 181 54%
$12
PowerPC 601 4 0.80 $1700
1.3 121 115 28%
$53
HP PA 7100 3 0.80 $1300
1.0 196
66 27%
$73
DEC Alpha
3 0.70 $1500
1.2 234
53 19%
$149
SuperSPARC 3 0.70 $1700
1.6 256
48 13%
$272
Pentium
3 0.80 $1500
1.5 296
40 9%
$417
– From "Estimating IC Manufacturing Costs,” by Linley Gwennap,
Microprocessor Report, August 2, 1993, p. 15
8/30/00
CS252/Kubiatowicz
Lec 1.53
Cost/Performance
What is Relationship of Cost to Price?
• Component Costs
• Direct Costs (add 25%
purchasing, scrap, warranty
to 40%) recurring costs: labor,
• Gross Margin
(add 82% to 186%) nonrecurring costs:
R&D, marketing, sales, equipment maintenance, rental, financing
cost, pretax profits, taxes
• Average Discount
to get List Price (add 33% to 66%):
volume discounts and/or retailer markup
List Price
Average
25% to 40%
Discount
Avg. Selling Price
Gross
Margin
Direct Cost
Component
Cost
8/30/00
34% to 39%
6% to 8%
15% to 33%
CS252/Kubiatowicz
Lec 1.54
Chip Prices (August 1993)
• Assume purchase 10,000 units
Chip
386DX
Area Mfg. Price Multi- Comment
mm2
cost
43
$9
486DX2
81
PowerPC 601 121
8/30/00
plier
$31
$35 $245
$77 $280
3.4 Intense Competition
7.0 No Competition
3.6
DEC Alpha
234 $202 $1231
6.1 Recoup R&D?
Pentium
296 $473 $965
2.0 Early in shipments
CS252/Kubiatowicz
Lec 1.55
Summary: Price vs. Cost
100%
80%
Av erage Discount
60%
Gross Margin
40%
Direct Costs
20%
Component Cost s
0%
Mini
5
4
W/S
PC
4.7
3.5
3.8
Av erage Discount
2.5
3
Gross Margin
1.8
2
Direct Costs
1.5
1
Component Costs
0
Mini
8/30/00
W/S
PC
CS252/Kubiatowicz
Lec 1.56
Summary, #1
• Designing to Last through Trends
Capacity
Logic
•
2x in 3 years
Speed
2x in 3 years
SPEC RATING:
2x in 1.5 years
DRAM
4x in 3 years
2x in 10 years
Disk
4x in 3 years
2x in 10 years
6yrs to graduate => 16X CPU speed, DRAM/Disk size
• Time to run the task
–
Execution time, response time, latency
• Tasks per day, hour, week, sec, ns, …
–
Throughput, bandwidth
• “X is n times faster than Y” means
8/30/00
ExTime(Y)
--------ExTime(X)
=
Performance(X)
-------------Performance(Y)
CS252/Kubiatowicz
Lec 1.57
Summary, #2
• Amdahl’s Law:
Speedupoverall =
• CPI Law:
CPU time
ExTimeold
ExTimenew
1
=
(1 - Fractionenhanced) + Fractionenhanced
Speedupenhanced
= Seconds
Program
= Instructions x
Program
Cycles
x Seconds
Instruction
Cycle
• Execution time is the REAL measure of computer
performance!
• Good products created when have:
– Good benchmarks, good ways to summarize performance
8/30/00
• Die Cost goes roughly with die area4
• Can PC industry support engineering/research
investment?
CS252/Kubiatowicz
Lec 1.58