Transcript ENGS116 F04

ENGS 116 Lecture 1

ENGS 116 / COSC 107 Computer Architecture Introduction

Vincent H. Berk September 24 th , 2008 Reading for Friday: Chapter 1.1 – 1.4, Amdahl article Reading for Monday: 1.5 – 1.11

1

ENGS 116 Lecture 1

Prerequisite Knowledge

• Assembly language programming • Fundamentals of logic design  Combinational and sequential components (e.g., gates, multiplexers, decoders, ROMs, flip-flops, registers, RAMs) • Processor Design  Instruction cycle, pipelining, branch prediction, exceptions • Memory Hierarchy  Caches (direct-mapped, fully-associative, 2-way set associative), spatial locality, temporal locality, virtual memory, translation lookaside buffer (TLB) • Input and Output  Polling, interrupts • Multiprocessors 2

ENGS 116 Lecture 1

What is Computer Architecture?

Two viewpoints: • Hardware designer’s viewpoint: CPUs, caches, buses, pipelines, physical memory, etc.

• Programmer’s viewpoint: instruction set – opcodes, addressing modes, registers, virtual memory, etc.

 Study of architecture covers both instruction-set architectures and machine implementation organizations.

3

ENGS 116 Lecture 1

Computer Architecture Is ...

The attributes of a [computing] system as seen by the programmer, i.e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls the logic design, and the physical implementation.

Amdahl, Blaauw, and Brooks, 1964 4

ENGS 116 Lecture 1

Computer Architecture’s Changing Definition

• 1950s to 1960s: Computer Architecture Course = Computer Arithmetic.

• 1970s to 1980s: Computer Architecture Course = Instruction Set Design, especially ISA appropriate for compilers.

• 1990s to 2000s: Computer Architecture Course = Design of CPU, memory system, I/O • 2000 to now: Computer Architecture Course = ILP, DLP, TLP, storage 5

ENGS 116 Lecture 1 6

ENGS 116 Lecture 1

5 Generations of Electronic Computers (Hwang)

First

(1945-54)

Second

(1955-64)

Third

(1965-74)

Fourth

(1975-90)

Fifth

(1991 present) Vacuum tubes and relay memories, CPU driven by PC and accumulator, fixed-point arithmetic Discrete transistors and core memories, floating-point arithmetic, I/O processors, multiplexed memory access.

Integrated circuits (SSI/MSI), microprogramming, pipelining, cache and lookahead.

LSI/VLSI and semi-conductor memory, multiprocessors, vector super-computers, multicomputers.

ULSI/VHSIC processors, memory, and switches, high-density packaging, scalable architectures Machine/assembly lan-guages, single user, no subroutine linkage, pro-grammed I/O using CPU.

HLL used with compilers, subroutine libraries, batch processing monitor.

Multiprogramming and time sharing OS, multiuser applications Multiprocessor OS, languages, compilers, and environments for parallel processing.

Massively parallel process-ing, grand challenge appli-cations, heterogeneous pro-cessing.

ENIAC, Princeton IAS, IBM 701 IBM 7030, CDC 1604, Univac LARC IBM 360/370, CDC 6600, TI-ASC, PDP-8 VAX/900, Cray X/MP, IBM 3090, BBN TC2000.

IBM/MPP, Cray/MPP, TMC/CM-5, Intel Paragon.

7

ENGS 116 Lecture 1

Computer Tasks

• Desktop Computing, Lightweight Servers, Laptops  Price-performance (low cost)  Communication, Graphics • Server Computing, Mainframe Systems   Specific performance, processing power, storage Availability, Reliability • Embedded Computers and DSPs    Power and Memory requirements Lowest cost for required performance Real-time or soft-real-time performance 8

ENGS 116 Lecture 1

Task of Computer Designer

• Determine which attributes are important for a new machine.

• Design a machine to meet functional requirements, price, power and performance goals.

9

ENGS 116 Lecture 1

Basic Computer Organization

Processor Control Input Memory Datapath Output 10

ENGS 116 Lecture 1

Computer Architecture Topics

Input/Output and Storage Disks, WORM, Tape Memory Hierarchy SDRAM L2/L3 Cache L1 Cache VLSI Instruction Set Architecture Pipelining, Hazard Resolution, Superscalar, Reordering, Prediction, Speculation RAID Emerging Technologies Interleaving Bus protocols Multi-Core Coherence, Bandwidth, Latency Addressing, Protection, Exception Handling Pipelining, Instruction Level Parallelism, Thread Level Parallelism

11

ENGS 116 Lecture 1

Computer Architecture Topics

P M P M ° ° ° P M P M Shared Memory, Message Passing, Data Parallelism S Interconnection Network Processor-Memory-Switch Multiprocessors Networks and Interconnections Network Interfaces Topologies, Routing, Bandwidth, Latency, Reliability

12

ENGS 116 Lecture 1

Course Focus

Understanding the design techniques, machine structures, technology factors, and evaluation methods that will determine the form of computers in the 21st Century 13

Parallelism Applications Technology Operating Systems

Computer Architecture: • Instruction Set Design • Organization • Hardware

Measurement & Evaluation Programming Languages

Interface Design (ISA)

History

ENGS 116 Lecture 1

Technology Trends

• Integrated circuit logic technology  transistor density (feature size)    transistor count cycle speed multiple cores • Semiconductor DRAM  density  latency and bandwidth • Magnetic disk technology  density  access time • Network technology  bandwidth  latency 14

ENGS 116 Lecture 1

Scaling in ICs

• Feature size: minimum size of a single distinguishable/producible item on a chip die  1971 – 10 microns  2001 – 0.18 microns  2003 – 0.06 microns  2006 – 5 nanometers (0.005 microns) • Complex relationships:  Transistor density increases quadratically with decrease in feature size  Reduction in feature size requires voltage reduction to maintain correct operation and reasonable reliability • Scaling IC wiring:  Signal delay increases with product of resistance and capacitance  Shorter wires can be smaller  Smaller features have higher current leakage 15

ENGS 116 Lecture 1

Power Consumption of ICs

• Power requirements per transistor are proportional to load capacitance, frequency of switching and the square of the voltage.

Power = ½ x Capacitance x Voltage 2 x Frequency switched • Switching frequency and density of transistors increases faster than decrease in capacitance and voltage, leading to increased power consumption == generated heat • Pentium 4 consumes 135 Watts of power while the 8086 i386 did not even feature a heat-sink 16

ENGS 116 Lecture 1

Cost and Price

• Cost of manufacturing decreases over time: learning curve • Learning curve is measured as an increase in yield • Volume doubling leads to 10% reduction in cost • Commodity products tend to decrease cost:  Volume  Competition  Efficiency 17

ENGS 116 Lecture 1

Difference between Cost and Price

18

ENGS 116 Lecture 1

Wafers and Dies

• Chips are produced on round silicon disks • Dies are the actual chip, cut out from the wafer • Testing occurs before cutting and after packaging 19

ENGS 116 Lecture 1

Yield and Cost

20 • However:  Wafers do not just contain chip-dies, usually a large area, including several chip-dies, is dedicated for test equipment hook-up  Actual yield in mass-production chip-fabs varies between 98% for DRAMS to 1% for new Processors

ENGS 116 Lecture 1

Yield and Cost

• Switch from 200mm to 300mm wafers:  Although 300mm wafers have lower yield than 200mm wafers, the overhead processing costs per wafer are high enough to make 300mm wafers more cost effective.

• Redundancy in dies:  Single transistors do fail during production, causing memory cells, pipeline stages, control logic sections to fail  Redundancy is built into the each die by introducing backup-units  After testing, backup units are enabled and failed units can be disabled by LASER  This decreases the chances of small flaws failing an entire die  Few companies give insight into their redundant circuitry numbers 21

ENGS 116 Lecture 1

Performance

Hwang: “The ideal performance of a computer system demands a perfect match between machine capability and program behavior.” Machine capability – enhanced with better hardware technology, innovative architectural features, efficient resource management.

Program behavior – affected by algorithm design, data structures, language efficiency, programmer skill, compiler technology.

To improve software performance, need to understand how various hardware factors affect overall system performance!

22

ENGS 116 Lecture 1

Measuring Performance

• Key measure is

time

.

• Response time (execution time): Time between start and completion of a task.

• Throughput: total amount of work completed in a given time.

Seconds Program

=

Instr count Program  Clock Cycles Instr count  Seconds Clock Cycle 23

ENGS 116 Lecture 1

Comparing Design Alternatives

X is

n

times faster than Y” means

24

ENGS 116 Lecture 1

Benchmarking

• Real programs; e.g., compilers, photo editing • Modified or scripted real programs; e.g., compression algorithms 25 • Kernels – small, key pieces from real programs; e.g., Livermore Loops, Linpack.

• Toy benchmarks – typically 10 to 100 lines of code, useful primarily for intro programming assignments; e.g., quicksort, prime numbers, encryption • Synthetic benchmarks – try to match average frequency of operations and operands for a set of programs; e.g., Whetstone, Dhrystone.

• Benchmark suites – collections of programs; e.g, SPEC CPU2000