Transcript Slide 1
Moore’s Law in Microprocessors Transistors on lead microprocessors double every 2 years 1000 2X growth in 1.96 years! Transistors (MT) 100 10 486 1 386 286 0.1 0.01 P6 Pentium® proc 8086 8080 8008 4004 8085 0.001 1970 1980 1990 Year 2000 2010 1 Evolution in DRAM Chip Capacity 100000000 10000000 64,000,000 4X growth every 3 years! 16,000,000 Kbit capacity/chip 4,000,000 1000000 1,000,000 256,000 100000 64,000 16,000 10000 4,000 1000 1,000 256 100 64 10 1980 0.07 m 0.1 m 0.13 m 0.18-0.25 m 0.35-0.4 m 0.5-0.6 m 0.7-0.8 m 1.0-1.2 m 1.6-2.4 m 1983 1986 1989 1992 1995 Year 1998 2001 2004 2007 2010 2 Die Size Growth Die size grows by 14% to satisfy Moore’s Law Die size (mm) 100 P6 486 Pentium ® proc 10 386 8080 8008 4004 8086 8085 286 ~7% growth per year ~2X growth in 10 years 1 1970 1980 1990 Year 2000 2010 3 Clock Frequency Lead microprocessors frequency doubles every 2 years 10000 2X every 2 years Frequency (Mhz) 1000 P6 100 Pentium ® proc 486 10 8085 1 0.1 1970 8086 286 386 8080 8008 4004 1980 1990 Year Courtesy, Intel 2000 2010 4 Power Dissipation Lead Microprocessors power continues to increase Power (Watts) 100 P6 Pentium ® proc 10 8086 286 1 8008 4004 486 386 8085 8080 0.1 1971 1974 1978 1985 1992 2000 Year Power delivery and dissipation will be prohibitive 5 Power Density Power Density (W/cm2) 10000 Rocket Nozzle 1000 Nuclear Reactor 100 8086 10 4004 Hot Plate P6 8008 8085 Pentium® proc 386 286 486 8080 1 1970 1980 1990 Year 2000 2010 Power density too high to keep junctions at low temp 6 Design Productivity Trends 100,000 Logic Tr./Chip 10,000 Tr./Staff Month. 1,000 100 58%/Yr. compounded Complexity growth rate 10 100 1 10 x 0.1 xx x x 0.01 x 1 21%/Yr. compound Productivity growth rate x x Productivity (K) Trans./Staff - Mo. 1,000 0.1 0.01 2009 2007 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 1985 1983 0.001 1981 Logic Transistor per Chip (M) Complexity 10,000 Complexity outpaces design productivity Courtesy, ITRS Roadmap 7 SIA Roadmap Year 1999 2002 2005 2008 2011 2014 Feature size (nm) Mtrans/cm2 Chip size (mm2) 180 7 170 130 14-26 170214 100 47 235 Signal pins/chip Clock rate (MHz) Wiring levels 768 600 6-7 1024 800 7-8 1024 1280 1408 1472 1100 1400 1800 2200 8-9 9 9-10 10 Power supply (V) High-perf power (W) Battery power (W) 1.8 90 1.5 130 1.2 160 0.9 170 0.6 174 0.6 183 1.4 2.0 2.4 2.0 2.2 2.4 70 115 269 50 284 308 35 701 354 8 9 10 Design Abstraction Levels SYSTEM MODULE + GATE CIRCUIT Vin Vout DEVICE G S n+ D n+ 11 Major Design Challenges • Microscopic issues – ultra-high speeds – power dissipation and supply rail drop – growing importance of interconnect – noise, crosstalk – reliability, manufacturability – clock distribution • Year Tech. Complexity Frequenc y 1997 1998 0.35 0.25 13 M Tr. 20 M Tr. 400 MHz 500 MHz 1999 2002 0.18 0.13 32 M Tr. 130 M Tr. 600 MHz 800 MHz Macroscopic issues – time-to-market – design complexity (millions of gates) – high levels of abstractions – reuse and IP, portability – systems on a chip (SoC) – tool interoperability 3 Yr. Design Staff Size 210 270 360 800 Staff Costs $90 M $120 M $160 M 12 $360 M 13 14 15 16 17 18 19 20 21 Programmable Logic Technologies Fuse and anti-fuse Fuse makes or breaks link between two wires Typical connections are 50-300 ohm One-time programmable (testing before programming?) Very high density EPROM and EEPROM High power consumption Typical connections are 2K-4K ohm Fairly high density RAM-based Memory bit controls a switch that connects/disconnects two wires Typical connections are .5K-1K ohm 22 Can be programmed and re-programmed in the circuit Low density Altera EPLD (Erasable Programmable Logic Devices) • • Historical Perspective – PALs: same technology as programmed once bipolar PROM – EPLDs: CMOS erasable programmable ROM (EPROM) erased by UV light Altera building block = MACROCELL CLK 8 Product Term AND-OR Array + Programmable MUX's Clk MUX AND ARRAY Output MUX Q pad I/O Pin Inv ert Control F/B MUX Programmable polarity Seq. Logic Block Programmable feedback 23 Altera EPLD Altera EPLDs contain 8 to 48 independently programmed macrocells Global CLK Personalized by EPROM bits: Clk MUX Synchronous Mode 1 Flipflop controlled by global clock signal OE/Local CLK Q EPROM Cell Global CLK Clk MUX local signal computes output enable Asynchronous Mode 1 OE/Local CLK Q Flipflop controlled by locally generated clock signal EPROM Cell + Seq Logic: could be D, T positive or negative edge triggered + product term to implement clear function 24 Actel Logic Module SOA S0 Basic Module is a Modified 4:1 Multiplexer S1 D0 2:1 MUX D1 2:1 MUX Y D2 2:1 MUX R "0" D3 SOB Example: Implementation of S-R Latch 2:1 MUX "0" 2:1 MUX Q "1" 2:1 MUX S 25 Actel Interconnect Logic Module Horizontal Track Vertical Track Anti-fuse Interconnection Fabric 26 Xilinx Programmable Gate Arrays IOB IOB IOB IOB IOB CLB IOB CLB IOB Wiring Channels CLB CLB IOB • CLB - Configurable Logic Block – 5-input, 1 output function – or 2 4-input, 1 output functions – optional register on outputs • Built-in fast carry logic • Can be used as memory • Three types of routing – direct – general-purpose – long lines of various lengths • RAM-programmable – can be reconfigured 27 CLB Slew Rate Control CLB D Q Passive Pull-Up, Pull-Down Output Buffer Switch Matrix Vcc Pad Input Buffer CLB Q CLB Programmable Interconnect D Delay I/O Blocks (IOBs) C1 C2 C3 C4 H1 DIN S/R EC S/R Control G4 G3 G2 G1 DIN G Func. Gen. SD F' H' EC RD 1 F4 F3 F2 F1 H Func. Gen. F Func. Gen. Y G' H' S/R Control DIN SD F' D G' Q H' 1 H' K Q D G' F' EC RD X Configurable Logic Blocks (CLBs) 28 The Xilinx 4000 CLB 29 Xilinx 4000 Interconnect 30 Switch Matrix 31 Xilinx 4000 Interconnect Details 32 Computer-Aided Design • Can't design FPGAs by hand – Way too much logic to manage, hard to make changes • Hardware description languages – Specify functionality of logic at a high level • Validation: high-level simulation to catch specification errors – Verify pin-outs and connections to other system components – Low-level to verify mapping and check performance • Logic synthesis – Process of compiling HDL program into logic gates and flip-flops • Technology mapping – Map the logic onto elements available in the implementation technology (LUTs for Xilinx FPGAs) 33 CAD Tool Path (cont’d) • Placement and routing – Assign logic blocks to functions – Make wiring connections • Timing analysis - verify paths – Determine delays as routed – Look at critical paths and ways to improve • Partitioning and constraining – If design does not fit or is unroutable as placed split into multiple chips – If design it too slow prioritize critical paths, fix placement of cells, etc. – Few tools to help with these tasks exist today • Generate programming files - bits to be loaded into chip for configuration 34 Xilinx CAD Tools • Verilog (or VHDL) use to specify logic at a high-level – Combine with schematics, library components • Synopsys – Compiles Verilog to logic – Maps logic to the FPGA cells – Optimizes logic • Xilinx APR - automatic place and route (simulated annealing) – Provides controllability through constraints – Handles global signals • Xilinx Xdelay - measure delay properties of mapping and aid in iteration • Xilinx XACT - design editor to view final mapping results 35