ECE 448 Lecture 7 FPGA Devices ECE 448 – FPGA and ASIC Design with VHDL George Mason University.

Download Report

Transcript ECE 448 Lecture 7 FPGA Devices ECE 448 – FPGA and ASIC Design with VHDL George Mason University.

ECE 448
Lecture 7
FPGA Devices
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Reading
Required
• P. Chu, FPGA Prototyping by VHDL Examples
Chapter 2.2, FPGA
Recommended
• S. Brown and Z. Vranesic, Fundamentals of Digital
Logic with VHDL Design
Chapter 3.6.5 Field-Programmable Gate Arrays
ECE 448 – FPGA and ASIC Design with VHDL
2
Recommended Reading
Xilinx, Inc.
Spartan-3E FPGA Family
Module 1:
• Introduction
• Features
• Architectural Overview
• Package Marking
Module 2:
• Configurable Logic Block (CLB)
and Slice Resources
• Dedicated Multipliers
ECE 448 – FPGA and ASIC Design with VHDL
3
Required Reading
Xilinx, Inc.
Spartan-3 Generation FPGA User Guide
Extended Spartan-3A, Spartan-3E, and Spartan-3
FPGA Families
Chapter 5 Using Configurable Logic Blocks (CLBs)
Chapter 6 Using Look-Up Tables as Distributed
RAM
Chapter 7 Using Look-Up Tables as Shift Registers
(SRL16) [up to Library Primitives]
ECE 448 – FPGA and ASIC Design with VHDL
4
Two competing implementation approaches
ASIC
Application Specific
Integrated Circuit
FPGA
Field Programmable
Gate Array
• designed all the way
from behavioral description
to physical layout
• no physical layout design;
design ends with
a bitstream used
to configure a device
• designs must be sent
for expensive and time
consuming fabrication
in semiconductor foundry
ECE 448 – FPGA and ASIC Design with VHDL
• bought off the shelf
and reconfigured by
designers themselves
5
What is an FPGA?
Configurable
Logic
Blocks
Block RAMs
Block RAMs
I/O
Blocks
Block
RAMs
ECE 448 – FPGA and ASIC Design with VHDL
6
Which Way to Go?
ASICs
FPGAs
Off-the-shelf
High performance
Low development cost
Low power
Short time to market
Low cost in
high volumes
ECE 448 – FPGA and ASIC Design with VHDL
Reconfigurability
7
Other FPGA Advantages
• Manufacturing cycle for ASIC is very costly,
lengthy and engages lots of manpower
• Mistakes not detected at design time have
large impact on development time and cost
• FPGAs are perfect for rapid prototyping of
digital circuits
• Easy upgrades like in case of software
• Unique applications
• reconfigurable computing
ECE 448 – FPGA and ASIC Design with VHDL
8
Major FPGA Vendors
SRAM-based FPGAs
• Xilinx, Inc.
~ 51% of the market
• Altera Corp. ~ 34% of the market
• Lattice Semiconductor
• Atmel
~ 85%
Flash & antifuse FPGAs
• Actel Corp. (Microsemi SoC Products Group)
• Quick Logic Corp.
ECE 448 – FPGA and ASIC Design with VHDL
9
Xilinx

Primary products: FPGAs and the associated CAD
software
Programmable
Logic Devices


ISE Alliance and Foundation
Series Design Software
Main headquarters in San Jose, CA
Fabless* Semiconductor and Software Company




UMC (Taiwan) {*Xilinx acquired an equity stake in UMC in 1996}
Seiko Epson (Japan)
TSMC (Taiwan)
Samsung (Korea)
ECE 448 – FPGA and ASIC Design with VHDL
10
Xilinx FPGA Families
•
•
High-performance families
• Virtex (220 nm)
• Virtex-E, Virtex-EM (180 nm)
• Virtex-II (130 nm)
• Virtex-II PRO (130 nm)
• Virtex-4 (90 nm)
• Virtex-5 (65 nm)
• Virtex-6 (40 nm)
• Virtex-7 (28 nm)
Low Cost Family
• Spartan/XL – derived from XC4000
• Spartan-II – derived from Virtex
• Spartan-IIE – derived from Virtex-E
• Spartan-3 (90 nm)
• Spartan-3E (90 nm) – logic optimized
• Spartan-3A (90 nm) – I/O optimized
• Spartan-3AN (90 nm) – non-volatile,
• Spartan-3A DSP (90 nm) – DSP optimized
• Spartan-6 (45 nm)
• Artix-7 (28 nm)
ECE 448 – FPGA and ASIC Design with VHDL
11
ECE 448 – FPGA and ASIC Design with VHDL
12
CLB Structure
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
General structure of an FPGA
Programmable
interconnect
Programmable
logic blocks
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
14
Xilinx Spartan 3E CLB
Configurable logic block (CLB)
CLB
CLB
CLB
CLB
Slice
Slice
Logic cell
Logic cell
Logic cell
Logic cell
Slice
Slice
Logic cell
Logic cell
Logic cell
Logic cell
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
15
CLB Slice = 2 Logic Cells
COUT
YB
G4
G3
G2
G1
Y
Look-Up
O
Table
D
Carry
&
Control
Logic
S
Q
CK
EC
R
F5IN
BY
SR
XB
F4
F3
F2
F1
X
Look-Up
Table O
CIN
CLK
CE
ECE 448 – FPGA and ASIC Design with VHDL
Carry
&
Control
Logic
S
D
Q
CK
EC
R
SLICE
16
Xilinx Multipurpose LUT (MLUT)
16-bit SR
16 x 1 RAM
4-input
LUT
16 x 1 ROM
(logic)
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
17
CLB Structure
ECE 448 – FPGA and ASIC Design with VHDL
18
CLB Slice Structure
• Each slice contains two sets of the
following:
• Four-input LUT
• Any 4-input logic function,
• or 16-bit x 1 sync RAM (SLICEM only)
• or 16-bit shift register (SLICEM only)
• Carry & Control
• Fast arithmetic logic
• Multiplier logic
• Multiplexer logic
• Storage element
•
•
•
•
Latch or flip-flop
Set and reset
True or inverted inputs
Sync. or async. control
ECE 448 – FPGA and ASIC Design with VHDL
19
Multipurpose Look-Up Table (MLUT)
COUT
YB
G4
G3
G2
G1
Y
Look-Up
O
Table
D
Carry
&
Control
Logic
S
Q
CK
EC
R
F5IN
BY
SR
XB
F4
F3
F2
F1
CIN
CLK
CE
X
Look-Up
Table O
Carry
&
Control
Logic
S
D
Q
CK
EC
R
SLICE
20
MLUT as 16x1 ROM
16-bit SR
16 x 1 RAM
4-input
LUT
16 x 1 ROM
(logic)
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
21
LUT (Look-Up Table) in the Basic ROM
Mode
x1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
x2
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
x3
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
x4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
x1
x2
x3
x4
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
LUT
y
x1 x2 x3 x4
x1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
x2
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
x3
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
x4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
• Look-Up tables
are primary
elements for
logic
implementation
• Each LUT can
implement any
function of
4 inputs
x1 x2
y
y
ECE 448 – FPGA and ASIC Design with VHDL
22
5-Input Functions implemented using
two LUTs
• One CLB Slice can implement any function of 5 inputs
• Logic function is partitioned between two LUTs
• F5 multiplexer selects LUT
A4
A3
LUT
ROM
RAM
D
A2
A1
WS
DI
F5
0
F4
A4
F3
A3
F2
A2
F1
A1
BX
WS
DI
D
1
F5
GXOR
X
G
LUT
ROM
RAM
nBX
BX
1
0
ECE 448 – FPGA and ASIC Design with VHDL
23
5-Input Functions implemented using two LUTs
X5 X4 X3 X2 X1
0 0 0 0 0
0 0 0 0 1
0 0 0 1 0
0 0 0 1 1
0 0 1 0 0
0 0 1 0 1
0 0 1 1 0
0 0 1 1 1
0 1 0 0 0
0 1 0 0 1
0 1 0 1 0
0 1 0 1 1
0 1 1 0 0
0 1 1 0 1
0 1 1 1 0
0 1 1 1 1
1 0 0 0 0
1 0 0 0 1
1 0 0 1 0
1 0 0 1 1
1 0 1 0 0
1 0 1 0 1
1 0 1 1 0
1 0 1 1 1
1 1 0 0 0
1 1 0 0 1
1 1 0 1 0
1 1 0 1 1
1 1 1 0 0
1 1 1 0 1
1 1 1 1 0
1 1 1 1 1
Y
0
1
0
0
1
1
0
0
1
0
0
1
1
1
1
1
0
0
0
0
0
0
0
1
0
1
0
1
0
1
0
0
LUT
OUT
LUT
ECE 448 – FPGA and ASIC Design with VHDL
24
MLUT as 16x1 RAM
16-bit SR
16 x 1 RAM
4-input
LUT
16 x 1 ROM
(logic)
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
25
Distributed RAM
RAM16X1S
• CLB LUT configurable as
Distributed RAM
• A single LUT equals 16x1
RAM
• Two LUTs Implement Single
and Dual-Port RAMs
• Cascade LUTs to increase
RAM size
• Synchronous write
• Synchronous/Asynchronous
read
• Accompanying flip-flops used
for synchronous read
D
WE
WCLK
A0
A1
A2
A3
=
LUT
O
RAM32X1S
D
WE
WCLK
A0
A1
A2
A3
A4
LUT
=
LUT
or
O
RAM16X2S
D0
D1
WE
WCLK
A0
A1
A2
A3
O0
O1
or
RAM16X1D
D
WE
WCLK
A0
SPO
A1
A2
A3
DPRA0 DPO
DPRA1
DPRA2
DPRA3
ECE 448 – FPGA and ASIC Design with VHDL
26
MLUT as 16-bit Shift Register (SRL16)
16-bit SR
16 x 1 RAM
4-input
LUT
16 x 1 ROM
(logic)
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
27
Shift Register
LUT
• Each LUT can be
configured as shift register
IN
CE
CLK
• Serial in, serial out
• Dynamically addressable
delay up to 16 cycles
• For programmable
pipeline
• Cascade for greater cycle
delays
• Use CLB flip-flops to add
depth
LUT
=
D
CE
Q
D
CE
Q
D
CE
Q
D
CE
Q
OUT
DEPTH[3:0]
ECE 448 – FPGA and ASIC Design with VHDL
28
Using Multipurpose Look-Up Tables
in the Shift Register Mode (SRL16)
Inferred from behavioral description in VHDL
for shift-registers with
- one serial input, one serial output
- no reset, no set
ECE 448 – FPGA and ASIC Design with VHDL
29
Cascading LUT Shift Registers into Shift
Registers Longer than 16 bits
ECE 448 – FPGA and ASIC Design with VHDL
30
Shift Register
12 Cycles
64
Operation A
Operation B
4 Cycles
8 Cycles
64
Operation C
3 Cycles
3 Cycles
9-Cycle imbalance
• Register-rich FPGA
• Allows for addition of pipeline stages to increase
throughput
• Data paths must be balanced to keep desired
functionality
ECE 448 – FPGA and ASIC Design with VHDL
31
Logic Cell = ½ of a CLB Slice
ECE 448 – FPGA and ASIC Design with VHDL
32
CLB Slice = 2 Logic Cells
ECE 448 – FPGA and ASIC Design with VHDL
33
Examples:
Determine the amount of
Spartan 3 resources needed
to implement a given circuit
George Mason University
Circuit 1:
Top level
m
0
w
1
run
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
clk
a
b
c
d
F
y
Circuit 1:
F – function
a
b
a
b
c
d
y3
w1
y2
w0
y1
En
y0
2-to-4 Decoder
a
x3
y3
b
x2
y2
<<<3
c
x1
y1
d
x0
y0
e
0
1
2
3
4
5
6
7
1
e
1
0
f
y
0
3
f
g
s
h
cout
Full
Adder
cin
x
y
g
h
c
d
Circuit 2:
Top level
0
z
1
run
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
clk
a
b
c
d
e
F
y
Circuit 2:
F – function
a
e
a
w3
y1
b
w2
y0
c
w1
z
d
w0 Encoder
Priority
a
b
c
x3
y3
x2
y2 g
x1
y1 h
>>2
d
x0
y0
f
0
1
2
3
4
5
6
7
1
g
1
0
h
y
0
3
s
i
cout
Half
Adder
x
y
e
i
Carry & Control Logic
COUT
YB
G4
G3
G2
G1
Y
Look-Up
O
Table
D
Carry
&
Control
Logic
S
Q
CK
EC
R
F5IN
BY
SR
XB
F4
F3
F2
F1
X
Look-Up
Table O
CIN
CLK
CE
ECE 448 – FPGA and ASIC Design with VHDL
Carry
&
Control
Logic
S
D
Q
CK
EC
R
SLICE
39
Full-adder
cout
FA
s
2
x
y
cin
1
x + y + cin = ( cout s )2
x
0
0
0
0
1
1
1
1
y
0
0
1
1
0
0
1
1
cin cout
0 0
1 0
0 0
1 1
0 0
1 1
0 1
1 1
s
0
1
1
0
1
0
0
1
Full-adder
Alternative implementations
x
0
0
1
1
y
0
1
0
1
cout
0
cin
cin
1
s
cin
cin
cin
cin
Full-adder
Alternative implementations
Implementation used to generate fast carry logic
in Xilinx FPGAs
x
0
0
1
1
y
0
1
0
1
cout
y
cin
cin
y
Cout
0
1
S
x
y
A2
p=xy
g=y
s= p  cin = x  y  cin
D
p
XOR
A1
g
Cin
Carry & Control Logic in Spartan 3 FPGAs
LUT
Hardwired (fast) logic
Critical Path for an
Adder Implemented Using
Xilinx Spartan 3/Spartan 3E
FPGAs
Number and Length of Carry Chains
for Spartan 3E FPGAs
Bottom Operand Input to Carry Out Delay
TOPCYF
0.9 ns for Spartan 3
Carry Propagation Delay
tBYP
0.2 ns for Spartan 3
Carry Input to Top Sum Combinational Output Delay
TCINY
1.2 ns for Spartan 3
Fast Carry Logic
Each CLB contains separate
logic and routing for the fast
generation of sum & carry
signals
MSB
Carry Logic
Routing

• Increases efficiency and
performance of adders,
subtractors, accumulators,
comparators, and counters

Carry logic is independent of
normal logic and routing
resources
ECE 448 – FPGA and ASIC Design with VHDL
LSB
49
Accessing Carry Logic

All major synthesis tools can infer carry
logic for arithmetic functions
•
•
•
•
Addition (SUM <= A + B)
Subtraction (DIFF <= A - B)
Comparators (if A < B then…)
Counters (count <= count +1)
ECE 448 – FPGA and ASIC Design with VHDL
50
Embedded Multipliers
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
RAM Blocks and Multipliers in Xilinx
FPGAs
RAM blocks
Multipliers
Logic blocks
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
52
Combinational and Registered Multiplier
ECE 448 – FPGA and ASIC Design with VHDL
53
Dedicated Multiplier Block
ECE 448 – FPGA and ASIC Design with VHDL
54
Interface of a Dedicated Multiplier
ECE 448 – FPGA and ASIC Design with VHDL
55
3 Ways to Use Dedicated Hardware
• Three (3) ways to use dedicated
(embedded) hardware
– Inference
– Instantiation
– CORE Generator
56
Inferred Multiplier
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity mult18x18 is
generic (
word_size
: natural
:= 17;
signed_mult : boolean
:= true);
port (
clk : in
std_logic;
a
: in
std_logic_vector(1*word_size-1 downto 0);
b
: in
std_logic_vector(1*word_size-1 downto 0);
c
: out
std_logic_vector(2*word_size-1 downto 0));
end entity mult18x18;
architecture infer of mult18x18 is
begin
process(clk)
begin
if rising_edge(clk) then
if signed_mult then
c
<= std_logic_vector(signed(a) * signed(b));
else
c
<= std_logic_vector(unsigned(a) * unsigned(b));
end if;
end if;
end process;
end architecture infer;
Unsigned vs. Signed Multiplication
Unsigned
Signed
1111
x 1111
15
x 15
1111
x 1111
-1
x -1
11100001
225
00000001
1
ECE 448 – FPGA and ASIC Design with VHDL
58
Forcing a particular implementation in VHDL
Synthesis tool: Xilinx XST
Attribute MULT_STYLE: string;
Attribute MULT_STYLE of
mult18x18: entity is block;
Allowed values of the attribute:
block – dedicated multiplier
lut - LUT-based multiplier
pipe_block – pipelined dedicated multiplier
pipe_lut – pipelined LUT-based multiplier
auto – automatic choice by the synthesis tool
CORE Generator
CORE Generator
FPGA Block RAM
62
Block RAM
Port B
Port A
Spartan-3
Dual-Port
Block RAM
Block RAM
• Most efficient memory implementation
• Dedicated blocks of memory
• Ideal for most memory requirements
• 4 to 36 memory blocks in Spartan 3E
• 18 kbits = 18,432 bits per block (16 k without parity bits)
• Use multiple blocks for larger memories
• Builds both single and true dual-port RAMs
• Synchronous write and read (different from distributed RAM)
63
RAM Blocks and Multipliers in Xilinx FPGAs
RAM blocks
Multipliers
Logic blocks
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
64
Spartan-3E Block RAM Amounts
65
Block RAM can have various configurations (port
aspect ratios)
1
2
0
4
0
0
4k x 4
8k x 2
4,095
16k x 1
8,191
8+1
0
2k x (8+1)
2047
16+2
0
1023
1024 x (16+2)
16,383
66
Block RAM Port Aspect Ratios
67
Single-Port Block RAM
DO[w-p-1:0]
DI[w-p-1:0]
68
Dual-Port Block RAM
DOA[wA-pA-1:0]
DIA[wA-pA-1:0]
DOA[wB-pB-1:0]
DIB[wB-pB-1:0]
69
Input/Output Blocks
(IOBs)
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Basic I/O Block Structure
D Q
EC
Three-State
FF Enable
Clock
SR
Three-State
Control
Set/Reset
D Q
EC
Output
FF Enable
Output Path
SR
Direct Input
FF Enable
Registered
Input
Q
D
EC
Input Path
SR
ECE 448 – FPGA and ASIC Design with VHDL
71
IOB Functionality
• IOB provides interface between the
package pins and CLBs
• Each IOB can work as uni- or bi-directional
I/O
• Outputs can be forced into High Impedance
• Inputs and outputs can be registered
• advised for high-performance I/O
• Inputs can be delayed
ECE 448 – FPGA and ASIC Design with VHDL
72
Spartan-3E Family Attributes
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Spartan-3E FPGA Family Members
ECE 448 – FPGA and ASIC Design with VHDL
74
FPGA Nomenclature
ECE 448 – FPGA and ASIC Design with VHDL
75
FPGA device present on the
Digilent Basys2 board
XC3S100E-4CP132
Spartan 3E
100 k
family
equivalent
logic gates
ECE 448 – FPGA and ASIC Design with VHDL
speed
grade
-4
= standard
performance
132 pins
package type
76