ECE 448 Lecture 6 FPGA Devices & FPGA Design Flow ECE 448 – FPGA and ASIC Design with VHDL George Mason University.
Download
Report
Transcript ECE 448 Lecture 6 FPGA Devices & FPGA Design Flow ECE 448 – FPGA and ASIC Design with VHDL George Mason University.
ECE 448
Lecture 6
FPGA Devices
& FPGA Design Flow
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Required reading (1)
• P. Chu, FPGA Prototyping by VHDL Examples
Chapter 2.2, FPGA
• S. Brown and Z. Vranesic, Fundamentals of Digital
Logic with VHDL Design
Chapter 3.6.5 Field-Programmable Gate Arrays
ECE 448 – FPGA and ASIC Design with VHDL
2
Required Reading (2)
Xilinx, Inc.
Spartan-3 FPGA Family
Module 1:
• Introduction
• Features
• Architectural Overview
• Package Marking
Module 2:
• CLB Overview
ECE 448 – FPGA and ASIC Design with VHDL
3
Two competing implementation approaches
ASIC
Application Specific
Integrated Circuit
FPGA
Field Programmable
Gate Array
• designed all the way
from behavioral description
to physical layout
• no physical layout design;
design ends with
a bitstream used
to configure a device
• designs must be sent
for expensive and time
consuming fabrication
in semiconductor foundry
ECE 448 – FPGA and ASIC Design with VHDL
• bought off the shelf
and reconfigured by
designers themselves
4
What is an FPGA?
Configurable
Logic
Blocks
Block RAMs
Block RAMs
I/O
Blocks
Block
RAMs
ECE 448 – FPGA and ASIC Design with VHDL
5
Which Way to Go?
ASICs
FPGAs
Off-the-shelf
High performance
Low development cost
Low power
Short time to market
Low cost in
high volumes
ECE 448 – FPGA and ASIC Design with VHDL
Reconfigurability
6
Other FPGA Advantages
• Manufacturing cycle for ASIC is very costly,
lengthy and engages lots of manpower
• Mistakes not detected at design time have
large impact on development time and cost
• FPGAs are perfect for rapid prototyping of
digital circuits
• Easy upgrades like in case of software
• Unique applications
• reconfigurable computing
ECE 448 – FPGA and ASIC Design with VHDL
7
Major FPGA Vendors
SRAM-based FPGAs
• Xilinx, Inc.
Share about 90% of the market
• Altera Corp.
• Atmel
• Lattice Semiconductor
Flash & antifuse FPGAs
• Actel Corp.
• Quick Logic Corp.
ECE 448 – FPGA and ASIC Design with VHDL
8
The Programmable Marketplace
Q1 Calendar Year 2005
PLD Segment
Actel
Lattice
5% 7%
FPGA Sub-Segment
Xilinx
QuickLogic: 2%
Other: 2%
58%
33%
51%
31%
Altera
Xilinx
Altera
11%
All Others
Two dominant suppliers, indicating a maturing market
Source: Company reports
Latest information available; computed on a 4-quarter rolling basis
ECE 448 – FPGA and ASIC Design with VHDL
9
Xilinx
Primary products: FPGAs and the associated CAD
software
Programmable
Logic Devices
ISE Alliance and Foundation
Series Design Software
Main headquarters in San Jose, CA
Fabless* Semiconductor and Software Company
UMC (Taiwan) {*Xilinx acquired an equity stake in UMC in 1996}
Seiko Epson (Japan)
TSMC (Taiwan)
Samsung (Korea)
ECE 448 – FPGA and ASIC Design with VHDL
10
Xilinx FPGA Families
•
•
•
Old families
• XC3000, XC4000, XC5200
• Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern
designs.
High-performance families
• Virtex (220 nm)
• Virtex-E, Virtex-EM (180 nm)
• Virtex-II (130 nm)
• Virtex-II PRO (130 nm)
• Virtex-4 (90 nm)
• Virtex-5 (65 nm)
• Virtex-6 (40 nm) coming in 2009
Low Cost Family
• Spartan/XL – derived from XC4000
• Spartan-II – derived from Virtex
• Spartan-IIE – derived from Virtex-E
• Spartan-3 (90 nm)
• Spartan-3E (90 nm) – logic optimized
• Spartan-3A (90 nm) – I/O optimized
• Spartan-3AN (90 nm) – non-volatile,
• Spartan-3A DSP (90 nm) – DSP optimized
• Spartan-6 (45 nm) – coming in 2009
ECE 448 – FPGA and ASIC Design with VHDL
11
ECE 448 – FPGA and ASIC Design with VHDL
12
CLB Structure
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
General structure of an FPGA
Programmable
interconnect
Programmable
logic blocks
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
14
Xilinx CLB
Configurable logic block (CLB)
CLB
CLB
CLB
CLB
Slice
Slice
Logic cell
Logic cell
Logic cell
Logic cell
Slice
Slice
Logic cell
Logic cell
Logic cell
Logic cell
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
15
CLB Structure
ECE 448 – FPGA and ASIC Design with VHDL
16
Xilinx CLB Slice
Slice
16-bit SR
Logic Cell (LC)
16x1 RAM
4-input
LUT
LUT
16-bit SR
MUX
REG
Logic Cell (LC)
16x1 RAM
4-input
LUT
LUT
MUX
REG
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
17
CLB Slice Structure
• Each slice contains two sets of the
following:
• Four-input LUT
• Any 4-input logic function,
• or 16-bit x 1 sync RAM (SLICEM only)
• or 16-bit shift register (SLICEM only)
• Carry & Control
• Fast arithmetic logic
• Multiplier logic
• Multiplexer logic
• Storage element
•
•
•
•
Latch or flip-flop
Set and reset
True or inverted inputs
Sync. or async. control
ECE 448 – FPGA and ASIC Design with VHDL
18
LUT (Look-Up Table) Functionality
x1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
x2
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
x3
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
x4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
x1
x2
x3
x4
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
LUT
y
x1 x2 x3 x4
x1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
x2
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
x3
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
x4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
• Look-Up tables
are primary
elements for
logic
implementation
• Each LUT can
implement any
function of
4 inputs
x1 x2
y
y
ECE 448 – FPGA and ASIC Design with VHDL
19
5-Input Functions implemented using
two LUTs
• One CLB Slice can implement any function of 5 inputs
• Logic function is partitioned between two LUTs
• F5 multiplexer selects LUT
A4
A3
LUT
ROM
RAM
D
A2
A1
WS
DI
F5
0
F4
A4
F3
A3
F2
A2
F1
A1
BX
WS
DI
D
1
F5
GXOR
X
G
LUT
ROM
RAM
nBX
BX
1
0
ECE 448 – FPGA and ASIC Design with VHDL
20
5-Input Functions implemented using two LUTs
X5 X4 X3 X2 X1
0 0 0 0 0
0 0 0 0 1
0 0 0 1 0
0 0 0 1 1
0 0 1 0 0
0 0 1 0 1
0 0 1 1 0
0 0 1 1 1
0 1 0 0 0
0 1 0 0 1
0 1 0 1 0
0 1 0 1 1
0 1 1 0 0
0 1 1 0 1
0 1 1 1 0
0 1 1 1 1
1 0 0 0 0
1 0 0 0 1
1 0 0 1 0
1 0 0 1 1
1 0 1 0 0
1 0 1 0 1
1 0 1 1 0
1 0 1 1 1
1 1 0 0 0
1 1 0 0 1
1 1 0 1 0
1 1 0 1 1
1 1 1 0 0
1 1 1 0 1
1 1 1 1 0
1 1 1 1 1
Y
0
1
0
0
1
1
0
0
1
0
0
1
1
1
1
1
0
0
0
0
0
0
0
1
0
1
0
1
0
1
0
0
LUT
OUT
LUT
ECE 448 – FPGA and ASIC Design with VHDL
21
Xilinx Multipurpose LUT
16-bit SR
16 x 1 RAM
4-input LUT
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
22
Simplified view of a Xilinx Logic Cell
16-bit SR
16x1 RAM
a
b
c
d
4-input
LUT
e
y
mux
flip-flop
q
clock
clock enable
set/reset
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
23
Distributed RAM
RAM16X1S
• CLB LUT configurable as
Distributed RAM
• A single LUT equals 16x1
RAM
• Two LUTs Implement Single
and Dual-Port RAMs
• Cascade LUTs to increase
RAM size
• Synchronous write
• Synchronous/Asynchronous
read
• Accompanying flip-flops used
for synchronous read
D
WE
WCLK
A0
A1
A2
A3
=
LUT
O
RAM32X1S
D
WE
WCLK
A0
A1
A2
A3
A4
LUT
=
LUT
or
O
RAM16X2S
D0
D1
WE
WCLK
A0
A1
A2
A3
O0
O1
or
RAM16X1D
D
WE
WCLK
A0
SPO
A1
A2
A3
DPRA0 DPO
DPRA1
DPRA2
DPRA3
ECE 448 – FPGA and ASIC Design with VHDL
24
Shift Register
LUT
• Each LUT can be
configured as shift register
IN
CE
CLK
• Serial in, serial out
• Dynamically addressable
delay up to 16 cycles
• For programmable
pipeline
• Cascade for greater cycle
delays
• Use CLB flip-flops to add
depth
LUT
=
D
CE
Q
D
CE
Q
D
CE
Q
D
CE
Q
OUT
DEPTH[3:0]
ECE 448 – FPGA and ASIC Design with VHDL
25
Shift Register
12 Cycles
64
Operation A
Operation B
4 Cycles
8 Cycles
64
Operation C
3 Cycles
3 Cycles
9-Cycle imbalance
• Register-rich FPGA
• Allows for addition of pipeline stages to increase
throughput
• Data paths must be balanced to keep desired
functionality
ECE 448 – FPGA and ASIC Design with VHDL
26
Carry & Control Logic
COUT
YB
G4
G3
G2
G1
Y
Look-Up
O
Table
D
Carry
&
Control
Logic
S
Q
CK
EC
R
F5IN
BY
SR
XB
F4
F3
F2
F1
X
Look-Up
Table O
CIN
CLK
CE
ECE 448 – FPGA and ASIC Design with VHDL
Carry
&
Control
Logic
S
D
Q
CK
EC
R
SLICE
27
Fast Carry Logic
Each CLB contains separate
logic and routing for the fast
generation of sum & carry
signals
MSB
Carry Logic
Routing
• Increases efficiency and
performance of adders,
subtractors, accumulators,
comparators, and counters
Carry logic is independent of
normal logic and routing
resources
ECE 448 – FPGA and ASIC Design with VHDL
LSB
28
Accessing Carry Logic
All major synthesis tools can infer carry
logic for arithmetic functions
•
•
•
•
Addition (SUM <= A + B)
Subtraction (DIFF <= A - B)
Comparators (if A < B then…)
Counters (count <= count +1)
ECE 448 – FPGA and ASIC Design with VHDL
29
Input/Output Blocks
(IOBs)
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Basic I/O Block Structure
D Q
EC
Three-State
FF Enable
Clock
SR
Three-State
Control
Set/Reset
D Q
EC
Output
FF Enable
Output Path
SR
Direct Input
FF Enable
Registered
Input
Q
D
EC
Input Path
SR
ECE 448 – FPGA and ASIC Design with VHDL
31
IOB Functionality
• IOB provides interface between the
package pins and CLBs
• Each IOB can work as uni- or bi-directional
I/O
• Outputs can be forced into High Impedance
• Inputs and outputs can be registered
• advised for high-performance I/O
• Inputs can be delayed
ECE 448 – FPGA and ASIC Design with VHDL
32
Other Components of
Spartan 3 FPGAs
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
RAM Blocks and Multipliers in Xilinx
FPGAs
RAM blocks
Multipliers
Logic blocks
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
34
A simple clock tree
Clock
tree
Flip-flops
Special clock
pin and pad
Clock signal from
outside world
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
35
Digital Clock Manager (DCM)
Clock signal from
outside world
Clock
Manager
etc.
Daughter clocks
used to drive
internal clock trees
or output pins
Special clock
pin and pad
The Design Warrior’s Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
ECE 448 – FPGA and ASIC Design with VHDL
36
Spartan-3 Family Attributes
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Spartan-3 FPGA Family Members
ECE 448 – FPGA and ASIC Design with VHDL
38
FPGA Nomenclature
ECE 448 – FPGA and ASIC Design with VHDL
39
FPGA device present on the RC10 board
XC3S1500-4FG320
Spartan 3
family
1500 k
= 1.5 M
equivalent
logic gates
ECE 448 – FPGA and ASIC Design with VHDL
speed
grade
-4
= standard
performance
320 pins
package type
40
Celoxica RC10
FPGA Board
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
ECE 448 – FPGA and ASIC Design with VHDL
42
ECE 448 – FPGA and ASIC Design with VHDL
43
ECE 448 – FPGA and ASIC Design with VHDL
44
FPGA Design Flow
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Design flow (1)
Design and implement a simple unit permitting to
speed up encryption with RC5-similar cipher with
fixed key set on 8031 microcontroller. Unlike in
the experiment 5, this time your unit has to be able
to perform an encryption algorithm by itself,
executing 32 rounds…..
Specification (Lab Experiments)
VHDL description (Your Source Files)
Library IEEE;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
Functional simulation
entity RC5_core is
port(
clock, reset, encr_decr: in std_logic;
data_input: in std_logic_vector(31 downto 0);
data_output: out std_logic_vector(31 downto 0);
out_full: in std_logic;
key_input: in std_logic_vector(31 downto 0);
key_read: out std_logic;
);
end AES_core;
Synthesis
ECE 448 – FPGA and ASIC Design with VHDL
Post-synthesis simulation
46
Design flow (2)
Implementation
Timing simulation
Configuration
On chip testing
ECE 448 – FPGA and ASIC Design with VHDL
47
Tools used in FPGA Design Flow
Functionally
verified
VHDL code
Design
VHDL code
Synplicity
Xilinx XST
Synplify Pro
Synthesis
Netlist
Xilinx ISE
Implementation
Bitstream
48
Synthesis
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Synthesis Tools
Synplify Pro
Xilinx XST
… and others
ECE 448 – FPGA and ASIC Design with VHDL
50
Logic Synthesis
VHDL description
Circuit netlist
architecture MLU_DATAFLOW of MLU is
signal A1:STD_LOGIC;
signal B1:STD_LOGIC;
signal Y1:STD_LOGIC;
signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC;
begin
A1<=A when (NEG_A='0') else
not A;
B1<=B when (NEG_B='0') else
not B;
Y<=Y1 when (NEG_Y='0') else
not Y1;
MUX_0<=A1 and B1;
MUX_1<=A1 or B1;
MUX_2<=A1 xor B1;
MUX_3<=A1 xnor B1;
with (L1 & L0) select
Y1<=MUX_0 when "00",
MUX_1 when "01",
MUX_2 when "10",
MUX_3 when others;
end MLU_DATAFLOW;
ECE 448 – FPGA and ASIC Design with VHDL
51
Circuit netlist (RTL view)
ECE 448 – FPGA and ASIC Design with VHDL
52
Mapping
LUT0
LUT4
LUT1
FF1
LUT5
LUT2
FF2
LUT3
ECE 448 – FPGA and ASIC Design with VHDL
53
RTL view in Synplify Pro
General logic structures can be recognized in RTL view
comparator
incrementer
MUX
Crossprobing between RTL view and code
Each port, net or block can be chosen by mouse click from the
browser or directly from the RTL View
By double-clicking on the element its source code can be seen:
Reverse crossprobing is also possible: if section of code is marked,
appropriate element of RTL View is marked too:
Technology View in Synplify Pro
Technology view is a mapped RTL view. It can be seen by pressing
button
or by double-click on “.srm” file
As in case of “RTL View”, buttons
can be used here
Two additional buttons are enabled:
Pay attention:
technology view
is usually large
and presented on
number of sheets
Ports, nets and
blocks browser
- show critical path
- open timing analyst
Technology view is
presented using device
primitives
Viewing critical path
Critical path can be viewed by pressing on
Delay values are written near each component of the path
Timing Analyst
Timing analyst opened by pressing on
Timing analyst gives a possibility to analyze different paths in the design
Timing analyst can be opened only from Technology View
Implementation
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Implementation
• After synthesis the entire implementation
process is performed by FPGA vendor
tools
ECE 448 – FPGA and ASIC Design with VHDL
60
ECE 448 – FPGA and ASIC Design with VHDL
61
Translation
Synthesis
Circuit netlist
Electronic Design
Interchange Format
EDIF
Timing Constraints
Constraint Editor
or Text Editor
Native
Constraint
File
NCF
UCF
User Constraint File
Translation
NGD
ECE 448 – FPGA and ASIC Design with VHDL
Native Generic Database file
62
Pin Assignment
FPGA
H3
K2
G5
CLOCK
CONTROL(0)
CONTROL(1)
CONTROL(2)
RESET
LAB2
B10
P10
SEGMENTS(0)
SEGMENTS(1)
SEGMENTS(2)
SEGMENTS(3)
SEGMENTS(4)
SEGMENTS(5)
SEGMENTS(6)
H2
H6
H5
K3
H1
K4
G4
ECE 448 – FPGA and ASIC Design with VHDL
63
ECE 448 – FPGA and ASIC Design with VHDL
64
Mapping
LUT0
LUT4
LUT1
FF1
LUT5
LUT2
FF2
LUT3
ECE 448 – FPGA and ASIC Design with VHDL
65
Placing
FPGA
CLB SLICES
ECE 448 – FPGA and ASIC Design with VHDL
66
Routing
FPGA
Programmable Connections
ECE 448 – FPGA and ASIC Design with VHDL
67
Configuration
• Once a design is implemented, you must create a
file that the FPGA can understand
• This file is called a bit stream: a BIT file (.bit extension)
• The BIT file can be downloaded directly to the
FPGA, or can be converted into a PROM file
which stores the programming information
ECE 448 – FPGA and ASIC Design with VHDL
68
Two main stages of the
FPGA Design Flow
Implementation
Synthesis
Technology
dependent
Technology
independent
RTL
Synthesis
- Code analysis
- Derivation of main logic
constructions
- Technology independent
optimization
- Creation of “RTL View”
Map
Place & Route
- Mapping of extracted logic
structures to device primitives
- Technology dependent
optimization
- Application of “synthesis
constraints”
-Netlist generation
- Creation of “Technology View”
Configure
- Placement of generated
netlist onto the device
-Choosing best interconnect
structure for the placed
design
-Application of “physical
constraints”
- Bitstream
generation
- Burning device
Report files
ECE 448 – FPGA and ASIC Design with VHDL
70
Map report header
Release 8.1i Map I.24
Xilinx Mapping Report File for Design 'Lab3Demo'
Design Information
-----------------Command Line : c:\Xilinx\bin\nt\map.exe -p 3S1500FG320-4 -o map.ncd -pr b -k 4
-cm area -c 100 Lab3Demo.ngd Lab3Demo.pcf
Target Device : xc3s1500
Target Package : fg320
Target Speed : -4
Mapper Version : spartan3 -- $Revision: 1.34 $
Mapped Date : Tue Feb 13 17:04:54 2007
ECE 448 – FPGA and ASIC Design with VHDL
71
Map report
Design Summary
-------------Number of errors:
0
Number of warnings: 0
Logic Utilization:
Number of Slice Flip Flops:
30 out of 26,624 1%
Number of 4 input LUTs:
38 out of 26,624 1%
Logic Distribution:
Number of occupied Slices:
33 out of 13,312 1%
Number of Slices containing only related logic:
33 out of
33 100%
Number of Slices containing unrelated logic:
0 out of
33 0%
*See NOTES below for an explanation of the effects of unrelated logic
Total Number 4 input LUTs:
62 out of 26,624 1%
Number used as logic:
38
Number used as a route-thru:
24
Number of bonded IOBs:
10 out of 221 4%
IOB Flip Flops:
7
Number of GCLKs:
1 out of
8 12%
ECE 448 – FPGA and ASIC Design with VHDL
72
Place & route report
Asterisk (*) preceding a constraint indicates it was not met.
This may be due to a setup or hold violation.
-----------------------------------------------------------------------------------------------------Constraint
| Requested | Actual
| Logic | Absolute
|Number of
|
|
| Levels | Slack
|errors
-----------------------------------------------------------------------------------------------------* TS_CLOCK = PERIOD TIMEGRP "CLOCK" 5 ns
| 5.000ns
| 5.140ns
| 4
| -0.140ns
| 5
HIGH 50%
|
|
|
|
|
-----------------------------------------------------------------------------------------------------TS_gen1Hz_Clock1Hz = PERIOD TIMEGRP "gen1 | 5.000ns
| 4.137ns
| 2
| 0.863ns
| 0
"gen1Hz_Clock1Hz" 5 ns HIGH 50%
|
|
|
|
|
------------------------------------------------------------------------------------------------------
ECE 448 – FPGA and ASIC Design with VHDL
73
Post layout timing report
Clock to Setup on destination clock CLOCK
---------------+---------+---------+---------+---------+
| Src:Rise| Src:Fall| Src:Rise| Src:Fall|
Source Clock
|Dest:Rise|Dest:Rise|Dest:Fall|Dest:Fall|
---------------+---------+---------+---------+---------+
CLOCK
|
5.140|
|
|
|
---------------+---------+---------+---------+---------+
Timing summary:
--------------Timing errors: 9
Score: 543
Constraints cover 574 paths, 0 nets, and 187 connections
Design statistics:
Minimum period:
5.140ns (Maximum frequency: 194.553MHz)
ECE 448 – FPGA and ASIC Design with VHDL
74