ECE 448 Lecture 20 ASIC Front-End Design ECE 448 – FPGA and ASIC Design with VHDL George Mason University.

Download Report

Transcript ECE 448 Lecture 20 ASIC Front-End Design ECE 448 – FPGA and ASIC Design with VHDL George Mason University.

ECE 448
Lecture 20
ASIC Front-End Design
ECE 448 – FPGA and ASIC Design with VHDL
George Mason University
Two competing implementation approaches
ASIC
Application Specific
Integrated Circuit
• designed all the way
from behavioral description
to physical layout
• designs must be sent
for expensive and time
consuming fabrication
in semiconductor foundry
ECE 448 – FPGA and ASIC Design with VHDL
FPGA
Field Programmable
Gate Array
• no physical layout design;
design ends with
a bitstream used
to configure a device
• bought off the shelf
and reconfigured by
designers themselves
2
FPGAs vs. ASICs
ASICs
FPGAs
Off-the-shelf
High performance
Low development costs
Low power
Short time to the market
Low cost (but only
in high volumes)
ECE 448 – FPGA and ASIC Design with VHDL
Reconfigurability
3
ASIC Design Example – Factoring circuit/GMU
Global Memory
Local
Memory
ECE 448 – FPGA and ASIC Design with VHDL
4
ASIC 130 nm vs. Virtex II 6000
Factoring/GMU
19.68 mm
19.80 mm
51x
Area of Xilinx Virtex II 6000
FPGA
(estimation by R.J. Lim Fong,
MS Thesis, VPI, 2004)
2.7 mm
2.82 mm
Area of an ASIC with equivalent functionality
ECE 448 – FPGA and ASIC Design with VHDL
5
ASICs vs. FPGAs
Source:
I. Kuon, J. Rose,
University of Toronto
“Measuring the Gap Between
FPGAs and ASICs”
IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems,
vol. 62, no. 2, Feb 2007.
ECE 448 – FPGA and ASIC Design with VHDL
6
ASICs vs. FPGAs
23 representative circuits implemented using
FPGAs and ASICs
- computer arithmetic (booth, cordic18, cordic8, etc.)
- digital signal processing (rs_encoder, fir3, fir24,
etc.)
- communications (ethernet, mac1, atm, etc.)
- cryptography (des_area, des_perf, aes, aes192,
etc.)
- scientific computations (molecular, raytracer, etc.)
ECE 448 – FPGA and ASIC Design with VHDL
7
ECE 448 – FPGA and ASIC Design with VHDL
8
ECE 448 – FPGA and ASIC Design with VHDL
9
ECE 448 – FPGA and ASIC Design with VHDL
10
ECE 448 – FPGA and ASIC Design with VHDL
11
Simplified ASIC Design Flow
Front-End
Design
Back-End
Design
Synthesis
Timing Analysis
Floorplanning
Placement
Clock Tree Synthesis
Routing
Design for Manufacturing
ECE 448 – FPGA and ASIC Design with VHDL
12
31
Major ASIC Toolsets
Cadence
Magma
ECE 448 – FPGA and ASIC Design with VHDL
13
Simplified ASIC Design Flow
Front-End
Design
Back-End
Design
Synthesis
Timing Analysis
Synopsys
Tools
Design Compiler
Primetime
Floorplanning
Placement
Clock Tree Synthesis
Astro
Routing
Design for Manufacturing
ECE 448 – FPGA and ASIC Design with VHDL
14
31
A Complete Placed and Routed Chip
IP
ECE 448 – FPGA and ASIC Design with VHDL
15
28
What is “Physical Layout”?
VDD
VDD
PMOS
PMOS
OUT
IN
IN
OUT
NMOS
NMOS
GND
GND
Transistor or Device View
Physical or Layout View
Physical Layout – Topography of devices and interconnects, made
up of polygons that represent different layers of material
(diffusion, polysilicon, metal, contact, etc)
ECE 448 – FPGA and ASIC Design with VHDL
16
Process of Device Fabrication
• Devices are fabricated vertically on a silicon substrate wafer by
layering different materials in specific locations and shapes on
top of each other
• Each of many process masks defines the shapes and
locations of a specific layer of material (diffusion, polysilicon,
metal, contact, etc)
• Mask shapes, derived from the layout view, are transformed to
silicon via photolithographic and chemical processes
Silicon Substrate
Layout or Mask (aerial) view
ECE 448 – FPGA and ASIC Design with VHDL
Wafer (cross-sectional) view
17
40
Wafer Representation of Layout Polygons
0.25 um
PMOS
Input
VDD
Output
GND
NMOS
Aerial or Layout View
ECE 448 – FPGA and ASIC Design with VHDL
Wafer Cross-sectional View
18
41
Front-End Design Flow
ECE 448 – FPGA and ASIC Design with VHDL
19
Simplified RTL Synthesis
Write RTL HDL
Code
HDL
No
Simulate
OK
Yes
Synthesize RTL
Code to Gates
Gate Level
Netlist
No
Constraints
Met?
Yes
No
Gate Level
Testing
OK?
Yes
Proceed with
Backend
Processing
ECE 448 – FPGA and ASIC Design with VHDL
20
VHDL
vs.
Verilog
Government
Developed
Commercially
Developed
Ada based
C based
Strongly Type Cast
Mildly Type Cast
Difficult to learn
Easy to Learn
More Powerful
Less Powerful
ECE 448 – FPGA and ASIC Design with VHDL
21
Logic Synthesis
VHDL description
Circuit netlist
architecture MLU_DATAFLOW of MLU is
signal A1:STD_LOGIC;
signal B1:STD_LOGIC;
signal Y1:STD_LOGIC;
signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC;
begin
A1<=A when (NEG_A='0') else
not A;
B1<=B when (NEG_B='0') else
not B;
Y<=Y1 when (NEG_Y='0') else
not Y1;
MUX_0<=A1 and B1;
MUX_1<=A1 or B1;
MUX_2<=A1 xor B1;
MUX_3<=A1 xnor B1;
with (L1 & L0) select
Y1<=MUX_0 when "00",
MUX_1 when "01",
MUX_2 when "10",
MUX_3 when others;
end MLU_DATAFLOW;
ECE 448 – FPGA and ASIC Design with VHDL
22
Basic Synthesis Flow
ECE 448 – FPGA and ASIC Design with VHDL
23
Synthesis using Design Compiler
ECE 448 – FPGA and ASIC Design with VHDL
24
ECE 448 – FPGA and ASIC Design with VHDL
25
ECE 448 – FPGA and ASIC Design with VHDL
26
Script Language:
TCL – Tool Command Language
• Created by John Ousterhout of UC Berkeley
• Scripting Language
• Very simple to automate routine tasks.
• Extension Language
• Used to customize tools with
user/company specific aplications.
• Nearly all of modern EDA tools have a TCL
interface.
• Very simple to learn and use.
ECE 448 – FPGA and ASIC Design with VHDL
27
TCL References
• Practical Programming in Tcl and TK
• Brent B. Welch
• Ken Jones
• TCL/TK in a Nutshell
• Paul Raines
• Jeff Tranter
ECE 448 – FPGA and ASIC Design with VHDL
28
Synthesis script (1)
designer = "Pawel Chodowiec"
company = "George Mason University"
search_path =
"./opt3/synopsys/TSMCHOME/digital/Front_End/timing_power/tcb013ghp_200a
"
link_library
= "* tcb013ghptc.db" /* Typical case library */
target_library = "tcb013ghptc.db "
symbol_library = "tcb013ghp.sdb "
/* Directory configuration */
src_directory = ~/exam1/vhdl/
report_directory = ~/exam1/reports/
db_directory = ~/exam1/db/
ECE 448 – FPGA and ASIC Design with VHDL
29
Synthesis script (2)
/* Packages can be only read */
read_file -format vhdl -rtl src_directory + "components.vhd"
blocks = {regne, upcount, RAM_16Xn_DISTRIBUTED, exam1}
foreach (block, blocks) {
block_source = src_directory + block + ".vhd"
read_file -format vhdl -rtl block_source
analyze -format vhdl -lib WORK block_source
}
current_design block
/* All commands now apply to the entity "exam1" */
ECE 448 – FPGA and ASIC Design with VHDL
30
Synthesis script (3)
uniquify
/* Creates unique instances of multiple refrenced entities */
link
check_design
/* Checks the current design for consistency */
/*******************************************/
/* apply block attributes and constraints */
/*******************************************/
create_clock -period 10 clk
/* Defines that the port "clk" on the entity "clk"
is the clock for the design. Period=10ns 50% duty cycle
Use -waveform option to define duty cycle other than 50%*/
set_operating_conditions NCCOM
/*Normal Case Commercial Operating Conditions*/
ECE 448 – FPGA and ASIC Design with VHDL
31
Synthesis script (4)
/***************************************************/
/* Apply these constraints to the top-level entity*/
/***************************************************/
set_max_fanout 100 block
set_clock_latency 0.1 find(clock, "clk")
set_clock_transition 0.01 find(clock, "clk")
set_clock_uncertainty -setup 0.1 find(clock, "clk")
set_clock_uncertainty -hold 0.1 find(clock, "clk")
set_load 0 all_outputs()
set_input_delay 1.0 -clock clk -max all_inputs()
set_output_delay -max 1.0 -clock clk all_outputs()
set_wire_load_model -library tcb013ghptc -name "TSMC8K_Fsg_Conservative"
ECE 448 – FPGA and ASIC Design with VHDL
32
Wireload model basics (1)
ECE 448 – FPGA and ASIC Design with VHDL
33
Wireload model basics (2)
ECE 448 – FPGA and ASIC Design with VHDL
34
Synthesis script (5)
set_dont_touch block
compile -map_effort medium
change_names -rules vhdl
vhdlout_architecture_name = "sort_syn"
vhdlout_use_packages = {"IEEE.std_logic_1164"}
write -f db -hierarchy -output db_directory + "exam1.db"
/*write -f vhdl -hierarchy -output db_directory + "exam1_syn.vhd"*/
report -area > report_directory + "exam1.report_area"
report -timing -all > report_directory + "exam1.report_timing"
ECE 448 – FPGA and ASIC Design with VHDL
35
Results of synthesis
ECE 448 – FPGA and ASIC Design with VHDL
36
Area report after synthesis (1)
report_area
Information: Updating design information... (UID-85)
****************************************
Report :
area
Design :
exam1
Version:
V-2003.12-SP1
Date:
Tue Nov 15 20:39:06 2005
****************************************
Library(s) Used:
tcb013ghptc (File:
/opt3/synopsys/TSMCHOME/digital/Front_End/timing_power/
tcb013ghp_200a/tcb013ghptc.db)
ECE 448 – FPGA and ASIC Design with VHDL
37
Area report after synthesis (2)
Number of ports:
Number of nets:
Number of cells:
Number of references:
75
346
107
28
Combinational area:
10593.477539
Noncombinational area: 14295.521484
Net Interconnect area:
undefined (Wire load has zero net area)
Total cell area:
Total area:
24888.976562
undefined
ECE 448 – FPGA and ASIC Design with VHDL
38
Critical Path (1)
• Critical Path – The Longest Path From
Outputs of Registers to Inputs of
Registers
t logic
in
D
Q
D
Q
out
clk
tCritical = tFF-P + tlogic + tFF-setup
ECE 448 – FPGA and ASIC Design with VHDL
39
Critical Path (2)
• Min. Clock Period = Length of The
Critical Path
• Max. Clock Frequency = 1 / Min. Clock
Period
ECE 448 – FPGA and ASIC Design with VHDL
40
n+m
n+m
ECE 448 – FPGA and ASIC Design with VHDL
41
Clock Jitter
• Rising Edge of The Clock Does Not
Occur Precisely Periodically
• May cause faults in the circuit
clk
ECE 448 – FPGA and ASIC Design with VHDL
42
Clock Skew
• Rising Edge of the Clock Does Not Arrive at
Clock Inputs of All Flip-flops at The Same
Time
in
D
Q
clk
in
D
Q
out
delay
D
Q
D
delay
ECE 448 – FPGA and ASIC Design with VHDL
Q
out
clk
43
Timing report after synthesis (1)
****************************************
Report : timing
-path full
-delay max
-max_paths 1
Design : exam1
Version: V-2003.12-SP1
Date : Tue Nov 15 20:39:06 2005
****************************************
Operating Conditions: NCCOM Library: tcb013ghptc
Wire Load Model Mode: segmented
ECE 448 – FPGA and ASIC Design with VHDL
44
Timing report after synthesis (2)
Startpoint: in_addr(1) (input port clocked by clk)
Endpoint: RegSUM/Q_reg[34]
(rising edge-triggered flip-flop clocked by clk)
Path Group: clk
Path Type: max
Des/Clust/Port
Wire Load Model
Library
----------------------------------------------------------------------------------exam1
TSMC8K_Fsg_Conservative tcb013ghptc
RAM_16Xn_DISTRIBUTED ZeroWireload
tcb013ghptc
exam1_DW01_cmp2_32_0
ZeroWireload
tcb013ghptc
exam1_DW01_cmp2_32_1
ZeroWireload
tcb013ghptc
exam1_DW01_add_35_0
ZeroWireload
tcb013ghptc
regne_1
ZeroWireload
tcb013ghptc
regne_2
ZeroWireload
tcb013ghptc
regne_n35
ZeroWireload
tcb013ghptc
ECE 448 – FPGA and ASIC Design with VHDL
45
Timing report after synthesis (3)
Point
Incr
Path
-----------------------------------------------------------------------------------------------clock clk (rise edge)
0.00
0.00
clock network delay (ideal)
0.10
0.10
input external delay
1.00
1.10 f
in_addr(1) (in)
0.00
1.10 f
U98/Z (CKMUX2D1)
0.13
1.23 f
Memory/ADDR[1] (RAM_16Xn_DISTRIBUTED)
0.00
1.23 f
Memory/U41/ZN (INVD1)
0.08
1.31 r
Memory/U343/Z (OR3D1)
0.10
1.41 r
Memory/U338/ZN (INVD2)
0.20
1.61 f
Memory/U40/ZN (MOAI22D0)
0.17
1.78 f
Memory/U350/Z (OR4D1)
0.26
2.03 f
Memory/DATA_OUT[0] (RAM_16Xn_DISTRIBUTED) 0.00
2.03 f
ECE 448 – FPGA and ASIC Design with VHDL
46
Timing report after synthesis (4)
add_96xplusxplus/B[0] (exam1_DW01_add_35_0)
add_96xplusxplus/U9/Z (AN2D0)
add_96xplusxplus/U1_1/CO (CMPE32D1)
add_96xplusxplus/U1_2/CO (CMPE32D1)
add_96xplusxplus/U1_3/CO (CMPE32D1)
add_96xplusxplus/U1_4/CO (CMPE32D1)
add_96xplusxplus/U1_5/CO (CMPE32D1)
add_96xplusxplus/U1_6/CO (CMPE32D1)
add_96xplusxplus/U1_7/CO (CMPE32D1)
add_96xplusxplus/U1_8/CO (CMPE32D1)
add_96xplusxplus/U1_9/CO (CMPE32D1)
add_96xplusxplus/U1_10/CO (CMPE32D1)
add_96xplusxplus/U1_11/CO (CMPE32D1)
add_96xplusxplus/U1_12/CO (CMPE32D1)
add_96xplusxplus/U1_13/CO (CMPE32D1)
add_96xplusxplus/U1_14/CO (CMPE32D1)
ECE 448 – FPGA and ASIC Design with VHDL
0.00
0.12
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
2.03 f
2.15 f
2.25 f
2.34 f
2.44 f
2.54 f
2.63 f
2.73 f
2.82 f
2.92 f
3.02 f
3.11 f
3.21 f
3.31 f
3.40 f
3.50 f
47
Timing report after synthesis (5)
add_96xplusxplus/U1_15/CO (CMPE32D1)
add_96xplusxplus/U1_16/CO (CMPE32D1)
add_96xplusxplus/U1_17/CO (CMPE32D1)
add_96xplusxplus/U1_18/CO (CMPE32D1)
add_96xplusxplus/U1_19/CO (CMPE32D1)
add_96xplusxplus/U1_20/CO (CMPE32D1)
add_96xplusxplus/U1_21/CO (CMPE32D1)
add_96xplusxplus/U1_22/CO (CMPE32D1)
add_96xplusxplus/U1_23/CO (CMPE32D1)
add_96xplusxplus/U1_24/CO (CMPE32D1)
add_96xplusxplus/U1_25/CO (CMPE32D1)
add_96xplusxplus/U1_26/CO (CMPE32D1)
add_96xplusxplus/U1_27/CO (CMPE32D1)
add_96xplusxplus/U1_28/CO (CMPE32D1)
add_96xplusxplus/U1_29/CO (CMPE32D1)
add_96xplusxplus/U1_30/CO (CMPE32D1)
add_96xplusxplus/U1_31/CO (CMPE32D1)
ECE 448 – FPGA and ASIC Design with VHDL
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
0.10
3.60 f
3.69 f
3.79 f
3.88 f
3.98 f
4.08 f
4.17 f
4.27 f
4.37 f
4.46 f
4.56 f
4.66 f
4.75 f
4.85 f
4.94 f
5.04 f
5.14 f
48
Timing report after synthesis (6)
add_96xplusxplus/U7/Z (AN2D0)
0.10
add_96xplusxplus/U5/Z (AN2D0)
0.08
add_96xplusxplus/U4/Z (CKXOR2D0)
0.15
add_96xplusxplus/SUM[34] (exam1_DW01_add_35_0) 0.00
RegSUM/R[34] (regne_n35)
0.00
RegSUM/U32/Z (AO21D0)
0.11
RegSUM/Q_reg[34]/D (EDFQD1)
0.00
data arrival time
ECE 448 – FPGA and ASIC Design with VHDL
5.24 f
5.32 f
5.47 f
5.47 f
5.47 f
5.57 f
5.57 f
5.57
49
Timing report after synthesis (7)
clock clk (rise edge)
10.00
10.00
clock network delay (ideal)
0.10
10.10
clock uncertainty
-0.10
10.00
RegSUM/Q_reg[34]/CP (EDFQD1)
0.00
10.00 r
library setup time
-0.12
9.88
data required time
9.88
------------------------------------------------------------------------------------data required time
9.88
data arrival time
-5.57
------------------------------------------------------------------------------------slack (MET)
4.31
ECE 448 – FPGA and ASIC Design with VHDL
50
Static Timing Analysis
ECE 448 – FPGA and ASIC Design with VHDL
51
Static Timing Analysis Review
• Tools will calculate all paths from sequential start point to
sequential end point.
• The worst case path will be used for Setup analysis, and
the best case path will be used for hold analysis.
• All paths are considered for design rule checking
ECE 448 – FPGA and ASIC Design with VHDL
52
Review of Setup and Hold Checks
ECE 448 – FPGA and ASIC Design with VHDL
53
False and Multicycle paths
• False path
• Very slow signals like reset, test mode enable,
that are not used under normal conditions are
classified as false paths
• Multicycle path
• Paths that take more than one clock cycle are
known as multicycle paths.
• You have to define the multicylce paths in the
analyzer and the tool takes those constraints into
account when synthesizing
ECE 448 – FPGA and ASIC Design with VHDL
54
Multicycle path - Example
ECE 448 – FPGA and ASIC Design with VHDL
55
Optimization
criteria
ECE 448 – FPGA and ASIC Design with VHDL
56
Degrees of freedom and possible trade-offs
speed
area
power
testability
ECE 448 – FPGA and ASIC Design with VHDL
57
Degrees of freedom and possible trade-offs
speed
latency
area
throughput
ECE 448 – FPGA and ASIC Design with VHDL
58
VHDL Coding
for Synthesis
ECE 448 – FPGA and ASIC Design with VHDL
59
Recommended rules for Synthesis
• When implementing combinational paths do not
use hierarchy
• Register all outputs
• Do not implement glue logic between blocks,
partition them well
• Separate designs on functional boundary
• Keep block sizes to a reasonable size
ECE 448 – FPGA and ASIC Design with VHDL
60
Avoid hierarchical combinational blocks
Block A
reg1
Block B
Combinatorial
Logic1
Combinatorial
Logic2
Block C
Combinatorial
Logic3
reg2
Not recommended Design Practice
The path between reg1 and reg2 is divided between three
different block
Due to hierarchical boundaries, optimization of the combinational
logic cannot be achieved
Synthesis tools (Synopsys) maintain the integrity of the I/O
ports, combinational optimization cannot be achieved between
blocks (unless “grouping” is used).
ECE 448 – FPGA and ASIC Design with VHDL
61
Recommend way to handle Combinational Paths
Block C
Block A
reg1
Combinatorial
Logic1 &
Logic2& Logic3
reg2
Recommended practice
All the combinational circuitry is grouped in the same block that
has its output connected the destination flip flop
It allows the optimal minimization of the combinational logic
during synthesis
Allows simplified description of the timing interface
ECE 448 – FPGA and ASIC Design with VHDL
62
Register all outputs
Block Y
Block Y
Block X
reg1
reg2
reg3
Register all outputs
Simplifies the synthesis design environment: Inputs to the
individual block arrive within the same relative delay (caused by
wire delays)
Don’t really need to specify output requirements since paths
starts at flip flop outputs.
Take care of fanouts, rule of thumb, keep the fanout to 16
(dependent on technology and components that are being driven
by the output)
ECE 448 – FPGA and ASIC Design with VHDL
63
NO GLUE LOGIC between blocks
Top
Block X
Block Y
reg1
reg3
No Glue Logic between Blocks, no
matter what the temptation
Due to time pressures, and a bug found that can be simply be fixed
by adding some simple glue logic. RESIST THE TEMPTATION!!!
At this level in the hierarchy, this implementation will not allow the
glue logic to be absorbed within any lower level block.
ECE 448 – FPGA and ASIC Design with VHDL
64
Separate design with different goals
Top
Time
reg1
critical path
Slow Logic
reg3
ECE 448 – FPGA and ASIC Design with VHDL
reg1 may be driven by time
critical function, hence will have
different optimization
constraints
reg3 may be driven by slow
logic, hence no need to
constrain it for speed
65
Optimization based on design requirements
Top
Speed optimized block
Time
reg1
critical path
Area optimized block
Slow Logic
• Use different entities
to partition design
blocks
• Allows different
constraints during
synthesis to optimize
for area or speed or
both.
reg3
ECE 448 – FPGA and ASIC Design with VHDL
66
Separate FSM with random logic
• Separation of the FSM
and the random logic
allows you to use FSM
optimized synthesis
Top
Use FSM optimization tool
FSM
reg1
Standard optimization
techniques used
Random
Logic
ECE 448 – FPGA and ASIC Design with VHDL
reg3
67
Maintain a reasonable block size
• Partition your design such that each block
is between 1000-10000 gates (this is
strictly tools and technology dependent)
• Larger the blocks, longer the run time ->
quick iterations cannot be done.
ECE 448 – FPGA and ASIC Design with VHDL
68