Release by Aug-Sept - Computer Engineering

Download Report

Transcript Release by Aug-Sept - Computer Engineering

Flex-Cell Optimization
A Paradigm Shift in High-Performance
Cell-Based Design
Slide 1
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
The Power-User Dilemma
Takes too long!
Cost / TTM
Results aren’t
good enough!
ASIC/COT
FPGA
Team=10
400 MHz
9 Months
Custom
Team=400
3 GHz, 3 Years
Flex-Cell
Opt
Team=10
520 MHz
6 Months
Speed, Power, Area
Slide 2
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
The Timing Dilemma
• Design Team clock target – 350 MHz
• On Post-logic synth./Post-placement STA
– Only 300 MHz – Problem!!
• Options
– Design change
• Rewrite RTL – Tapeout Delay!!
– Better technology
• Smaller geometry – Tapeout delay and NRE cost!!
• Low-k technology – Yield hit!!
– Better tools
• Flex-Cell Optimization
– Custom-design benefits in std cell flow
Slide 3
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Root of the Problem
• Various past studies, including a special
session at DAC 2000
• Std-Cell based design “an order of
magnitude” lower performance than
custom, at same process node
– Architecture
– Fixed cell library
– Layout
• Fixed cell library can account for as much
as 25% of the performance shortfall
Slide 4
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Rich vs Smart
• Simply creating a “richer” cell library does not
solve problem
– Too many cells hinder automated optimization
– Missing design-specific context information
– Well-known matching problems for larger cells
• Custom-crafted cells, for specific design, can
inject large timing gains late in the design
cycle
• Compute-intensive process
– Transistor netlist optimization
– Cell layout creation
– View generation
Slide 5
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Flex-Cell Optimization -- Concept
Logical
Level
Physical
Physical
Level
Level
Flex-Cell
Opt
Transistor
Level
Optimization at Gate, Transistor & Physical Levels
Slide 6
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Prior Work
• Manual custom-crafting of cells, is well established
– Tactical cells: every high-performance design project
uses some
• Automated transistor-level netlist
creation/optimization
– Fishburn, Dunlop(1985): TILOS, transistor sizing
– Gavrilov et al (1997): Library-less synthesis
– Kanecko, Tian (1998): Concurrent cell generation and
mapping of digital logic
– Liu, Abraham (1999): Transistor-level synthesis of
combinational logic
Slide 7
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Flex-Cell Optimization Targets
• Eliminate deficiency due to fixed cell library
– Boost performance by 15% - 25%
• Close aggressive timing in days
• Retain proven existing cell-based design flow
• Use high-yield process, still get performance
• Minimal increase in die-size or power
• Get custom-design performance from std-cellbased flow
Slide 8
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Key Steps
• Post synthesis netlist
• STA
• Cluster formation
• Flex-cell (custom crafted)
creation
• Gate-level optimization
Slide 9
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
a
c
d
b
a
4 Cells
22 Transistors
9 Wires
c
a
a
b
d
b d
a
d
a
c
1 Cell
13 Transistors
6 Wires
Flex-Cell Optimization with Physicals
• Physically-aware STA
– Placement aware
• Congestion
• Blockage
– Multiple levels of accuracy for route info
• Steiner estimates
• Global route
• Detailed route**
• Physically-driven optimization
–
–
–
–
–
Physically-aware clustering and mapping
Physically-aware gate-level optimizations
Low disturbance to existing placement
Incremental legalization of placement
Incremental re-computation of
routes/estimates
Slide 10
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Sample Flex-Cell
Gate-Level Cluster
Before
4 Cells, 9 nets
a
c
d
Rise (critical) 0.26ns 0.12ns
Fall (critical) 0.31ns 0.10ns
# Cells
4
1
# Transistors 22
13
Path depth
3
2
# nets
9
7
y
a
Critical Path: a -> y
b
After
Custom-Crafted Flex-Cell
1 Cell, 7 nets
Critical Path: a -> y
Rise = 0.12 ns ; Fall = 0.10 ns
Rise = 0.26 ns ; Fall = 0.31 ns
Critical Path: a -> y
c
Rise = 0.12 ns; Fall = 0.10 ns
a
a
y
c
a
b
c
d
d
b
c
22 Transistors
Path depth = 3 levels
Tx-Level View of Gate Cluster
Slide 11
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Tx Opt
c
a
a
b
d
b d
a
d
c
a
y
13 Transistors; Path depth = 2 levels
After Tx-Level Optimization
Transistor-Level Optimization
Cluster of standard cells, various context-specific constraints
for this cluster, other real-life constraints like process, etc.
Map to transistor-level
candidate netlists for Flex-Cell
Transistor Sizing
(No)
Fast (pre-layout)
characterization
Create transistor-level
netlist with systematic
redundancy, if permitted
Meets requirements?
Layout synthesis
Post-layout characterization
Detailed characterization
Meets requirements?
Set of candidate Flex-Cell
Various interfaces to evaluate and fit
Flex-Cell into standard-cell based design flow
Slide 12
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
(No)
Key Issues
• Judicious mix of gate-level and transistor-level
optimization
• Judicious mix of discrete and continuous
transistor sizing
• Effective use of transistor-level restructuring
• Fast and accurate transistor-level simulation
– 50x to 100x faster than Spice
• Accurate estimation of parasitics given transistorlevel netlist
Slide 13
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Impact On a Sample Critical Path
Original Critical Path
0.29
0.25
0.07
0.11
0.18
0.14
0.04
0.15
0.36
0.20
0.24
Optimized Path
0.07
0.82
Flex-Cell
2
Flex-Cell
1
21%
Improvement
0.04
0.20
Slide 14
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
1.04
Results
(ZenTime)
• 38K+ instance design
• 16% performance boost
– 297 MHz --> 344 MHz
• Implemented in a 0.13u process
• Added 132 flex-cells, 5,927 instances
• Without increasing power or area
Slide 15
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Impact on Global Timing
• Initial frequency: 297 MHz
• Final frequency: 344 MHz
Slide 16
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Timing Optimization Results
Orig
Opt
Improv
Flex
Cell
Flex
Cell
M Hz
M Hz
(%)
Created
Insts
Size
(#insts)
Circuit1
297
345
16%
132
5927
38,130
0.13um
slow
Circuit2
250
277
11%
103
4900
62,801
0.13um
slow
Circuit3
248
279
13%
133
5113
160,610
0.13um
slow
Circuit4
251
294
17%
150
2050
21,814
0.18um
slow
Circuit5
187
219
18%
165
3821
33,940
0.13um
typical
Circuit6
167
193
16%
49
183
18,265
0.18um
typical
Circuit7
562
641
14%
160
2469
9,048
0.13um
typical
Design
with physicals (def, sdf, …)
Slide 17
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
with wire loads
Design
Technology
Process
Corner
I/O & Design Flow
Design
library.lib
library.lef
library.cdl
Constraints
netlist.v
netlist.def
constr.sdc
netlist.set_load
netlist.sdf
tech.bsim3
Physical Synthesis
Interface
Flex-Cell Opt
Detailed Route
Extraction &
Verification
GDSII
Slide 18
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Flex-Cell
Factory
Clustering
Cont. Sizing
Discrete Sizing
Physical
Timing
Timing
Back-end Design
Front-end Design
Library
Gatelevel Opt.
flex-cell.cdl
flex-cell.est.lib
flex-cell.est.lef
opt_netlist.v
opt_netlist.def
Automated Flex-Cell Generation
Sized spice
netlists
Cell
Architecture
Tool Suite and Flow
Layout
Spice
gds
lef
ant. lef
lumpedC.sp
distrRC.sp
Functional
eqn.v
mos.v
Slide 19
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Reports
Timing Power
.lib
.db
.tlf
Noise/
glitch
.lib
??
Summary
• New dimension in optimization of cell-based
designs
• Essential to find the “right balance” between
gate-level and transistor-level optimization
• Better design quality, higher runtime
• Timing, Area, Power no longer a simple tradeoff
– Possible to improve more than one, simultaneously
• Many challenges
– Lots of research opportunities!!
Slide 20
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
The History of
Methodology Shifts
Netlist
optimization
Slide 21
Dec 1, 2003
Copyright, 1999 - 2003 © Zenasis Technologies, Inc.
Flex-cell
synthesis
Physical
synthesis
Logic
synthesis
Netlist
schematic
Physical
optimization
Flex-cell
optimization