Transcript Slides

International Symposium on Physical Design 2010

Skew Management of NBTI Impacted Gated Clock Trees

Ashutosh Chakraborty

and David Z. Pan ECE Department, University of Texas at Austin [email protected]

[email protected]

1

Outline

 Background: Clock Gating & NBTI Effect  Problem: Skew due to NBTI in gated clock  Previous Works  Proposed Solution  Results

2

Clock Gating

 Very popular low power technique  Freeze (“gate”) clock to inactive module › › Needs: Signal informing if a module is inactive Needs: Way to use this signal to freeze clock  Inactivity deduced by checking input permutations › › Example: OPCODE for adder? Freeze multiplier clock RTL simulation and ON/OFF set manipulation helps

3

Clock Gating (2)

 Duration of gating determined by many factors › Gating aggressiveness, input data statistics  How to stop clock signal?

› Use NAND/NOR/AND/OR gate › › One input: regular clock signal Other input: Inactivity/Activity signal CLK CLK_OUT Active?

4

Example Clock Tree

CLK FLOPS 5

Minimize Clock Gating Elements

40% CLK FLOPS 20% 6 30%

Implementation using NANDs

GATE: 40% CLK FLOPS GATE: 20% GATE: 30% 7

NBTI Effect

N

egative

B

ias

T

emperature

I

nstability  Occurs when

PMOS

negatively biased (V GS <0)  Reason: › › V GS <0 causes Si-H breaking Need higher V G to invert channel  Effects: › › ∆V TH = +100mV 10 years 30% increase in inverter delay [Kumar et. al. DAC 2007]

8

S D [Alam et. al. 2005 Micro. Reliab.]

NBTI Effect (2)

 Proportional to negative bias duration (~t N )  For PMOS in standard cells, › V GS < 0  V G < V DD  Input to cell = logic LOW › Thus, logic LOW feeding a cell causes NBTI ›

Differing LOW probability

different degradation

 Define SP0 = Probability of signal to be LOW › Higher SP0  More NBTI Degradation

9

Outline

 Background: NBTI & Clock Gating 

Problem: Skew due to NBTI in gated clock

 Previous Works  Proposed Solution  Results

10

SP0 Difference due to Clock Gating

SP0=50% SP0=50% CLK SP0=50% SP0=50% Larger

V TH Skew?

SP0=35% GATE: 30% Lower

V TH

 Using NAND gate reduces SP0 at output  Using NOR gate increases SP0 at output  In both cases, ∆

V TH

mismatch will exist!

Problems due to ∆V

TH

mismatch?

 Clock skew can degrade significantly!

 Up to 2.5X increase in skew [Chakraborty et al, DATE 2009] › › Large variation due to difference in nominal values Will lead to timing violation and circuit failure

12

Outline

 Background of NBTI & Clock Gating  Problem: Skew due to NBTI in gated clock 

Previous Works

 Proposed Solution  Results

13

Previous Works

 2003: US patent 6651230 [John Cohn et. al.] › › Essentially overdesign by tightening skew bound.

A limit to which skew constraint can be tightened.

 2009: DATE 09 [Chakraborty et. al.] › › › First runtime compensation for NBTI in clock trees At runtime, choose NAND or NOR to drive Aims to equalize all signal probabilities (of clock nets) »

Power Penalty? Routing?

14

Previous Works (2)

CLK Gated at 0 NOR GATE CLK NAND Gated at 1 SELECT If { GATE = FALSE } CLK_OUT = CLK Else If { SELECT = 0 } CLK_OUT = 0 Else CLK_OUT = 1 15 CLK_OUT

Outline

 Background of NBTI & Clock Gating  Problem: Skew due to NBTI in gated clock  Previous Works 

Proposed Solution

 Results

16

Main Idea

 NAND Gate increases SP0 at output  NOR Gate reduces SP0 at output  SP0 impacts delay cell of the cell being driven  Need to reduce

delay

difference at sinks  Multiple levels of clock gating elements › Can we selectively choose NAND/NOR at the right places, so that even if SP0 is different within the tree, by the time sinks are reached, the

delay

difference is minimized?

17

Proposed Solution

 At design time (i.e. statically), determine NAND or NOR choice for each gating enabled buffer › Objective: Minimize skew after NBTI aging  Benefits: › › › No hardware penalty w.r.t. regular clock gating No glitches due to SELECT signal switch No extra routing overhead

18

Our Optimization Flow

Symbolic SP0 Propagation SP0 Aware Delay Characterization Symbolic Arrival Time Computation Skew Minimization Formulation

19

Solve

Propagate SP0 in Clock Tree

 For gating probability of G & input SP0 of S, output SP0 for NAND or NOR choice:

20

Example: SP0 Propagation

21

Delay Characterization

 NBTI impacts T RISE . T FALL unchanged  T RISE characterization w.r.t. SP needed  Conducted SPICE simulations to obtain

Input SP0 22

Example [Delay Expression]

DINV(0.5) + ( X4 X2 * DNAND(0.5) + X2’ * DNOR(0.5) + * DNAND( 0.72 X2 * 0.5 ) + X4’ * DNOR( 0.75 X2 * 0.5 ) )

Can the expressions of Delay and SP become unmanageable as we traverse down the clock tree?

Like: X1*X2*X3’*X4*X6’… 24

Observations

 Lemma 1: SP0 of any gate is at most a linear function of Xi.

› No multiplication of Xi in SP expression.

 Lemma 2: Delay expression is at most a quadratic function of Xi › X1*X2 possible. Not X1*X2*X3 etc.

 Thus, delay/SP0 expression remain only quadratic functions of Xi. › If Xi binary, quadratic => linear transformation

25

ILP Formulation

 Minimize: MAX – MIN // Both dummy variables  Subject To:      Arrival Time(Sink i) <= MAX for all i; Arrival Time(Sink i) >= MIN for all i; MAX >= 0; MIN >= 0; Xi = {0, 1} Max Min

26

Experimental Setup

 Generated balanced clock trees (skew=0) › › 9K to 350K sinks. Buffers at all branching points  Picked 2% of buffers as gating enabled  Assign 20%  70% gating probability  Clock source input SP=0.5

 Spice netlist from 45nm Nangate library  C++ for SP propagation & ILP writing  Mathematica to reduce. CPLEX to solve.

27

Benchmarks

F G H

Name

A B C D E 8 8 7

Depth

7 8 9 8 9 3 4 4

Fanout

4 3 3 4 3

# Buffers

22k 10k 29k 88k 29k 10k 87k 22k

# Sinks

87k 8k 26k 349k 26k 9k 349k 87k

# Gated

331 144 426 1251 430 138 1267 326

28

Outline

 Background of NBTI & Clock Gating  Problem: Skew due to NBTI in gated clock  Previous Works  Proposed Solutions 

Results 29

Results

 Age the circuit to 10 years  Calculated skew for four cases › › › › Choose NAND/NOR based on our formulation Choosing all NAND gates Choosing all NOR gates Try 10 random assignment, pick best

30

Results (contd)

Name

 A B C D E F G H

Solver Time (s)

0.14

0.06

1.41

0.81

0.12

0.09

0.47

0.09

OUR Skew (ps) 2.80

2.18

4.13

3.03

2.76

3.94

3.88

2.59

All NAND (ps)

4.41

3.23

6.4

5.04

5.46

6.21

6.75

3.91

All NOR (ps)

9.02

5.84

9.28

9.74

10.21

12.23

13.07

8.44

10 Rand.

(ps)

7.24

4.96

7.05

6.21

7.04

11.82

10.58

5.38

1.33X

 Significantly tightens the skew budget

31

Conclusions

 Proposed choosing NAND/NOR gating at design time minimize skew degradation.

 Optimal (ILP) results show 55% and 120% lower skew than all NAND/all NOR cases.

 Random + pick best results reduce 20% and 80% over all NAND/all NOR cases.

 Fast. Log(n) binary variables.

 Future Works: › › ILP is NP complete. Some other formulation.

How ICGs can be handled.

32

Thank you.

Questions?

33