A 256Kbits L-TAGE branch predictor
Download
Report
Transcript A 256Kbits L-TAGE branch predictor
A 256 Kbits L-TAGE branch predictor
André Seznec
IRISA/INRIA/HIPEAC
1
André Seznec
Caps Team
IRISA/INRIA
Directly derived from:
A case for (partially) tagged branch predictors,
A. Seznec and P. Michaud JILP Feb. 2006
+
Tricks:
Loop predictor
Kernel/user histories
2
André Seznec
Caps Team
Irisa
TAGE:
TAgged GEometric history length predictors
The genesis
3
André Seznec
Caps Team
Irisa
Back around 2003
2bcgskew was state-of-the-art, but:
but was lagging behind neural inspired
predictors on a few benchmarks
Just wanted to get best of both behaviors
and maintain:
Reasonable
implementation cost:
• Use only global history
• Medium number of tables
In-time response
4
André Seznec
Caps Team
Irisa
The basis : A Multiple length global
history predictor
TO
T1
T2
L(0)
L(1)
L(2)
T3
?
T4
L(3)
L(4)
5
André Seznec
Caps Team
Irisa
GEometric History Length
predictor
The set of history lengths forms a geometric series
L(0) 0
L(i) α i 1L(1)
Capture correlation
on very long histories
{0, 2, 4, 8, 16, 32, 64, 128}
most of the storage
What
is important:
for short
history !!L(i)-L(i-1) is drastically increasing
6
André Seznec
Caps Team
Irisa
Combining multiple predictions ?
Classical solution:
Use of a meta predictor
“wasting” storage !?!
chosing among 5 or 10 predictions ??
Neural inspired predictors, Jimenez and Lin 2001
Use an adder tree instead of a meta-predictor
Partial matching
Use tagged tables and the longest matching history
Chen et al 96, Michaud 2005
7
André Seznec
Caps Team
Irisa
CBP-1 (2004): OGEHL
Final computation through a sum
TO
T1
T2
L(0)
L(1)
L(2)
T3
∑
T4
L(3)
Prediction=Sign
L(4)
12 components 3.670 misp/KI
8
André Seznec
Caps Team
Irisa
TAGE
Geometric history length + PPM-like
+ optimized update policy
pc
h[0:L1]
pc
hash
hash
tag
ctr
pc h[0:L2]
u
hash
ctr
=?
1
pc h[0:L3]
hash
tag
u
hash
ctr
=?
1
1
tag
hash
u
=?
1
1
1
1
1
Tagless base
predictor
1
9
prediction
André Seznec
Caps Team
Irisa
Miss
Hit
Pred
=?
1
=?
1
1
=?
1
1
1
1
1
Hit
Altpred 10
1
André Seznec
Caps Team
Irisa
Prediction computation
General case:
Longest matching component provides the prediction
Special case:
Many mispredictions on newly allocated entries: weak Ctr
On many applications, Altpred more accurate than Pred
Property dynamically monitored through a single 4-bit
counter
11
André Seznec
Caps Team
Irisa
TAGE update policy
General principle:
Minimize the footprint of the prediction.
Just
update the longest history
matching component and allocate at
most one entry on mispredictions
12
André Seznec
Caps Team
Irisa
A tagged table entry
Ctr: 3-bit prediction counter
U: 2-bit useful counter
Was the entry recently useful ?
Tag: partial tag
U
Tag
13
Ctr
André Seznec
Caps Team
Irisa
Updating the U counter
If (Altpred ≠ Pred) then
• Pred = taken : U= U + 1
• Pred ≠ taken : U = U - 1
Graceful aging:
Periodic shift of all U counters
• implemented through the reset of a single bit
14
André Seznec
Caps Team
Irisa
Allocating a new entry on a
misprediction
Find a single “useless” entry with a longer history:
Priviledge the smallest possible history
• To minimize footprint
But not too much
• To avoid ping-pong phenomena
Initialize Ctr as weak and U as zero
15
André Seznec
Caps Team
Irisa
Improve the global history
Address + conditional branch history:
path confusion on short histories
Address + path:
Direct hashing leads to path confusion
1. Represent all branches in branch history
2. Use also path history ( 1 bit per branch, limited to 16
bits)
16
André Seznec
Caps Team
Irisa
Design tradeoff for CBP2 (1)
13 components:
Bring the best accuracy on distributed traces
• 8 components not very far !
History length:
Min=4 , Max = 640
Could use any Min in [2,6] and any Max in
[300, 2000]
17
André Seznec
Caps Team
Irisa
Design tradeoff for CBP2 (2)
Tag width tradeoff:
(destructive) false match is better tolerated
on shorter history
7 bits on T1 to 15 bits on T12
Tuning the number of table entries:
Smaller number for very long histories
Smaller number for very short histories
18
André Seznec
Caps Team
Irisa
Adding a loop predictor
The loop predictor captures the number of iterations of a loop
When successively encounters 4 times the same number of
iterations, the loop predictor provides the prediction.
Advantages:
Very reliable
Small storage budget: 256 52-bit entries
Complexity ?
Might be difficult to manage speculative iteration numbers on
deep pipelines
19
André Seznec
Caps Team
Irisa
Using a kernel history and a user
history
Traces mix user and kernel activities:
Kernel activity after exception
• Global history pollution
Solution: use two separate global histories
User
history is updated only in user mode
Kernel history is updated in both modes
20
André Seznec
Caps Team
Irisa
L-TAGE submission accuracy
(distributed traces)
3.314 misp/KI
21
André Seznec
Caps Team
Irisa
Reducing L-TAGE complexity
Included 241,5 Kbits TAGE predictor:
3.368
misp/KI
Loop
predictor beneficial only on gzip:
Might not be worth the extra complexity
22
André Seznec
Caps Team
Irisa
Using less tables
8 components 256 Kbits TAGE predictor:
3.446 misp/KI
23
André Seznec
Caps Team
Irisa
TAGE prediction computation time ?
3 successive steps:
Index computation
Table read
Partial match + multiplexor
Does not fit on a single cycle:
But can be ahead pipelined !
24
André Seznec
Caps Team
Irisa
Ahead pipelining a global history
branch predictor (principle)
Initiate branch prediction X+1 cycles in advance to
provide the prediction in time
Use information available:
• X-block ahead instruction address
• X-block ahead history
To ensure accuracy:
Use intermediate path information
25
André Seznec
Caps Team
Irisa
Practice
A
B
C
bc
Ha
Ahead pipelined TAGE:
4// prediction computations
A
26
André Seznec
Caps Team
Irisa
3-branch ahead pipelined
8 component 256 Kbits TAGE
3.552 misp/KI
27
André Seznec
Caps Team
Irisa
A final case for the Geometric History
Length predictors
delivers state-of-the-art accuracy
uses only global information:
Very long history: 300+ bits !!
can be ahead pipelined
many effective design points
OGEHL or TAGE
Nb of tables, history lengths
28
André Seznec
Caps Team
Irisa
The End
29
André Seznec
Caps Team
Irisa