SCALIP: Scalable IP for systolic applications

Download Report

Transcript SCALIP: Scalable IP for systolic applications

Floorplanning of Pipelined Array (FoPA) Modules using Sequence Pairs

Matt Moe Herman Schmit 1

Outline

• Pipelined Arrays • Previous Sequence Pair work • Sequence Pair additions • Results 2

System of the Future Cryptography Control Microprocessor Memory

• Soft IP cores – hardware accelerators – pipelined arrays

Signal Processing

3

Pipelined Arrays

• Systolic architecture • Easy to compile to • Fast throughput after synthesis • Structure lost during physical design pipeline stage array adjacent pipeline stages 4

Physical Design of Pipelined Arrays

• Maintain structure Logic floorplan module • One pipeline stage = one floorplan module Logic • Use floorplanning tools to create placement constraints Logic array adjacent modules 5

How do you maintain the structure?

• If modules were the same size - trivial solutions 1 2 3 4 5 6 7 8 9 1 2 3 4 5 9 8 7 6 1 2 8 9 7 6 3 4 6 5

More interesting problem…

• Modules vary in size • Wire Congestion – Created by non-adjacency of modules – Forces extra area usage 0 1 2 8 7 9 6 3 11 10 5 4 7

Classic Simulated Annealing of Sequence Pairs

• Sequence Pair – Floorplan representation that describes directional constraints between every possible pair of blocks – Large design space • H. Murata, et.al., “VLSI Module Placement Based on Rectangle Packing by the Sequence Pair,”

IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems,

vol. 15, no. 12, pp. 1518-1524, December 1996.

8

A

Classic Swap Move

B C A D C B D A D B C A D C B 9

ABCD DACB

Oblique Constraint Graph Oblique Connnectivity Graph B B A A C C D A D B C D 10

A

FoPA Delete / Insert Move

B C A B C D A D B C D A B D C 11

Restricted Delete / Insert Move

A B C A C B D A D B C D A C B D 12

This looks better…

• All logically array adjacent elements are adjacent in the floorplan 9 10 8 11 3 • Reduced wire congestion 2 4 1 7 6 5 13 0

Floorplanning Results

• Block sizes created from fastest synthesized designs • Each point represents the best score from 10 annealing runs 14

Floorplan Utilization

100.0% 99.5% 99.0% 98.5% 98.0% 97.5% 97.0% 1-D DCT IDEA(short) IDEA(long) 2-D DCT Classic FoPA LDPC 15

Longest Wire Length

14 12 10 8 6 4 2 0 Classic FoPA 1-D DCT IDEA(short) IDEA(long) 2-D DCT LDPC 16

Results after Placement and Routing

• Floorplans used as constraints in Monterey Design System’s Dolphin • Iteratively expand floorplans by 1% until routable • Delay reported by Dolphin 17

Added Area

80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% Unfloorplanned Classic FoPA 1-D DCT IDEA(short) IDEA(long) 2-D DCT 18

30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00%

Added Delay

Unfloorplanned Classic FoPA 1-D DCT IDEA(short) IDEA(long) 2-D DCT 19

Average Placed and Routed Results

Unfloorplanned Added Area 19.95% Classic Classic+LSP FoPA 41.84% 37.96% 15.00% Added Delay 12.60% 16.96% 15.51% 8.50% 20

Conclusions

• New restricted move set – Creates better placement of modules during floorplanning synthesis – Creates smaller and faster designs after placement and routing • In paper – New wire length model – Cost Metric 21