Coarse Grain Reconfigurable Architectures

Download Report

Transcript Coarse Grain Reconfigurable Architectures

Seminar by Prof. Dr.
José Camargo da Costa
November 22, 2002, ENE, UnB, Brasilia
Reiner Hartenstein*
Kaiserslautern
University of Technology
(TU Kaiserslautern)
*) IEEE fellow
Data-stream-based Computing,
Enabling Technology for
Reconfigurable Computing
Friday, November 22, 2002, 17.00 hrs.
>> Microelectronics History
Xputer Lab
TU Kaiserslautern
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved
http://www.uni-kl.de
© 2002, [email protected]
2
http://hartenstein.de
The History of
Paradigm Shifts
Xputer Lab
TU Kaiserslautern
TTL
1967
1957
custom
LSI,
MSI
© 2002, [email protected]
“The Programmable System-on-a-Chip
is the next wave“
µproc.,
memory
1987
2nd Design Crisis
standard
1st Design Crisis
“Mainstream Silicon Application
is switching every 10 Years”
ASICs,
accel’s
1977
3
2007
1997
http://hartenstein.de
The Impact of Makimoto’s
Paradigm Shifts
Xputer Lab
TU Kaiserslautern
Software Industry’s
Secret of Success
Personalization
(CAD) before
fabrication
standard
1967
1957
custom
Repeat Success Story by
new Machine Paradigm !
Procedural
personalization
via RAM-based
Machine Paradigm
µproc.,
memory
TTL
LSI,
MSI
© 2002, [email protected]
Dr. Makimoto: FPL 2000 keynote
structural
personalization:
RAM-based
before run time
2007
1987
ASICs,
accel’s
1977
4
1997
http://hartenstein.de
The
EDA Industry
Revolution
“Thenext
Programmable
System-on-a-Chip
Xputer Lab
TU Kaiserslautern
is the next wave“
Makimoto’s 3rd wave
Von Neumann does not
support Morphware:
EDA industry paradigm
switching every 7 years
[Hartenstein]
1999
[Keutzer / Newton]
McKinsey Curves
1992
Synthesis (HDLs): Cadence, Synopsys ...
1985
1978
(Co-) Compilation:
data-stream-based DPAs
Schematics entry: Daisy, Mentor, Valid ...
Transistor entry: Applicon, Calma, CV ...
© 2002, [email protected]
5
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Ubiquitous embedded systems
Embedded systems means:
20 billion µprocessors (2001)
• hardware / software
co-design
> 90% in embedded systems
10 times more programmers will
write embedded applications
than computer software by 2010
That’s where our graduates will go
© 2002, [email protected]
6
• configware / software
co-design
• hardware / configware /
software co-design
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Embedded Systems Requirement:
Hardware/Configware and Software as Alternatives
Algorithm
partitioning
Hardware,
Configware
Software
Hardw/Configw
Software only
Softwareonly
& Hardw/Configw
© 2002, [email protected]
7
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
>> fine grain and coarse grain Morphware
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved
http://www.uni-kl.de
© 2002, [email protected]
8
http://hartenstein.de
Top 4 FPGA Manufacturers 2000
Xputer Lab
TU Kaiserslautern
12
12
16
no. of masks
20 26 28 30
Lattice
15%
cost / mio §
4
>30
NRE and
mask cost
[dataquest]
Actel
6%
Xilinx
42%
3
.
Altera
37%
2
mask set
cost [eASIC]
0.8
0.6
0.35 0.25 0.18 0.15 0.13
0.1
Top 4 PLD Manufacturers 2000
• [Dataquest] > $7 billion by 2003.
1
• FPGAs going into every type
of application – also SoC
0.07 feature size
You do not neet specific silicon !
© 2002, [email protected]
total: $3.7 Bio
9
• fastest growing segment
of semiconductor market
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Configware and EDA as the Key Enabler
• Growing no. of independent configware houses
(soft IP core vendors) and design services provide
libraries of "pre-fabricated" re-usable IP cores
• Emerging separate EDA software market FPGA synthesis [2001: Dataquest]:
• Synplicity 57%,
• Mentor 37%,
• Synopsys 7%
© 2002, [email protected]
10
http://hartenstein.de
Throughput vs. Efficiency
Xputer Lab
TU Kaiserslautern
area used by
application
T. Claasen et al.: ISSCC 1999
*) R. Hartenstein: ISIS 1997
MOPS / mW
1000
L
100
L
L
L
S
1
L
S
L
L
resources
needed for
reconfigurability
0.01
0.001
L
1 Bit CLB
0.1
Wiring by abutment:
32 Bit example
S
S
10
L
2
© 2002, [email protected]
1
0.5
0.25
11
0.13 0.1 0,07 µ feature size
http://hartenstein.de
Commercial rDPAs
Xputer Lab
TU Kaiserslautern
XPU family (IP cores):
PACT AG., Munich
http://pactcorp.com
ACM: Quicksilver Tech
© 2002, [email protected]
XPU128
12
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
SNN filter KressArray Mapping Example
http://kressarray.de
rout thru only
data streams
array size:
10 x 16
= 160 rDPUs
Legend:
© 2002, [email protected]
rDPU not used
backbus connect
used for
routing only
backbus
connect
13
operator and routing
port location
not
used marker
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
KressArray Family generic Fabrics:
a
few
examples
Select mode,
Select
number, width
of NNports
16
Function
Repertory
8
32
+
24
2
rDPU
4
select Nearest Neighbour (NN) Interconnect: an example
routthrough
only
more NNports:
rich Rout Resources
rout-through
and function
Examples of
2nd Level
Interconnect:
layouted over
rDPU cell no separate
routing areas !
http://kressarray.de
© 2002, [email protected]
14
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Antimatter of Computing is available
• Using FPGAs (fine grain morphware) has been
just Logic Synthesis on a strange platform
• Coarse Grain rDPAs
(Reconfigurable Computing):
a fundamental Paradigm Shift
• up several abstraction levels
© 2002, [email protected]
15
http://hartenstein.de
>> Anti Matter of Computing
Xputer Lab
TU Kaiserslautern
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved
http://www.uni-kl.de
© 2002, [email protected]
16
http://hartenstein.de
The anti universe
Xputer Lab
TU Kaiserslautern
• Paul Dirac predicted a complete
anti universe consisting of antimatter
• “There are regions in the universe,
which consist of antimatter .....
• .... But there are asymmetries”
• when a particle hits its antiparticle, both
are converted into energy: Annihilation
• We are not aware, that there is a new area in computing
sciences , which consists of antimatter of computing
• Reconfigurable Computing is made from this antimatter:
data-stream-based computing
© 2002, [email protected]
17
http://hartenstein.de
anti particles
Xputer Lab
TU Kaiserslautern
• 1928: Paul Dirac: „there should be an anti electron
having positive charge“ (Nobel price 1933)
• 1932: Carl David Anderson detected this „positron“
in cosmic radiation (Nobel price 1936)
hydrogen
• 1954: new accelerators: cyclotron,
like Berkeley‘s Bevatron
anti hydrogen
• 1955 Owen Chamberlain et al.
create anti proton on Bevatron
• 1956: anti neutron created on Bevatron
• 1965: creation of a deuterium
anti nucleus at CERN
• 1995: hydrogen anti atom created at
CERN – by forcing positron and anti
proton to merge by very low energy.
© 2002, [email protected]
18
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
-
Matter & Antimatter: Atom and Anti Atom
+
Anti Matter machine paradigm:
Anti Atom
The World of Matter machine paradigm:
the Atom
© 2002, [email protected]
19
+
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Matter & Antimatter of Informatics :
Machine and Anti Machine
CPU
-
+
Anti Machine paradigm
1936
1946
1971
1979
1990
1995
Machine paradigm:
„von Neumann“
© 2002, [email protected]
1st electronic computer (Konrad Zuse)
v. N. machine paradigm
1st microprocessor (Ted Hoff)
„data streams“ (systolic array: Kung / Leiserson)
anti machine paradigm published
rDPA / DPSS (supersystolic: Rainer Kress)
DPU
novel
compilation
techniques
+
20
-
http://hartenstein.de
CPU
+
+
instruction
sequencer
-
+
Data
Path
instruction
stream
© 2002, [email protected]
21
stream
data
data streams
-
DPU
+
Xputer Lab
TU Kaiserslautern
Matter vs. antimatter: CPU vs. DPU
DPU
Data
Path
Unit
http://hartenstein.de
heavy anti atoms: DPA = DPU array
coherent data streams
spinning around
Xputer Lab
TU Kaiserslautern
+
+
+
DPU
DPU
DPU
DPU
DPU
DPU
DPU
DPU
DPU
DPA
-
+
-
+
+
© 2002, [email protected]
-
-
-
-
22
+
-
DPA
+
-
+
http://hartenstein.de
Parallelism by Concurrency
Xputer Lab
TU Kaiserslautern
independent instruction streams
difficult ...
+
+
-
-
© 2002, [email protected]
+
+
-
+
23
-
+
-
+
http://hartenstein.de
>> Anti Machine and its Resources
Xputer Lab
TU Kaiserslautern
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved
http://www.uni-kl.de
© 2002, [email protected]
24
http://hartenstein.de
Dichotomy of machine paradigms
Xputer Lab
TU Kaiserslautern
M
asM
asM
M
address
generator
instruction
stream
CPU
DPU
instruction
sequencer
© 2002, [email protected]
M
M
M
M
M
M
M
M
data
streams
data
stream
(r)DPU
or
(r)
DPU
(r)DPA
(r)DPU Array
25
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
•
•
•
•
•
•
Terminology: DPU versus CPU ...
DPU: data path unit
DPA: DPU array
GA: gate array
rDPU: reconfigurable DPU
rDPA: reconfigurable DPA
rGA: reconfigurable GA
• DPU is no CPU:
there is nothing central
- like in a DPA
© 2002, [email protected]
26
r DPA
r DPU
CPU
DPU
DPU
instruction
sequencer
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Terminology: Digital System Platforms
clearly distinguished
source
running on it
platform
hardware
(not running on it)
fine grain rGA (FPGA)
configware
morphware coarse
rDPU, rDPA
grain
reconfigurable flowware &
data stream
configware
processor
data stream processor (hardwired)
flowware
instruction stream processor
software
© 2002, [email protected]
machine
paradigm
27
none
anti machine
von Neumann
machine
http://hartenstein.de
flowware defines ....
Xputer Lab
TU Kaiserslautern
time
x
x
x
DPA
... which data item
at which time
at which port
time
|
|
port #
- - - x x x
time
- - - - x x x
x x x - -
- - - - - x x x
port #
|
|
|
|
|
|
|
|
|
|
|
x
x
x
28
input data streams
|
x x x
x x x -
flowware manipulates the
data counter(s) ...
... software manipulates
the program counter
© 2002, [email protected]
x
x
x
x
x
x
time
x
x
x
port #
output data streams
|
x
x
x
http://hartenstein.de
Configware / Flowware Compilation
Xputer Lab
TU Kaiserslautern
M
M
M
M
high level source program
data
streams
M
M
M
© 2002, [email protected]
M
M
M
r. Data
Path
Array
mapper
M
rDPA
wrapper
intermediate
configware
M
M
M
M
M
asM
scheduler
address
generator
29
flowware
data sequencer
http://hartenstein.de
... for a Stream-based Soft Machine
Xputer Lab
TU Kaiserslautern
Memory
(data memory)
Compiler
“instructions”
Scheduler
rDPA
memory bank
memory bank
memory bank
...
memory bank
...
memory bank
Sequencers
(data stream
generator)
© 2002, [email protected]
30
http://hartenstein.de
*> Declarations
Xputer Lab
SouthWestScan
TU Kaiserslautern is
loop 8 times until [1,*]
step by [-1,1]
endloop
end SouthWestScan;
JPEG zigzag scan pattern
Flowware language example
HalfZigZag;
SouthWestScan
(MoPL)
uturn (HalfZigZag)
goto PixMap[1,1]
SouthScan is
step by [0,1]
endSouthScan;
NorthEastScan is
loop 8 times until [*,1]
step by [1,-1]
x
y
dataHalfZigZag
counter
data counter
data counter
data counter
endloop
end NorthEastScan;
EastScan is
step by [1,0]
end EastScan;
endloop
end HalfZigZag;
© 2002, [email protected]
31
HalfZigZag
HalfZigZag is
EastScan
loop 3 times
SouthWestScan
SouthScan
NorthEastScan
EastScan
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
GAG Slider Model
floor
A
L0
Limit
Stepper
B0
Address
Stepper
A
[
B0
Generic
Address
Generator
L0
]
A
L0
[
© 2002, [email protected]
Base
Stepper
GAG
A
B0
ceiling
sliders
]
32
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
GAG Slider Operation Demo Example
address
floor
F
ceiling
B0
A
B
x
© 2002, [email protected]
y
B
33
L0
C
L
L
http://hartenstein.de
GAG Complex Sequencer Implementation
Xputer Lab
TU Kaiserslautern
GAG
GAG
L0 A B0
Limit
Slider
Address
Stepper
A
VLIW
stack
L0 A B0
Base
Slider
Limit
Slider
Address
Stepper
GAG
A
L0 A B0
Limit
Slider
GAU
Address
Stepper
A
GAU
GAU
SDS
Base
Slider
GAG
Generic Addressing Unit
© 2002, [email protected]
Base
Slider
34
all `been
published
in 1990
http://hartenstein.de
Generic Sequence Examples
Xputer Lab
TU Kaiserslautern
L0 A B0
atomic scan
linear scan
a)
Address
Stepper
Limit
Slider
video scan
b)
A
Base
Slider
GAU
-90º rotated video scan
c)
-45º rotated (mirx (v scan))
until
sheared video scan
non-rectangular video scan
zigzag video scan
d)
e)
f)
g)
spiral scan
feed-back-driven scans
perfect
shuffle
© 2002, [email protected]
35
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
y
scan window
example
Storage scheme optimization: scanline unrolling
MoM anti machine architecture
intra scan window accesses
(low level sequencing)
scan pattern
(high level sequencing)
w/r
r
r
Bank a
r
r
r
Bank b
r
r
r
Bank a
x
handle positions
final design
after inner scan
line loop unrolling
after scan
line unrolling
© 2002, [email protected]
r/w
r
r
r/w
r
r
r/w
r
r
r
r
r
r
r
r
hardw. level
access optim.
initial design
36
http://hartenstein.de
>> Problems to be solved
Xputer Lab
TU Kaiserslautern
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved
http://www.uni-kl.de
© 2002, [email protected]
37
http://hartenstein.de
What is the trend ?
Xputer Lab
TU Kaiserslautern
•Data-stream-based Computing
• vN is needed for embedded
systems, OS, compilers,
is heading for mainstream
Sauerkraut software, nonperformance-critical applications, –1979 „data streams“ (Kung / Leiserson)
–1997 SCCC (LANL) Streams-C
others ….
• vN is obsolete for massive
parallelism, except some
special application areas
• Anti machine is the way to go
for massive parallelism, also
data-intensive applications
Configurabble Computing
–SCORE (UCB) Stream Computations
Organized for Reconfigurable Execution
–ASPRC (UCB) Adapting Software Pipelining
for Reconfigurable Computing
–2000 Bee (UCB), ...
–Most stream-based multimedia systems, etc.
–Many other areas ....
• Morphware is the way for high
performance with short product life
cycles, unstable standards
© 2002, [email protected]
38
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Conclusion: all knowledge needed is available
• machine paradigm
courses / embedded tutorials:
•
DATE. Munich, 2001
•
ASP-DAC, Yokohama, 2001
•
SBCCI, Brasilia, 2001
• languages
• hw / sw partitioning methodology
• compilation techniques
• anti architectural resources
• sequencing methodology: hw & sw
full day courses:
Univ. Montpellier 1998
Nokia / Univ. Tampere, Finland, 2002
CNRS Paris France, 2002
UnB, Brasilia, 2002
•
•
10 keynotes 2001 / 2002
5 invited talks 2001 / 2002
• parallel memory IP core and module generator vendors
• anything else needed
© 2002, [email protected]
39
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Main problems to be solved
•Lack of qualified users
and implementers
•Each programmer should
have qualified awareness
on dichotomy and
morphware
computing
computing systolic
arrays
in space
in time
etc.
migration by re-timing
and other transformations
•curricular innovations
are urgently needed
© 2002, [email protected]
this dichotomy is
completely ignored
by our CS curricula
40
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
structural
CS education .....
Configware / Software Co-Design?
Hardware / Software Co-Design?
procedural
hardware person
© 2002, [email protected]
41
software person
http://hartenstein.de
Annihilation?
Xputer Lab
TU Kaiserslautern
-
avoidable
by careful
methodology
+
© 2002, [email protected]
+
42
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
CS Education
….
…However,
is basedcurrent
on the Submarine
Model
This model disables ...
Algorithm
procedural high level
Programming Language
Brain usage:
procedural-only
Assembly Language
Hardware invisible:
under the surface
Hardware
© 2002, [email protected]
43
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Hardware and Software as Alternatives
procedural
structural
Algorithm
partitioning
Brain Usage:
both Hemispheres
Hardware,
Configware
Software
Hardw/Configw
Softwareonly
& Hardw/Configw
Software only
© 2002, [email protected]
44
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
(procedural)
The Dominance of the Submarine Model ...
structurally
disabled
Hardware
... indicates, that our CS education
system produces zillions of
mentally disabled Persons
It‘s time to attack the software
faculty dictatorship. Get involved!
© 2002, [email protected]
… completely disabled to cope with
solutions other than software only
45
http://hartenstein.de
Xputer Lab
TU Kaiserslautern
Antimatter Search ?
in EE & CS we do not need to search
Antimatter Search
© 2002, [email protected]
46
http://hartenstein.de
>>> thank you
Xputer Lab
TU Kaiserslautern
thank you for your patience
© 2002, [email protected]
47
http://hartenstein.de
>>> END
Xputer Lab
TU Kaiserslautern
END
© 2002, [email protected]
48
http://hartenstein.de