Coarse Grain Reconfigurable Architectures

Download Report

Transcript Coarse Grain Reconfigurable Architectures

MSE 2005
International Conference on
Microelectronic Systems Education,
June 12, 2005, Anaheim, USA
in conjunction with DAC
Reiner Hartenstein
TU Kaiserslautern
Reconfigurable Computing
(RC) being Mainstream:
Torpedoed by Education
Apropos „reconfigurable“:
TU Kaiserslautern
avoid confusing terminology
soft hardware
®
morphware
[DARPA]
programming
morphware:
programming
data streams:
software
configware
© 2005, [email protected]
software
flowware
2
http://hartenstein.de
Programmer Education
TU Kaiserslautern
for microelectronic systems
we also need “programmers”
but our CS curricula are obsolete:
the requirements of our
labor market are ignored
this is one of the reasons
of declining enrolment
young people find molecular
biology more fascinating
© 2005, [email protected]
3
http://hartenstein.de
TU Kaiserslautern
>> The Wrong Roadmap <<
• The Wrong Roadmap
• Our Curricula are obsolete
• The overdue new Basic Model
• Coarse Grain vs. Fine Grain
http://www.uni-kl.de
© 2005, [email protected]
• Lobbying for RC Education
4
http://hartenstein.de
Objectives
TU Kaiserslautern
of RC are acceleration,
flexibility, low cost, and,
low power dissipation,
mainly in:
Supercomputing, HPC
(High Performance Computing)
Embedded Systems
© 2005, [email protected]
5
http://hartenstein.de
TU Kaiserslautern
N.N.: „Innovation
for Prosperity“ (1)
“ ......
High performance computing [HPC] has been and will continue
to be a key ingredient in America’s innovation capacity. It
accelerates the innovation process by shrinking “time-toinsight” and “time-to-solution” for both discovery and
invention. ..... ”
© 2005, [email protected]
6
http://hartenstein.de
TU Kaiserslautern
N.N.: „Innovation
for Prosperity“ (2)
in high performance
computer architecture, however,
the innovation process has been
massively slowed down for a long time,
time-to-insight stalled
for more than a decade.
© 2005, [email protected]
7
http://hartenstein.de
TU Kaiserslautern
N.N.: „Innovation
for Prosperity“ (3)
impressive: all these respectable sponsors
the HPC Initiative:
how to force the wrong road map
into private economy
© 2005, [email protected]
8
http://hartenstein.de
… understand only this parallelism solution:
TU Kaiserslautern
the instruction-stream-based approach
the data-stream-based approach
has no von
Neumann
bottleneck
von
Neumann
bottlenecks
© 2005, [email protected]
9
http://hartenstein.de
TU Kaiserslautern
path of least resistance*:
avoiding a paradigm shift
*) [Michel
on Dubois]
Most researchers seem never to stop working
sophisticated solutions for marginal improvements ...
... continously ignoring methodologies promising
speed-ups by orders of magnitude ....
... continuing to bang their heads
against the memory wall
blinders
to ignore
the impact
of morphware
© 2005, [email protected]
10
instead of
http://hartenstein.de
TU Kaiserslautern
They ignore, that Reconfigurability
has become mainstream
FPGAs: the
fastest growing segment of
the microelectronics market
found by Google:
term
no. of links
FPGA
© 2005, [email protected]
~ 1,580,000
11
~ 6 bio US-$
http://hartenstein.de
By the way ...
... the oldest and largest conference in the field:
TU Kaiserslautern
15th International Conference
on Field-Programmable Logic,
Reconfigurable Computing
and Applications (FPL) ~ FPGA
http://fpl.org
August 24 – 26, 2005, Tampere, Finland
... going into every type of application
reconf.
µProc. accel.
© 2005, [email protected]
364 submissions !
12
http://hartenstein.de
Ignoring
TU Kaiserslautern
Ignoring Reconfigurable Computing (RC)
is the completely wrong roadmap …
this is being changed recently by a
(growing) minority in the HPC area.
massively higher performance is obtained
by a fundamental paradigm shift.
© 2005, [email protected]
13
http://hartenstein.de
Cray XD1
TU Kaiserslautern
Ignoring Reconfigurable Computing
is the completely wrong roadmap …
… has delayed the time to insight
by more than a decade.
#############
© 2005, [email protected]
14
10
http://hartenstein.de
TU Kaiserslautern
>> Our Curricula are obsolete <<
• The Wrong Roadmap
• Our Curricula are obsolete
• The overdue new Basic Model
• Coarse Grain vs. Fine Grain
• Lobbying for RC Education
http://www.uni-kl.de
© 2005, [email protected]
15
http://hartenstein.de
TU Kaiserslautern
Speech by Bill Gates at a summit meeting
of US state governors:
"American high schools are obsolete."
Bill
Gates
„Our high schools - even working exactly as designed cannot teach our kids what they need to know today.“
"The high schools of today teach kids about today's
computers like on a 50-year-old mainframe.
"Our high schools were designed 50 years ago
to meet the needs of another age. „
„Without re-design for the needs of the 21st century,
we will keep limiting - even ruining the lives of millions of Americans every year."
© 2005, [email protected]
16
http://hartenstein.de
carved out of stone
TU Kaiserslautern
The most important cultural revolution
since the invention of text characters:
it‘s not the mainframe
It is the Microchip !
© 2005, [email protected]
17
http://hartenstein.de
TU Kaiserslautern
Bill
Speech by Bill Gates at a summit meeting
Gates
of US state governors:
"American high schools are obsolete."
Find & replace
high schools
CS curricula
.... yields:
© 2005, [email protected]
18
http://hartenstein.de
TU Kaiserslautern
R. H.
R. H. at MSE 2005 (and earlier):
„Our CS curricula are obsolete."
Our CS departments - even working exactly as designed cannot teach our students what they need to know today.
The Universities of today teach students about today's
computers like on a 50-year-old mainframe.
The basic paradigm was designed 50 years ago
to meet the needs of another age.
Without re-design for the needs of the 21st century, we
will keep limiting - even ruining - the lives of our graduates.
© 2005, [email protected]
19
http://hartenstein.de
TU Kaiserslautern
path of least resistance …
… also by academic curriculum committees
© 2005, [email protected]
20
http://hartenstein.de
Computing
Curricula
2004 (1)
TU Kaiserslautern
© 2005, [email protected]
21
http://hartenstein.de
Computing Curricula 2004 (2)
TU Kaiserslautern
Joint Task Force for Computing Curricula 2004
Russell Shackelford, chair, CC2004 Task Force, chair, ACM Education Board.
James H. Cross II, Auburn University, past VP IEEE Computer Society’s EAB*.
Gordon Davies (retired) , U.K.’s Open University.
John Impagliazzo , Hofstra University
Richard LeBlanc, Georgia Tech, Vice Chair ACM Education Board, a Team Chair for
ABET’s Computing Accreditation Commission,
Barry Lunt, Brigham Young University
Andrew McGettrick, University of Strathclyde, Glasgow
Robert Sloan, Univ. of Illinois at Chicago, member, EAB IEEE Computer Society.
Heikki Topi , Bentley College, Waltham, MA.
*) Eductional Activity Board
© 2005, [email protected]
22
http://hartenstein.de
Computing Curricula 2004 (3)
TU Kaiserslautern
#
© 2005, [email protected]
23
http://hartenstein.de
2.2.1.
TU Kaiserslautern
Computing
Curricula
2004 (4)
Within all 48
pages the term
reconfigurable
is not found –
nor its synonyms
the areas of
configware and
morphware are
completely
missing
© 2005, [email protected]
24
http://hartenstein.de
2.2.1.
TU Kaiserslautern
Computing
Curricula
2004 (5)
… how it
should be
CONFIGWARE
morphware
and
configware
added
MORPHWARE
© 2005, [email protected]
25
http://hartenstein.de
TU Kaiserslautern
The term „embedded systems“ almostignored.
Computing
Curricula
2004 (6)
problem
space
seen:
how it is
© 2005, [email protected]
26
http://hartenstein.de
TU Kaiserslautern
Computing
Curricula
2004 (7)
problem
space seen:
configware
added
morphware
added
© 2005, [email protected]
Configware Methods
Computer Hardware and
Morphware Architecture
27
http://hartenstein.de
obsolete (1)
TU Kaiserslautern
general recommendations
are obsolete
CE recommendations
are obsolete
© 2005, [email protected]
28
http://hartenstein.de
TU Kaiserslautern
CE curricula
2.3.1. Computer Engineering (1)
Computing
Curricula
2004 (8)
Computer engineering is concerned with the design and construction
of computers, and computer based systems. It involves the study of
hardware, software, communications, and the interaction among them.
Its curriculum focuses on the theories, principles, and practices of
relevant areas of traditional electrical engineering and mathematics,
and applies them to the problems of designing computers and the
many kinds of computer-based devices.
Computer engineering is concerned with the design and construction
of computers, and computer based systems. It involves the study of
hardware, morphware, software, configware, communications, and
the interaction among them.
© 2005, [email protected]
29
http://hartenstein.de
CE curricula
Computing
Curricula
2.3.1. Computer Engineering (2)
TU Kaiserslautern
2004
(9)
Computer engineering students study the design of digital
hardware systems, including computers, communications systems,
and devices that contain computers. They also study software
development with a focus on the software used within and between
digital devices (not the software programs directly used by
computer users). The emphasis of the curriculum is on hardware
more than software, and it has a very strong engineering flavor.
Computer engineering students study the design of digital
hardware/morphware systems, including computers, communications
systems, and devices that contain computers. They also study software
and configware development with a focus on software and configware
used within and between digital devices (not the software programs
directly used by computer users). The emphasis of the curriculum is on
hardware/morphware more than software, and it has a very strong
engineering flavor.
© 2005, [email protected]
30
http://hartenstein.de
CE curricula
TU Kaiserslautern
2.3.1. Computer Engineering (3)
Computing
Curricula
2004 (10)
Currently, a dominant area within computing engineering is embedded
systems, the development of devices that have software components
embedded in hardware. For example, devices such as cell phones, digital
audio players, digital video recorders, alarm systems, x-ray machines,
and laser surgical tools all require integration of hardware and
embedded software, and are all the result of computer engineering.
Currently, a dominant area within computing engineering is embedded
systems, the development of devices that have software and configware
components embedded in hardware and morphware. For example, devices
such as cell phones, digital audio players, digital video recorders, alarm
systems, x-ray machines, and laser surgical tools all require integration
of hardware, morphware, and embedded software, as well as embedded
configware, and are all the result of computer engineering.
© 2005, [email protected]
31
http://hartenstein.de
TU Kaiserslautern
Embedded Software
99% of all microprocessors are
used within embedded systems
the code for embedded software
doubles every 10 months
most programmers write embedded applications
> 90% by the year 2010
typical CS graduates are
not qualified for this labor market
© 2005, [email protected]
32
http://hartenstein.de
obsolete (2)
TU Kaiserslautern
general recommendations
are obsolete
CS recommendations
are obsolete
© 2005, [email protected]
33
http://hartenstein.de
obsolete
TU Kaiserslautern
von Neumann‘s
monopoly
inside curricula
is obsolete
© 2005, [email protected]
34
http://hartenstein.de
>> The overdue new Basic Model <<
TU Kaiserslautern
• The Wrong Roadmap
• Our Curricula are obsolete
• The overdue new Basic Model
• Coarse Grain vs. Fine Grain
• Lobbying for RC Education
http://www.uni-kl.de
© 2005, [email protected]
35
http://hartenstein.de
TU Kaiserslautern
3rd machine model became mainstream
mainframe age
compile
main
frame
instructionstream-based
computer age (PC age)
compile design
µProc. accel.
datastreambased
1967
1957
© 2005, [email protected]
morphware age
µProc. reconfiguraccel. able
2007
1987
1977
1997
36
http://hartenstein.de
TU Kaiserslautern
modern FPGA bestsellers:
The new model is reality:
FPGA fabrics, together with
several µprocessors,
several memory banks,
and other IP cores,
on the same COTS microchip
© 2005, [email protected]
37
http://hartenstein.de
TU Kaiserslautern
Nick Tredennick’s Paradigm Shifts
explain the differences
Software Engineering
CPU
software
resources: fixed
algorithm: variable
1 programming
source needed
Configware Engineering
configware
flowware
© 2005, [email protected]
resources: variable
algorithm: variable
38
2 programming
sources needed
http://hartenstein.de
Compilation: Software vs. Configware
TU Kaiserslautern
Software
Engineering
source program
Configware
Engineering
placement source „program“
& routing
mapper
software
compiler
configware
compiler
data scheduler
software code
configware code
© 2005, [email protected]
39
flowware code
http://hartenstein.de
TU Kaiserslautern
Terminology clean-up
Programming sources:
Configware: for configuring morphware
Flowware: for scheduling data streams
primarily
non-von
Neumann
Software: for scheduling instruction streams
© 2005, [email protected]
40
von
Neumann
http://hartenstein.de
TU Kaiserslautern
>> Coarse Grain vs. Fine Grain <<
• The Wrong Roadmap
• Our Curricula are obsolete
• The overdue new Basic Model
• Coarse Grain vs. Fine Grain
• Lobbying for RC Education
http://www.uni-kl.de
© 2005, [email protected]
41
http://hartenstein.de
TU Kaiserslautern
coarse-grained reconfigurability
FPGAs are fine-grained by using ~ 1 bit
wide CLBs (configurable logic blocks)
coarse-grained reconfigurable platforms
use multi-bit wide PUs, e. g. ALU-like …
… are much more area-efficient than FPGAs
© 2005, [email protected]
42
http://hartenstein.de
coarse grain reconfigurable
TU Kaiserslautern
rDPU = reconfigurable datapath unit
is not a CPU
has no program counter
rDPA = reconfigurable datapath array
= array of rDPUs
© 2005, [email protected]
43
http://hartenstein.de
(r)DPA
TU Kaiserslautern
commercial rDPA example:
PACT XPP - XPU128
XPP128 rDPA
ALU
• Full 32 or 24 Bit Design working silicon
• 2 Configuration Hierarchies
• Evaluation Board available, and
• XDS Development Tool with Simulator
© 2005, [email protected]
buses
not
shown
Ctrl
CFG
rDPU
PAE
core
© PACT AG, http://pactcorp.com
44
http://hartenstein.de
rDPA (coarse grain) vs. FPGA (fine grain)
TU Kaiserslautern
Status: ~1998
roughly:
performance
(MOPS/mW,
orders of magnitude)
µProc
DSP
FPGA
rDPA
hardwired
0
1
2
3
3
roughly:
area efficiency
(transistors/chip,
orders of magnitude)
µProc
commodity FPGA
rDPA
hardwired
0
2
4
4
in special cases much higher acceleration factors !
© 2005, [email protected]
45
http://hartenstein.de
TU Kaiserslautern
>> Lobbying for RC Education <<
• The Wrong Roadmap
• Our Curricula are obsolete
• The overdue new Basic Model
• Coarse Grain vs. Fine Grain
• Lobbying for RC Education
http://www.uni-kl.de
© 2005, [email protected]
46
http://hartenstein.de
TU Kaiserslautern
growing awareness …
…that the impact of morphware
means a fundamental paradigm shift
demonstrated by Google:
© 2005, [email protected]
term
no. of links
Reconfigurable
Computing
~ 68,400
47
http://hartenstein.de
more fascinating
TU Kaiserslautern
Dual paradigm CS & CE curricula
including RC already for freshmen …
… could be more fascinating
to bring enrolment up again,
provide the qualifications needed
could make the qualifications
more offshoring-resistant
© 2005, [email protected]
48
http://hartenstein.de
TU Kaiserslautern
Lobbying for RC education
Special interest group with IEEE Computer Society
Launch proposals to EAB, IEEE Computer Society
Push for a Special Issue of COMPUTER magazine
Meetings ?
mid’ July, Massachusetts Av, Washington, DC ?
end’ August, FPL, Tampere, Finland ?
other proposals ?
get involved !
give me your business card
send me an e-mail
© 2005, [email protected]
49
http://hartenstein.de
TU Kaiserslautern
thank you
© 2005, [email protected]
50
http://hartenstein.de
TU Kaiserslautern
END
© 2005, [email protected]
51
http://hartenstein.de
TU Kaiserslautern
-© 2005, [email protected]
52
http://hartenstein.de
From Software to Configware Industry
TU Kaiserslautern
Software
Industry
Growing Configware Industry
Repeat Success Story by
a 2nd Machine Paradigm !
Software Industry’s
Secret of Success
Procedural
personalization
via RAM-based 1)
.
2) Machine Paradigm
computer age (PC age)
© 2005, [email protected]
morphware age
compile
rDPA
µProc.
1967
1957
structural
personalization:
RAM-based
anti machine
2007
1987
1977
53
1997
http://hartenstein.de
Software / Configware Co-Compilation
Juergen Becker’s CoDe-X, 1996
TU Kaiserslautern
High level PL source
“vN" machine
paradigm
Partitioner
anti machine
paradigm
CW
SW
Analyzer
compiler / Profiler compiler
SW code
© 2005, [email protected]
CW Code
FW Code
54
supporting
different
platforms
Resource
Parameters
http://hartenstein.de
speed-up examples from 2004 & earlier
TU Kaiserslautern
key issue: algorithmic cleverness
platform
application example
PACT Xtreme
4-by-4 array 16 tap FIR filter
[2003]
grid-based DRC**
MoM anti
machine with 1-metal 1-poly nMOS***
DPLA* [1983] 256 reference patterns
CPU 2 FPGA
migrate several simple
[FPGA 2004] application exampes
DSP 2 FPGA
from fastest DSP:
[Xilinx 20042] 10 gMACs to 1 teraMAC
speed-up factor
method
x16 MOPS/mW
straight
forward
> x1000
multiple
aspects
x7 – x46
(compute time)
X 100
(compute time)
hi level
synthesis
(computation time)
not spec.
*) DPLA: MPC fabr. via E.I.S. multi univ. project **) Design Rule Check
2) Wim Roelandts
***) for 10-metal 3-poly cMOS expected: >> x10,000
© 2005, [email protected]
55
http://hartenstein.de
section of a major pipe network on rDPU
TU Kaiserslautern
hypothetical branching example
to illustrate time-to-space migration
S = R + (if C then A else B endif);
R B A
C =1
+
S
clock
200 MHz
(5 nanosec)
© 2005, [email protected]
C=1
simple conservative CPU example
read instruction
instruction decoding
if C
then read A read operand*
operate & reg. transfers
read instruction
if not C
then read B instruction decoding
read instruction
instruction decoding
add & store
operate & reg. transfers
store result
total
memory nano
cycles seconds
1
100
1
100
1
100
1
100
1
5
100
500
*) if no intermed. storage in register file
56
http://hartenstein.de
moving data around inside the Earth Simulator
TU Kaiserslautern
Crossbar weight: 220 t, 3000 km of cable,
ES 20: TFLOPS
© 2005, [email protected]
5120 Processors, 5000 pins each
57
http://hartenstein.de
data are moved around by software
TU Kaiserslautern
i.e. by memory-cycle-hungry instruction
streams which fully hit the memory wall
(slower than CPU clock by 2 orders of magnitude)
extremely
unbalanced
© 2005, [email protected]
58
stolen from Bob Colwell
http://hartenstein.de