Transactors - University of California, Berkeley
Download
Report
Transcript Transactors - University of California, Berkeley
SMASH: The C++ Layer
Krste Asanovic
[email protected]
MIT Computer Science and Artificial Intelligence Laboratory
http://cag.csail.mit.edu/scale
RAMP Retreat, UC Berkeley
January 11, 2007
SMASH: SiMulation And SyntHesis
Goal:
One framework for both architectural exploration and
chip design and verification
Approach:
High-level design discipline where design expressed as
network of transactors (transactional actor)
Transactors (aka units) refined down to RTL
implementations
Design structure preserved during refinement
From my perspective, RDL & RAMP are pieces of
SMASH
Transactor Anatomy
Transactor unit comprises:
Architectural state (registers + RAMs)
Input queues and output queues connected to other units
Transactions (guarded atomic actions on state and queues)
Scheduler (selects next ready transaction to run)
Input
queues
Transactions
Scheduler
Transactor
Architectural
State
Advantages
Handles non-deterministic inputs
Allows concurrent operations on mutable state within unit
Natural representation for formal verification
Output
queues
RAMP Design Framework Overview
Target System: the machine
being emulated
CPU
CPU
CPU
Interconnect
Network
DRAM
RDL Compiled to
FPGA Emulation
CPU
• Describe structure as transactor
netlist in RAMP Description
Language (RDL)
• Describe behavior of each leaf
unit in favorite language (Verilog,
VHDL, Bluespec, C/C++, Java)
RDL Compiled to
Software Simulation
Host Platforms: systems
that run the emulation or
simulation
2VP70
FPGA
2VP70
FPGA
• Can have part of target
mapped to FPGA
emulation and part mapped
to software simulation
2VP70
FPGA
2VP70
FPGA
2VP70
FPGA
BEE2 Host Platform
Workstation Host Platform
SMASH/C++ is the way to write leaf units in C++, either
for use with RDL or for standalone C++ simulation
What’s in SMASH/C++?
A C++ class library plus conventions for writing
transactor leaf units
– These should work within a RDL-generated C++-harness
In addition, libraries for channels, configuration, and
parameter passing code to support standalone C++
elaboration and simulation
Also, can convert HDL modules into C++ units for
co-simulation
– Verilog -> C++ using either Verilator or Tenison VTOC
– Bluespec -> C++ using Bluespec Csim
Why C++? I thought RAMP was FPGAs?
Initial design in C++, eventually mapped into RTL
Much faster to spin C++ design than to spin FPGA design
Hardware verification needs golden model
Some units might only ever be software
– Power/temperature models
– Disk models
SMASH/C++ Code Example – Leaf Unit
struct Increment : public smash::IUnit_LeafImpl {
// Parameters
static const smash::Parameter<int> inc_amount;
// Port functions
smash::InputPort<IntMsg>& in(){return m_in;}
smash::OutputPort<IntMsg>& out(){return m_out;}
void elaborate(smash::ParameterList& plist)
{
m_inc = plist.get(Increment::inc_amount, 1);
};
bool tick()
{
if
( xactInc()
) return true;
else if ( xactBumpInc() ) return true;
return false;
}
private:
// Ports
smash::InputPort<IntMsg> m_in;
smash::OutputPort<IntMsg> m_out;
// Private state
int m_inc;
// Private transactions…
Example Leaf Unit Transactions
bool xactInc()
{
bool xactIncFired
= m_in.deqRdy() && m_out.enqRdy()
&& (m_in.first() != 0);
if ( !xactIncFired )
return false;
m_out.enq( m_in.first() + m_inc );
m_in.deq();
return true;
}
bool xactBumpInc()
{
bool xactBumpIncFired
= m_in.deqRdy() && (m_in.first() == 0);
if ( !xactBumpIncFired )
return false;
m_inc += 1;
m_in.deq();
return true;
}
SMASH/C++ Example: Structural Unit
struct IncPipe : public smash::IUnit_StructuralImpl {
// Port functions
smash::InputPort<IntMsg>& in() {return m_in;}
smash::OutputPort<IntMsg>& out() {return m_out;}
void elaborate( smash::ParameterList& plist )
{
regPort
( "in",
&m_in
);
regUnit
( "incA",
&m_incA
);
regChannel ( “inc2inc", &m_inc2inc );
regUnit
( "incB",
&m_incB
);
regPort
( "out",
&m_out
);
elaborateChildUnits(plist);
}
// Connect child units and channels
smash::connect( m_in, m_incA.in() );
smash::connect( m_incA.out(),
m_channel,
m_incB.in() );
smash::connect( m_incB.out(), m_out );
private:
// Ports
smash::InputPort<IntMsg> m_in;
smash::OutputPort<IntMsg> m_out;
// Child units and channels
Increment m_incA;
Increment m_incB;
smash::SimpleChannel<IntMsg> m_inc2inc;
};
InputPort
“in”
IncPipe
Incrementer
“incA”
SimpleChannel
“inc2inc”
Incrementer
“incB”
OutputPort
“out”
SMASH/C++ Example: Simulation Loop
int main( int argc, char* argv[] ) {
// Toplevel channels and unit
smash::SimpleChannel<IntMsg> iChannel("iChannel",32,3,7);
smash::SimpleChannel<IntMsg> oChannel("oChannel",32,3,7);
IncPipe incPipe; incPipe.setName("top");
// Set some parameters and elaborate the design
smash::ParameterList plist;
plist.set("top.incB",Increment::increment_amount, 2);
plist.set("top.inc2inc",SimpleChannel<IntMsg>::bandwidth,32);
plist.set("top.inc2inc",SimpleChannel<IntMsg>::latency,3);
plist.set("top.inc2inc",SimpleChannel<IntMsg>::buffering,7);
incPipe.elaborate(plist);
// Connect the toplevel channels to the toplevel unit
smash::connect( iChannel,
incPipe.in() );
smash::connect( incPipe.out(), oChannel
);
// Simulation loop
int testInputs[] = { 1, 2, 0, 3, 4, 0, 1, 2, 3, 4 };
int inputIndex = 0;
for ( int cycle = 0; cycle < 20; cycle++ ) {
if ( iChannel.enqRdy() && (inputIndex < 10) )
iChannel.enq( IntMsg(testInputs[inputIndex++]) );
if ( oChannel.deqRdy() ) {
std::cout << oChannel.first() << std::endl;
oChannel.deq();
}
incPipe.tick();
iChannel.tick();
oChannel.tick();
}
}
“iChannel”
IncPipe
“top”
Incrementer
“incA”
SimpleChannel
“inc2inc”
Incrementer
“incB”
// Hierarchical tick
// Always tick units before channels
“oChannel”
Why didn’t we just use SystemC?
If you’re asking, you haven’t read the SystemC standard
Ugly semantics
Too many ways of doing the same thing
Fundamental assumption is that host is sequential
– SMASH/C++ designed to support parallel hosts
Even worse, simulator is a global object (can’t have two
engines in one executable)
In industry, architects use SystemC, hardware designers
ignore it when building chips
Issues
Need to figure out flexible type system and bindings from
RDL into C++/Bluespec/Verilog
Need to figure out common (across C++/RDL/Bluespec)
interfaces/syntax for
– Elaboration
– Configuration
– Debugging
– Monitoring/Tracing