Transactors - University of California, Berkeley

Download Report

Transcript Transactors - University of California, Berkeley

SMASH: The C++ Layer
Krste Asanovic
[email protected]
MIT Computer Science and Artificial Intelligence Laboratory
http://cag.csail.mit.edu/scale
RAMP Retreat, UC Berkeley
January 11, 2007
SMASH: SiMulation And SyntHesis
Goal:
 One framework for both architectural exploration and
chip design and verification
Approach:
 High-level design discipline where design expressed as
network of transactors (transactional actor)
 Transactors (aka units) refined down to RTL
implementations
 Design structure preserved during refinement
 From my perspective, RDL & RAMP are pieces of
SMASH
Transactor Anatomy
Transactor unit comprises:
 Architectural state (registers + RAMs)
 Input queues and output queues connected to other units
 Transactions (guarded atomic actions on state and queues)
 Scheduler (selects next ready transaction to run)
Input
queues
Transactions
Scheduler
Transactor
Architectural
State
Advantages
 Handles non-deterministic inputs
 Allows concurrent operations on mutable state within unit
 Natural representation for formal verification
Output
queues
RAMP Design Framework Overview
Target System: the machine
being emulated
CPU
CPU
CPU
Interconnect
Network
DRAM
RDL Compiled to
FPGA Emulation
CPU
• Describe structure as transactor
netlist in RAMP Description
Language (RDL)
• Describe behavior of each leaf
unit in favorite language (Verilog,
VHDL, Bluespec, C/C++, Java)
RDL Compiled to
Software Simulation
Host Platforms: systems
that run the emulation or
simulation
2VP70
FPGA
2VP70
FPGA
• Can have part of target
mapped to FPGA
emulation and part mapped
to software simulation
2VP70
FPGA
2VP70
FPGA
2VP70
FPGA
BEE2 Host Platform
Workstation Host Platform
SMASH/C++ is the way to write leaf units in C++, either
for use with RDL or for standalone C++ simulation
What’s in SMASH/C++?
 A C++ class library plus conventions for writing
transactor leaf units
– These should work within a RDL-generated C++-harness
 In addition, libraries for channels, configuration, and
parameter passing code to support standalone C++
elaboration and simulation
 Also, can convert HDL modules into C++ units for
co-simulation
– Verilog -> C++ using either Verilator or Tenison VTOC
– Bluespec -> C++ using Bluespec Csim
Why C++? I thought RAMP was FPGAs?
 Initial design in C++, eventually mapped into RTL
 Much faster to spin C++ design than to spin FPGA design
 Hardware verification needs golden model
 Some units might only ever be software
– Power/temperature models
– Disk models
SMASH/C++ Code Example – Leaf Unit
struct Increment : public smash::IUnit_LeafImpl {
// Parameters
static const smash::Parameter<int> inc_amount;
// Port functions
smash::InputPort<IntMsg>& in(){return m_in;}
smash::OutputPort<IntMsg>& out(){return m_out;}
void elaborate(smash::ParameterList& plist)
{
m_inc = plist.get(Increment::inc_amount, 1);
};
bool tick()
{
if
( xactInc()
) return true;
else if ( xactBumpInc() ) return true;
return false;
}
private:
// Ports
smash::InputPort<IntMsg> m_in;
smash::OutputPort<IntMsg> m_out;
// Private state
int m_inc;
// Private transactions…
Example Leaf Unit Transactions
bool xactInc()
{
bool xactIncFired
= m_in.deqRdy() && m_out.enqRdy()
&& (m_in.first() != 0);
if ( !xactIncFired )
return false;
m_out.enq( m_in.first() + m_inc );
m_in.deq();
return true;
}
bool xactBumpInc()
{
bool xactBumpIncFired
= m_in.deqRdy() && (m_in.first() == 0);
if ( !xactBumpIncFired )
return false;
m_inc += 1;
m_in.deq();
return true;
}
SMASH/C++ Example: Structural Unit
struct IncPipe : public smash::IUnit_StructuralImpl {
// Port functions
smash::InputPort<IntMsg>& in() {return m_in;}
smash::OutputPort<IntMsg>& out() {return m_out;}
void elaborate( smash::ParameterList& plist )
{
regPort
( "in",
&m_in
);
regUnit
( "incA",
&m_incA
);
regChannel ( “inc2inc", &m_inc2inc );
regUnit
( "incB",
&m_incB
);
regPort
( "out",
&m_out
);
elaborateChildUnits(plist);
}
// Connect child units and channels
smash::connect( m_in, m_incA.in() );
smash::connect( m_incA.out(),
m_channel,
m_incB.in() );
smash::connect( m_incB.out(), m_out );
private:
// Ports
smash::InputPort<IntMsg> m_in;
smash::OutputPort<IntMsg> m_out;
// Child units and channels
Increment m_incA;
Increment m_incB;
smash::SimpleChannel<IntMsg> m_inc2inc;
};
InputPort
“in”
IncPipe
Incrementer
“incA”
SimpleChannel
“inc2inc”
Incrementer
“incB”
OutputPort
“out”
SMASH/C++ Example: Simulation Loop
int main( int argc, char* argv[] ) {
// Toplevel channels and unit
smash::SimpleChannel<IntMsg> iChannel("iChannel",32,3,7);
smash::SimpleChannel<IntMsg> oChannel("oChannel",32,3,7);
IncPipe incPipe; incPipe.setName("top");
// Set some parameters and elaborate the design
smash::ParameterList plist;
plist.set("top.incB",Increment::increment_amount, 2);
plist.set("top.inc2inc",SimpleChannel<IntMsg>::bandwidth,32);
plist.set("top.inc2inc",SimpleChannel<IntMsg>::latency,3);
plist.set("top.inc2inc",SimpleChannel<IntMsg>::buffering,7);
incPipe.elaborate(plist);
// Connect the toplevel channels to the toplevel unit
smash::connect( iChannel,
incPipe.in() );
smash::connect( incPipe.out(), oChannel
);
// Simulation loop
int testInputs[] = { 1, 2, 0, 3, 4, 0, 1, 2, 3, 4 };
int inputIndex = 0;
for ( int cycle = 0; cycle < 20; cycle++ ) {
if ( iChannel.enqRdy() && (inputIndex < 10) )
iChannel.enq( IntMsg(testInputs[inputIndex++]) );
if ( oChannel.deqRdy() ) {
std::cout << oChannel.first() << std::endl;
oChannel.deq();
}
incPipe.tick();
iChannel.tick();
oChannel.tick();
}
}
“iChannel”
IncPipe
“top”
Incrementer
“incA”
SimpleChannel
“inc2inc”
Incrementer
“incB”
// Hierarchical tick
// Always tick units before channels
“oChannel”
Why didn’t we just use SystemC?
 If you’re asking, you haven’t read the SystemC standard
 Ugly semantics
 Too many ways of doing the same thing
 Fundamental assumption is that host is sequential
– SMASH/C++ designed to support parallel hosts
 Even worse, simulator is a global object (can’t have two
engines in one executable)
 In industry, architects use SystemC, hardware designers
ignore it when building chips
Issues
 Need to figure out flexible type system and bindings from
RDL into C++/Bluespec/Verilog
 Need to figure out common (across C++/RDL/Bluespec)
interfaces/syntax for
– Elaboration
– Configuration
– Debugging
– Monitoring/Tracing