IEC Class Software for Embedded Systems

Download Report

Transcript IEC Class Software for Embedded Systems

CSE 291 Winter 2009
The FPGA Ecosystem
Rajesh Gupta
University of California, San Diego
Moore’s Law
Die
Die size
size grows
grows by
by 14%
14% to
to satisfy
satisfy Moore’s
Moore’s Law
Law
Transistors
Transistors on
on lead
lead microprocessors
microprocessors double
double every
every 22 years
years
1000
100
2X growth in 1.96 years!
10
486
1
Courtesy, Intel
P6
Pentium® proc
Die size (mm)
Transistors (MT)
100
386
286
0.1
8086
8080
8008
4004
8085
0.01
0.001
1970
1980
P6
Pentium
® proc
486
10
386
8080
8008
4004
Courtesy, Intel
8086
8085
286
~7% growth per year
~2X growth in 10 years
1
1990
Year
2000
2010
1970
1980
1990
Year
Lead
Lead microprocessors
microprocessors frequency
frequency doubles
doubles every
every 22 years
years
2010
100
10000
Power (Watts)
2X every 2 years
1000
Frequency (Mhz)
2000
P6
100
Pentium ® proc
486
10
8085
1
0.1
1970
8086 286
386
P6
Pentium ® proc
10
8086 286
1
8008
4004
486
386
8085
8080
8080
Courtesy, Intel
8008
4004
0.1
1971
1980
1990
2000
2010
1974
1978
1985
Year
1992
2
2000
The ITRS: Tao of Scaling
http://public.itrs.net
2007
 0.065 micron
 6.7 GHz on chip clock
 9 wiring levels
 600-3000 pins
 Vdd=0.7-1.1V
 3.5W / 104W / 190W
 DRAM:
4.29 Gb/chip, 183 mm^2, 2.35 Gb/cm^2
 MPU
386 Mtrans/chip, 140 mm^2, 276.1 Mtrans/cm^2

3
Source: Ken Yang, UCLA
Design Abstraction Levels
SYSTEM
MODULE
+
GATE
CIRCUIT
Vin
Vout
DEVICE
G
S
n+
D
n+
4
Adapted from Irwin & Nayaranan’s Slides from PSU. Copyright 2002 J. Rabaey et al."
Design Process
• Conceptualization: function & structure
•
•
•
•
– HLM, behavioral modeling
Architecture: structure and organization
– microarchitectural implementation
Logical implementation: gates, modules
– logic synthesis, logic verification, static timing analysis
Circuit implementation: transistors
– circuit simulations
Physical design, verification
– floorplanning, placement, routing, dynamic timing analysis
5
Many Implementation Choices
• Microprocessors
• Domain-specific processors
•
•
•
•
•
Speed
Power
Cost
– DSP
– Network processors
– Microcontrollers
ASIPs
Reconfigurable SoC
FPGA
Gate-array
ASIC
High
Low
Volume
6
E.g. Degree of Customization of
Processor Architecture
•
•
The architecture of the computation engine used to implement
total = 0
desired functionality
for i = 1 to N loop
Processor does not have to be programmable
total += M[i]
– “Processor” not equal to general-purpose processor
end loop
Controller
Datapath
Controller
Datapath
Controller
Datapath
Control
logic and
State register
Control logic
and State
register
Registers
Control
logic
index
Register
file
IR
PC
General
ALU
IR
Custom
ALU
State
register
Data
memory
total = 0
for i =1 to …
General-purpose (“software”)
+
PC
Data
memory
Program
memory
Assembly code
for:
total
Data
memory
Program memory
Assembly code
for:
total = 0
for i =1 to …
Application-specific
Single-purpose (“hardware”)
7
[Adapted from Embedded Systems Design: A Unified Hardware/Software Introduction. Copyright 2000 Vahid & Givargis]
General-purpose Microprocessors
•
•
•
•
Programmable device used in a variety of
applications
– Also known as “microprocessor”
Features
– Program memory
– General datapath with large register file and
general ALU
User benefits
– Low time-to-market and NRE costs
– High flexibility
“Pentium” the most well-known, but there are
hundreds of others
Controller
Datapath
Control
logic and
State register
Register
file
IR
PC
Program
memory
General
ALU
Data
memory
Assembly code
for:
total = 0
for i =1 to …
[Adapted from Embedded Systems Design: A Unified Hardware/Software Introduction. Copyright 2000 Vahid & Givargis]
8
Application-specific Instruction
Processors, ASIP
•
•
•
Programmable processor optimized for a
particular class of applications having common
characteristics
– Compromise between general-purpose and
single-purpose processors
Features
– Program memory
– Optimized datapath
– Special functional units
Benefits
– Some flexibility, good performance, size and
power
Controller
Datapath
Control
logic and
State register
Registers
Custom
ALU
IR
PC
Program
memory
Data
memory
Assembly code
for:
total = 0
for i =1 to …
9
[Adapted from Embedded Systems Design: A Unified Hardware/Software Introduction. Copyright 2000 Vahid & Givargis]
Single-purpose ‘Processors,’ or ASIC
•
•
•
Digital circuit designed to execute exactly one
program
– a.k.a. coprocessor, accelerator or peripheral
Features
– Contains only the components needed to execute a
single program
– No program memory
Benefits
– Fast
– Low power
– Small size
Controller
Datapath
Control
logic
index
total
State
register
+
Data
memory
10
[Adapted from Embedded Systems Design: A Unified Hardware/Software Introduction. Copyright 2000 Vahid & Givargis]
E.g. ASIC
ASIC Features
Area: 4.6 mm x 5.1 mm
Speed: 20 MHz @ 10 Mcps
Technology: HP 0.5 mm
Power: 16 mW - 120 mW (mode dependent)
@ 20 MHz, 3.3 V
Avg. Acquisition Time: 10 ms to 300 ms
• A direct sequence spread spectrum (DSSS) radio receiver
ASIC (UCLA)
11
The Implementation Choice is Important
12
The Co-design Ladder
•
•
In the past:
– Hardware and software
design technologies were very
different
– Recent maturation of
synthesis enables a unified
view of hardware and
software
Hardware/software “codesign”
Sequential program code (e.g., C, VHDL)
Behavioral synthesis
(1990's)
Compilers
(1960's,1970's)
Register transfers
Assembly instructions
RT synthesis
(1980's, 1990's)
Assemblers, linkers
(1950's, 1960's)
Logic equations / FSM's
Machine instructions
Logic synthesis
(1970's, 1980's)
Microprocessor plus
program bits: “software”
Logic gates
Implementation
VLSI, ASIC, or PLD
implementation: “hardware”
The choice of hardware versus software for a particular function is simply a tradeoff among various
design metrics, like performance, power, size, NRE cost, and especially flexibility; there is no
fundamental difference between what hardware or software can implement.
13
[Adapted from Embedded Systems Design: A Unified Hardware/Software Introduction. Copyright 2000 Vahid & Givargis]
Map from Behavior to Architecture
[Vincentelli]
14
Four Phases in Creating a Chip
15
Implementation Choices
Digital Circuit Implementation Approaches
Custom
Semicustom
Cell-based
Standard Cells
Compiled Cells
Array-based
Macro Cells
Pre-diffused
(Gate Arrays)
Pre-wired
(FPGA's)
16
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Transition to Automation and Regular Structures
Intel 4004 (‘71)
Intel 8080
Intel 8286
Adapted from Digital Integrated Circuits
Intel 8085
Intel 8486
(2nd
Courtesy Intel
Edition). Copyright 2002 J. Rabaey et al."
17
Cell-based Design (or standard cells)
Routing channel
requirements are
reduced by presence
of more interconnect
layers
18
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Standard Cell - Example
3-input NAND cell
(from ST Microelectronics):
C = Load capacitance
T = input rise/fall time
19
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Automatic Cell Generation
Initial transistor
geometries
Placed
transistors
Adapted from Digital Integrated Circuits
(2nd
Routed
cell
Compacted
cell
Courtesy
Acadabra
Edition).
Copyright
2002 J. Rabaey et al."
Finished
cell
20
MacroModules
25632 (or 8192 bit) SRAM
Generated by hard-macro module generator
21
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
“Soft” MacroModules
Synopsys DesignCompiler
Adapted from Digital Integrated Circuits (2nd Edition). Copyright 2002 J. Rabaey et al."
22
“Intellectual Property”
A Protocol Processor for Wireless
23
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Semicustom Design Flow
Design Capture
Behavioral
Design Iteration
HDL
Pre-Layout
Simulation
Structural
Logic Synthesis
Floorplanning
Post-Layout
Simulation
Placement
Circuit Extraction
Routing
Physical
Tape-out
24
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Late-Binding Implementation
Digital Circuit Implementation Approaches
Custom
Semicustom
Cell-based
Standard Cells
Compiled Cells
Macro Cells
Array-based
Pre-diffused
(Gate Arrays)
Pre-wired
(FPGA's)
Array-based
Pre-diffused
(Gate Arrays)
Pre-wired
(FPGA's)
25
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Gate Array — Sea-of-gates
polysilicon
VD D
rows of
uncommitted
cells
metal
possible
contact
GND
In 1 In 2
Uncommited
Cell
In 3 In4
routing
channel
Committed
Cell
(4-input NOR)
Out
26
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Sea-of-gate Primitive Cells
Oxide-isolation
PMOS
PMOS
NMOS
NMOS
NMOS
Using oxide-isolation
Using gate-isolation
27
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Prewired Arrays
Classification of prewired arrays (or field-programmable devices):
• Based on Programming Technique
•
•
– Fuse-based (program-once)
– Non-volatile EPROM based
– RAM based
Programmable Logic Style
– Array-Based
– Look-up Table
Programmable Interconnect Style
– Channel-routing
– Mesh networks
28
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Antifuse
• Normally high resistance
(> 100 M)
– on application of
appropriate voltage, the
antifuse is changed
permanently to a low
resistance structure (200500)
29
Array-Based Programmable Logic
I5
I4
I3
I2
I1
I0
Programmable
OR array
Programmable AND array
I3
I2
I1
I0
Programmable
OR array
Fixed AND array
I5
I4
I3
I2
I1
PLA
Fixed OR array
Programmable AND array
O3O2O1O0
O 3O 2O 1O 0
I0
PROM
O 3O 2O 1O 0
PAL
Indicates programmable connection
Indicates fixed connection
30
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Programming a PROM
1
X2
X1
X0
: programmed node
NA NA f 1 f 0
31
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
2-input mux
as programmable logic block
Configuration
A
0
F
B
1
S
A
B
S
F=
0
0
0
0
X
Y
Y
1
1
1
0
X
Y
Y
0
0
1
0
0
1
0
1
1
X
Y
X
X
X
Y
1
0
X
Y
XY
XY
XY
X1 Y
X
Y
1
32
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Logic Cell of Actel Fuse-Based FPGA
A
B
1
SA
Y
1
C
D
1
SB
S0
S1
33
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Memory
Look-up Table Based Logic Cell
Out
In
Out
00
00
01
1
10
1
11
0
ln1 ln2
34
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
LUT-Based Logic Cell
4
C1....C4
xx
xxxx
xxxx
xxxx
Bits
control
D4
D3
D2
Logic
function
of
xxx
D1
Logic
functionx
of
xxx
F4
F3
F2
F1
xx
xx
xx
xx
Logic
function
of
xxx
xx
x xx x
xx xx
x
x
x
x
Bits
control
xx
xx
xx
xx
xxxx
x xx x
xx
xx xx
x
xxxxx
H
P
Xilinx 4000 Series
Adapted from Digital Integrated Circuits
xxxx
(2nd
x
x
Multiplexer Controlled
by Configuration Program
Courtesy Xilinx
Edition). Copyright 2002 J. Rabaey et al."
35
Array-Based Programmable Wiring
M
Interconnect
Point
Programmed interconnection
Input/output pin
Cell
Horizontal
tracks
Vertical tracks
36
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Mesh-based Interconnect Network
Switch Box
Connect Box
Interconnect
Point
Courtesy Dehon and Wawrzyniek
Adapted from Digital Integrated Circuits (2nd Edition). Copyright 2002 J. Rabaey et al."
37
Transistor Implementation of Mesh
Courtesy Dehon and Wawrzyniek
Adapted from Digital Integrated Circuits (2nd Edition). Copyright 2002 J. Rabaey et al."
38
Hierarchical Mesh Network
Use overlayed mesh
to support longer connections
Reduced fanout and reduced
resistance
Courtesy Dehon and Wawrzyniek
Adapted from Digital Integrated Circuits (2nd Edition). Copyright 2002 J. Rabaey et al."
39
EPLD Block Diagram
Macrocell
Primary inputs
Adapted from Digital Integrated Circuits
(2nd
Courtesy Altera
Edition). Copyright 2002 J. Rabaey et al."
40
Altera MAX
Adapted from Digital Integrated Circuits
(2nd
From Smith97
Edition). Copyright 2002 J. Rabaey et al."
41
Altera MAX Interconnect Architecture
column channel
row channel
t PIA
LAB1
LAB2
LAB
PIA
t PIA
LAB6
Array-based
(MAX 3000-7000)
Adapted from Digital Integrated Circuits
Mesh-based
(MAX 9000)
(2nd
Courtesy Altera
Edition). Copyright 2002 J. Rabaey et al."
42
Field-Programmable Gate Arrays
Fuse-based
I/O Buffers
Program/Test/Diagnostics
Vertical routes
I/O Buffers
I/O Buffers
Standard-cell like
floorplan
Rows of logic modules
Routing channels
I/O Buffers
43
Adapted from Digital Integrated Circuits
(2nd
Edition). Copyright 2002 J. Rabaey et al."
Xilinx 4000 Interconnect Architecture
CLB
12
Quad
8
Single
4
Double
3
Long
2
3
12
4
4
8
4
Quad
Long
Global
Long
Clock
Adapted from Digital Integrated Circuits
8
4
Double Single Global
Direct
Connect
Long
2
Carry
Direct
Clock Chain Connect
(2nd
Courtesy Xilinx
Edition). Copyright 2002 J. Rabaey et al."
44
RAM-based FPGA
Xilinx XC4000ex
Adapted from Digital Integrated Circuits
(2nd
Courtesy Xilinx
Edition). Copyright 2002 J. Rabaey et al."
45
Heterogeneous Programmable Platforms
FPGA Fabric
Embedded memories
Embedded PowerPc
Hardwired multipliers
Xilinx Vertex-II Pro
High-speed I/O
Adapted from Digital Integrated Circuits
(2nd
Courtesy
Xilinx
Edition).
Copyright
2002 J. Rabaey et al."
46
SOC as a heterogeneous computing substrate
Code
ASIC
Controller
process
Real time
Operating
System
User
interface
Microprocessor
Core
System Bus
ASIC
Proc.
Code
Host/Bus Interface
Host/Bus Interface
Programmable
Processor Core
Programmable
DSP Core
Memory Interface
Memory Interface
BUS
CNTL
CODEC
DSP
Code
Multi-ported memory
SERIAL I/O
Analog
interface
Experimental Side of Putting Things Together
Design


Goal of design is to take an ‘idea’ and build
something that performance a certain function
Such ‘idea’ to ‘implementation’ never happen directly



We go through ‘models’ that allow us to reason about properties
May also be used by implementers to explore alternatives for
cost, performance
MODELS are key to formalization of the design

And its process.
Model of Computation






A ‘model’ is an abstraction of a ‘description’
 (Sometimes, a model is also used as a replica of a ‘description’)
This abstraction is defined using some ‘terms’
If the terms are graphical  graphical model
If the terms are mathematical  formal model
Generally, terms and their relationships are devised to allow
syntactical support for expressing important concepts
If done right, a MOC
 supports important concepts of an application domain through
use of right terms
 is clear and unambiguous to allow anyone to replicate/simulate
intended behavior
 is compositional: compositions can be validated with less effort
Compositional View of SOCs:
Model of Computation


A system consists of components
Important questions to ask when dealing with components
 What is a component? (Component ontology)


What knowledge do components share? (Epistemology)


Events? Rendezvous? Message Passing? CT Signals?
Streams? Method Calls? …
What do components communicate? (Lexicon)


Time? Name spaces? Signals? State?
How do components communicate? (Protocols)


States? Processes? Threads? Differential equations?
Constraints? Objects? …
Objects? Transfer of control? Data structures? Strings?...
A MOC makes it easier to reason through these questions
 Start with a model of a machine, define its behavior (as
Characteristics of Common MOCs

Finite State Machines



Data-Flow



State is summary of past, Finite number of states
No concurrency, no explicit time specification
Partial order of actions/events
Concurrency, determinate, support streams (data, computation)
Discrete-event models

Global notion of time, causality
Finite State Machines (FSMs)

Functional decomposition into states of operation


Useful for control functions, protocols
Properties of FSMs


Good for specifying sequential control.
Not Turing complete.


Typical domains of application



More amenable to formal analysis.
Control-intensive tasks.
Protocols (Telecom, cache-coherency, bus, ...)
Many variants of the formulation

Differ in communication, determinism, ...
FSM Example: Seat Belt Alarm Control

Informal Specification
 If the driver



turns on the key,
and
does not fasten
the seat belt within
5 seconds
then sound the alarm



for 5 seconds, or
until the driver
fastens the seat
belt
or until the driver
turns off the key
KEY_ON =>
START_TIMER
OFF
WAIT
KEY_OFF
or
BELT_ON
10_SECONDS_UP
or BELT_ON or
KEY_OFF =>
ALARM_OFF
5_SECONDS_UP
=> ALARM_ON
ALARM
No explicit condition => implicit self-loop
in the current state
Finite State Machine: Example + Definition

KEY_ON =>
WAIT
FSM = (Inputs, Outputs, States, InitialState,
START_TIMER
NextState, Outs)
 Inputs = {KEY_ON, KEY_OFF, BELT_ON,
KEY_OFF or 5_SECONDS_UP
=> ALARM_ON
OFF
BELT_ON
BELT_OFF, 5_SECONDS_UP,
10_SECONDS_UP}
10_SECONDS_
UP or BELT_ON
 Outputs = {START_TIMER, ALARM_ON,
ALARM
or KEY_OFF =>
ALARM_OFF}
ALARM_OFF
 States = {OFF, WAIT, ALARM}


InitialState = OFF
Inputs x S -> S
NextState:
2
NextState: CurrentState, Inputs -> NextState
Set of all subsets of I
 e.g., NextState(WAIT, {KEY_OFF}) = OFF


All inputs other than KEY_OFF are implicitly
absent
Outs (function): CurrentState, Inputs ->
Outputs
Outs: 2Inputs x S -> 2Outputs
Non-deterministic Finite State Machines


A finite state machine is said to be non-deterministic when
 The NextState and Output functions may be RELATIONs (instead of
functions).
 NextState(WAIT, {KEY_OFF,
END_TIMER_5})={{OFF},{ALARM}}
Non-determinism can be user to model
 unspecified behavior


unknown behavior



incomplete specification
e.g., the environment model
Driver can be modeled as single state FSM with outputs
{KEY_ON, KEY_OFF, BELT_ON}
abstraction

(the abstraction may result in insufficient detail to identify
Concurrency and FSM

Significant model change: treat it as a ‘collection’

Fundamental assumption: all FSMs change states together
(synchronicity)


System state is a cartesian product
State space can be reduced by constrained compositions


E.g., sequential composition: output of one machine is input of
another
A cleaner way to extend FSM model?

Hierarchy
Discrete Event Models


Action, Events
Notion of global time



Events can happen anytime asynchronously
A system consists of components with input events
and output events


Also, referred to as ‘primary events’.
Component is evaluated in response to input events


Though it is not fundamental: time progress can be captured by ‘special’ events
Evaluation leads to events at the output
A discrete event simulator is a program that
specifies how components are evaluated


Components at a time (‘clock-driven’)
Event at a time (‘event-driven’)
Reactive (Real-time) Systems

Reactive Systems
 “React” to events


Suited for modeling “non-terminating” interactions
 e.g., operating systems, interrupt handlers, process control
systems.
 Often subject to external timing constraints



e.g., in the external environment, other subsystems
“real-time”
Synchronous Reactive Systems
 Synchrony associates ‘clock’ to a model
 All ‘synchronous events’ happen simultaneously
Clock is a ‘simplifcation’ or abstraction of time in models
 Between clocks, any amount of time can pass
Four useful MOCs
•
•
•
Discrete Event (DE)
– Timed models, suitable for modeling digital hardware
– But can be very general (define what is an event and what happens to it)
Finite State Machines
– Variants and extensions: StateCharts, StarCharts
Synchronous Reactive Models
– Synchrony assumption useful for safety critical embedded systems
(instantaneous reactions)
• (Convert timing relations to causal ordering)
– A program is logically correct if it is deterministic and reactive
– Verifying that a program is causal is a challenge
• Want one and only solution for each configuration of inputs
– Assume “constructive causality” to make it work
• Still a lot better than multi-level time (delta) models
•
Dataflow Process Networks
– Signal processing applications
60
Compositional Correctness
• Build “Complete” System Models
– That include the application and system software
– Adapt, control and debug applications
– Explore the full potential of SOC architectural platforms
• e.g., by exploring applications, networking and communication
subsystems together
• Composition challenges
– Language support for multiple MOCs not enough
– Model composability may not be guaranteed
• E.g., composition of synchronous models may not be closed
• Like connecting two FSMs can lead to combinational cycles
– solutions like: delta steps (VHDL), acyclic composition
(Lustre), reactions as fixed points (Esterel
61
Going Across MOC: Ptolemy Approach
• Encapsulate each description in a MOC in a “domain”
• Inter-domain simulations achieved through domain
•
encapsulation
– Define semantics of every such encapsulation carefully,
conservatively (and yet with some efficiency)
The “event horizon”
– Couple timed, untimed domains
62
Network Architecture Modeling: NS2
• Developed under the Virtual Internet Testbed (VINT) project
•
•
•
(UCB, LBL, USC/ISI, Xerox PARC)
Captures network nodes, topology and provides efficient
event driven simulations with a number of “schedulers”
Interpreted interface for
– network configuration, simulation setup
– using existing simulation kernel objects such as predefined
network links
Simulation model in C++ for
– packet processing
– changing models of existing simulation kernel classes, e.g.,
using a special queuing discipline.
63
NS2 Simulations
64
A 4-node system with 2 “agents”, a traffic generator
• “Agents” are network endpoints where network-layer
packets are constructed or consumed.
set ns [new Simulator]
set f [open out.tr w]
$ns trace-all $f
set n0 {$ns node}
set n1 {$ns node}
set n2 {$ns node}
set n3 {$ns node}
$ns duplex-link $no $n2 5Mb 2ms DropTail
$ns duplex-link $n1 $n2 5Mb 2ms DropTail
$ns duplex-link $n2 $n3 1.5Mb 10ms DropTail
set udp0 [newagent/UDP]
$ns attach-agent $n0 $udp0
set cbr0 [newapplication/Traffic/CBR]
$cbr0 attach-agent $udp0
..
$ns at 3.0 “finish”
proc finish () {
…
}
$ns run
n0
UDP
n2
n3
Sink
n1
TCP
ftp
65
NS2 Usage: LAN nodes
•
•
LAN and wireless links are inherently different from PTP links due
to sharing and contention properties of LANs
– a network consisting of PTP links alone can not capture LAN
contention properties
– a special node is provided to specify LANs
LanNode captures functionality of three lowest layers in the
protocol stack, namely: link, MAC and physical layers.
– Specifies objects to be created for LL, INTF, MAC and Physical
channels.
– Example:
$ns make-lan <nodelist> <bw> <delay> <LL> <ifq> <MAC> <channel> <phy>
$ns make-lan “$n1 $n2” $bw $delay LL queue/DropTail Mac/CSMA/CD.
– Creates a LAN with basic link-layer, drop-tail queue and CSMA/CD
medium access control.
n1
n2
n1
n2
LAN
n3
n3
The LAN node collects all
the objects shared on the
LAN.
66
Network Stack simulation for LAN
nodes in ns
Objects used in LAN nodes. Each of the underlying classes can be specialized for a given simulation.
node1
Q
LL
MAC
Phy
node2
Q
node3
Q
LL
LL
LL
MAC
MAC
MAC
Channel
Channel object simulates the shared medium
and supports the medium access mechanisms
of the MAC objects on the sending side.
MAC classifier
On the receiving side, MAC classifier is
responsible for delivering and optionally
replicating packets to the receiving MAC
objects.
67
Putting things together…
Source: Virtio Corp.
ASIC Hardware
Network
Processor(s) and Memories
System Software: OS,
Middleware, Application Software
68
Time Granularity in Models
A. "Specification model"
"Untimed functioal models"
Communication
Cycletimed
D
C. "Bus-arbitration model"
"Transaction model"
Approximatetimed
Untimed
B. "Component-assembly model"
"Architecture model"
"Timed functonal model"
F
C
E
D. "Bus-functional model"
"Communicatin model"
"Behavior level model"
A
B
Untimed
Approximatetimed
Cycletimed
Computation
E. "Cycle-accurate computation
model"
F. "Implementation model"
"Register transfer model"
• Models B, C, D and E could be classified as TLMs.
Source: Daniel Gajski, UC Irvine.
69
Hardware-software co-simulation
• Verification of the functionality of a system consisting of
both hardware and software (as early as possible in the
design cycle).
Processor
Model
•
•
•
•
BFM
ISA
CAM
TAM
Custom
Hardware
Model
Communication
•
•
•
•
Tightly coupled
Loosely coupled
One process
Multi-process
•
•
•
•
•
Functional
Behavioral
RTL
Gate
Transistor
70
Processor Models
• Four types of models
–
–
–
–
Bus-functional models
Instruction-set models
Cycle-accurate models
Timing accurate models
BFM
ISM
71
Bus-functional Models
•
Can only execute bus transactions
Can be used to check how peripherals interact with the processor bus
Available in different degrees of timing accuracy
– Cycle-accurate
– Phase-accurate
– Full timing (nanosecond) accurate
Very popular in hardware design
CLK
Read from 0xff00
•
•
•
ADDRESS
BFM
CE
DATA
R/W
72
Instruction-set (ISA) Models
•
•
Basic ISA Model
– Model only the effect of
instruction execution on
registers and memory
– Not processor pipeline
– Fast, used in embedded
software models
Cylcle-accurate ISA
– Model the processor
pipeline and instruction
execution in a cycleaccurate manner
– Provides accurate cycle
counts for instruction
execution
– 1.2-5X slower
Register File
Fetch
mov r0, r1
add r0, r2, r3
st r0, (r5)
Decode
Execute
Memory
73
Processor Models
• ISA Processor Model
•
– ISA Model + Cycle-accurate BFM
– Cycle accurate bus transactions but not cycle
accurate instruction execution
– Fastest useful processor model
Cycle-accurate Processor Model
– Cycle-accurate ISA + Cycle-accurate BFM
– Cycle accurate instruction execution and bus
transactions
– Slower than ISA processor model but still popular.
BFM
ISM
74
Timing-accurate Models
• Correctly models the processor behavior at the
•
•
•
nanosecond accurate level
Is usually generated from a gate-level netlist of the
processor
Slow (could be 3 to 5 orders of magnitude slower than
cycle-accurate processor models)
Seldom used
75
Typical Usage Models
• System architects looking at hardware/software tradeoffs
• ASIC developers wanting a fast and easy way to test out the
•
•
•
hardware running actual code
Software developers testing H/W drivers and RTOS on
hardware (HDL) models
Software developers testing application code with an RTOS
on the “real” hardware (i.e. evaluation board)
Distributed application developers
– SensorSIM, TOSSIM
76