Transcript Slide 1

y=

2
x
Did you consider the data types for x and y?
 Floating point? Fixed point? Complex?
 Are x and y the same or different?

Did you consider the sizes of x and y?
 Are x and y the same or different?

Do you care?



Algorithms are different from
implementations
Algorithms are relatively easy to re-use and
to port to a different technologies
Implementations are often too specific to a
particular technology to re-use


Purpose-designed to rapidly develop and test
high-performance computer applications
High-level graphical design capture system
 Leverages object-oriented programming (OOP)
techniques
 Intuitively capture natively parallel operations
Tune algorithms for maximum performance or
minimum resources (or somewhere between)
 Fast and efficient solver that directly maps to
logic
 Adaptable to various CPU/FPGA boards and
systems


Integrated, immersive, test-on-the-fly
simulation and hardware test integration
 Rapidly develop and prototype on x86 platform
 Interact with real application running in hardware


Leverages FPGA vendor tools to place and
route designs
Azido designs can be ported to other
hardware architectures with the least amount
of effort, time, and change to the design

EDIF Translation/Conversion
 EDIF  Azido
 Azido  EDIF

API Calls
 COM/Active-X module interfaces










Graphic
Drag-and-drop
Interactive,
instantaneous
debugging
Object-oriented
Technology independent
Inherently parallel
Flexible data sizes,
types, and precision
Recursion
Overloading
Underloading





Correct by construction
Code re-use
Sequential vs. parallel
trade-offs
Polymorphism
IP re-use

Descendant of
Star Bridge Systems’
Viva®

Star Bridge HAL300
Hypercomputer
 200 FPGAs
 720 memory channels

Kirchhoff pre-stack
time migration
Constants:
L = 24” W = 3” P = 20 lbs
 = 0.097 lbs/in3

Constraint:
Stressallowed = 40K lbs/in2
Find thickness, d, to minimize Weight    L  w  d
where
6 PL
Stress 
 Stressallowed
2
wd
d chosen 1023 times
Azido Results: d= 0.156” (0.155 exact)
Minimum weight = 1.09 lbs (1.082 exact)
Design Entry/
Algorithm
Implementation
Stage 1
Azido®
Includes:
• Device Drivers
• Interface to C/COM
• System simulation
• Test harness
• System harness
Build In
Hardware
Synthesis
EDIF
Stage 2
Vendor
Bit Map
FPGA-based
Hardware
Rapid
Development
Environment
START
Load x86 System
Description
Load FPGA System
Description
Azido
Azido® synthesis
Design Algorithm
& Implementation
NO
Azido® synthesis
Pass ?
YES
Xilinx PAR
Functional Test
& Simulation
YES
NO
NO
Xilinx
Timing, Area ?
YES
END/RUN
Software Abstraction
Layer
BCS Wrapper
Azido Design
BCS=Behavioral Communication System
Programmed in Azido itself










Quickly Building an Example System
User Interface
Azido Primitives and Transports
Data Sets and Types
Object Recursion
CoreLib: Core Library
Data Flow Control and Synchronization
System Descriptions
Interactive Debugging
Next Steps
Machine vision and tracking system



A simple binary counter with variable count
value
Build and test in x86
Implement and test on Xilinx Spartan-3
XC3S4000 FPGA (Opal Kelly board)
http://www.opalkelly.com/products/xem3050/


Aptina CMOS imager
Directly connects to back
of XEM3050 board
Histogram
CMOS Imager
Aptina
Camera
Frame
Grabber
Bayer
Pattern to
RGB
Converter
White
Balance
Blob
Feature
Extraction
RGB to
Gray
Converter
Blob
Filter
Segmentation
Write
Frame


Camera output is not traditional RGB
Uses Bayer pattern (more green sensitivity)
Interpolating Red and Blue: Four cases
Bayer Pattern
Interpolating Green: Two cases
www.siliconimaging.com/RGB%20Bayer.htm
A Brisk Walk Through the Environment
Project
Options
Sheet
Options
Build
Options
View
Options
Zoom
Options
Azido: User Interface
System
Options
 Add/Delete/View sheets within project
 Name/Rename sheets
Project Editor
 Drag/Drop objects onto sheet
 Examples on object use
 Contains polymorphic modules and objects
to be used in designs and applications, e.g.
Logic Gates, Registers, Shifters, Math
operators
Library Editor
 Red Text – Error
 Black Text – Warning
 Blue Text – Information
 Green Text – User Trap
 Error Log file placed in Project directory
Message Tab
 Search for objects within CoreLib using (Ctrl + F)
 Search is based on a literal character match
 No wildcard characters allowed
 Search window displays all the locations where the object is
present within the project
 Navigate to the object by double-clicking the entry in the
search window
Search Tab
 Displays the various overloads available for an object.
 Ctrl + Double click an object to view its overloads
Overloads Tab
I2ADL Editor
(Implementation Independent Algorithm
Description Language)
 Create/Edit and document applications and objects in Azido™
 Add/Edit functionality of sheets and objects
 Create/Modify library components
 View/Edit/Create
datasets to be used in
Azido ™
Dataset Editor
 View child datasets of
parent datasets
 Modify color codes for
different datasets
 View/Edit resources within
FPGA-based System
Resource Editor
 Displays the resources
used by a particular
application build
 Define Build options and
locations of build files within the
project directory
System Editor
 Define different options to the
vendor’s PAR tool for the
design
 Starting point for creating and
editing system descriptions
 Adjust the clock
speed depending
on the constraints
and complexity of
the algorithm
being implemented
 Objects and libraries created in Azido support high
clock speeds, removing one more barrier for an
application designer.
Start with the Basics
Primitive library loaded
automatically
 Basic bit-oriented Boolean
functions

 AND
 OR
 INVERT


“Atomic”-level functions
Cannot be broken down
further
Primitives
u
Up Quark
d
Down Quark
e
Electron
Atoms
Proton
u u
d
Neutron
u d
d
Molecules
“The Works”

A peer to the other primitives
Transport

Moves data from a source to one or more places




As simple as data move operation
As complex as a COM interface call
Moves bits, buses, scalars, vectors, lists, etc.
Interprocess communication, even between chips and
systems
How Bits Become Systems




Azido allows clean separation between
algorithm and implementation
Bit: The usual definitions (0/1, off/on,
false/true)
Variant: Unknown data type at time of
construction; specified or resolved during
synthesis
List: Collections of two or more data sets;
modeled on Lisp, Prolog
Implementation may
differ, depending on
the incoming data set
 Data set is defined
during Azido synthesis

2
=x

= xx
Algorithm remains the
same, regardless of the
data type
Bit

1
Processors have fixed,
pre-defined data sizes
 8-, 16-, 32-, 64-bit
Byte (char)
 IEEE floating point
7 6 5 4 3 2 1 0
1 0 0 1 1 0 1 0
Integer
▪ Single precision
▪ Double precision

Consequently, so do
15 14 13 12
2 1 0
 programming languages
1 0 0 1
0 1 0
 operating systems

Field-Programmable Gate
Array (FPGA) architecture




Logic
Memory
Programmable connections
I/O
User-defined application
architecture
 Flexible, adaptable
 Inherently parallel
 Flexible data widths

STATIC


DYNAMIC
Mandated by the fixed
architectures of processorbased systems
Examples:


Provides the freedom and
richness available in FPGA
architectures
Five predefined dynamic sets
 Nibble, Byte
 Complex
 Int, Dint, Qint
 Fixed-point
 Word, Dword, Qword
 Floating
 Float, Double
 Signed
 Fix16, Fix32
 COM-based types
 List
Must always convert back to
static when communicating
with fixed architecture
Variant
Real
Complex
Real
Imaginary
Fixed
Sign, Whole Number
Fraction
Floating
Sign, Exponent
Sign
Exponent
Mantissa
Mantissa
Any data set
Imaginary Any data set

MSB First (MSB):

LSB First (LSB):
Same binary information,
ordered differently
 Available in a variety of sizes
 Specify during synthesis


Binary Tree (BIN):
Manipulating Data Sets
Exposer: Decompose a data set into its constituent
data sets
 Collector: Compose data sets into a single data set

Variant

Real
Imaginary
Collector
Complex
Exposer
Real
Imaginary
For those familiar with ‘C’, think of a data set as a
struct
struct Complex {
variant Real
variant Complex
};
Variant
Second Verse, Same as the First
INVERT BIT FUNCTION




From Primitive Objects
Bit input and output
Atomic-level primitive
Cannot be further reduced
Bit
INVERT VARIANT FUNCTION



From CoreLib/Gates
Variant input and output
Must be further reduced
during synthesis
Variant
INVERT calls itself:
Recursion!
Variant
Output recast to the
same type as the Input
Variant

What happens if you compile
the following design for the
Byte data set?
Variant
At Bit
level?
Yes
DONE
No
Call
INVERT
At Bit
level?
No
Call INVERT
Yes
DONE
Nibble
Byte
Variant
At Bit
level?
No
Call INVERT
Yes
DONE
Dbit
Nibble
Byte
Variant
Dbit
At Bit
level?
No
Call INVERT
Yes
DONE
Dbit
Bit
Nibble
Byte
Variant
Dbit
Bit
Now resolve the output Collectors!
Resolving output Collectors
Dbit
Bit
Nibble
Byte
Dbit
Variant
Dbit
Bit
Resolving output Collectors
Dbit
Bit
Nibble
Byte
Dbit
Variant
Dbit
Bit
Nibble
Done!
Dbit
Bit
Nibble
Byte
Dbit
Byte
Dbit
Bit
Nibble

The same INVERT function works for other
data sets
 From 1 to 128 bits
 Complex
 Fixed-Point
 Floating-Point
 Lists
CoreLib Simplifies Life

Reusable, flexible CoreLib library adds highercomplexity functions







Math
Control
Data Conversion
Memory
Grammatical operations
Polymorphic—supports various data sets and types
Other advanced libraries available from developers
 DSP, etc.


Library objects adapt to input and output data sets
Example: Same multiply object handles scalars,
vectors (Lists), floating point, fixed point, etc.

Asynchronous Operations

Not clocked
No data flow control (coming up)


Right-click
Right-click on an
element to reveal …
 Documentation
 Ports
 Description
 Attributes


Synchronous, clocked
Data flow control (GDBW)
Both asynchronous and synchronous
with flow control
 Minimum and Maximum functions

From 2 to 16 inputs
If in Tree view …
If in Name view …
Right click, Sort by Name
Right click, Sort by Tree
Synchronization and Data Flow Control

Go-Done-Busy-Wait (GDBW) mechanism
controls the data flow, synchronizations, and
data sequencing of an application
 Fork and join
 Spring synchronization
 Back-pressure with differential latency
compensation
#1
#2
Go
Wait


Go
Busy
#3
Done
Wait
Go
Busy
Done
Wait
Function #1 has new data to propagate
Downstream Function #2 is not busy and can
accept new data
#1
#2
Go
Wait


Go
Busy
#3
Done
Wait
Go
Busy
Done
Wait
If unable to complete in a single clock cycle, Function #2
signals Busy to upstream Function #1 until it is Done
Upstream Function #1 must not send new Go data until
Function #2 is no longer Busy
#1
#2
Go
Wait


Go
Busy
#3
Done
Wait
Go
Busy
Done
Wait
When finished, Function #2 signals Done to Function #3
Function #2 also is no longer Busy and accepts new Go data
from Function #1
#1
#2
Go
Wait


#3
Go
Done
Go
Busy
Wait
Busy
Done
Wait
Even if Function #1 and #2 have new data, they must wait
until Function #3 is no longer Busy
“Backpressure” or “Spring synchronization”

Synchronization
 Multiple data paths
 Different process latency through different paths
Synchronize all Dones
by creating a List
Only GO-DONE chain shown.
BUSY-WAIT removed for clarity.
Add
Data Path 1
Synchronize
Square Root
Data Path 2

Pipelining
 Maximize efficiency
 Ensure data stability
Widgets provide visibility
into design behavior both
during simulation (x86)
and real-time hardware
 Runtime widget provides
ability to single-step
through design using
“UseManualClk” and
“ManualClk” controls


Equivalent function = overload
 Name space
 Number of inputs


Symbol substitution : constraint wavefront :
chain rule
Underloading: Moving from the general to
the specific
 System Description provides specific options
 Redefines objects to work as system implements
them
Azido’s “Device Drivers”
Azido Application
x86
Executes on both
types of systems
FPGA-based
XEM3050
Describes the system on which XEM6110-LX45
the Azido design will execute
 Compute resources and types
 Communication
… and others
 Essentially, the “device driver”




Use the x86 SD to initially design
and test the application
Almost every object in CoreLib has
an equivalent x86 SD for fast,
interactive simulation
Executes on the processor and
provides accurate simulation of
design ensuring successful placeand-route during synthesis
Contains objects and system-level
implementations mapped to specific
components and primitives within FPGA
system
 All Library objects and components contain
equivalent descriptions for each FPGA SD
 Different SDs can be created using Azido for
different FPGA-based systems from other
vendors

Runtime Widget Interface

All signals visible and interactive
 Attach input and output horns to transports
 Set traps on output horns to generate a message
visible in the message window



Interactively verify functionality using the x86
system description
Message window indicates nature of
error/warning and the location of object
where the error took place
Error messages logged within Azido folder

ScrollBar
 Default widget


TextBox
SpinEdit
 Input only


Button
Graphs
 Output only

Memo
 Output only

Decimal/Hex
Output
SET WIDGET TYPE



Right-click on widget
Select Change To …
…then select widget type
DECIMAL/HEX OPTIONS


Right-click on widget
Select Toggle Hex Display
Right-click
Right-click
SET WIDGET OPTION
Right-click on input or output
horn
 Click on cell under Attributes
 Select Widget from drop list
 Click on adjacent cell under
Values
 Select widget type from drop
list
 Click OK to accept
 Must resynthesize to see effect

Right-click

There are two attributes which facilitate debugging in Azido
 Global
 Trap



The Global attribute is used to create an output which is not
necessarily recognized as part of the footprint of the object.
Besides, it can also be used to create an implicit connection
between the input of one object and the output of another.
The Trap attribute is placed along with a message. The
message is displayed whenever data is propagated to that
output
These attributes, when placed within the hierarchy of an
object can be effectively used to observe outputs from
within the hierarchy
Viewing low-level signal with ease

Purpose:
 To demonstrate debugging
within Azido
Getting Started with Azido


Request access to Azido Beta site
Download and install Azido development
environment
 Azido runtime for Windows (and virtual machines)
 Xilinx ISE (optional, but required for hardware
interaction)


Learn about Azido
Tell us what you think
(but we need the thumb drives back!)








Data Sets
Graphical Representations
Exposers and Collectors
Context
Separation of Size and Type
Type=context size variant
Variant exposer collector = data grammar
Cast / convert /context
Data Set Color Coding
Variant
List
Fixed
Compiled as Byte data set
QUESTION: Why do we need to covert the Fixed data type to a List
in this example?
ANSWER:
The Fixed data type does not exist in the x86
architecture. Convert it to a List for display.
From 2 to 36
connections
Packs or Unpacks
bit lists into smaller,
equal-sized lists