Transcript Document

EEE515J1
ASICs and DIGITAL DESIGN
Lecture 6: Data Processors and
Control Units
Ian McCrum
Room 5D03B
Tel: 90 366364 voice mail on 6th ring
Email: [email protected]
Web site: http://www.eej.ulst.ac.uk
Last changed 01/11/04@18:00
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-1/21
Designing Larger Digital Systems:
•
We have seen how designing Finite state machines (FSMs) is relatively
straightforward once the state diagram or design specification is
drawn.
•
Together with combinational logic these design methods will stand you
in good stead.
•
Of course there are problems that would be rather large or tedious to
solve using these methods such as a system with a large number of
inputs or one with a large variety of actions or steps to be
performed.
•
We can modify the FSM approach.
•
Having one FSM send inputs and receive outputs from another FSM is a
useful technique, such cascaded or coupled FSMs are found in real
designs;
•
the design techniques used will depend on whether the two FSMs have
synchronous clocks.
•
If not then the system is an asynchronous one and will use handshake
and control to effect synchronisation between the machines.
•
We will not dwell (sic) on such machines here except to note that
testing asynchronous systems is difficult, error prone and can give a
design which is difficult to modify late in the design cycle.
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-2/21
The Algorithmic State Machine method
• Other modifications to the basic FSM method might add
memory such as stack or heap structures and have state
machines route data to and from these memory
structures.
• A more general approach is described below.
• Another alternative is to use a computer or
microprocessor system and write software.
• Actually a computer is just an instance of a digital
system and the stored program concept on which its
application is based is similar to the design method
below so it should come as no surprise that if you can
master the method below you will understand how
computers actually work, and could even design your
own CPU.
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-3/21
The ASM Method
•
Instead of concentrating on simply moving from state to state
we can decompose our problem into a number of sections.
•
If we must process input data and can identify simple
operations to be performed on the data then we can sequence
and control the flow of data to and from each data processing
block using FSM design methods.
•
Thus we partition our system into a “DATA PROCESSOR” and a
“CONTROL LOGIC” section.
•
The data processor has functional blocks that “do something”
to the incoming data or locally generated data such as a
count of items processed.
•
A good design rule is that each functional block should do
one thing and be easily described. It might be a counter, an
added or comparator or shift register. It could even be a
complete ALU.
•
The Control Logic sends control signals to each block and
receives status signals or information about the data but not
the data itself. Many choices can be made by the designer but
as a rule this partition gives an easily designed, easily
tested and easily modified system
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-4/21
The ASM Method
•
An ALU or
Arithmetic Logic
Unit has typically 2
data inputs and a
data output all 8 or
9 bits wide. It also
has 3 or 4 inputs to
indicate what to do.
The 3 bit binary
number 000…111
might specify
F=A+B, A-B, B-A,
A and B, A or B
and maybe F=A,
F=B and
F=11111111
Input Data
External Inputs
( only a few and
preferably
synchronised to the
system clock)
Control
Signals
DATA PROCESSOR
Simple blocks, each of which
does a single, simple, easily
expressed function.
CONTROL
LOGIC
Actually a FSM;
receiving inputs and
deciding what
sequences of outputs to
generate.
Status Signals
Output
Data
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-5/21
Example of ASM method
• Averaging 16 numbers each of 8 bits in size
• Method 1: use 8 adders to add 8 pairs of numbers, this
gives 8 9 bit numbers (worst case)
• Use 4 9-bit adders to give four 10 bit answers
• Use 2 10 bit adders to give two 11 bit answers
• Finally use a 11 bit adder giving a 12 bit answer, we
can use a trick to “divide by 16” – simply use the 8
left most bits of the 12 bit number, akin to shifting
right 4 bits, this is division by 2^4.
• This is obviously most wasteful of space, but achieves
a reasonably fast answer, 4 add-times.
• Actually adders are slow, though there are a number of
special techniques to speed up addition, c.f carrylookahead-adders.
• Clearly a more space efficient system would be to do
the calculation the way humans would do it. Use a
running total and add sequentially, I.e use one adder
and pass the data through it one number at a time.
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-6/21
Example of ASM method
DATA IN
S (START)
S0
0
S
1
ADD
ADDER
ADD
REGISTER
STROBE
CLEAR
S1
DATAVALID
CLEAR
COUNT
COUNTER
(RESETABLE)
EQ16
DETECT
16
S4
STROBE
1
DATAVALID
S2
S3
S5
EQ16
0
COUNT
State equations
S0.D:= S0./s + S2
S1.D:= S0. S
S2.D:= S5.EQ16
S3.D:= S1
+ S6
S4.D:= S3
S5.D:= S4
S6.D:= S5./EQ16
S6
Output equations
CLEAR = S1
ADD = S3
STROBE = S5
COUNT = S6
DATAVALID = S2
CLOCK
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-7/21
Signals to the outside world
•
•
•
Several unanswered problems remain with the previous design
– Exactly when the input arrives
– The datavalid pulse is only available for a short time
– It would be “better” ( “cheaper”?)to use countdown
counter.
Often when doing an initial ASM design, the interface to the
outside world (or the next machine in the chain)is not given
much attention.
A typical, useful approach is to provide handshake lines to
allow flow control. Thus
Data out
Data out
Data out
DATA VALID
REQUEST
STROBE
STROBE
ack
Sender driven, o/p
data, then o/p
strobe, keep it
high until ack is
seen from far end
0
ack 1
RECEIVER driven,
Wait for REQUEST
I/p then o/p
data, then o/p
DATAVALID, often
just a timed
pulse , a lowhigh-low
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
0
REQ 1
Data out
STROBE
EEE515J1_L6-8/21
ASM machines demand synchronous logic
• Even simple latches are best driven in a synchronous
manner, even though applying a “latch” or “strobe”
signal to the clocks of a register ( e.g 8 D-type
flip-flops) will work, a more testable circuit results
if the master clock goes to every component.
• Thus the D-types spend most of their time in a “held”
state and only “load data” when the strobe signal is
high
• This is easily achieved by adding multiplexors
strobe
strobe
clock
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-9/21
Using a CLOCK
•
The role of the clock is very important in the ASM method.
•
As has been said before, having everything synchronised to a single clock can
ease testing and last minute design modifications.
•
In very large systems you will find systems that use two phase clocks where
the rising edge is used by one section of a system and the next section uses
the falling edge.
•
Or latches are provided to isolate adjacent sections.
•
Multiphase clocks exist, a 4 phase solution allows “the soldiers all to march
in step”.
•
Very large fast systems will have problems routing a clock signal from one
edge of a chip to the other and several solutions exist to fix this.
•
Often the designer will lay down the clock distribution network before adding
other gates.
•
A matrix of equal delay buffers may allow distribution with a low timing skew
across chip.
•
Also used today is local generation of the clock and a system of phase locking
( cf www.altera.com for a description of their DPLL cells). This can also
allow the clock frequency off-chip to be much lower than the clock on the
chip, the phase locking can be done at a sub multiple of the clock frequency.
I first saw this on a Transputer chip were the chip internally worked at 20MHz
but you only needed to supply the chip with a 5 Mhz oscillator. The PCB layout
was less critical and the emitted RF noise was much less with this approach.
You may be aware it is used a lot in modern PC CPU design, sometimes the
internal clocks run at 3.5 times the external clocks!. ( cf
www.tomshardwareguide.com )
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-10/21
Synchronous Control signals:
•
A key to initial ASM designs is to have very strict
synchronisation. This rule has even prompted some TTL companies
to bring out two versions of their chips; the 74163 and 74163A
counters are identical except that the RESET action is
synchronised on one version but asynchronous on the other.
•
Once you are familiar with the method and have a dozen designs
under your belt you may relax this strict rule somewhat.
•
Chips such as counters and shift registers can undertake various
control actions; the RESET, LOAD, PRESET, DIRECTION controls for
a counter are all VERBS of ACTION. An important part of the
method is to recognise that whilst your control logic may assert
these control inputs they are NOT acted upon until the next
clock pulse. Thus the ACTION is not taken until the clock pulse.
This makes the design diagrams easier to follow.
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-11/21
The Design Method
•
There are two main steps both graphical in nature; a block
diagram of the data processor and the ASM chart describing
the sequence of data operations to be performed. Different
problems sometimes lend themselves to applying these in
different orders. The data processor is a block diagram or
circuit diagram where each block is a simple functional
circuit. As a guide each block should be available as a TTL
chip but if you have little experience of the TTL family a
further guide should be to ensure that it performs a single,
easily explained task. Each block should be simple to design
such as a combinational problem or a very simple FSM.
•
•
All control signals MUST be synchronous. Combinational
circuits such as ADDERS might have a synchronous ADD control
signal or you can just assume the answer pops out the bottom
of the adder. You must ensure that the propagation delays of
each data processor block do not cause problems; if these are
all much faster than the clock then there will be no problem.
It is possible to insert dummy states into the Control logic
to wait for answers to appear, or we must complicate our
system by adding status signals e.g “ADDER_COMPLETE”
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-12/21
The Design Method continued
•
The ASM chart is comprised of boxes of just three types.
•
It superficially resembles a programming flowchart. There is one crucial
difference; Programming Flowcharts are read sequentially from the top of
the page to the bottom, if there is only one CPU then this also
represents the time behaviour of the program.
•
Obviously in a hardware circuit with a couple of counters the counting of
one counter does not wait for the counting of another. Both pieces of
hardware operate at the same time, concurrently.
•
In fact the different parts of the Data Processor in an ASM all operate
at the same time. If we have a section of an ASM chart where a counter is
told to count, an input is tested and an output is generated then these
actions will all be scheduled to happen at the same time.
•
Of course it will take the next clock pulse to action the events.
•
Each “state” in an ASM chart has only one output box.
•
It may have a number of input testing boxes and output boxes conditional
on some inputs but there must only be one main output box per state.
•
All arrows arriving at that state must go through this box.
•
We label the state by labelling that output box but be clear where the
dotted lines that form the boundary of our state lie, see Figure 2
overleaf.
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-13/21
The Design Method continued
•
S0
000
1
A <-A+1,
R1 <- 0
0
F
•
1
E
1
R2 <11111111
0
Figure 2: Different shapes of an
ASM
•
•
Note some texts will name the state inside a
bubble shown as a dotted circle. Here I have
listed the state S0, with a state code of 0001. (I
will use one-hot codes for the state code but
there is no reason why a more efficient code
couldn’t be used)
When “in” state zero you are in all boxes inside
the dotted line simulaneously! Depending on
input conditions. Thus the single bit input “E” is
tested at the same time as the single bit input
“F” is tested, the PRESET or
LOAD_ALL_ONES control signal of the 8 bit
register R2 is asserted if E is high, it flickers if E
flickers but of course we should try and use
synchronous inputs where possible. The Adder (
or counter?) A is to increment and the RESET
signal of R1 is asserted.
Maybe you see now why all control signals are
only activated on a clock pulse. All these control
signals are set or cleared but NO action takes
place until the clock pulse arrives that will take
the machine to its next state, down one of the
three arrows exiting the box.
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-14/21
The Design Method continued
•
One of the consequences of this method means that if a test is
activated instantly on entering a state then it is based on the old
values of the inputs.
•
If the state alters an input then we must be most careful. If the
conditional boxes above tested the counter/adder A then it would exit
depending on the old value of A, despite A altering as we left the
state.
•
It is a good idea not to test a signal in the same state as you
attempt to alter it
•
It is easy to add “dummy” states (empty state boxes) to cause a one
clock cycle delay and this can decouple the two effects. It is usually
a good idea to avoid two tests within one state.
•
These rules or guidelines can be broken but adherence will increase
the likelihood that the system will work!
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-15/21
Counting ‘1’s in a 16 bit word.
The previous example was extremely abstract, a more typical application
follows; we begin with an English description of the problem.
“A system is needed that will count the number of ones in a 16 bit word.
The design should be easily modified for a 32 bit word.”
This is a nice example because, as in real life, there are many possible
solutions, the good designer will reject all but one of these, the one that is
picked will be for a good reason! Here we will adopt an ASM method to
illustrate the design method. Speed of response or cost may push a real
designer to different conclusions.
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-16/21
Solution 1a
Solution 1b: create a 4 bit cell and iterate the answer. Adders
will be needed to combine the four outputs and this will be a
slower, but easier to design solution.
Register R1 containing word
Large Combinational
circuit.
Register R1 containing word
The answer will be between zero and 16
inclusive. This needs 5 bits to represent it
(00000…10000)
Solution 2: Use a shift Register and counter.
This will demonstrate the ASM method quite nicely. Note that the two solutions trade space and time.
The pure combinational approach is fastest but largest. We will use a shift register and shift each bit
out in turn; if it is a ‘1’ we will increment a counter. As is often the case we need to know when to stop.
This could be done by having a loop counter keep track of how many shifts we had done, beginners
usually set up a counter to go from zero ( or 1) to 16. This may be out by one and a comparator is
needed. Experienced ASM designers ( and programmers) preload a counter with 15 and decrement to
zero or find an alternative. Here we will use a clever trick to save time. By shifting zeros into our word
as we shift our data out we can test for all zeros to exit our loop. In the case where there are few ones
this may give an impressive speed advantage, at the disadvantage that the execution time of our
machine varies according to the input data; that is not always allowed.
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-17/21
Solution 2: Shift Register and adder…
•This will demonstrate the ASM method quite nicely. Note that
the two solutions trade space and time. The pure combinational
approach is fastest but largest.
•We will use a shift register and shift each bit out in turn; if it is
a ‘1’ we will increment a counter.
•As is often the case we need to know when to stop. This could
be done by having a loop counter keep track of how many
shifts we had done, beginners usually set up a counter to go
from zero ( or 1) to 16. This may be out by one and a
comparator is needed.
•Experienced ASM designers ( and programmers) preload a
counter with 15 and decrement to zero or find an alternative.
•Here we will use a clever trick to save time. By shifting zeros
into our word as we shift our data out we can test for all zeros
to exit our loop.
•In the case where there are few ones this may give an
impressive speed advantage, at the disadvantage that the
execution time of our machine varies according to the input
data; that is not always allowed.
Simple Combinational
circuit. ( NOR gate)
Detects ALL_ZEROS
SHIFT
Register R1 containing word
LOAD
‘1’
‘1’
‘1’
‘1’
Counter
COUNT
LOAD
Initial sketch of
Data Processor
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-18/21
Solution 2: Shift Register and adder…
S
Simple Combinational
circuit. ( NOR gate)
Detects ALL_ZEROS
Control
Logic
SHIFT
Register R1 containing word
LOAD
‘1’
‘1’
‘1’
‘1’
Counter
Implementing
the
ASM
Chart
below
COUNT
LOAD
D
Q
Figure 9: The Data Processor, one way of solving the problem,
alternatively leave out the D-Type Flip-Flop. Not shown here is how
the answer is read from the counter and how the input is wired up to
the shift register’s parallel data inputs
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-19/21
Solution 2: Shift Register and adder…
The one-hot equations for this machine are as follows…
T0
INITIAL STATE
T0.d = T0 * /S + T1 * Z
0
T1.d = T3 * E + T0 * S
S
T2.d = T1 * /Z + T3 * /E
1
R1  INPUT (LOAD)
R2  ‘1111’ (LOAD)
T3.d = T2
; this causes a one clock delay between altering E and
testing E.
Also the control signals are
T1
COUNT
0
Z
LOAD = T0 * S
COUNT = T1
SHIFT = T2
1
T2
SHIFT
T3
DUMMY STATE
1
E
0
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-20/21
Try the tut questions!
See the file
ASMTUTS.pdf on the
website
The only “trick” to some
of them is the use of a
pipeline, a line of
registers to allow
access to older data…
I’ll do a DSP pipeline design on the board, its not
hard. Remember real ADCs will need to be given a SC
control signal and will return an EOC status
signal. These stand for START_CONVERSION and
END_OF_CONVERSION.
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
EEE515J1_L6-21/21