Transcript Slide 1

EEE515J1
ASICs and DIGITAL DESIGN
Lecture 7: CPUs; The SHC1, Simple
Hypothetical CPU #1
Ian McCrum
Room 5B18
Tel: 90 366364 voice mail on 6th ring
Email: [email protected]
Web site: http://www.eej.ulst.ac.uk
(old archive http://tigger.engj.ulst.ac.uk/~ddij23 )
Last changed
30/11/07@18:00
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
1
Common ASM DATA PROCESSOR blocks

Input Data
External Inputs
( only a few and
preferably
synchronised to the
system clock)
Control
Signals
DATA PROCESSOR
Simple blocks, each of which does
a single, simple, easily expressed
function.
CONTROL LOGIC
Actually a FSM;
receiving inputs and
deciding what
sequences of outputs
to generate.
Status Signals
Output
Data
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
We find we
often get
data from the
outside world
or a internal
storage
register,
process it in
some way
and put the
result back
into an
internal
register or
send it to the
outside world
2
More common DATA PROCESSOR blocks
CLEAR
COUNTER
(RESETABLE)
DETECT
COUNTUP
EQ16
16
CLOCK
REGISTER,
Load number
or constant
LOADSTARTVALUE
LOAD
COUNTER
(RESETABLE)
DETECT
COUNTDOWN
EQ
In designing ASM
machines we often
need to repeat a set of
operations a number
of times. Hence we
will often have
counters and some
means of detecting
when a count is
reached. (or counters
that count down and a
zero detector (NOR
gate!)
zero
CLOCK
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
3
More general pupose data processing block

LOAD
REG
CLEAR
CLEAR
REG
LOAD
ALU
ADD
ALU
Function
code
ADDER

LOADRESULT
REGISTER
REGISTER
LOADRESULT
CLEARRESULT
CLEARRESULT
ALU can output A+B, A-B, B-A, A, B, A AND B, A OR B, A XOR
B, NOT A NOT B using 4 Function code lines. It can also
output STATUS bits Z,C,N,V (see 74F181 datasheet)
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files

We could add
the blocks on
the left to
every digital
machine we
design...
This is the
start of
designing a
“general
purpose digital
machine”
- a CPU
4
The SHC01 (see SHC01.pdf)





30/11/07
The minimum to do useful work – has many areas that can be
improved; it only has one accumulator (and a temporary
register). It cannot, as it stands, implement subroutines or even
indexed memory accesses. It has only 8 bit data and address
buses.
Has a PROGRAM ROM where every instruction code
(OPCODE) and operand is stored, starts at address zero
Requires 22 control signals emitted in the correct order for
everything to work
allows up to 16 microinstructions for each OPCODE loaded into
the IR (Instruction Register
See the fetch-execute tables and microcode tables to see how
this machine works (the .pdf on the website/handout in class)
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
5
DATA BUS – 8 bits
ACCA
MDR
S
S
ALU
S
E
RESULT
S
C[2..0]
i
IR
PC
MAR
E
CONTROL UNIT
ROM
13 ADDRESSES
22 DATA
OUTPUTS
LAT
i.e
{ ACCAS, MDRS, RESULTS, RESULTE, IRS, PCS,
PCi, PCE, MARS, MARE, ALU[C2..C0], ROME.RAMS, RAME,
INPE, OUTS, LAT[d3..d0] }
Hence the ROM is 2^13 x 22 bits in size
S
E
ADDRESS
BUS
8 bits
E
The control unit ROM outputs signals to;control Strobing data into a register (using the 'S' lines)
Enabling outputs from registers or buffers ('E')
Controlling function of the ALU (C2,C1 and C0)
Incrementing the PC (the 'I' line)
Supply a 4 bit number to the LAT latch, (this causes the
ROM to switch to (typically) the next microinstruction)
30/11/07
S
S
PROGRAM
ROM
DATA
RAM
E
E
INPUT BUFFER
S
OUTPUT REG
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
6
POWER UP SEQUENCE and fetch-execute of first instruction (assumes immediate ADD)
Step
#
ACTION
0
PCE=1
(PC) -> AB
PUT PROGRAM COUNTER CONTENTS ONTO
ADDRESS BUS
1
ROME=1
(ROM) -> DB
READ THE PROGRAM ROM.
OPCODE NOW ON DATA BUS
2
PCI=1,PCE=1,IRS=
1
(PC)+1->PC,
(PC)->AB,
(DB)->IR
POINT PC AT OPERAND, AND READ THE ROM; ITS
CONTENTS GO INTO THE IR
3
ROME=1
(ROM)->DB
ADDRESS BUS SETTLES WITH NEW VALUE; THE
ADDRESS OF THE OPERAND
4
MDRS=1
(DB)->MDR
PUT IT IN THE MDR
5
ALU=ADD
ALU=(ACC)+(MDR)
EXECUTE THE INSTRUCTION
6
RESULTS=1
(ALU)->RESULTS
7
RESULTE=1
(RESULTS)->DB
PUT ANSWER ONTO DATA BUS
8
ACCS=1
(DB)-ACC
AND INTO ACC.
9
PCI=1
THESE ARE PART OF
10
PCE=1.ROME=1
THE NEXT
11
IRS=1,PCI=1,PCE=
1
FETCH-EXECUTE TABLE
RESULT
{ ROUND BRACKETS MEAN
“CONTENTS OF” }
COMMENT
Do examine the 5 page handout carefully – check the microcode tables that implement the above
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
7
Improving the SHC01
1)
Use a REGISTER BANK
WRS
A
REG_WRITE_ADDRESS[2..0]
B
REGISTER BANK
(8 registers)
REGA_READ_ADDRESS[2..0]
REGB_READ_ADDRESS[2..0]
C2
C1
ALU
C0
S
RESULT
E
30/11/07
The Register bank needs10 control
signals instead of 2, but the control
logic can be altered to make this
efficient – take bits direct from the IR
to the register address lines. Suits
larger machines – 16 bits and above
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
8
Improving the SHC01
2)
Use a bigger ALU or 3) a secondary ALU
REG bank
ALU
ALU
Function
code
Secondary ALU
(e.g
MULTIPLIER
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
9
Improving the SHC01
4)
Improve memory addressing capability MARS
S
MAR
PC
I
E
MARE
MARI
MARII
MARD
MARDD
MARCLR
S
ROM
S
RAM
E
30/11/07
(a)
(b)
(c)
(d)
(e)
Increment
Double Increment
Decrement
Double Decrement
Reset (to access address zero)
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
10
Improving the SHC01
5)
DATA BUS
S
I
Add a second MAR -
MAR1
PC
E
MAR2
ADDRESS BUS
S
ROM
S
RAM
E
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
If you have a
source and
destination
address in
external RAM
in makes
sense to have
two address
pointers within
the CPU
MAR2 will need
the usual S
and E lines, it
makes sense
to also add
others
(c.f. previous
11
slide)
Improving the SHC01
Add a second ALU – to allow calculated addresses
6)
DATA BUS
MAR1
PC
MAR2
TEMPREG2
ADDRESS BUS
REG
Secondary ALU
(-simple adder)
ROM
RAM
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
12
Now to optimise the Control unit.
It currently needs 13 inputs and 22 outputs
If implemented as a large ROM it needs
2^13 * 22 bits = 180,224 bits
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
13
MICROPROGRAMMING
On Powerup the IR and LATCH
are at zero, so the first address
presented at the inputs of the
MICROCODE ROM is
INSTRUCTION
REGISTER
X 0000-0000 0000
8
4
STATUS
BIT
FROM
ALU
CONTROL UNIT ROM
clk
CONTAINING MICROCODE
4 BIT LATCH
18
To all 'S' and 'E' control signals, also to ALU C2, C1 and C0 control lines, AS and BS
strobe lines, PCI Increment line (PCI)
30/11/07
The first thing to do is put the
PC’s contents onto the address
bus
Next Enable the PROGAM ROMs
outputs (onto the databus)
Next The IR is strobed – the first
real opcode is now in the IR and
the ROM has a new address …
depending on what that opcode
is!
The Microcode performs a
“microjump” to the new microcode
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
14
Improving the CONTROL UNIT of SHC01
1)
Replace LAT with “MICROPROGRAM COUNTER”
INSTRUCTION
REGISTER
8
4
STATUS
BIT
FROM
ALU
CONTROL UNIT ROM
clk
CONTAINING MICROCODE
4 bit LATCH
2
If we use just microorders “COUNT” and “RESET” this saves 2 outputs from the control
unit so its new size is 2^13 X 20 (....168,340 bits ...)
Actually we can remove the need for “RESET” if we complicate the microcode.
its new size is 2^13 X 19 (...155,648 bits...)
It is even possible to have “COUNT” as a default option and remove the need for it as well
– at this stage the microcode becomes hard to follow – so this step is left until the very end
when a number of obfuscating optimisations can be carried out
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
15
Improving the CONTROL UNIT of SHC01
2)
Look for redundancy in the control signals - PCE/MARe
PC
PCE
MAR
Drop MARE and use an invertor wired to PCE
since we see that PCE and MARE are never
'1' at the same time and it does no harm to
have one of these at '1' all the time. (“00” not
used)
This saves an output, CU ROM is now
2^13 X 18 (....147,456 bits...)
3)
Look for redundancy in the control signals – mutually exclusive 'S' lines
It so happens that we never activate more
than one S line at a time – we can use a
decoder, There are times when no S lines
are active so it is convenient to use a 3:8
decoder and provide 7 S lines with a 3 bit
number emitted from the Control unit ROM
CU ROM is 2^13 X 14 (...114,688 bits...)
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
16
Improving the CONTROL UNIT of SHC01
4)
Look for redundancy in the control signals - NANOMEMORY
INSTRUCTION
REGISTER
8
4
STATUS
BIT
FROM
ALU
CONTROL UNIT ROM
CONTAINING lookup number of
MICROCODE
7
clk
NANOMEMORY
5 inputs and 24 outputs
4 BIT LATCH
12
30/11/07
Although the CU ROM could
output many different patterns, if
we analyse the complete set of
microcode we might discover, for
example, we only need 100
different emissions. Hence we use
a “LOOKUP TABLE” to generate
these. The CU ROM outputs a
number between 0 and 99 and
the NANOMEMORY emits the
required wide microinstruction
CU ROM is 2^13 X 7 = 57,344
and NANOMEMORY is 2^7 X 14 =
1778 giving total of
(...59,122 bits...)
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
17
Improving the CONTROL UNIT of SHC01
5)
Only provide the opcodes actually wanted – probably less than 254
Although the CU ROM could
provide many different
opcodes, such a simple
architecture may only need 50
or so opcodes, we can keep
IR7 and IR6 low all the time –
hence only apply 6 bits to the
ROM from the IR
INSTRUCTION
REGISTER
6
4
STATUS
BIT
FROM
ALU
CONTROL UNIT ROM
CONTAINING lookup number of
MICROCODE
7
clk
NANOMEMORY
5 inputs and 24 outputs
4 BIT LATCH
CU ROM is
2^11 X 7 = 14336
and NANOMEMORY is
12
2^7 X 14 = 1778
giving total of
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
(...16,114 bits...)
18
Improving the CONTROL UNIT of SHC01
6)
Use fields in the IR to drive control signals directly
INSTRUCTION REGISTER
2
e.g to
ALU fn or REG
bank
addresses
CONTROL UNIT ROM
Although more common in bigger machines (e.g 16 bits) we can divide the IR into fields and “wire” them
directly to parts of the CPU, bypassing the CU and saving space there.
If a field in the IR is used as a “MODE” field it can drive multiplexors and switches to route the other IR fields to
different parts of the CPU.
This is used in, for example, the PDP11 to allow fields to be used to drive the ALU or the ADDRESS calculation
sections.
At this point the architecture (and microcode) become complicated - and beyond the course!
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
19
Summary

Be able to sketch a typical CPU

Be able to sketch a typical CONTROL UNIT




Be able to work out FETCH-EXECUTE tables for simple
(explained)instructions
Be able to write out a MICROCODE table, including
whatever steps are required at powerup to get the
machine going
Be able to suggest architectural improvements to the
CPU
Be able to sketch CONTROL UNIT improvements and
calculate the resulting savings in ROM sizes.
30/11/07
www.eej.ulster.ac.uk/~ian/modules/EEE515J1/
files
20